←back to thread

263 points itzlambda | 1 comments | | HN request time: 0.216s | source
Show context
lsy ◴[] No.44608975[source]
If you have a decent understanding of how LLMs work (you put in basically every piece of text you can find, get a statistical machine that models text really well, then use contractors to train it to model text in conversational form), then you probably don't need to consume a big diet of ongoing output from PR people, bloggers, thought leaders, and internet rationalists. That seems likely to get you going down some millenarian path that's not helpful.

Despite the feeling that it's a fast-moving field, most of the differences in actual models over the last years are in degree and not kind, and the majority of ongoing work is in tooling and integrations, which you can probably keep up with as it seems useful for your work. Remembering that it's a model of text and is ungrounded goes a long way to discerning what kinds of work it's useful for (where verification of output is either straightforward or unnecessary), and what kinds of work it's not useful for.

replies(12): >>44609211 #>>44609259 #>>44609322 #>>44609630 #>>44609864 #>>44609882 #>>44610429 #>>44611712 #>>44611764 #>>44612491 #>>44613946 #>>44614339 #
qsort ◴[] No.44609259[source]
I agree, but with the caveat that it's probably a bad time to fall asleep at the wheel. I'm very much a "nothing ever happens" kind of guy, but I see a lot of people who aren't taking the time to actually understand how LLMs work, and I think that's a huge mistake.

Last week I showed some colleagues how to do some basic things with Claude Code and they were like "wow, I didn't even know this existed". Bro, what are you even doing.

There is definitely a lot of hype and the lunatics on Linkedin are having a blast, but to put it mildly I don't think it's a bad investment to experiment a bit with what's possible with the SOTA.

replies(3): >>44609316 #>>44609385 #>>44610477 #
1. crystal_revenge ◴[] No.44609385[source]
> I see a lot of people who aren't taking the time to actually understand how LLMs work

The trouble is that the advice in the post will have very little impact on "understanding how LLMs work". The number of people who talk about LLMs daily but have never run an LLM local, and certainly never "opened it up to mess around" is very large.

A fun weekend exercise that anyone can do is to implement speculative decoding[0] using local LLMs. You'll learn a lot more about how LLMs work than reading every blog/twitter stream mentioned there.

0. https://research.google/blog/looking-back-at-speculative-dec...