←back to thread

278 points itzlambda | 1 comments | | HN request time: 0.214s | source
Show context
lsy ◴[] No.44608975[source]
If you have a decent understanding of how LLMs work (you put in basically every piece of text you can find, get a statistical machine that models text really well, then use contractors to train it to model text in conversational form), then you probably don't need to consume a big diet of ongoing output from PR people, bloggers, thought leaders, and internet rationalists. That seems likely to get you going down some millenarian path that's not helpful.

Despite the feeling that it's a fast-moving field, most of the differences in actual models over the last years are in degree and not kind, and the majority of ongoing work is in tooling and integrations, which you can probably keep up with as it seems useful for your work. Remembering that it's a model of text and is ungrounded goes a long way to discerning what kinds of work it's useful for (where verification of output is either straightforward or unnecessary), and what kinds of work it's not useful for.

replies(12): >>44609211 #>>44609259 #>>44609322 #>>44609630 #>>44609864 #>>44609882 #>>44610429 #>>44611712 #>>44611764 #>>44612491 #>>44613946 #>>44614339 #
crystal_revenge ◴[] No.44609322[source]
I strongly agree with this sentiment and found the blog's list of "high signal" to be more a list of "self-promoting" (some good people who I've interacted with a fair bit on there, but that list is more 'buzz' than insight).

I also have not experienced the post's claim that: "Generative AI has been the fastest moving technology I have seen in my lifetime." I can't speak for the author, but I've been in this field from when "SVMs are the new hotness and neural networks are a joke!" to the entire explosion of deep learning, and insane number of DL frameworks around the 20-teens, all within a decade (remember implementing restricted Boltzmann machines and pre-training?). Similarly I saw "don't use JS for anything other than enhancing the UX" to single page webapps being the standard in the same timeframe.

Unless someone's aim is to be on that list of "High signal" people, it's far better to just keep your head down until you actually need these solutions. As an example, I left webdev work around the time of backbone.js, one of the first attempts at front end MVC for single pages apps. Then the great React/Angular wars began, and I just ignored it. A decade later I was working with a webdev team and learned React in a few days, very glad I did not stress about "keeping up" during the period of non-stop changing. Another example is just 5 years ago everyone was trying to learn how to implement LSTMs from scratch... only to have that model essentially become obsolete with the rise of transformers.

Multiple times over my career I've learned lesson that moving fast is another way of saying immature. One would find more success learning about the GLM (or god forbid understanding to identify survival analysis problems) and all of it's still under appreciated uses for day-to-day problem solving (old does not imply obsolete) than learning the "prompt hack of the week".

replies(1): >>44616271 #
1. megh-khaire ◴[] No.44616271[source]
I completely get where you're coming from. There’s a ton of noise in the space right now, and the hype is very real. I think that's mostly because the AI wave reached a broader, non-technical audience pretty quickly. That visibility has created a lot of excitement.

However, this AI wave does feel a bit different. What stands out is the speed of progress in multiple directions. We’ve seen new model architectures, prompting techniques, and agent frameworks. And every time one of those advances, it opens up new possibilities that startups are quick to explore.

I’m with you that chasing every shiny thing isn’t practical or even useful most of the time. But as someone curious about the space, I still find it exciting.