Most active commenters

jimmaswell(5)
soulofmischief(5)
Ygg2(4)
fuzzfactor(3)
mecsred(3)
datavirtue(3)

Popular/hot comments

>>41890621 #
>>41891845 #
>>41894099 #
>>41892532 #
>>41892560 #
>>41892901 #
>>41895172 #

←back to thread

AI engineers claim new algorithm reduces AI power consumption by 95%

(www.tomshardware.com)

1. djoldman ◴[19 Oct 24 19:09 UTC] No.41889903[source]▶

>>41889414 (OP) #

https://arxiv.org/abs/2410.00907

ABSTRACT

Large neural networks spend most computation on floating point tensor multiplications. In this work, we find that a floating point multiplier can be approximated by one integer adder with high precision. We propose the linear-complexity multiplication (L-Mul) algorithm that approximates floating point number multiplication with integer addition operations. The new algorithm costs significantly less computation resource than 8-bit floating point multiplication but achieves higher precision. Compared to 8-bit floating point multiplications, the proposed method achieves higher precision but consumes significantly less bit-level computation. Since multiplying floating point numbers requires substantially higher energy compared to integer addition operations, applying the L-Mul operation in tensor processing hardware can potentially reduce 95% energy cost by elementwise floating point tensor multiplications and 80% energy cost of dot products. We calculated the theoretical error expectation of L-Mul, and evaluated the algorithm on a wide range of textual, visual, and symbolic tasks, including natural language understanding, structural reasoning, mathematics, and commonsense question answering. Our numerical analysis experiments agree with the theoretical error estimation, which indicates that L-Mul with 4-bit mantissa achieves comparable precision as float8 e4m3 multiplications, and L-Mul with 3-bit mantissa outperforms float8 e5m2. Evaluation results on popular benchmarks show that directly applying L-Mul to the attention mechanism is almost lossless. We further show that replacing all floating point multiplications with 3-bit mantissa L-Mul in a transformer model achieves equivalent precision as using float8 e4m3 as accumulation precision in both fine-tuning and inference.

replies(4): >>41890324 #>>41892025 #>>41901112 #>>41921796 #

2. onlyrealcuzzo ◴[19 Oct 24 20:00 UTC] No.41890324[source]▶

>>41889903 (TP) #

Does this mean you can train efficiently without GPUs?

Presumably there will be a lot of interest.

replies(2): >>41890353 #>>41901656 #

3. crazygringo ◴[19 Oct 24 20:05 UTC] No.41890353[source]▶

>>41890324 #

No. But it does potentially mean that either current or future-tweaked GPUs could run a lot more efficiently -- meaning much faster or with much less energy consumption.

You still need the GPU parallelism though.

replies(2): >>41890621 #>>41893598 #

4. fuzzfactor ◴[19 Oct 24 20:40 UTC] No.41890621{3}[source]▶

>>41890353 #

I had a feeling it had to be something like massive waste due to a misguided feature of the algorithms that shouldn't have been there in the first place.

Once the "math is done" quite likely it would have paid off better than most investments for the top people to have spent a few short years working with grossly underpowered hardware until they could come up with amazing results there before scaling up. Rather than grossly overpowered hardware before there was even deep understanding of the underlying processes.

When you think about it, what we have seen from the latest ultra-high-powered "thinking" machines is truly so impressive. But if you are trying to fool somebody into believing that it's a real person it's still not "quite" there.

Maybe a good benchmark would be to take a regular PC, and without reliance on AI just pull out all the stops and put all the effort into fakery itself. No holds barred, any trick you can think of. See what the electronics is capable of this way. There are some smart engineers, this would only take a few years but looks like it would have been a lot more affordable.

Then with the same hardware if an AI alternative is not as convincing, something has got to be wrong.

It's good to find out this type of thing before you go overboard.

Regardless of speed or power, I never could have gotten an 8-bit computer to match the output of a 32-bit floating-point algorithm by using floating-point myself. Integers all the way and place the decimal where it's supposed to be when you're done.

Once it's really figured out, how do you think it would feel being the one paying the electric bills up until now?

replies(5): >>41890824 #>>41891053 #>>41892039 #>>41892366 #>>41895079 #

5. jimmaswell ◴[19 Oct 24 21:12 UTC] No.41890824{4}[source]▶

>>41890621 #

Faster progress was absolutely worth it. Spending years agonizing over theory to save a bit of electric would have been a massive disservice to the world.

replies(2): >>41890834 #>>41891732 #

6. BolexNOLA ◴[19 Oct 24 21:15 UTC] No.41890834{5}[source]▶

>>41890824 #

“A bit”?

replies(1): >>41891112 #

7. pcl ◴[19 Oct 24 21:54 UTC] No.41891053{4}[source]▶

>>41890621 #

Isn’t this paper pretty much about spending a few short years to improve the performance? Or are you arguing that the same people who made breakthroughs over the last few years should have also done the optimization work?

replies(1): >>41891328 #

8. bartread ◴[19 Oct 24 22:02 UTC] No.41891112{6}[source]▶

>>41890834 #

Yes, a large amount for - in the grand scheme of things - a short period of time (i.e., a quantity of energy usage in an intense spike that will be dwarfed by energy usage over time) can accurately be described as “a bit”.

Of course, the impact is that AI will continue to become cheaper to use, and induced demand will continue the feedback loop driving the market as a result.

9. fuzzfactor ◴[19 Oct 24 22:36 UTC] No.41891328{5}[source]▶

>>41891053 #

>the same people who made breakthroughs over the last few years should have also done the optimization work

I never thought it would be ideal if it was otherwise, so I guess so.

When I first considered neural nets from state-of-the art vendors to assist with some non-linguistic situations over 30 years ago, it wasn't quite ready for prime time and I could accept that.

I just don't have generic situations all the time which would benefit me, so it's clearly my problems that have the deficiencies ;\

What's being done now with all the resources being thrown at it is highly impressive, and gaining all the time, no doubt about it. It's nice to know there are people that can afford it.

I truly look forward to more progress, and this may be the previously unreached milestone I have been detecting that might be a big one.

Still not good enough for what I need yet so far though. And I can accept that as easily as ever.

That's why I put up my estimation that not all of those 30+ years has been spent without agonizing over something ;)

10. rossjudson ◴[19 Oct 24 23:48 UTC] No.41891732{5}[source]▶

>>41890824 #

You're sort of presuming that LLMs are going to be a massive service to the world there, aren't you? I think the jury is still out on that one.

replies(1): >>41891845 #

11. jimmaswell ◴[20 Oct 24 00:08 UTC] No.41891845{6}[source]▶

>>41891732 #

They already have been. Even just in programming, even just Copilot has been a life changing productivity booster.

replies(5): >>41891944 #>>41892396 #>>41892532 #>>41894732 #>>41900955 #

12. wruza ◴[20 Oct 24 00:32 UTC] No.41891944{7}[source]▶

>>41891845 #

Are you sure it’s a life changing productivity booster? Sometimes I look at my projects and wonder how would I explain it to an LLM what this code should have done if it didn’t exist yet. Must be a shitton of boilerplate programming for copilot to be a life-changing experience.

replies(1): >>41892560 #

13. etcd ◴[20 Oct 24 00:52 UTC] No.41892025[source]▶

>>41889903 (TP) #

I feel like I have seen this idea a few times but don't recall where but stuff posted via HN.

Here https://news.ycombinator.com/item?id=41784591 but even before that. It is possibly one of those obvious ideas to people steeped in this.

To me intuitively using floats to make ultimatelty boolean like decisions seems wasteful but that seemed like the way it had to be to have diffetentiable algorithms.

14. Scene_Cast2 ◴[20 Oct 24 00:55 UTC] No.41892039{4}[source]▶

>>41890621 #

This is a bit like recommending to skip vacuum tubes, think hard and invent transistors.

replies(1): >>41892316 #

15. fuzzfactor ◴[20 Oct 24 01:58 UTC] No.41892316{5}[source]▶

>>41892039 #

This is kind of thought-provoking.

That is a good correlation when you think about how much more energy-efficient transistors are than vacuum tubes.

Vacuum tube computers were a thing for a while, but it was more out of desperation than systematic intellectual progress.

OTOH you could look at the present accomplishments like it was throwing more vacuum tubes at a problem that can not be adequately addressed that way.

What turned out to be a solid-state solution was a completely different approach from the ground up.

To the extent a more power-saving technique using the same hardware is only a matter of different software approaches, that would be something that realistically could have been accomplished before so much energy was expended.

Even though I've always thought application-specific circuits would be what really helps ML and AI a lot, and that would end up not being the exact same hardware at all.

If power is truly being wasted enough to start rearing its ugly head, somebody should be able to figure out how to fix it before it gets out-of-hand.

Ironically enough with my experience using vacuum tubes, I've felt that there were some serious losses in technology when the research momentum involved was so rapidly abandoned in favor of "solid-state everything" at any cost.

Maybe it is a good idea to abandon the energy-intensive approaches, as soon as anything completely different that's the least bit promising can barely be seen by a gifted visionary to have a glimmer of potential.

16. michaelmrose ◴[20 Oct 24 02:08 UTC] No.41892366{4}[source]▶

>>41890621 #

This comment lives in a fictional world where there is a singular group that could have collectively acted counterfactually. In the real world any actor that individually went this route would have gone bankrupt while the others collected money by showing actual results even if ineffeciently earned.

replies(1): >>41892732 #

17. giraffe_lady ◴[20 Oct 24 02:16 UTC] No.41892396{7}[source]▶

>>41891845 #

"Even just in programming" the jury is still out. None of my coworkers using these are noticeably more productive than the ones who don't. Outside of programming no one gives a shit except scammers and hype chasers.

replies(1): >>41894449 #

18. recursive ◴[20 Oct 24 02:54 UTC] No.41892532{7}[source]▶

>>41891845 #

I've been using copilot for several months. If I could figure out a way to measure its impact on my productivity, I'd probably see a single digit percentage boost in "productivity". This is not life-changing for me. And for some tasks, it's actually worse than nothing. As in, I spend time feeding it a task, and it just completely fails to do anything useful.

replies(3): >>41892918 #>>41895485 #>>41903092 #

19. AYBABTME ◴[20 Oct 24 03:02 UTC] No.41892560{8}[source]▶

>>41891944 #

You haven't used them enough. Everytime an LLM reduces my search from 1min to 5s, the LLM pays.

Just summary features: save me 20min of reading a transcript, turn it into 20s. That's a huge enabler.

replies(3): >>41892901 #>>41893233 #>>41894036 #

20. newyankee ◴[20 Oct 24 03:54 UTC] No.41892732{5}[source]▶

>>41892366 #

Also it is likely that the rise of LLMs gave many researchers in allied fields the impetus to tackle with the problems that are relevant to making it more efficient and people stumbled upon a solution hiding there.

The momentum with LLMs and allied technology may last till it keeps on improving even by a few percentage points and keeps shattering human created new benchmarks every few months

21. mecsred ◴[20 Oct 24 04:39 UTC] No.41892901{9}[source]▶

>>41892560 #

If 20 mins of informations can legitimately be condensed into 20 seconds, it sounds like the original wasn't worth reading in the first place. Could have skipped the llm entirely.

replies(3): >>41893262 #>>41895287 #>>41895941 #

22. jimmaswell ◴[20 Oct 24 04:43 UTC] No.41892918{8}[source]▶

>>41892532 #

I've been using it for over a year I think. I don't often feed it tasks with comments so much as go about things the same as usual and let it autocomplete. The time and cognitive load saved adds up massively. I've had to go without it for a bit while my workplace gets its license in order for the corporate version and the personal version has an issue with the proxy, and it's been agonizing going without it again. I almost forgot how much it sucks having to jump to google every other minute, and it was easy to start to take for granted how much context copilot was letting me not have to hold onto in my head. It really lets me work on the problem as opposed to being mired in immaterial details. It feels like I'm at least 2x slower overall without it.

replies(2): >>41893474 #>>41893539 #

23. wruza ◴[20 Oct 24 06:02 UTC] No.41893233{9}[source]▶

>>41892560 #

Overviews aren’t code though. In code, for me, they don’t pass 80/20 tests well enough, sometimes even on simple cases. (You get 50-80% of an existing function/block with some important context prepended and a comment, let it write the rest and check if it succeeds). It doesn’t mean that LLMs are useless. Or that I am antillamist or a denier - I’m actually an enthusiast. But this specific claim I hear often and don’t find true. Maybe true for repetitive code in boring environments where typing and remembering formats/params over and over is the main issue. Not in actual code.

If I paste the actual non-trivial code, it starts deviating fast. And it isn’t too complex, it’s just less like “parallel sort two arrays” and more like “wait for an image on a screenshot by execing scrot (with no sound) repeatedly and passing the result to this detect-cv2.py script and use all matching options described in this ts type, get stdout json as in this ts type, and if there’s a match, wait for the specified anim timeout and test again to get the settled match coords after an animation finishes; throw after a total timeout”. Not a rocket science, pretty dumb shit, but right there they fall flat and start imagining things, heavily.

I guess it shines if you ask it to make an html form, but I couldn’t call that life-changing unless I had to make these damn forms all day.

24. bostik ◴[20 Oct 24 06:09 UTC] No.41893262{10}[source]▶

>>41892901 #

I upvoted you, because I think you have a valid point. The tone is unnecessarily aggressive though.

Effective and information-dense communication is really hard. That doesn't mean we should just accept the useless fluff surrounding the actual information and/or analysis. People could learn a lot from the Ignoble Prize ceremony's 24/7 presentation model.

Sadly, it seems we are heading towards a future where you may need an LLM to distill the relevant information out of a sea of noise.

replies(1): >>41895758 #

25. rockskon ◴[20 Oct 24 06:58 UTC] No.41893474{9}[source]▶

>>41892918 #

I don't know about you but LLMs spit out garbage nonsense frequent enough that I can't trust their output in any context I cannot personally verify the validity of.

replies(1): >>41901813 #

26. atq2119 ◴[20 Oct 24 07:14 UTC] No.41893539{9}[source]▶

>>41892918 #

> I almost forgot how much it sucks having to jump to google every other minute

Even allowing for some hyperbole, your programming experience is extremely different from mine. Looking anything up outside the IDE, let alone via Google, is by far the exception for me rather than the rule.

I've long suspected that this kind of difference explains a lot of the difference in how Copilot is perceived.

replies(1): >>41894099 #

27. 3abiton ◴[20 Oct 24 07:23 UTC] No.41893598{3}[source]▶

>>41890353 #

This is still amazing work, imagine running chungus models on a single 3090.

replies(1): >>41907214 #

28. andrei_says_ ◴[20 Oct 24 08:58 UTC] No.41894036{9}[source]▶

>>41892560 #

My experience with overviews is that they are often subtly or not so subtly inaccurate. LLMs not understanding meaning or intent carries risk of misrepresentation.

29. namaria ◴[20 Oct 24 09:13 UTC] No.41894099{10}[source]▶

>>41893539 #

Claiming LLMs are a massive boost for coding productivity is becoming a red flag that the claimant has a tenuous grasp on the skills necessary. Yeah if you have to look up everything all the time and you can't tell the AI slop isn't very good, you can put out code quite fast.

replies(5): >>41894349 #>>41895172 #>>41898052 #>>41900076 #>>41903110 #

30. soulofmischief ◴[20 Oct 24 10:16 UTC] No.41894349{11}[source]▶

>>41894099 #

Comments like this are a great example of the Dunning-Kruger effect. Your comment is actually an indication that you don't have the mastery required to get useful, productive output from a high quality LLM.

Maybe you don't push your boundaries as an engineer and thus rarely need to know new things or at least learn new API surfaces. Maybe you don't know how to effectively prompt an LLM. Maybe you lack the mastery to analyze and refine the results. Maybe you just like doing things the slow way. I too remember a time as an early programmer where I eschewed even Intellisense and basic auto complete...

I'd recommend learning a bit more and practicing some humility and curiosity before condemning an entire class of engineers just because you don't understand their workflow. Just because you've had subpar experiences with a new tool doesn't mean it's not a useful tool in another engineer's toolkit.

replies(1): >>41894397 #

31. namaria ◴[20 Oct 24 10:26 UTC] No.41894397{12}[source]▶

>>41894349 #

Funny you should make claims about my skills when you have exactly zero data about my abilities or performance.

Evaluating my skills based on how I evaluated someone else's skills when they tell me about their abilities with and without a crutch, and throwing big academic sounding expressions with 'effect' in them might be intimidating to some but to me it just transparently sounds pretentious and way off mark, since, like I said, you have zero data about my abilities or output.

> I'd recommend learning a bit more and practicing some humility and curiosity before condemning an entire class of engineers

You're clearly coming from an emotional place because you feel slighted. There is no 'class of engineers' in my evaluation. I recommend reading comments more closely, thinking about their content, and not getting offended when someone points out signs of lacking skills, because you might just be advertising your own limitations.

replies(1): >>41894604 #

32. JonChesterfield ◴[20 Oct 24 10:37 UTC] No.41894449{8}[source]▶

>>41892396 #

The people writing articles for journals that aggregate and approximate other sources are in mortal terror of LLMs. Likewise graphic designers and anyone working in (human language) translation.

I don't fear that LLMs are going to take my job as a developer. I'm pretty sure they mark a further decrease in the quality and coherence of software, along with a rapid increase in the quantity of code out there, and that seems likely to provide me with reliable employment forever. I'm basically employed in fixing bugs that didn't need to exist in the first place and that seems to cover a lot of software dev.

replies(1): >>41898223 #

33. soulofmischief ◴[20 Oct 24 11:18 UTC] No.41894604{13}[source]▶

>>41894397 #

> Funny you should make claims about my skills when you have exactly zero data about my abilities or performance.

Didn't you just do that to an entire class of engineers:

> Claiming LLMs are a massive boost for coding productivity is becoming a red flag that the claimant has a tenuous grasp on the skills necessary

Anyway,

> Evaluating my skills based on how I evaluated someone else's skills when they tell me about their abilities with and without a crutch

Your argument rests on the assumption that LLMs are a "crutch", and you're going to have to prove that before the rest of your argument holds any water.

It sucks getting generalized, doesn't it? Feels ostracizing? That's the exact experience someone who productively and effectively uses LLMs will have upon encountering your premature judgement.

> You're clearly coming from an emotional place because you feel slighted.

You start off your post upset that I'm "making claims" about your skills (I used the word "maybe" intentionally, multiple times), and then turn around and make a pretty intense claim about me. I'm not "clearly" coming from an emotional place, you did not "trigger" me, I took a moment to educate you about being overly judgemental before fully understanding something, and pointed out the inherent hypocrisy.

> you might just be advertising your own limitations

But apparently my approach was ineffective, and you are still perceiving a world where people who approach their work differently than you are inferior. Your toxic attitude is unproductive, and while you're busy imagining yourself as some masterful engineer, people are out there getting massive productivity boosts with careful application of cutting-edge generative technologies. LLMs have been nothing short of transcendental to a curious but skilled mind.

replies(1): >>41894901 #

34. haakonhr ◴[20 Oct 24 11:49 UTC] No.41894732{7}[source]▶

>>41891845 #

And here you're assuming that making software engineers more productive would be a service to the world. I think the jury is out on that one as well. At least for the majority of software engineering since 2010.

35. Ygg2 ◴[20 Oct 24 12:27 UTC] No.41894901{14}[source]▶

>>41894604 #

> Didn't you just do that to an entire class of engineers

Not really. He said "if you claim LLM's are next thing since sliced butter I am doubting your abilities". Which is fair. It's not really a class as much as a group.

I've never been wowed over by LLMs. At best they are boilerplate enhancers. At worst they write plausibly looking bullshit that compiles but breaks everything. Give it something truly novel and/or fringe and it will fold like a deck of cards.

Even latest research called LLM's benefits into question: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4945566

That said. They are fine at generating commit messages and docs than me.

replies(1): >>41894947 #

36. soulofmischief ◴[20 Oct 24 12:35 UTC] No.41894947{15}[source]▶

>>41894901 #

> Not really. He said "if you claim LLM's are next thing since sliced butter I am doubting your abilities". Which is fair.

No, OP said:

> Claiming LLMs are a massive boost for coding productivity is becoming a red flag that the claimant has a tenuous grasp on the skills necessary

Quotation marks are usually reserved for direct quotes, not paraphrases or straw mans.

> I've never been wowed over by LLMs.

Cool. I have, and many others have. I'm unsure why your experience justifies invalidating the experiences of others or supporting prejudice against people who have made good use of them.

> Give it something truly novel and/or fringe and it will fold like a deck of cards.

Few thoughts are truly novel, most are derivative or synergistic. Cutting edge LLMs, when paired with a capable human, are absolutely capable of productive work. I have long, highly technical and cross-cutting discussions with GPT 4o which I simply could not have with any human that I know. Humans like that exist, but I don't know them and so I'm making due with a very good approximation.

Your and OP's lack of imagination at the capabilities of LLMs are more telling than you realize to those intimate with them, which is what makes this all quite ironic given that it started from OP making claims about how people who say LLMs massively boost productivity are giving tells that they're not skilled enough.

replies(1): >>41896844 #

37. VagabundoP ◴[20 Oct 24 12:59 UTC] No.41895079{4}[source]▶

>>41890621 #

That's just not how progress works.

Its iteritive, there are plenty of cul-de-sacs and failures. You can't really optimise until you have something that works and its a messy process that is inefficient.

You're looking at this with hindsight.

38. williamcotton ◴[20 Oct 24 13:15 UTC] No.41895172{11}[source]▶

>>41894099 #

I know plenty of fantastic engineers that use LLM tools as code assistants.

I’m not sure when and why reading documentation and man pages became a sign of a lack of skill. Watch a presentation by someone like Brian Kernighan and you’ll see him joke about looking up certain compiler flags for the thousandth time!

Personally I work in C, C#, F#, Java, Kotlin, Swift, R, Ruby, Python, Postgres SQL, MySQL SQL, TypeScript, node, and whatever hundreds of libraries and DSLs are built on top. Yes, I have to look up documentation and with regularity.

replies(3): >>41896407 #>>41897682 #>>41897867 #

39. AYBABTME ◴[20 Oct 24 13:36 UTC] No.41895287{10}[source]▶

>>41892901 #

Think of the summary of a zoom call. Or of a chapter that you're not sure if you care to read or not.

Not all content is worth consuming, and not all content is dense.

replies(1): >>41899027 #

40. framapotari ◴[20 Oct 24 14:10 UTC] No.41895485{8}[source]▶

>>41892532 #

If you're already a competent developer, I think that's a reasonable expectation of impact on productivity. I think the 'life-changing' part comes in helping someone get to the point of building things with code where before they couldn't (or believed they couldn't). It does a lot better job of turning the enthusiasts and code-curious into amateurs vs. empowering professionals.

replies(1): >>41898903 #

41. mecsred ◴[20 Oct 24 14:58 UTC] No.41895758{11}[source]▶

>>41893262 #

Didn't intend for it to be aggressive, just concise. Spare me from the llm please :)

42. crazygringo ◴[20 Oct 24 15:29 UTC] No.41895941{10}[source]▶

>>41892901 #

> it sounds like the original wasn't worth reading in the first place

But if that's the only place that contained the information you needed, then you have no choice.

There's a lot of material out there that is badly written, badly organized, badly presented. LLM's can be a godsend for extracting the information you actually need without wasting 20 minutes wading through the muck.

replies(1): >>41904244 #

43. FpUser ◴[20 Oct 24 16:19 UTC] No.41896407{12}[source]▶

>>41895172 #

Same opinion here. I work with way too many things to keep everything in my head. I'd rather use my head for design than to remember every function and parameter of say STL

44. Ygg2 ◴[20 Oct 24 17:08 UTC] No.41896844{16}[source]▶

>>41894947 #

> Quotation marks are usually reserved for direct quotes,

Not on HN. Customary is to use > paragraph quotes like you did. However I will keep that in mind.

> Cool. I have, and many others have. I'm unsure why your experience justifies invalidating the experiences of others

If we're both grading a single student (LLM) in same field (programming), and you find it great and I find it disappointing, it means one of us is scoring it wrong.

I gave papers that demonstrate its failings, where is your counter-proof?

> Your and OP's lack of imagination at the capabilities of LLMs

It's not lack of imagination. It's terribleness of results. It can't consistently write good doc comments. I does not understand the code nor it's purpose, but roughly guesses the shape. Which is fine for writing something that's not as formal as code.

It can't read and understand specifications, and even generate something as simple as useful API for it. The novel part doesn't have to be that novel just something out of its learned corpus.

Like Yaml parser in Rust. Maybe Zig or something beyond it's gobbled data repo.

> Few thoughts are truly novel, most are derivative or synergistic.

Sure but you still need A mind to derive/synergize the noise of everyday environment into something novel.

It can't even do that but remix data into plausibly looking forms. A stochastic parrot. Great for DnD campaign. Shit for code.

replies(1): >>41897229 #

45. soulofmischief ◴[20 Oct 24 18:03 UTC] No.41897229{17}[source]▶

>>41896844 #

> Not on HN. Customary is to use > paragraph quotes like you did. However I will keep that in mind.

Hacker News is not some strange place where the normal rules of discourse don't apply. I assume you are familiar with the function of quotation marks.

> If we're both grading a single student (LLM) in same field (programming), and you find it great and I find it disappointing, it means one of us is scoring it wrong.

No, it means we have different criteria and general capability for evaluating the LLM. There are plenty of standard criteria which LLMs are pitted against, and we have seen continued improvement since their inception.

> It can't consistently write good doc comments. I does not understand the code nor it's purpose, but roughly guesses the shape.

Writing good documentation is certainly a challenging task. Experience has led me to understand where current LLMs typically do and don't succeed with writing tests and documentation. Generally, the more organized and straightforward the code, the better. The smaller each module is, the higher the likelihood of a good first pass. And then you can fix deficiencies in a second, manual pass. If done right, it's generally faster than not making use of LLMs for typical workflows. Accuracy also goes down for more niche subject material. All tools have limitations, and understanding them is crucial to using them effectively.

> It can't read and understand specifications, and even generate something as simple as useful API for it.

Actually, I do this all the time and it works great. Keep practicing!

In general, the stochastic parrot argument is oft-repeated but fails to recognize the general capabilities of machine learning. We're not talking about basic Markov chains, here. There are literally academic benchmarks against which transformers have blown away all initial expectations, and they continue to incrementally improve. Getting caught up criticizing the crudeness of a new, revolutionary tool is definitely my idea of unimaginative.

replies(1): >>41898249 #

46. specialist ◴[20 Oct 24 19:01 UTC] No.41897682{12}[source]▶

>>41895172 #

For me, thus far, LLMs help me forage docs. I know what I want and it helps me narrow my search faster. Watching adepts like Simon Willison wield LLMs is on my to do list.

47. fragmede ◴[20 Oct 24 19:29 UTC] No.41897867{12}[source]▶

>>41895172 #

Add Golang and rust and JavaScript and next.js and react to the list for me. ;) If you live and work and breathe in the same kernel, operating system, and user space, and don't end up memorizing the various bits of minutiae, I'd judge you (and me) too, but it's not the 2000's, or the 90's or even the 80's anymore, and some of us don't have the luxury, or have chosen not to, live in one small niche for our entire career. At the end of the day, the client doesn't care what language you use, or the framework, or even the code quality, as long as it works. What they don't want to pay for is overage, and taking the previous developer's work and refactoring it and rewriting it in your preferred language isn't high value work, so you pick up whatever they used and run with it. Yeah that makes me less fluent in that one particular thing, not having done the same thing for 20+ years, but that's not where I deliver value. Some people do, and that's great for them and their employers, but my expertise lies elsewhere. I got real good at MFC, back in the day, and then WX and Qt and I'm working on getting good at react and such.

48. jimmaswell ◴[20 Oct 24 19:57 UTC] No.41898052{11}[source]▶

>>41894099 #

At the risk of sounding like an inflated ego: I'm very good at what I do, the rest of my team frequently looks to me for guidance, my boss and boss's boss etc. have repeatedly said I'm among the most valuable people around, and I'm the one turned to in emergencies, for difficult architectural decisions, and to lead projects. I conceptually understand the ecosystem I work in very well at every layer.

What I'm not good at is memorizing API's and libraries that all use different verbs and nouns for the same thing, and other such things that are immaterial to the actual work. How do you use a mutation observer again? Hell if I remember the syntax but I know the concept, and copilot will probably spit out what I want, and I'll easily verify the output. Or how do you copy an array in JS? Or print a stack trace? Or do a node walk? You can either wade through google and stackoverflow, or copilot can tell you instantly. And I can very quickly tell if the code copilot gave me is sensible or not.

49. giraffe_lady ◴[20 Oct 24 20:26 UTC] No.41898223{9}[source]▶

>>41894449 #

They're not scared of LLMs because of anything about LLMs. It's just that everyone with power is publicly horny to delete the remaining middle class jobs and are happy to use LLMs as a justification whether it can functionally replace those workers or not. So it's not that everyone has evaluated chatgpt and cannily realized it can do their job, they're just reading the room.

50. Ygg2 ◴[20 Oct 24 20:33 UTC] No.41898249{18}[source]▶

>>41897229 #

> Hacker News is not some strange place where the normal rules of discourse don't apply. I assume you are familiar with the function of quotation marks.

Language is all about context. I wasn't trying to be deceitful. And on HN I've never seen anyone using quotation marks to quote people.

> Writing good documentation is certainly a challenging task.

Doctests isn't same as writing documentation. Doctest are the simplest form of documentation. Given function named so and so write API doc + example. It could not even write example that passed syntax check.

> Actually, I do this all the time and it works great. Keep practicing!

Then you haven't given it interesting/complex enough problems.

Also this isn't about practice. It's about its capabilities.

> In general, the stochastic parrot argument is oft-repeated but fails to recognize the general capabilities of machine learning.

I gave it write YAML parser given Yaml org spec, and it wrote following struct:

   enum Yaml {
      Scalar(String),
      List(Vec<Box<Yaml>>),
      Map(HashMap<String, Box<Yaml>>),
   }

This is the stochastic parrot in action. Why? Because it tried to pass of JSON like structure as YAML.

Whatever LLM's are they aren't intelligent. Or they have attention spans of a fruit fly and can't figure out basic differences.

replies(2): >>41898825 #>>41899658 #

51. williamcotton ◴[20 Oct 24 21:58 UTC] No.41898825{19}[source]▶

>>41898249 #

That’s not a good prompt, my friend!

52. jimmaswell ◴[20 Oct 24 22:10 UTC] No.41898903{9}[source]▶

>>41895485 #

> turning the enthusiasts and code-curious into amateurs vs. empowering professionals.

I'm firmly in #2. My other comment goes over how.

I'm intrigued to see how devs in #1 grow. One might be wary those devs would grow into bad habits and not thinking for themselves, but it might be a case of the ancient Greek rant against written books hindering memorization. Could be that they'll actually grow to be even better devs unburdened by time wasted on trivial details.

53. postalrat ◴[20 Oct 24 22:30 UTC] No.41899027{11}[source]▶

>>41895287 #

If I had a recording of the zoom call I could generate a summary on demand with better tools than were available at the time the zoom call was made.

54. soulofmischief ◴[21 Oct 24 00:28 UTC] No.41899658{19}[source]▶

>>41898249 #

> Language is all about context. I wasn't trying to be deceitful. And on HN I've never seen anyone using quotation marks to quote people.

It's still unclear how this apparent lack of knowledge of basic writing mechanics would justify your use of quotation marks to attempt a straw man argument wherein you deliberately attempted to convince me that OP said something completely different.

> Doctests isn't same as writing documentation. Doctest are the simplest form of documentation. Given function named so and so write API doc + example. It could not even write example that passed syntax check.

That truly sounds like a skill issue. This no-true-Scotsman angle is silly. I said documentation and tests, I don't know how you got "doctests" out of that. I said "documentation", and "tests". I didn't say "the simplest form of documentation", that is another straw man on your behalf.

> Then you haven't given it interesting/complex enough problems.

Wow, the arrogance. There is absolutely nothing to justify this assumption. It's exceedingly likely that you yourself aren't capable of interacting meaningfully with LLMs for one reason or another, not that I haven't considered interesting or complex problems. I bring some extraordinarily difficult cross-domain problems to these tools and end up satisfied with the results far more often than not.

My argument is literally that cutting-edge LLMs excel with complex problems, and they do in many cases in the right hands. It's unfortunate if you can't find these problems "interesting" enough, but that hasn't stopped me from getting good enough results to justify using an LLM during research and development.

> Also this isn't about practice. It's about its capabilities.

Unfortunately, this discourse has made it clear that you do need considerable practice, because you seem to get bad results, and you're more interested in defending those bad results even if it means insulting others, instead of just considering that you might not quite be skilled enough.

> This is the stochastic parrot in action. Why? Because it tried to pass of JSON like structure as YAML.

That proves its stochasticity, but it doesn't prove it is a "stochastic parrot". As long as you lack the capability to realistically assess these models, it's no wonder that you've had such bad experiences. You didn't even bother clarifying which LLM you used, nor did you mention any parameters of your experiment or even if you attempted multiple trials with different LLMs or prompts. You failed to follow the scientific method and so it's no surprise that you got subpar results.

> Whatever LLM's are they aren't intelligent.

You have demonstrated throughout this discussion that you aren't capable of assessing machine intelligence. If you learned how to be more open-minded and took the time to learn more about these new technologies, instead of complaining about contemporary shortcomings and bashing those who do benefit from the technologies, it would likely open many doors for you.

replies(1): >>41902066 #

55. h_tbob ◴[21 Oct 24 01:57 UTC] No.41900076{11}[source]▶

>>41894099 #

Hey, we were all beginners once!

On another note, even if you are experienced it helps when doing new stuff and you don’t know the proper syntax for what you want. For example let’s say your using flutter, you can just type

// bold

And it will help put the proper bold stuff in there.

56. vrighter ◴[21 Oct 24 05:35 UTC] No.41900955{7}[source]▶

>>41891845 #

actually, studies seem to show it makes code worse. Just like llms can confidently spout junk, devs using llms confidently check in more bugs.

57. mvkel ◴[21 Oct 24 06:06 UTC] No.41901112[source]▶

>>41889903 (TP) #

Is this effectively quantizing without actually quantizing?

58. pnt12 ◴[21 Oct 24 07:40 UTC] No.41901656[source]▶

>>41890324 #

The GPU main advantage is its parallelism - thousands of cores compared handful of cores in CPUs.

If you're training models with billions of parameters, you're still gonna need that.

59. ab5tract ◴[21 Oct 24 08:10 UTC] No.41901813{10}[source]▶

>>41893474 #

Same here. The Cody autocomplete is so off base all the time that it’s deactivated.

I serve Cody a direct question about 1-3 times a week. Of that, it gets maybe 50% correct on the first try. I don’t bother with a second try because by then I’ve already spent the equivalent amount of time looking at the relevant library source code and/or docs would have taken.

60. Ygg2 ◴[21 Oct 24 09:00 UTC] No.41902066{20}[source]▶

>>41899658 #

> That truly sounds like a skill issue. This no-true-Scotsman angle is silly. I said documentation and tests, I don't know how you got "doctests" out of that. I said "documentation", and "tests". I didn't say "the simplest form of documentation", that is another straw man on your behalf.

What are you on about? Doctest is the simplest form of documentation and test. I.e. you don't have to write an in-depth test, you just need to understand what the function does. I expect even juniors can write a doctest that passes the compiler check. Not a good, not a passing one doctest, a COMPILING one. It's rate of writing a passing one was even worse.

> Wow, the arrogance. There is absolutely nothing to justify this assumption.

Ok, then. Prove what exactly hard problems did you give it?

I gave my examples, I noticed it fails at complex tasks like YAML parser in an unknown language.

I noticed when confronted with anything harder than writing pure boilerplate, it fails. E.g. it would fail 10% of the time.

> Unfortunately, this discourse has made it clear that you do need considerable practice

You can practice with a stochastic parrot all you want, it won't make it an Einstein. Programming is all about converting requirements to math, and LLMs aren't good at it. Do I need to link stuff like doing basic calculation and counting 'r' in the word 'strawberries'.

The best you can do is half the error rate, but that follows a power law. You need to double the energy to half the error rate. So unless you intend to boil the surface of the Earth to get it to be decent at programming, I don't think it's going to change anytime soon.

> You have demonstrated throughout this discussion that you aren't capable of assessing machine intelligence.

Pure ad hominem. You've demonstrated nothing outside your ""Trust me bro, it's not a bubble"" and ""You're wrong"". I'm using double double quotes so you don't assume I'm quoting you.

> You didn't even bother clarifying which LLM you used.

For YAML parser, I used Chat GPT-4o at my friend's place. For the rest of the tasks I used JetBrains AI assistant, which is a mix of Chat GPT-4, GPT-4o and GPT-3.

replies(1): >>42017113 #

61. datavirtue ◴[21 Oct 24 11:46 UTC] No.41903092{8}[source]▶

>>41892532 #

If you are in maintenance mode your visits to Copilot will be rare. If you are building greenfield, use goes through the roof. All those test cases, nevermind all the POC and framework scaffolding and other boilerplate that is now completely unacceptable as a use of developer time.

replies(1): >>41905734 #

62. datavirtue ◴[21 Oct 24 11:49 UTC] No.41903110{11}[source]▶

>>41894099 #

Nope, just want it to write tests and other low value work so I can get shit done. Some of it depends on the stakes of your job. Are you floating along day by day in big corp or are you grinding it out at a startup? Those working at the startup have to use coding assistants, period.

63. mecsred ◴[21 Oct 24 14:00 UTC] No.41904244{11}[source]▶

>>41895941 #

Yeah I can see that use case, I just wouldn't trust an LLM to decide "is this worth reading". May as well flip a coin.

64. recursive ◴[21 Oct 24 16:22 UTC] No.41905734{9}[source]▶

>>41903092 #

I'm building "greenfield". I still use it at least daily, but the benefit just struggles to outweigh the cost of invoking it. Maybe I don't understand how to use it.

replies(1): >>41905945 #

65. datavirtue ◴[21 Oct 24 16:46 UTC] No.41905945{10}[source]▶

>>41905734 #

It really depends on what you are doing and what tech you are using. I use it to teach me or build out ideas quickly or solve for complex issues. Mostly these days I use it as a memory aid or to bounce ideas off it. In my job I have to move quickly and stay focused as I'm driving improvements to a tech stack that reaches across four verticals, each having their own quirks and tech stacks. It's great for jogging my memory and helping to flush out ideas and approaches that I then bounce off the dev teams. Super helpful.

66. Sohcahtoa82 ◴[21 Oct 24 18:57 UTC] No.41907214{4}[source]▶

>>41893598 #

The bottleneck on a consumer-grade GPU like a 3090 isn't the processing power, it's the lack of RAM. The PCI-Express bus ends up being your bottleneck from having to swap in parts of the model.

Even with PCIe 5.0 and 16 lanes, you only get 64 GB/s of bandwidth. If you're trying to run a model too big for your GPU, then for every token, it has to reload the entire model. With a 70B parameter model, 8 bit quantization, you're looking at just under 1 token/sec just from having to transfer parts of the model in constantly. Making the actual computation faster won't make it any faster.

replies(1): >>41907290 #

67. dragonwriter ◴[21 Oct 24 19:04 UTC] No.41907290{5}[source]▶

>>41907214 #

OTOH, doesn't it also mean that (given appropriate software framework support) iGPUs with less processing capacity and slower-but-more RAM available (because system RAM is comparatively cheap and plentiful compared to VRAM) without swapping anything are more competitive against consumer dGPUs with fast-but-small RAM for both inference and training with larger models?

replies(1): >>41908501 #

68. Sohcahtoa82 ◴[21 Oct 24 21:02 UTC] No.41908501{6}[source]▶

>>41907290 #

System memory isn't that fast, either. Even with DDR5-8400, the fastest memory you can get right now, you're only looking at a memory transfer speed of 67.2 GB/s, barely faster than the PCI-E bus. So even if you could store that entire 70B model in RAM, you're still getting just under 1 token/sec, and that's assuming your CPU doesn't become a bottleneck.

Your best bet would likely be a laptop that has integrated system RAM with VRAM, but I don't think any of those offer enough RAM to store an entire 70B model. A 7B parameter model would work fine, but you could do those on a consumer-grade GPU anyways.

replies(1): >>41910031 #

69. DeveloperErrata ◴[22 Oct 24 00:35 UTC] No.41910031{7}[source]▶

>>41908501 #

Macbook Pros with M3 & integrated RAM & VRAM can do 70B models :)

70. yogrish ◴[23 Oct 24 04:20 UTC] No.41921796[source]▶

>>41889903 (TP) #

we used to use Fixed point multiplications (Q Format) in DSP algorithms on different DSP architectures. https://en.wikipedia.org/wiki/Q_(number_format). They used to be so fast and near accurate to floating point multiplications. Probably we need to use those DSPs blocks as part of Tensors/GPUs to realise both fast multiplications & parallelisms.

71. ◴[01 Nov 24 14:18 UTC] No.42017113{21}[source]▶

>>41902066 #

↑