←back to thread

358 points andrewstetsenko | 2 comments | | HN request time: 0.49s | source
Show context
sysmax ◴[] No.44360302[source]
AI can very efficiently apply common patterns to vast amounts of code, but it has no inherent "idea" of what it's doing.

Here's a fresh example that I stumbled upon just a few hours ago. I needed to refactor some code that first computes the size of a popup, and then separately, the top left corner.

For brevity, one part used an "if", while the other one had a "switch":

    if (orientation == Dock.Left || orientation == Dock.Right)
        size = /* horizontal placement */
    else
        size = /* vertical placement */

    var point = orientation switch
    {
        Dock.Left => ...
        Dock.Right => ...
        Dock.Top => ...
        Dock.Bottom => ...
    };
I wanted the LLM to refactor it to store the position rather than applying it immediately. Turns out, it just could not handle different things (if vs. switch) doing a similar thing. I tried several variations of prompts, but it very strongly leaning to either have two ifs, or two switches, despite rather explicit instructions not to do so.

It sort of makes sense: once the model has "completed" an if, and then encounters the need for a similar thing, it will pick an "if" again, because, well, it is completing the previous tokens.

Harmless here, but in many slightly less trivial examples, it would just steamroll over nuance and produce code that appears good, but fails in weird ways.

That said, splitting tasks into smaller parts devoid of such ambiguities works really well. Way easier to say "store size in m_StateStorage and apply on render" than manually editing 5 different points in the code. Especially with stuff like Cerebras, that can chew through complex code at several kilobytes per second, expanding simple thoughts faster than you could physically type them.

replies(2): >>44360561 #>>44360985 #
gametorch[dead post] ◴[] No.44360561[source]
[flagged]
sysmax ◴[] No.44360830[source]
I am working on a GUI for delegating coding tasks to LLMs, so I routinely experiment with a bunch of models doing all kinds of things. In this case, Claude Sonnet 3.7 handled it just fine, while Llama-3.3-70B just couldn't get it. But that is literally the simplest example that illustrates the problem.

When I tried giving top-notch LLMs harder tasks (scan an abstract syntax tree coming from a parser in a particular way, and generate nodes for particular things) they completely blew it. Didn't even compile, let alone logical errors and missed points. But once I broke down the problem to making lists of relevant parsing contexts, and generating one wrapper class at a time, it saved me a whole ton of work. It took me a day to accomplish what would normally take a week.

Maybe they will figure it out eventually, maybe not. The point is, right now the technology has fundamental limitations, and you are better off knowing how to work around them, rather than blindly trusting the black box.

replies(1): >>44360860 #
gametorch ◴[] No.44360860[source]
Yeah exactly.

I think it's a combination of

1) wrong level of granularity in prompting

2) lack of engineering experience

3) autistic rigidity regarding a single hallucination throwing the whole experience off

4) subconscious anxiety over the threat to their jerbs

5) unnecessary guilt over going against the tide; anything pro AI gets heavily downvoted on Reddit and is, at best, controversial as hell here

I, for one, have shipped like literally a product per day for the last month and it's amazing. Literally 2,000,000+ impressions, paying users, almost 100 sign ups across the various products. I am fucking flying. Hit the front page of Reddit and HN countless times in the last month.

Idk if I break down the prompts better or what. But this is production grade shit and I don't even remember the last time I wrote more than two consecutive lines of code.

replies(2): >>44360937 #>>44364148 #
nextlevelwizard ◴[] No.44364148[source]
Can you provide links to these 30 products you have shipped?

I keep hearing how people are so god damn productive with LLMs, but whenever I try to use them they can not reliably produce working code. Usually producing something that looks correct at first, but either doesn't work (at all or as intended).

Going over you list:

1. if the problem is that I need to be very specific with how I want LLM to fix the issue, like providing it the solution, why wouldn't I just make the change myself?

2. I don't even know how you can think that not vibe coding means you lack experience

3. Yes. If the model keeps trying to use non-existent language feature or completely made up functions/classes that is a problem and nothing to do with "autism"

4. This is what all AI maximalists want to think; that only reason why average software developer isn't knee deep in AI swamp with them is that they are luddites who are just scared for their jobs. I personally am not as I have not seen LLMs actually being useful for anything but replacing google searches.

5. I don't know why you keep bringing up Reddit so much. I also don't quite get who is going against the tide here, are you going against the tide of the downvotes or am I for not using LLMs to "fucking fly"?

>But this is production grade shit

I truly hope it is, because...

>and I don't even remember the last time I wrote more than two consecutive lines of code.

Means if there is a catastrophic error, you probably can't fix it yourself.

replies(1): >>44366200 #
gametorch ◴[] No.44366200[source]
> if the problem is that I need to be very specific with how I want LLM to fix the issue, like providing it the solution, why wouldn't I just make the change myself?

I type 105 wpm on a bad day. Try gpt-4.1. It types like 1000 wpm. If you can formally describe your problem in English and the number of characters in the English prompt is less than whatever code you write, gpt-4.1 will make you faster.

Obviously you have to account for gpt-4.1 being wrong sometimes. Even so, if you have to run two or three prompts to get it right, it still is going to be faster.

> I don't even know how you can think that not vibe coding means you lack experience

If you lack experience, you're going to prompt the LLM to do the wrong thing and engineer yourself into a corner and waste time. Or you won't catch the mistakes it makes. Only experience and "knowing more than LLM" allows you to catch its mistakes and fix them. (Which is still faster than writing the code yourself, merely by way of it typing 1000 wpm.)

> If the model keeps trying to use non-existent language feature or completely made up functions/classes that is a problem and nothing to do with "autism"

You know that you can tell it those functions are made up and paste it the latest documentation and then it will work, right? That knee-jerk response makes it sound like you have this rigidity problem, yourself.

> I personally am not as I have not seen LLMs actually being useful for anything but replacing google searches.

Nothing really of substance here. Just because you don't know how to use this tool that doesn't mean no one does.

This is the least convincing point for me, because I come along and say "Hey! This thing has let me ship far more working code than before!" and then your response is just "I don't know how to use it." I know that it's made me more productive. You can't say anything to deny that. Do you think I have some need to lie about this? Why would I feel the need to go on the internet and reap a bunch of downvotes while peddling some lie that does stand to get me anything if I convince people of the lie?

> I also don't quite get who is going against the tide here, are you going against the tide of the downvotes

Yeah, that's what I'm saying. People will actively shame and harass you for using LLMs. It's mind boggling that a tool, a technology, that works for me and has made me more productive, would be so vehemently criticized. That's why I listed these 5 reasons, the only reasons I have thought of yet.

> Means if there is a catastrophic error, you probably can't fix it yourself.

See my point about lacking experience. If you can't do the surgery yourself every once in a while, you're going to hate these tools.

Really, you've just made a bunch of claims about me that I know are false, so I'm left unconvinced.

I'm trying to have a charitable take. I don't find joy in arguing or leaving discussions with a bitter taste. I genuinely don't know why people are so mad at me claiming that a tool has helped me be more productive. They all just don't believe me, ultimately. They all come up with some excuse as to why my personal anecdotes can be dismissed and ignored: "even though you have X, we should feel bad for you because Y!" But it's never anything of substance. Never anything that has convinced me. Because at the end of the day, I'm shipping faster. My code works. My code has stood the test of time. Insults to my engineering ability I know are demonstrably false. I hope you can see the light one day. These are extraordinary tools that are only getting better, at least by a little bit, in the foreseeable future. Why deny?

replies(2): >>44366559 #>>44374038 #
rootnod3 ◴[] No.44366559[source]
Would also love to see those daily shipped products. What I see on reddit is the same quiz done several times just for different categories and the pixel art generator. That does not look like shipping a product per day as you claim.
replies(1): >>44367328 #
1. gametorch ◴[] No.44367328[source]
On my main, not gonna dox myself. Being pro AI is clearly a faux pas for personal branding.

Just a few days ago got flamed for only having 62 users on GameTorch. Now up to 91 and more paying subs. Entire thing written by LLMs and hasn't fallen over once. I'd rather be a builder than an armchair critic.

People would rather drag you down in the hole that they're in than climb out.

replies(1): >>44368385 #
2. rootnod3 ◴[] No.44368385[source]
Not trying to drag down, genuinely interested due to the claim.