←back to thread

358 points andrewstetsenko | 8 comments | | HN request time: 0s | source | bottom
Show context
sysmax ◴[] No.44360302[source]
AI can very efficiently apply common patterns to vast amounts of code, but it has no inherent "idea" of what it's doing.

Here's a fresh example that I stumbled upon just a few hours ago. I needed to refactor some code that first computes the size of a popup, and then separately, the top left corner.

For brevity, one part used an "if", while the other one had a "switch":

    if (orientation == Dock.Left || orientation == Dock.Right)
        size = /* horizontal placement */
    else
        size = /* vertical placement */

    var point = orientation switch
    {
        Dock.Left => ...
        Dock.Right => ...
        Dock.Top => ...
        Dock.Bottom => ...
    };
I wanted the LLM to refactor it to store the position rather than applying it immediately. Turns out, it just could not handle different things (if vs. switch) doing a similar thing. I tried several variations of prompts, but it very strongly leaning to either have two ifs, or two switches, despite rather explicit instructions not to do so.

It sort of makes sense: once the model has "completed" an if, and then encounters the need for a similar thing, it will pick an "if" again, because, well, it is completing the previous tokens.

Harmless here, but in many slightly less trivial examples, it would just steamroll over nuance and produce code that appears good, but fails in weird ways.

That said, splitting tasks into smaller parts devoid of such ambiguities works really well. Way easier to say "store size in m_StateStorage and apply on render" than manually editing 5 different points in the code. Especially with stuff like Cerebras, that can chew through complex code at several kilobytes per second, expanding simple thoughts faster than you could physically type them.

replies(2): >>44360561 #>>44360985 #
gametorch[dead post] ◴[] No.44360561[source]
[flagged]
sysmax ◴[] No.44360830[source]
I am working on a GUI for delegating coding tasks to LLMs, so I routinely experiment with a bunch of models doing all kinds of things. In this case, Claude Sonnet 3.7 handled it just fine, while Llama-3.3-70B just couldn't get it. But that is literally the simplest example that illustrates the problem.

When I tried giving top-notch LLMs harder tasks (scan an abstract syntax tree coming from a parser in a particular way, and generate nodes for particular things) they completely blew it. Didn't even compile, let alone logical errors and missed points. But once I broke down the problem to making lists of relevant parsing contexts, and generating one wrapper class at a time, it saved me a whole ton of work. It took me a day to accomplish what would normally take a week.

Maybe they will figure it out eventually, maybe not. The point is, right now the technology has fundamental limitations, and you are better off knowing how to work around them, rather than blindly trusting the black box.

replies(1): >>44360860 #
gametorch ◴[] No.44360860[source]
Yeah exactly.

I think it's a combination of

1) wrong level of granularity in prompting

2) lack of engineering experience

3) autistic rigidity regarding a single hallucination throwing the whole experience off

4) subconscious anxiety over the threat to their jerbs

5) unnecessary guilt over going against the tide; anything pro AI gets heavily downvoted on Reddit and is, at best, controversial as hell here

I, for one, have shipped like literally a product per day for the last month and it's amazing. Literally 2,000,000+ impressions, paying users, almost 100 sign ups across the various products. I am fucking flying. Hit the front page of Reddit and HN countless times in the last month.

Idk if I break down the prompts better or what. But this is production grade shit and I don't even remember the last time I wrote more than two consecutive lines of code.

replies(2): >>44360937 #>>44364148 #
nextlevelwizard ◴[] No.44364148[source]
Can you provide links to these 30 products you have shipped?

I keep hearing how people are so god damn productive with LLMs, but whenever I try to use them they can not reliably produce working code. Usually producing something that looks correct at first, but either doesn't work (at all or as intended).

Going over you list:

1. if the problem is that I need to be very specific with how I want LLM to fix the issue, like providing it the solution, why wouldn't I just make the change myself?

2. I don't even know how you can think that not vibe coding means you lack experience

3. Yes. If the model keeps trying to use non-existent language feature or completely made up functions/classes that is a problem and nothing to do with "autism"

4. This is what all AI maximalists want to think; that only reason why average software developer isn't knee deep in AI swamp with them is that they are luddites who are just scared for their jobs. I personally am not as I have not seen LLMs actually being useful for anything but replacing google searches.

5. I don't know why you keep bringing up Reddit so much. I also don't quite get who is going against the tide here, are you going against the tide of the downvotes or am I for not using LLMs to "fucking fly"?

>But this is production grade shit

I truly hope it is, because...

>and I don't even remember the last time I wrote more than two consecutive lines of code.

Means if there is a catastrophic error, you probably can't fix it yourself.

replies(1): >>44366200 #
1. gametorch ◴[] No.44366200[source]
> if the problem is that I need to be very specific with how I want LLM to fix the issue, like providing it the solution, why wouldn't I just make the change myself?

I type 105 wpm on a bad day. Try gpt-4.1. It types like 1000 wpm. If you can formally describe your problem in English and the number of characters in the English prompt is less than whatever code you write, gpt-4.1 will make you faster.

Obviously you have to account for gpt-4.1 being wrong sometimes. Even so, if you have to run two or three prompts to get it right, it still is going to be faster.

> I don't even know how you can think that not vibe coding means you lack experience

If you lack experience, you're going to prompt the LLM to do the wrong thing and engineer yourself into a corner and waste time. Or you won't catch the mistakes it makes. Only experience and "knowing more than LLM" allows you to catch its mistakes and fix them. (Which is still faster than writing the code yourself, merely by way of it typing 1000 wpm.)

> If the model keeps trying to use non-existent language feature or completely made up functions/classes that is a problem and nothing to do with "autism"

You know that you can tell it those functions are made up and paste it the latest documentation and then it will work, right? That knee-jerk response makes it sound like you have this rigidity problem, yourself.

> I personally am not as I have not seen LLMs actually being useful for anything but replacing google searches.

Nothing really of substance here. Just because you don't know how to use this tool that doesn't mean no one does.

This is the least convincing point for me, because I come along and say "Hey! This thing has let me ship far more working code than before!" and then your response is just "I don't know how to use it." I know that it's made me more productive. You can't say anything to deny that. Do you think I have some need to lie about this? Why would I feel the need to go on the internet and reap a bunch of downvotes while peddling some lie that does stand to get me anything if I convince people of the lie?

> I also don't quite get who is going against the tide here, are you going against the tide of the downvotes

Yeah, that's what I'm saying. People will actively shame and harass you for using LLMs. It's mind boggling that a tool, a technology, that works for me and has made me more productive, would be so vehemently criticized. That's why I listed these 5 reasons, the only reasons I have thought of yet.

> Means if there is a catastrophic error, you probably can't fix it yourself.

See my point about lacking experience. If you can't do the surgery yourself every once in a while, you're going to hate these tools.

Really, you've just made a bunch of claims about me that I know are false, so I'm left unconvinced.

I'm trying to have a charitable take. I don't find joy in arguing or leaving discussions with a bitter taste. I genuinely don't know why people are so mad at me claiming that a tool has helped me be more productive. They all just don't believe me, ultimately. They all come up with some excuse as to why my personal anecdotes can be dismissed and ignored: "even though you have X, we should feel bad for you because Y!" But it's never anything of substance. Never anything that has convinced me. Because at the end of the day, I'm shipping faster. My code works. My code has stood the test of time. Insults to my engineering ability I know are demonstrably false. I hope you can see the light one day. These are extraordinary tools that are only getting better, at least by a little bit, in the foreseeable future. Why deny?

replies(2): >>44366559 #>>44374038 #
2. rootnod3 ◴[] No.44366559[source]
Would also love to see those daily shipped products. What I see on reddit is the same quiz done several times just for different categories and the pixel art generator. That does not look like shipping a product per day as you claim.
replies(1): >>44367328 #
3. gametorch ◴[] No.44367328[source]
On my main, not gonna dox myself. Being pro AI is clearly a faux pas for personal branding.

Just a few days ago got flamed for only having 62 users on GameTorch. Now up to 91 and more paying subs. Entire thing written by LLMs and hasn't fallen over once. I'd rather be a builder than an armchair critic.

People would rather drag you down in the hole that they're in than climb out.

replies(1): >>44368385 #
4. rootnod3 ◴[] No.44368385{3}[source]
Not trying to drag down, genuinely interested due to the claim.
5. nextlevelwizard ◴[] No.44374038[source]
This is going to be all over the place and possibly hard to follow, I am just going to respond in "real time" as I read your comment, if you think that is too lazy to warrant reading I completely understand. I hope you have a nice day.

WPM is not my limiting factor. Maybe the difference is that I am not working on trivial software so a lot of thought goes into the work, typing is the least time consuming part. Still I don't see how your 105 wpm highly descriptive and instructive English can be faster than just fixing the thing. Even if after you prompt your LLM takes 1ms to fix the issue you have probably already wasted more time by debugging the issue and writing the prompt.

So your "you lack engineering experience" was actually "you don't know LLMs well", maybe use the words you intend and not make them into actual insults.

I am not going to be pasting in any C++ spec into an LLM.

Yet when I checked your profile you have shipped one sprite image generator website. I find all these claims so hard to believe. Everyone just keeps telling me how they are making millions off of LLMs but no one has to the recipes to show. Just makes me feel like you have stock in OpenAI or something and are trying your hardest to pump it up.

I think the shaming and harassing is mostly between your ears, at least I am not trying to shame or harass you for using LLMs, if anything I want to have superpowers too. If LLMs really work for you that is nice and you should keep doing it, I just have not seen the evidence you are talking about. I am willing to admit that it could very well be a skill issue, but I need more proof than "trust me" or "1000 wpm".

I don't think I have made any claims about you, although you have used loaded language like "autism" and "lack of engineering experience" and heavily implied that I am just too dumb to use the tools.

>I'm trying to have a charitable take.

c'mon nothing about your comments has been charitable in anyway. No one is mad at you personally. Do not take criticism of your tools as personal attacks. Maybe the tools will get good, but again my problem with LLMs and hype around them is that no one has been able to demonstrate them actually being as good as the hype suggests.

replies(1): >>44374108 #
6. gametorch ◴[] No.44374108[source]
I appreciate the reply.

What is everyone working on that takes more than five minutes to think about?

For me, the work is insurmountable and infinite, while coming up with the solution is never too difficult. I'm not saying this to be cocky. I mean this:

In 99.9999999999% of the problems I encounter in software engineering, someone smarter than me has already written the battle tested solution that I should be using. Redis. Nginx. Postgres. etc. Or it's a paradigm like depth first search or breadth first search. Or just use a hash set. Sometimes it's a little crazier like Bloom filters but whatever.

Are you like constantly implementing new data structures and algorithms that only exist in research papers or in your head?

Once you've been engineering for 5 or 10 years, you've seen almost everything there is to see. Most of the solutions should be cached in your brains at that point. And the work just amounts to tedious, unimportant implementation details.

Maybe I'm forgetting that people still get bogged down in polymorphism and all that object oriented nonsense. If you just use flat structs, there's nothing too complicated that could possibly happen.

I worked in HFT, for what it's worth, and that should be considered very intense non-CRUD "true" engineering. That, I agree, LLMs might have a little more trouble with. But it's still nothing insane.

Software engineering is extremely formulaic. That's why it's so easy to statistically model it with LLMs.

replies(1): >>44374476 #
7. nextlevelwizard ◴[] No.44374476{3}[source]
I write embedded software in C++ for industrial applications. We have a lot of proprietary protocols and custom hardware. We have some initiatives to train LLMs with our protocols/products/documentation, but I have not been impressed with the results. Same goes with our end-to-end testing framework. I guess it isn't so popular so the results vary a lot.

I have been doing this for 8 year and while yes I have seen a lot you can't just copy-paste solutions due to flash, memory, and performance constraints.

Again maybe this is a skill issue and maybe I will be replaced with an LLM, but so far they seem more like cool toys. I have used LLMs to write AddOns for World of Warcraft since my Lua knowledge is mostly writing Wireshark plugins for our protocols and for that it has been nice, but it is nothing someone who actually works with Lua or with WoW API couldn't produce faster or just as fast, because I have to describe what I want and then try and see if the API the LLM provides exists and if it works as the LLM assumed it would.

replies(1): >>44377173 #
8. gametorch ◴[] No.44377173{4}[source]
Again, I appreciate the reply. I think my view on LLMs is skewed towards the positive because I've only been building CRUD apps, command line tools, and games with them. I apologize if I came off as incendiary or offensive.