Popular/hot comments

(hai.stanford.edu)

Show context

mrdependable ◴[10 Apr 25 17:09 UTC] No.43645990[source]▶

I always see these reports about how much better AI is than humans now, but I can't even get it to help me with pretty mundane problem solving. Yesterday I gave Claude a file with a few hundred lines of code, what the input should be, and told it where the problem was. I tried until I ran out of credits and it still could not work backwards to tell me where things were going wrong. In the end I just did it myself and it turned out to be a pretty obvious problem.

The strange part with these LLMs is that they get weirdly hung up on things. I try to direct them away from a certain type of output and somehow they keep going back to it. It's like the same problem I have with Google where if I try to modify my search to be more specific, it just ignores what it doesn't like about my query and gives me the same output.

replies(4): >>43646008 #>>43646119 #>>43646496 #>>43647128 #

simonw ◴[10 Apr 25 17:11 UTC] No.43646008[source]▶

>>43645990 #

LLMs are difficult to use. Anyone who tells you otherwise is being misleading.

replies(2): >>43646190 #>>43666132 #

__loam ◴[10 Apr 25 17:30 UTC] No.43646190[source]▶

>>43646008 #

"Hey these tools are kind of disappointing"

"You just need to learn to use them right"

Ad infinitum as we continue to get middling results from the most overhyped piece of technology of all time.

replies(6): >>43646640 #>>43646655 #>>43646908 #>>43647257 #>>43652095 #>>43663510 #

1. simonw ◴[10 Apr 25 18:16 UTC] No.43646640[source]▶

>>43646190 #

That's why I try not to hype it.

replies(2): >>43649582 #>>43652701 #

2. mvdtnz ◴[11 Apr 25 01:30 UTC] No.43649582[source]▶

>>43646640 (TP) #

You're the biggest hype merchant for this technology on this entire website. Please.

replies(2): >>43649742 #>>43655396 #

3. simonw ◴[11 Apr 25 01:56 UTC] No.43649742[source]▶

>>43649582 #

I've been banging the drum about how unintuitive and difficult this stuff is for over a year now: https://simonwillison.net/2025/Mar/11/using-llms-for-code/

I'm one of the loudest voices about the so-far unsolved security problems inherent in this space: https://simonwillison.net/tags/prompt-injection/ (94 posts)

I also have 149 posts about the ethics of it: https://simonwillison.net/tags/ai-ethics/ - including one of the first high profile projects to explore the issue around copyrighted data used in training sets: https://simonwillison.net/2022/Sep/5/laion-aesthetics-weekno...

One of the reasons I do the "pelican riding a bicycle" thing is that it's a great way to deflate the hype around these tools - the supposedly best LLM in the world still draws a pelican that looks like it was done by a five year old! https://simonwillison.net/tags/pelican-riding-a-bicycle/

If you want AI hype there are a thousand places on the internet you can go to get it. I try not to be one of them.

replies(3): >>43651102 #>>43653084 #>>43660423 #

4. __loam ◴[11 Apr 25 06:39 UTC] No.43651102{3}[source]▶

>>43649742 #

The prompt injection articles you wrote really early in the tech cycle were really good and I appreciated them at the time.

5. JohnKemeny ◴[11 Apr 25 11:34 UTC] No.43652701[source]▶

>>43646640 (TP) #

Uh... You don't do anything but hype them.

I literally don't know who anyone on HN are except you and dang, and you're the one that constantly writes these ads for your LLM database product.

replies(1): >>43652811 #

6. simonw ◴[11 Apr 25 11:55 UTC] No.43652811[source]▶

>>43652701 #

I think you and I must have different definitions of the word "hype".

To me, it means LinkedIn influencers screaming "AGI is coming!", "It's so over", "Programming as a career is dead" etc.

Or implying that LLMs are flawless technology that can and should be used to solve every problem.

To hype something is to provide a dishonest impression of how great it is without ever admitting its weaknesses. That's what I try to avoid doing with LLMs.

replies(1): >>43659344 #

7. andai ◴[11 Apr 25 12:35 UTC] No.43653084{3}[source]▶

>>43649742 #

Could a five year old do it in XML (SVG)? Could an artist? In one shot?

8. maleldil ◴[11 Apr 25 16:05 UTC] No.43655396[source]▶

>>43649582 #

It's true that simonw writes a lot about LLMs, but I find his content to be mostly factual. Much of it is positive, but that doesn't mean it's hype.

9. bluefirebrand ◴[11 Apr 25 22:24 UTC] No.43659344{3}[source]▶

>>43652811 #

> without ever admitting its weaknesses

I don't think this part is necessary

"To hype something is to provide a dishonest impression of how great it is" is accurate.

Marketing hype is all about "provide a dishonest impression of how great it is". Putting the weaknesses in fine print doesn't change the hype

Anyways I don't mean to pile on but I agree with some of the other posters here. An awful lot of extremely pro-AI posts that I've noticed have your name on them

I don't think you are as critical of the tech as you think you are.

Take that for what you will

10. annjose ◴[12 Apr 25 01:15 UTC] No.43660423{3}[source]▶

>>43649742 #

I agree - the content you write about LLMs is informative and realistic, not hyped. I get a lot of value from it, especially because you write mostly as stream of consciousness and explains your approach and/or reasoning. Thank you for doing that.

↑

2025 AI Index Report