2025 AI Index Report

(hai.stanford.edu)

170 points INGELRII | 1 comments | 10 Apr 25 15:13 UTC | HN request time: 0.198s | source

Show context

mrdependable ◴[10 Apr 25 17:09 UTC] No.43645990[source]▶

I always see these reports about how much better AI is than humans now, but I can't even get it to help me with pretty mundane problem solving. Yesterday I gave Claude a file with a few hundred lines of code, what the input should be, and told it where the problem was. I tried until I ran out of credits and it still could not work backwards to tell me where things were going wrong. In the end I just did it myself and it turned out to be a pretty obvious problem.

The strange part with these LLMs is that they get weirdly hung up on things. I try to direct them away from a certain type of output and somehow they keep going back to it. It's like the same problem I have with Google where if I try to modify my search to be more specific, it just ignores what it doesn't like about my query and gives me the same output.

replies(4): >>43646008 #>>43646119 #>>43646496 #>>43647128 #

namaria ◴[10 Apr 25 18:01 UTC] No.43646496[source]▶

>>43645990 #

It's overfitting.

Some people say they find LLMs very helpful for coding, some people say they are incredibly bad.

I often see people wondering if the some coding task is performed well or not because of availability of code examples in the training data. It's way worse than that. It's overfitting to diffs it was trained on.

"In other words, the model learns to predict plausible changes to code from examples of changes made to code by human programmers."

https://arxiv.org/abs/2206.08896

replies(2): >>43646676 #>>43651662 #

1. mdp2021 ◴[11 Apr 25 08:41 UTC] No.43651662[source]▶

>>43646496 #

> overfitting

Are you sure it's not just a matter of being halfwitted?

↑