(rlancemartin.github.io)

114 points 0x79de | 1 comments | 01 Jul 25 10:56 UTC | HN request time: 0.202s | source

Show context

ares623 ◴[04 Jul 25 05:22 UTC] No.44461351[source]▶

Another article handwaving or underselling the effects of hallucination. I can't help but draw parallels to layer 2 attempts from crypto.

replies(1): >>44462031 #

FiniteIntegral ◴[04 Jul 25 07:27 UTC] No.44462031[source]▶

>>44461351 #

Apple released a paper showing the diminishing returns of "deep learning" specifically when it comes to math. For example, it has a hard time solving the Tower of Hanoi problem past 6-7 discs, and that's not even giving it the restriction of optimal solutions. The agents they tested would hallucinate steps and couldn't follow simple instructions.

On top of that -- rebranding "prompt engineering" as "context engineering" and pretending it's anything different is ignorant at best and destructively dumb at worst.

replies(7): >>44462128 #>>44462410 #>>44462950 #>>44464219 #>>44464240 #>>44464924 #>>44465232 #

1. senko ◴[04 Jul 25 08:25 UTC] No.44462410[source]▶

>>44462031 #

That's one reading of that paper.

The other is that they intentionally forced LLMs to do the things we know are bad at (following algorithms, tasks that require more context that available, etc) without allowing them to solve it in a way they're optimized to do (write a code that implements the algorithm).

A cynical read is that the paper is the only AI achievement Apple has managed to do in the past few years.

(There is another: they managed not to lose MLX people to Meta)

↑

Context Engineering for Agents