←back to thread

S1: A $6 R1 competitor?

(timkellogg.me)
851 points tkellogg | 1 comments | | HN request time: 0.683s | source
Show context
mtrovo ◴[] No.42951263[source]
I found the discussion around inference scaling with the 'Wait' hack so surreal. The fact such an ingeniously simple method can impact performance makes me wonder how many low-hanging fruit we're still missing. So weird to think that improvements on a branch of computer science is boiling down to conjuring the right incantation words, how you even change your mindset to start thinking this way?
replies(16): >>42951704 #>>42951764 #>>42951829 #>>42953577 #>>42954518 #>>42956436 #>>42956535 #>>42956674 #>>42957820 #>>42957909 #>>42958693 #>>42960400 #>>42960464 #>>42961717 #>>42964057 #>>43000399 #
rgovostes ◴[] No.42957909[source]
> a branch of computer science

It should be considered a distinct field. At some level there is overlap (information theory, Kolmogorov complexity, etc.), but prompt optimization and model distillation is far removed from computability, formal language theory, etc. The analytical methods, the techniques to create new architectures, etc. are very different beasts.

replies(2): >>42958732 #>>42966331 #
1. maginx ◴[] No.42966331[source]
I agree - I don't know what field it formally is, but computer science it is not. It is also related to information retrieval aka "Google skills", problem presentation, 'theory of mind', even management and psychology. I'm saying the latter because people often ridicule AI responses for giving bad answers that are 'too AI'. But often it is simply because not enough context-specific information was given to allow the AI to giving a more personalized response. One should compare the response to "If I had asked a random person on the internet this query, what might I have gotten". If you write "The response should be written as a <insert characteristics, context, whatever you feel is relevant>" it will deliver a much less AI. This is just as much about how you pose a problem in general, as it is about computer science.