Andrej Karpathy: Software in the era of AI [video]

(www.youtube.com)

1479 points sandslash | 1 comments | 19 Jun 25 00:33 UTC | HN request time: 0.21s | source

Show context

OJFord ◴[20 Jun 25 02:08 UTC] No.44324130[source]▶

I'm not sure about the 1.0/2.0/3.0 classification, but it did lead me to think about LLMs as a programming paradigm: we've had imperative & declarative, procedural & functional languages, maybe we'll come to view deterministic vs. probabilistic (LLMs) similarly.

    def __main__:
        You are a calculator. Given an input expression, you compute the result and print it to stdout, exiting 0.
        Should you be unable to do this, you print an explanation to stderr and exit 1.

(and then, perhaps, a bunch of 'DO NOT express amusement when the result is 5318008', etc.)

replies(10): >>44324398 #>>44324762 #>>44325091 #>>44325404 #>>44325767 #>>44327171 #>>44327549 #>>44328699 #>>44328876 #>>44329436 #

no_wizard ◴[20 Jun 25 15:49 UTC] No.44328876[source]▶

>>44324130 #

I wonder when companies will remove the personality out of LLMs by default, especially for tools

replies(1): >>44329065 #

dingnuts ◴[20 Jun 25 16:05 UTC] No.44329065[source]▶

>>44328876 #

that would require actually curating the training data and eliminating sources that contain casual conversation

too expensive since those are all licensed sources, much easier to train on Reddit data

replies(1): >>44329159 #

amelius ◴[20 Jun 25 16:11 UTC] No.44329159[source]▶

>>44329065 #

Just ask an LLM to remove the personality from the training data. Then train a new LLM on that.

replies(1): >>44337538 #

1. omneity ◴[21 Jun 25 13:42 UTC] No.44337538[source]▶

>>44329159 #

It will work, but at the scale needed for pretraining you are bound to have many quality issues that will destroy your student model, so your data cleaning process better be very capable.

One way to think of it is that any little bias or undesirable path in your teacher model will be amplified in the resulting data and is likely to become over represented in the student model.

↑