Backpropagation is a leaky abstraction (2016)

(karpathy.medium.com)

346 points swatson741 | 1 comments | 02 Nov 25 05:20 UTC | HN request time: 0.212s | source

Show context

joshdavham ◴[02 Nov 25 07:14 UTC] No.45788432[source]▶

Given that we're now in the year 2025 and AI has become ubiquitous, I'd be curious to estimate what percentage of developers now actually understand backprop.

It's a bit snarky of me, but whenever I see some web developer or product person with a strong opinion about AI and its future, I like to ask "but can you at least tell me how gradient descent works?"

I'd like to see a future where more developers have a basic understanding of ML even if they never go on to do much of it. I think we would all benefit from being a bit more ML-literate.

replies(5): >>45788456 #>>45788499 #>>45788793 #>>45788867 #>>45798902 #

kojoru ◴[02 Nov 25 07:18 UTC] No.45788456[source]▶

>>45788432 #

I'm wondering: how can understanding gradient descent help in building AI systems on top of LLMs? To mee it feels like the skills of building "AI" are almost orthogonal to skills of building on top of "AI"

replies(2): >>45788514 #>>45791030 #

1. HarHarVeryFunny ◴[02 Nov 25 15:32 UTC] No.45791030[source]▶

>>45788456 #

Sure, but it'd be similar to being a software developer and not understanding roughly what a compiler does. In a world full of neural network based technology, it'd be a bit lame for a technologist not to at least have a rudimentary understanding of how it works.

Nowadays, fine tuning LLMs is becoming quite mainstream, so even if you are not training neural nets of any kind from scratch, if you don't understand how gradients are used in the training (& fine tuning) process, then that is going to limit your ability to fully work with the technology.

↑