Backpropagation is a leaky abstraction (2016)

(karpathy.medium.com)

346 points swatson741 | 1 comments | 02 Nov 25 05:20 UTC | HN request time: 0s | source

Show context

gchadwick ◴[02 Nov 25 07:20 UTC] No.45788468[source]▶

Karpathy's contribution to teaching around deep learning is just immense. He's got a mountain of fantastic material from short articles like this, longer writing like https://karpathy.github.io/2015/05/21/rnn-effectiveness/ (on recurrent neural networks) and all of the stuff on YouTube.

Plus his GitHub. The recently released nanochat https://github.com/karpathy/nanochat is fantastic. Having minimal, understandable and complete examples like that is invaluable for anyone who really wants to understand this stuff.

replies(2): >>45788631 #>>45788885 #

kubb ◴[02 Nov 25 09:03 UTC] No.45788885[source]▶

>>45788468 #

I was slightly surprised that my colleagues, who are extremely invested in capabilities of LLMs, didn’t show any interest in Karpathy’s communication on the subject when I recommended it to them.

Later I understood that they don’t need to understand LLMs, and they don’t care how they work. Rather they need to believe and buy into them.

They’re more interested in science fiction discussions — how would we organize a society where all work is done by intelligent machines — than what kinds of tasks are LLMs good at today and why.

replies(9): >>45788975 #>>45789023 #>>45789131 #>>45789241 #>>45789316 #>>45789676 #>>45789975 #>>45791483 #>>45791925 #

Al-Khwarizmi ◴[02 Nov 25 09:31 UTC] No.45789023[source]▶

>>45788885 #

What's wrong or odd about that? You can like a technology as a user and not want to delve into how it works (sentence written by a human despite use of "delve"). Everyone should have some notions on what LLMs can or cannot do, in order to use them successfully and not be misguided by their limitations, but we don't need everyone to understand what backpropagation is, just as most of us use cars without knowing much about how an internal combustion engine works.

And the issue you mention in the last paragraph is very relevant, since the scenario is plausible, so it is something we definitely should be discussing.

replies(2): >>45789298 #>>45789446 #

1. Archelaos ◴[02 Nov 25 11:06 UTC] No.45789446[source]▶

>>45789023 #

> What's wrong or odd about that? You can like a technology as a user and not want to delve into how it works

The question here is whether the details are important for the major issues, or whether they can be abstracted away with a vague understanding. To what extent abstracting away is okay depends greatly on the individual case. Abstractions can work over a large area or for a long time, but then suddenly collapse and fail.

The calculator, which has always delivered sufficiently accurate results, can produce nonsense when one approaches the limits of its numerical representation or combines numbers with very different levels of precision. This can be seen, for example, when one rearranges commutative operations; due to rounding problems, it suddenly delivers completely different results.

The 2008 financial crisis was based, among other things, on models that treated certain market risks as independent of one another. Risk could then be spread by splitting and recombining portfolios. However, this only worked as long as the interdependence of the different portfolios was actually quite small. An entire industry, with the exception of a few astute individuals, had abstracted away this interdependence, acted on this basis, and ultimately failed.

As individuals, however, we are completely dependent on these abstractions. Our entire lives are permeated by things whose functioning we simply have to rely on without truly understanding them. Ultimately, it is the nature of modern, specialized societies that this process continues and becomes even more differentiated.

But somewhere there should be people who work at the limits of detailed abstractions and are concerned with researching and evaluating the real complexity hidden behind them, and thus correcting the abstraction if necessary, sending this new knowledge upstream.

The role of an expert is to operate with less abstraction and more detail in her oder his field of expertise than a non-expert -- and the more so, the better an expert she or he is.

↑