The deep learning boom caught almost everyone by surprise

(www.understandingai.org)

306 points slyall | 1 comments | 06 Nov 24 04:05 UTC | HN request time: 0.241s | source

Show context

DeathArrow ◴[06 Nov 24 09:10 UTC] No.42058383[source]▶

>>42057139 (OP) #

I think neural nets are just a subset of machine learning techniques.

I wonder what would have happened if we poured the same amount of money, talent and hardware into SVMs, random forests, KNN, etc.

I don't say that transformers, LLMs, deep learning and other great things that happened in the neural network space aren't very valuable, because they are.

But I think in the future we should also study other options which might be better suited than neural networks for some classes of problems.

Can a very large and expensive LLM do sentiment analysis or classification? Yes, it can. But so can simple SVMs and KNN and sometimes even better.

I saw some YouTube coders doing calls to OpenAI's o1 model for some very simple classification tasks. That isn't the best tool for the job.

replies(11): >>42058980 #>>42059047 #>>42059100 #>>42059544 #>>42059813 #>>42060244 #>>42060447 #>>42060561 #>>42060833 #>>42062658 #>>42088131 #

mentalgear ◴[06 Nov 24 10:02 UTC] No.42059047[source]▶

>>42058383 #

KANs (Kolmogorov-Arnold Networks) are one example of a promising exploration pathway to real AGI, with the advantage of full explain-ability.

replies(2): >>42059624 #>>42073900 #

astrange ◴[06 Nov 24 10:38 UTC] No.42059624[source]▶

>>42059047 #

"Explainable" is a strong word.

As a simple example, if you ask a question and part of the answer is directly quoted from a book from memory, that text is not computed/reasoned by the AI and so doesn't have an "explanation".

But I also suspect that any AGI would necessarily produce answers it can't explain. That's called intuition.

replies(1): >>42059743 #

diffeomorphism ◴[06 Nov 24 10:46 UTC] No.42059743[source]▶

>>42059624 #

Why? If I ask you what the height of the Empire State Building is, then a reference is a great, explainable answer.

replies(1): >>42061157 #

astrange ◴[06 Nov 24 12:28 UTC] No.42061157[source]▶

>>42059743 #

It wouldn't be a reference; "explanation" for an LLM means it tells you which of its neurons were used to create the answer, ie what internal computations it did and which parts of the input it read. Their architecture isn't capable of referencing things.

What you'd get is an explanation saying "it quoted this verbatim", or possibly "the top neuron is used to output the word 'State' after the word 'Empire'".

You can try out a system here: https://monitor.transluce.org/dashboard/chat

Of course the AI could incorporate web search, but then what if the explanation is just "it did a web search and that was the first result"? It seems pretty difficult to recursively make every external tool also explainable…

replies(2): >>42061585 #>>42061651 #

Retric ◴[06 Nov 24 12:59 UTC] No.42061585[source]▶

>>42061157 #

LLM’s are not the only possible option here. When talking about AGI none of what we are doing is currently that promising.

The search is for something that can write an essay, drive a car, and cook lunch so we need something new.

replies(1): >>42064107 #

Vampiero ◴[06 Nov 24 15:49 UTC] No.42064107[source]▶

>>42061585 #

When people talk about explainability I immediately think of Prolog.

A Prolog query is explainable precisely because, by construction, it itself is the explanation. And you can go step by step and understand how you got a particular result, inspecting each variable binding and predicate call site in the process.

Despite all the billions being thrown at modern ML, no one has managed to create a model that does something like what Prolog does with its simple recursive backtracking.

So the moral of the story is that you can 100% trust the result of a Prolog query, but you can't ever trust the output of an LLM. Given that, which technology would you rather use to build software on which lives depend on?

And which of the two methods is more "artificially intelligent"?

replies(1): >>42070201 #

1. astrange ◴[06 Nov 24 22:05 UTC] No.42070201[source]▶

>>42064107 #

The site I linked above does that for LLaMa 8B.

https://transluce.org/observability-interface

LLMs don't have enough self-awareness to produce really satisfying explanations though, no.

↑