Most active commenters
  • astrange(3)
  • diffeomorphism(3)

←back to thread

251 points slyall | 26 comments | | HN request time: 0.756s | source | bottom
1. DeathArrow ◴[] No.42058383[source]
I think neural nets are just a subset of machine learning techniques.

I wonder what would have happened if we poured the same amount of money, talent and hardware into SVMs, random forests, KNN, etc.

I don't say that transformers, LLMs, deep learning and other great things that happened in the neural network space aren't very valuable, because they are.

But I think in the future we should also study other options which might be better suited than neural networks for some classes of problems.

Can a very large and expensive LLM do sentiment analysis or classification? Yes, it can. But so can simple SVMs and KNN and sometimes even better.

I saw some YouTube coders doing calls to OpenAI's o1 model for some very simple classification tasks. That isn't the best tool for the job.

replies(10): >>42058980 #>>42059047 #>>42059100 #>>42059544 #>>42059813 #>>42060244 #>>42060447 #>>42060561 #>>42060833 #>>42062658 #
2. Meloniko ◴[] No.42058980[source]
And based on what though do you think that?

I think neural networks are fundamental and we will focus/experiment a lot more with architecture, layers and other parts involved but emerging features arise through size

3. mentalgear ◴[] No.42059047[source]
KANs (Kolmogorov-Arnold Networks) are one example of a promising exploration pathway to real AGI, with the advantage of full explain-ability.
replies(1): >>42059624 #
4. trhway ◴[] No.42059100[source]
>I wonder what would have happened if we poured the same amount of money, talent and hardware into SVMs, random forests, KNN, etc.

people did that to horses. No car resulted from it, just slightly better horses.

>I saw some YouTube coders doing calls to OpenAI's o1 model for some very simple classification tasks. That isn't the best tool for the job.

This "not best tool" is just there for the coders to call while the "simple SVMs and KNN" would require coding and training by those coders for the specific task they have at hand.

replies(1): >>42060054 #
5. empiko ◴[] No.42059544[source]
Deep learning is easy to adapt to various domains, use cases, training criteria. Other approaches do not have the flexibility of combining arbitrary layers and subnetworks and then training them with arbitrary loss functions. The depth in deep learning is also pretty important, as it allows the model to create hierarchical representations of the inputs.
replies(1): >>42060613 #
6. astrange ◴[] No.42059624[source]
"Explainable" is a strong word.

As a simple example, if you ask a question and part of the answer is directly quoted from a book from memory, that text is not computed/reasoned by the AI and so doesn't have an "explanation".

But I also suspect that any AGI would necessarily produce answers it can't explain. That's called intuition.

replies(1): >>42059743 #
7. diffeomorphism ◴[] No.42059743{3}[source]
Why? If I ask you what the height of the Empire State Building is, then a reference is a great, explainable answer.
replies(1): >>42061157 #
8. jasode ◴[] No.42059813[source]
>I wonder what would have happened if we poured the same amount of money, talent and hardware into SVMs, random forests, KNN, etc.

But that's backwards from how new techniques and progress is made. What actually happens is somebody (maybe a student at a university) has an insight or new idea for an algorithm that's near $0 cost to implement a proof-of concept. Then everybody else notices the improvement and then extra millions/billions get directed toward it.

New ideas -- that didn't cost much at the start -- ATTRACT the follow on billions in investments.

This timeline of tech progress in computer science is the opposite from other disciplines such as materials science or bio-medical fields. Trying to discover the next super-alloy or cancer drug all requires expensive experiments. Manipulating atoms & molecules requires very expensive specialized equipment. In contrast, computer science experiments can be cheap. You just need a clever insight.

An example of that was the 2012 AlexNet image recognition algorithm that blew all the other approaches out of the water. Alex Krizhevsky had an new insight on a convolutional neural network to run on CUDA. He bought 2 NVIDIA cards (GTX580 3GB GPU) from Amazon. It didn't require NASA levels of investment at the start to implement his idea. Once everybody else noticed his superior results, the billions began pouring in to iterate/refine on CNNs.

Both the "attention mechanism" and the refinement of "transformer architecture" were also cheap to prove out at a very small scale. In 2014, Jakob Uszkoreit thought about an "attention mechanism" instead of RNN and LSTM for machine translation. It didn't cost billions to come up with that idea. Yes, ChatGPT-the-product cost billions but the "attention mechanism algorithm" did not.

>into SVMs, random forests, KNN, etc.

If anyone has found an unknown insight into SVM, KNN, etc that everybody else in the industry has overlooked, they can do cheap experiments to prove it. E.g. The entire Wikipedia text download is currently only ~25GB. Run the new SVM classification idea on that corpus. Very low cost experiments in computer science algorithms can still be done in the proverbial "home garage".

replies(3): >>42061648 #>>42063764 #>>42065288 #
9. guappa ◴[] No.42060054[source]
[citation needed]
10. edude03 ◴[] No.42060244[source]
Transformers were made for machine translation - someone had the insight that when going from one language to another the context mattered such that the tokens that came before would bias which ones came after. It just so happened that transformers we more performant on other tasks, and at the time you could demonstrate the improvement on a small scale.
11. ldjkfkdsjnv ◴[] No.42060447[source]
This is such a terrible opinion, im so tired of reading the LLM deniers
12. f1shy ◴[] No.42060561[source]
> neural nets are just a subset of machine learning techniques.

Fact by definition

13. f1shy ◴[] No.42060613[source]
But is very hard to validate for important or critical applications
14. dr_dshiv ◴[] No.42060833[source]
The best tool for the job is, I’d argue, the one that does the job most reliably for the least amount of money. When you consider how little expertise or data you need to use openai offerings, I’d be surprised if sentiment analysis using classical ML methods are actually better (unless you are an expert and have a good dataset).
15. astrange ◴[] No.42061157{4}[source]
It wouldn't be a reference; "explanation" for an LLM means it tells you which of its neurons were used to create the answer, ie what internal computations it did and which parts of the input it read. Their architecture isn't capable of referencing things.

What you'd get is an explanation saying "it quoted this verbatim", or possibly "the top neuron is used to output the word 'State' after the word 'Empire'".

You can try out a system here: https://monitor.transluce.org/dashboard/chat

Of course the AI could incorporate web search, but then what if the explanation is just "it did a web search and that was the first result"? It seems pretty difficult to recursively make every external tool also explainable…

replies(2): >>42061585 #>>42061651 #
16. Retric ◴[] No.42061585{5}[source]
LLM’s are not the only possible option here. When talking about AGI none of what we are doing is currently that promising.

The search is for something that can write an essay, drive a car, and cook lunch so we need something new.

replies(1): >>42064107 #
17. FrustratedMonky ◴[] No.42061648[source]
"$0 cost to implement a proof-of concept"

This falls apart for breakthroughs that are not zero cost to do a proof-of concept.

Think that is what the parent is rereferring . That other technologies might have more potential, but would take money to build out.

18. diffeomorphism ◴[] No.42061651{5}[source]
Then you should have a stronger notion of "explanation". Why were these specific neurons activated?

Simplest example: OCR. A network identifying digits can often be explained as recognizing lines, curves, numbers of segments etc.. That is an explanation, not "computer says it looks like an 8"

replies(1): >>42065185 #
19. jensgk ◴[] No.42062658[source]
> I wonder what would have happened if we poured the same amount of money, talent and hardware into SVMs, random forests, KNN, etc.

From my perspective, that is actually what happened between the mid-90s to 2015. Neural netowrks were dead in that period, but any other ML method was very, very hot.

20. scotty79 ◴[] No.42063764[source]
Do transformer architecture and attention mechanisms actually give any benefit to anything else than scalability?

I though the main insights were embeddings, positional encoding and shortcuts through layers to improve back propagation.

21. Vampiero ◴[] No.42064107{6}[source]
When people talk about explainability I immediately think of Prolog.

A Prolog query is explainable precisely because, by construction, it itself is the explanation. And you can go step by step and understand how you got a particular result, inspecting each variable binding and predicate call site in the process.

Despite all the billions being thrown at modern ML, no one has managed to create a model that does something like what Prolog does with its simple recursive backtracking.

So the moral of the story is that you can 100% trust the result of a Prolog query, but you can't ever trust the output of an LLM. Given that, which technology would you rather use to build software on which lives depend on?

And which of the two methods is more "artificially intelligent"?

replies(1): >>42070201 #
22. krisoft ◴[] No.42065185{6}[source]
But can humans do that? If you show someone a picture of a cat, can they "explain" why is it a cat and not a dog or a pumpkin?

And is that explanation the way how they obtained the "cat-nes" of the picture, or do they just see that it is a cat immediately and obviously and when you ask them for an explanation they come up with some explaining noises until you are satisfied?

replies(2): >>42067149 #>>42067384 #
23. DeathArrow ◴[] No.42065288[source]
True, you might not need lots of money to test some ideas. But LLMs and transformers are all the rage so they gather all attention and research funds.

People don't even think of doing anything else and those that might do, are paid to pursue research on LLMs.

24. diffeomorphism ◴[] No.42067149{7}[source]
Wild cat, house cat, lynx,...? Sure, they can. They will tell you about proportions, shape of the ears, size as compared to other objects in the picture etc.

For cat vs pumpkin they will think you are making fun of them, but it very much is explainable. Though now I am picturing a puzzle about finding orange cats in a picture of a pumpkin field.

25. fragmede ◴[] No.42067384{7}[source]
Shown a picture of a cloud, why it looks like a cat does sometimes need an explanation until others can see the cat, and it's not just "explaining noises".
26. astrange ◴[] No.42070201{7}[source]
The site I linked above does that for LLaMa 8B.

https://transluce.org/observability-interface

LLMs don't have enough self-awareness to produce really satisfying explanations though, no.