Most active commenters

kragen(8)
bryanrasmussen(3)
BobbyTables2(3)

Popular/hot comments

>>45080001 #
>>45080114 #

←back to thread

Enrollment at trade schools is expected to grow

(finance.yahoo.com)

Show context

prisenco ◴[30 Aug 25 23:35 UTC] No.45078963[source]▶

>>45078651 (OP) #

For junior devs wondering if they picked the right path, remember that the world still needs software, ai still breaks down at even a small bit of complexity, and the first ones to abandon this career will be those who only did it for money anyways and they’ll do the same once the trades have a rough year (as they always do).

In the meantime keep learning and practicing cs fundamentals, ignore hype and build something interesting.

replies(5): >>45079011 #>>45079019 #>>45079029 #>>45079186 #>>45079322 #

1. kragen ◴[31 Aug 25 00:42 UTC] No.45079322[source]▶

>>45078963 #

Nobody has any idea what AI is going to look like five years from now. Five years ago we had GPT-2; AI couldn't code at all. Five years from now AI might still break down at even a small bit of complexity, or it might be installing air conditioners, or it might be colonizing Mercury and putting humans in zoos.

Anyone who tells you they know what the future looks like five years from now is lying.

replies(2): >>45079366 #>>45079457 #

2. tmn ◴[31 Aug 25 00:51 UTC] No.45079366[source]▶

>>45079322 (TP) #

There’s a significant difference between predicting what it will specifically look like, and predicting sets of possibilities it won’t look like

replies(1): >>45079424 #

3. kragen ◴[31 Aug 25 01:04 UTC] No.45079424[source]▶

>>45079366 #

No, there isn't. When speaking of logically consistent possibilities, the two problems are precisely isomorphic under Boolean negation.

replies(1): >>45079980 #

4. noosphr ◴[31 Aug 25 01:13 UTC] No.45079457[source]▶

>>45079322 (TP) #

Unless we have another breakthrough like attention we do know that AI will keep struggling with context and costs will grow quadratically with context.

On a codebase of 10,000 lines any action will cost 100,000,000 AI units. One with 1,000,000 it will cost 1,000,000,000,000 AI units.

I work on these things for a living and no one else seems to ever think two steps ahead on what the mathematical limitations of the transformer architecture mean for transformer based applications.

replies(1): >>45079562 #

5. kragen ◴[31 Aug 25 01:37 UTC] No.45079562[source]▶

>>45079457 #

It's only been 8 years since the attention breakthrough. Since then we've had "sparsely-gated MoE", RLHF, BERT, "Scaling Laws", Dall-E, LoRA, CoT, AlphaFold 2, "Parameter-Efficient Fine-Tuning", and DeepSeek's training cost breakthrough. AI researchers rather than physicists or chemists won the Nobel Prizes in physics and (for AlphaFold) chemistry last year. Agentic software development, MCP, and video generation are more or less new this year.

Humans also keep struggling with context, so while large contexts may limit AI performance, they won't necessarily prevent them from being strongly superhuman.

replies(2): >>45080001 #>>45080114 #

6. bryanrasmussen ◴[31 Aug 25 03:00 UTC] No.45079980{3}[source]▶

>>45079424 #

good point, someone recently said

> Five years from now AI might still break down at even a small bit of complexity, or it might be installing air conditioners, or it might be colonizing Mercury and putting humans in zoos.

do all these seem logically consistent possibilities to you?

replies(1): >>45080025 #

7. lossolo ◴[31 Aug 25 03:04 UTC] No.45080001{3}[source]▶

>>45079562 #

> Since then we've had "sparsely-gated MoE", RLHF, BERT, "Scaling Laws", Dall-E, LoRA, CoT, AlphaFold 2, "Parameter-Efficient Fine-Tuning", and DeepSeek's training cost breakthrough.

OK, I will bite.

So "Sparsely-gated MoE" isn’t some new intelligence, it's a sharding trick. You trade parameter count for FLOPs/latency with a router. And MoE predates transformers anyway.

RLHF is packaging. Supervised finetune on instructions, learn a reward model, then nudge the policy. That’s a training objective swap plus preference data. It's useful, but not breakthrough.

CoT is a prompting hack to force the same model to externalize intermediate tokens. The capability was there, you’re just sampling a longer trajectory. It’s UX for sampling.

Scaling laws are an empirical fit telling you "buy more compute and data" That’s a budgeting guideline, not new math or architecture. https://www.reddit.com/r/ProgrammerHumor/comments/8c1i45/sta...

LoRA is linear algebra 101, low rank adapters to cut training cost and avoid touching the full weights. The base capability still comes from the giant pretrained transformer.

AlphaFold 2’s magic is mostly attention + A LOT of domain data/priors (MSAs, structures, evolutionary signal). Again attention core + data engineering.

"DeepSeek’s cost breakthrough" is systems engineering.

Agentic software dev/MCP is orchestration, that’s middleware and protocols, it helps use the model, it doesn’t make the model smarter.

Video generation? Diffusion with temporal conditioning and better consistency losses. It’s DALL-E style tech stretched across time with tons of data curation and filtering.

Most headline "wins" are compiler and kernel wins: FlashAttention, paged KV-cache, speculative decoding, distillation, quantization (8/4 bit), ZeRO/FSDP/TP/PP... These only move the cost curve, not the intelligence.

The biggest single driver the last few years has been the data so de dup, document quality scores, aggressive filtration, mixture balancing (web/code/math), synthetic bootstrapping, eval driven rewrites etc etc. You can swap half a dozen training "tricks" and get similar results if your data mix and scale are right.

For me a real post attention "breakthrough", would be something like: training that learns abstractions with sample efficiency far beyond scaling laws, reliable formal reasoning, causal/world-model learning that transfers out of distribution. None of the things you listed do that.

Almost everything since attention is optimization, ops, and data curation. I mean give me exact pretrain mix, filtering heuristics, and finetuning datasets for Claude/GPT-5 and without peeking at the secret sauce architecture I can get close just by matching tokens, quality filters and training schedule. The "breakthroughs" are mostly better ways to spend compute and clean data, not new ways to think.

replies(3): >>45080042 #>>45080140 #>>45080408 #

8. kragen ◴[31 Aug 25 03:07 UTC] No.45080025{4}[source]▶

>>45079980 #

Yes, obviously. You presumably don't know what "consistent" means in logic, and your untutored intuition is misleading you into guessing that possibilities like those could conceivably be inconsistent.

https://en.m.wikipedia.org/wiki/Consistency

replies(1): >>45080085 #

9. kragen ◴[31 Aug 25 03:12 UTC] No.45080042{4}[source]▶

>>45080001 #

I don't disagree with any of this, though it sounds like you know more about it than I do.

10. bryanrasmussen ◴[31 Aug 25 03:21 UTC] No.45080085{5}[source]▶

>>45080025 #

or I just wanted to make sure that you were adamant that the list of those three possibilities were equally probable, to reiterate

> AI might still break down at even a small bit of complexity, or it might be installing air conditioners, or it might be colonizing Mercury and putting humans in zoos.

that each of these things, being logically consistent, have equal chances of being the case 5 years from now?

replies(1): >>45080090 #

11. kragen ◴[31 Aug 25 03:22 UTC] No.45080090{6}[source]▶

>>45080085 #

No. Fuck off. There's no uniform probability distribution over the reals, so stop trying to put bullshit in my mouth.

replies(1): >>45080962 #

12. BobbyTables2 ◴[31 Aug 25 03:27 UTC] No.45080114{3}[source]▶

>>45079562 #

I think it’s currently too easy to get drunk on easy success cases for AI.

It’s like asking a college student 4th grade math questions and then being impressed they knew the answer.

I’ve use copilot a lot. Faster then google, gives great results.

Today I asked it for the name of a French restaurant that closed in my area a few years ago. The first answer was a Chinese fusion place… all the others were off too.

Sure, keep questions confined to something it was heavily trained on, answers will be great.

But yeah, AI going to get rid of a lot of low skilled labor.

replies(3): >>45080159 #>>45080590 #>>45081156 #

13. BobbyTables2 ◴[31 Aug 25 03:32 UTC] No.45080140{4}[source]▶

>>45080001 #

Indeed. I’m shocked that we train “AI” pretty much as one would build a fancy auto-complete.

Not necessarily a bad approach but feels like something is missing for it to be “intelligent”.

Should really be called “artificial knowledge” instead.

replies(2): >>45080164 #>>45080325 #

14. kragen ◴[31 Aug 25 03:36 UTC] No.45080159{4}[source]▶

>>45080114 #

Sure, we might have hit a wall in some important sense, where further progress on some kinds of abilities is blocked until we try something totally different. But we might not. Nobody has any clue.

15. kragen ◴[31 Aug 25 03:38 UTC] No.45080164{5}[source]▶

>>45080140 #

"What do you mean, they talk?"

"They talk by flapping their meat at each other!"

16. jofla_net ◴[31 Aug 25 04:13 UTC] No.45080325{5}[source]▶

>>45080140 #

This and parent are both approaching toward what I see as the main obstacle, that we as a species don't know how in its entirety a human mind thinks (and it varies among people), so trying to "model" it and reproduce it is reduced to a game of black-boxing. We black box the mind in terms of what situations its been seen to be in and how it has performed, the millions of correlative inputs/outputs are the training data. Yet, since we don't know the fullness of the interior we can only see its outputs it becomes somewhat of a Plato's cave situation. We believe it 'thinks' this way but again we cannot empirically say it performed a task a certain way, so unlike most other engineering problems, we are grasping at straws while trying to reconstruct it. This doesn't not mean that a human mind's inner-workings can't ever be %100 reproduced, but not until we know it further.

replies(2): >>45080593 #>>45123194 #

17. kianN ◴[31 Aug 25 04:36 UTC] No.45080408{4}[source]▶

>>45080001 #

This is a great summary of why despite so much progress/tricks being discovered, so little progress to the core limitations to LLMs are made.

18. CamperBob2 ◴[31 Aug 25 05:26 UTC] No.45080590{4}[source]▶

>>45080114 #

It’s like asking a college student 4th grade math questions and then being impressed they knew the answer.

No, it's more like asking a 4th-grader college math questions, and then desperately looking for ways to not be impressed when they get it right.

Today I asked it for the name of a French restaurant that closed in my area a few years ago. The first answer was a Chinese fusion place… all the others were off too.

What would have been impressive is if the model had replied, "WTF, do I look like Google? Look it up there, dumbass."

19. tempodox ◴[31 Aug 25 05:27 UTC] No.45080593{6}[source]▶

>>45080325 #

And there is another important difference: Our environments have oodles of details that inform us, while LLM training data is just “everything humans have ever written”. Those are completely different things. And LLMs have no concept of facts, only statements about facts in their training data that may or may not be true.

20. bryanrasmussen ◴[31 Aug 25 06:53 UTC] No.45080962{7}[source]▶

>>45080090 #

OK well you obviously seem to be having some bad time about something in your life right now so I won't continue, other than to note the comment that started this said

>There’s a significant difference between predicting what it will specifically look like, and predicting sets of possibilities it won’t look like

which I took to mean there are probability distributions around what things will happen, and it seemed to be your assertion that there wasn't, that a number of things only one of which seemed especially probable, were equally probable. I'm glad to learn you don't think this as it seems totally crazy, especially for someone praising LLMs which after all spend their time making millions of little choices based on probability.

21. BoiledCabbage ◴[31 Aug 25 07:30 UTC] No.45081156{4}[source]▶

>>45080114 #

> Today I asked it for the name of a French restaurant that closed in my area a few years ago. The first answer was a Chinese fusion place… all the others were off too.

What's the point of this anecdote? That it's not omniscient? Nobody is should be thinking that it is.

I can ask it how many coins I have in my pocket and I bet you it won't know that either.

22. BobbyTables2 ◴[04 Sep 25 03:24 UTC] No.45123194{6}[source]▶

>>45080325 #

It’s kinda interesting that a simple generator that probabilistically generates a word based on the previous one (less than 100 lines of simple python code), when trained on a book or two, will fairly consistently capitalize the first word after a period and form somewhat reasonable sentences.

It’s not that it knows grammar, it just was trained on a dataset that applied proper capitalization.

Humans learn from seeing patterns. I suspect AI only repeats them, more like a parrot.

↑