Most active commenters

vidarh(8)
johnisgood(4)

Popular/hot comments

>>46002480 #

Ask HN: How are Markov chains so different from tiny LLMs?

I polished a Markov chain generator and trained it on an article by Uri Alon and al (https://pmc.ncbi.nlm.nih.gov/articles/PMC7963340/).

It generates text that seems to me at least on par with tiny LLMs, such as demonstrated by NanoGPT. Here is an example:

  jplr@mypass:~/Documenti/2025/SimpleModels/v3_very_good$
  ./SLM10b_train UriAlon.txt 3
  
  Training model with order 3...
  
  Skip-gram detection: DISABLED (order < 5)
  
  Pruning is disabled
  
  Calculating model size for JSON export...
  
  Will export 29832 model entries
  
  Exporting vocabulary (1727 entries)...
  
  Vocabulary export complete.
  
  Exporting model entries...
  
    Processed 12000 contexts, written 28765 entries (96.4%)...
  
  JSON export complete: 29832 entries written to model.json
  
  Model trained and saved to model.json
  
  Vocabulary size: 1727
  
  jplr@mypass:~/Documenti/2025/SimpleModels/v3_very_good$ ./SLM9_gen model.json

Aging cell model requires comprehensive incidence data. To obtain such a large medical database of the joints are risk factors. Therefore, the theory might be extended to describe the evolution of atherosclerosis and metabolic syndrome. For example, late‐stage type 2 diabetes is associated with collapse of beta‐cell function. This collapse has two parameters: the fraction of the senescent cells are predicted to affect disease threshold . For each individual, one simulates senescent‐cell abundance using the SR model has an approximately exponential incidence curve with a decline at old ages In this section, we simulated a wide range of age‐related incidence curves. The next sections provide examples of classes of diseases, which show improvement upon senolytic treatment tends to qualitatively support such a prediction. model different disease thresholds as values of the disease occurs when a physiological parameter ϕ increases due to the disease. Increasing susceptibility parameter s, which varies about 3‐fold between BMI below 25 (male) and 54 (female) are at least mildly age‐related and 25 (male) and 28 (female) are strongly age‐related, as defined above. Of these, we find that 66 are well described by the model as a wide range of feedback mechanisms that can provide homeostasis to a half‐life of days in young mice, but their removal rate slows down in old mice to a given type of cancer have strong risk factors should increase the removal rates of the joint that bears the most common biological process of aging that governs the onset of pathology in the records of at least 104 people, totaling 877 disease category codes (See SI section 9), increasing the range of 6–8% per year. The two‐parameter model describes well the strongly age‐related ICD9 codes: 90% of the codes show R 2 > 0.9) (Figure 4c). This agreement is similar to that of the previously proposed IMII model for cancer, major fibrotic diseases, and hundreds of other age‐related disease states obtained from 10−4 to lower cancer incidence. A better fit is achieved when allowing to exceed its threshold mechanism for classes of disease, providing putative etiologies for diseases with unknown origin, such as bone marrow and skin. Thus, the sudden collapse of the alveoli at the outer parts of the immune removal capacity of cancer. For example, NK cells remove senescent cells also to other forms of age‐related damage and decline contribute (De Bourcy et al., 2017). There may be described as a first‐passage‐time problem, asking when mutated, impair particle removal by the bronchi and increase damage to alveolar cells (Yang et al., 2019; Xu et al., 2018), and immune therapy that causes T cells to target senescent cells (Amor et al., 2020). Since these treatments are predicted to have an exponential incidence curve that slows at very old ages. Interestingly, the main effects are opposite to the case of cancer growth rate to removal rate We next consider the case of frontline tissues discussed above.

Show context

Sohcahtoa82 ◴[20 Nov 25 18:26 UTC] No.45995897[source]▶

>>45958004 (OP) #

A Markov Chain trained by only a single article of text will very likely just regurgitate entire sentences straight from the source material. There just isn't enough variation in sentences.

But then, Markov Chains fall apart when the source material is very large. Try training a chain based on Wikipedia. You'll find that the resulting output becomes incoherent garbage. Increasing the context length may increase coherence, but at the cost of turning into just simple regurgitation.

In addition to the "attention" mechanism that another commenter mentioned, it's important to note that Markov Chains are discrete in their next token prediction while an LLM is more fuzzy. LLMs have latent space where the meaning of a word basically exists as a vector. LLMs will generate token sequences that didn't exist in the source material, whereas Markov Chains will ONLY generate sequences that existed in the source.

This is why it's impossible to create a digital assistant, or really anything useful, via Markov Chain. The fact that they only generate sequences that existed in the source mean that it will never come up with anything creative.

replies(12): >>45995946 #>>45996109 #>>45996662 #>>45996887 #>>45996937 #>>45998252 #>>45999650 #>>46000705 #>>46002052 #>>46002754 #>>46004144 #>>46021459 #

johnisgood ◴[20 Nov 25 18:30 UTC] No.45995946[source]▶

>>45995897 #

> The fact that they only generate sequences that existed in the source mean that it will never come up with anything creative.

I have seen the argument that LLMs can only give you what its been trained on, i.e. it will not be "creative" or "revolutionary", that it will not output anything "new", but "only what is in its corpus".

I am quite confused right now. Could you please help me with this?

Somewhat related: I like the work of David Hume, and he explains it quite well how we can imagine various creatures, say, a pig with a dragon head, even if we have not seen one ANYWHERE. It is because we can take multiple ideas and combine them together. We know how dragons typically look like, and we know how a pig looks like, and so, we can imagine (through our creativity and combination of these two ideas) how a pig with a dragon head would look like. I wonder how this applies to LLMs, if they even apply.

Edit: to clarify further as to what I want to know: people have been telling me that LLMs cannot solve problems that is not in their training data already. Is this really true or not?

replies(16): >>45996256 #>>45996266 #>>45996274 #>>45996313 #>>45996484 #>>45996757 #>>45997088 #>>45997100 #>>45997291 #>>45997366 #>>45999327 #>>45999540 #>>46001856 #>>46001954 #>>46007347 #>>46017836 #

koliber ◴[20 Nov 25 18:57 UTC] No.45996274[source]▶

>>45995946 #

Here's how I see it, but I'm not sure how valid my mental model is.

Imagine a source corpus that consists of:

Cows are big. Big animals are happy. Some other big animals include pigs, horses, and whales.

A Markov chain can only return verbatim combinations. So it might return "Cows are big animals" or "Are big animals happy".

An LLM can get a sense of meaning in these words and can return ideas expressed in the input corpus. So in this case it might say "Pigs and horses are happy". It's not limited to responding with verbatim sequences. It can be seen as a bit more creative.

However, LLMs will not be able to represent ideas that it has not encountered before. It won't be able to come up with truly novel concepts, or even ask questions about them. Humans (some at least) have that unbounded creativity that LLMs do not.

replies(3): >>45996596 #>>45996749 #>>45997780 #

vidarh ◴[20 Nov 25 19:41 UTC] No.45996749[source]▶

>>45996274 #

> However, LLMs will not be able to represent ideas that it has not encountered before. It won't be able to come up with truly novel concepts, or even ask questions about them. Humans (some at least) have that unbounded creativity that LLMs do not.

There's absolutely no evidence to support this claim. It'd require humans to exceed the Turing computable, and we have no evidence that is possible.

replies(3): >>45996979 #>>46001605 #>>46002996 #

1. koliber ◴[20 Nov 25 20:04 UTC] No.45996979[source]▶

>>45996749 #

If you tell me that trees are big, and trees are made of hard wood, I as a human am capable of asking whether trees feel pain. I don't think what you said is false and I am not familiar with computational theory to be able to debate it. People occasionally have novel creative insights that do not derive from past experience or knowledge, and that is what I think of when I think of creativity.

Humans created novel concepts like writing literally out of thin air. I like how the book "Guns, Steels, and Germs" describes that novel creative process and contrasts it via a disseminative derivation process.

replies(2): >>45999495 #>>45999976 #

2. vidarh ◴[20 Nov 25 23:45 UTC] No.45999495[source]▶

>>45996979 (TP) #

> People occasionally have novel creative insights that do not derive from past experience or knowledge, and that is what I think of when I think of creativity.

If they are not derived from past experience or knowledge, then unless humans exceed the Turing computable, they would need to be the result of randomness in one form or other. There's absolutely no reason why an LLM can not do that. The only reason a far "dumber" pure random number generator based string generator "can't" do that is because it would take too long to chance on something coherent, but it most certainly would keep spitting out novel things. The only difference is how coherent the novel things are.

replies(1): >>46000898 #

3. c22 ◴[21 Nov 25 00:53 UTC] No.45999976[source]▶

>>45996979 (TP) #

Wouldn't this insight derive from many past experiences of feeling pain yourself and the knowledge that others feel it too?

4. Jensson ◴[21 Nov 25 03:26 UTC] No.46000898[source]▶

>>45999495 #

> If they are not derived from past experience or knowledge

Every animal is born with intuition, you missed that part.

replies(1): >>46002480 #

5. vidarh ◴[21 Nov 25 08:32 UTC] No.46002480{3}[source]▶

>>46000898 #

So knowledge encoded in the physical structure of the brain.

You're missing the part where unless there is unknown physics going on in the brain that breaks maths as me know it, there is no mechanism for a brain to exceed the Turing computable, in which case any Turing complete system is comptationally equivalent to it.

replies(3): >>46002997 #>>46003452 #>>46003870 #

6. ◴[21 Nov 25 10:02 UTC] No.46002997{4}[source]▶

>>46002480 #

7. arowthway ◴[21 Nov 25 11:23 UTC] No.46003452{4}[source]▶

>>46002480 #

Turing machines are deterministic, brain might not be because of quantum mechanics happening. Of course there is no proof that this is related to creativity.

replies(1): >>46003875 #

8. johnisgood ◴[21 Nov 25 12:25 UTC] No.46003870{4}[source]▶

>>46002480 #

This Turing completeness equivalence is misleading. While all Turing-complete systems can theoretically compute the same class of functions, this says nothing about computational complexity, physical constraints, practical achievability in finite time, or the actual algorithms required. A Turing machine that can theoretically simulate a brain does not mean we know how to do it or that it is even feasible. This is like arguing that because weather systems and computers both follow physical laws, you should be able to perfectly simulate weather on your laptop.

Additionally, "No mechanism to exceed Turing computable" is a non-sequitur. Even granting that brains do not perform hypercomputation, this does not support your conclusion that artificial systems are "computationally equivalent" to brains in any practical sense. We would need: (1) complete understanding of brain algorithms, (2) the actual data/weights encoded in neural structures, (3) sufficient computational resources, and (4) correct implementation. None of these follow from Turing completeness alone, I believe.

More importantly, you completely dodged the actual point about intuition. Jensson's point is about evolutionary encoding vs. learned knowledge. Intuition represents millions of years of evolved optimization encoded in brain structure and chemistry. You acknowledge this ("knowledge encoded in physical structure") but then pivot to an irrelevant theoretical CS argument rather than addressing whether we can actually replicate such evolutionary knowledge in artificial systems.

Your original claim was "If they are not derived from past experience or knowledge" which creates a false dichotomy. Animals are born with innate knowledge encoded through evolutionary optimization. This is not learned from individual experience, yet it is still knowledge, specifically, it is millions of years of selection pressure encoded in neural architecture, reflexes, instincts, and cognitive biases.

So, for example: a newborn animal has never experienced a predator but knows to freeze or flee from certain stimuli. It has built-in heuristics for threat assessment, social behavior, spatial reasoning, and countless other domains that cost generations to develop through survival pressure.

Current AI systems lack this evolutionary substrate. They are trained on human data over weeks or months, not evolved over millions of years. We do not even know how to encode this type of knowledge artificially or even fully understand what knowledge is encoded in biological systems. Turing completeness does not bridge this gap any more than it bridges the gap between a Turing machine and actual weather.

Correct me if I'm misinterpreting your argument.

replies(2): >>46009943 #>>46015393 #

9. vidarh ◴[21 Nov 25 12:26 UTC] No.46003875{5}[source]▶

>>46003452 #

Turing machines are deterministic if all their inputs are deterministic, which they do not need to be, and if we allow them to be. Indeed, by default, LLMs are by default not deterministic because we intentionally inject randomness.

replies(1): >>46004197 #

10. arowthway ◴[21 Nov 25 13:13 UTC] No.46004197{6}[source]▶

>>46003875 #

It doesn't mean we can accurately simulate the brain by swapping its source of nondeterminism with any other PRNG or TRNG. It might just so happen that to simulate ingenuity you have to simulate the universe first.

replies(1): >>46015351 #

11. alansammarone ◴[21 Nov 25 22:46 UTC] No.46009943{5}[source]▶

>>46003870 #

I...I am very interested in this subject. There's a lot to unpack in your comment, but I think it's really pretty simple.

> this does not support your conclusion that artificial systems are "computationally equivalent" to brains in any practical sense.

You're making a point about engineering or practicality, and in that sense, you are absolutely correct.

That's not the most interesting part of the question, however.

> This is like arguing that because weather systems and computers both follow physical laws, you should be able to perfectly simulate weather on your laptop.

Yes, that's exactly what I'd argue, and...hm.. yes, I think that's clearly true. Whether it takes 10 minutes or 10^100 minutes, 1~ or 10^100 human lifetimes to do so, it's irrelevant. Units (including human lifetimes) are arbitrary, and I think fundamental truths probably won't depend on such arbitrary things as how long a particular collection of atoms in a particular corner of the universe (i.e. humans) happens to be stable for. Ratios are closer to being fundamental, but I digress.

To put it a different way - we think we know what the speed of light is. Traveling at v = 0.1c or at v = (1 - 10^(-100))c are equivalent in a fundamental sense, it's an engineering problem. Now, traveling at v = c...that's very different. That's interesting.

replies(1): >>46016993 #

12. vidarh ◴[22 Nov 25 15:09 UTC] No.46015351{7}[source]▶

>>46004197 #

If the brain does not exceed the Turing computable, then it does mean it is possible to accurately simulate the brain. Not only that, but in that case the brain itself is existence proof that doing so efficiently is possible.

If the brain exceeds the Turing computable, then all bets are off, but we have no evidence to suggest it does, nor that doing so is possible. This was in fact my original argument.

The only viable counter to my argument is demonstrating that there are computable functions outside the Turing computable, and that humans can compute them.

13. vidarh ◴[22 Nov 25 15:16 UTC] No.46015393{5}[source]▶

>>46003870 #

> While all Turing-complete systems can theoretically compute the same class of functions, this says nothing about computational complexity, physical constraints, practical achievability in finite time, or the actual algorithms required.

True. But if the brain is limited to the Turing computable, then the brain itself is existence proof it is possible to do so efficiently. It might require a different architecture, but that is a detail.

Personally I think that we have gotten this far this quickly with brute force suggests that the problem is fairly tractable, but it may in fact turn out to be much harder than we think.

The point is that when people dismiss it as impossible, that is a belief not backed up by any evidence.

> Additionally, "No mechanism to exceed Turing computable" is a non-sequitur. Even granting that brains do not perform hypercomputation, this does not support your conclusion that artificial systems are "computationally equivalent" to brains in any practical sense. We would need: (1) complete understanding of brain algorithms, (2) the actual data/weights encoded in neural structures, (3) sufficient computational resources, and (4) correct implementation. None of these follow from Turing completeness alone, I believe.

Computationally equivalent here refers to any two Turing complete systems being able to compute all functions that the other can, and so on that basis all four of your points are irrelevant to the question I addressed.

> yet it is still knowledge

You claim my statement creates a false dichotomy, but here you concede it is not.

> Current AI systems lack this evolutionary substrate.

That is irrelevant to the question of whether it is possible. That's an engineering problem, not a fundamental limitation.

> Correct me if I'm misinterpreting your argument.

It seems you're arguing difficult and complexity, while I argued over possibility. Your argument is mostly not relevant to mine for that reason. Most of it is not unreasonable, it just does not say anything about the possibility.

replies(1): >>46015643 #

14. johnisgood ◴[22 Nov 25 15:54 UTC] No.46015643{6}[source]▶

>>46015393 #

You write (as a response to someone else in this thread): "If the brain is limited to the Turing computable, then the brain itself is existence proof it is possible to do so efficiently."

No. The brain is existence proof that that particular physical substrate can achieve intelligence efficiently. A bird is existence proof that flight is possible efficiently, but not that elephants can fly. You are claiming "computational equivalence" means any Turing-complete system can efficiently replicate any other, but this does not follow from Turing's thesis at all.

You say: "Computationally equivalent here refers to any two Turing complete systems being able to compute all functions that the other can."

But then you make claims about replicating brain capabilities. These are different things. A Python interpreter and raw transistors are Turing-equivalent, but we do not conclude Python can efficiently do what transistors do. The abstraction layers, the architecture, the implementation: these all matter for the actual question at hand.

You dismiss the evolutionary substrate: "That is irrelevant to the question of whether it is possible. That's an engineering problem, not a fundamental limitation.".

This concedes the key point. You are now admitting current AI systems lack something the brain has (millions of years of encoded optimization), then handwaving it away as "just engineering". But the original discussion was whether LLMs as currently implemented can represent truly novel ideas. You have retreated to arguing about theoretical possibility with complete knowledge and arbitrary resources.

Finally: "It seems you're arguing difficult and complexity, while I argued over possibility."

Exactly. Your argument has contracted from making claims about actual LLM capabilities to an unfalsifiable position about theoretical possibility. In the sense you are now defending, it is "possible" that monks with abacuses could run Crysis given infinite time and perfect execution. This tells us nothing interesting about whether current LLMs have unbounded creativity.

Perhaps I am misunderstanding your original argument. Could you clarify what your argument is exactly? I want to make sure we are not talking past each other.

replies(1): >>46016921 #

15. vidarh ◴[22 Nov 25 18:18 UTC] No.46016921{7}[source]▶

>>46015643 #

> No. The brain is existence proof that that particular physical substrate can achieve intelligence efficiently.

So in other words, it is existence proof that it can be done efficiently. You arbitrarily applied your false beliefs about what that statement implied.

If you want to claim that we don't have any evidence that it can be done in an arbitrary substrate, then you'd be right, but that is entirely separate argument I have no interest in.

> You are claiming "computational equivalence" means any Turing-complete system can efficiently replicate any other, but this does not follow from Turing's thesis at all.

I have never in my life made that claim.

I have at times argued I believe that efficiency is "just" an engineering problem, but I have certainly not ever argued that computational equivalence proves that.

Again you are falsely attributing opinions to me I do not hold, and it's frankly offensive that you keep attrbuting to me things I not only have not said, but do not agree with.

> The abstraction layers, the architecture, the implementation: these all matter for the actual question at hand.

They do not at all matter for the question of whether one architecture is theoretically capable of computing the same as the other, which is what I have argued it is.

> This concedes the key point.

It concedes nothing. It pointed out that my argument was about whether LLMs can be made to "represent ideas that is has not encountered before" and "come up with truly novel concepts".

Those were the claims I stated has no evidence in favour of them. Nothing of what you have written in any of your responses have any relevance to that.

As you concede:

> Exactly.

Then you go on to make another false assertion about what I have said:

> Your argument has contracted from making claims about actual LLM capabilities to an unfalsifiable position about theoretical possibility.

It has done nothing of the sort. You have repeatedly tried to argue against a position I did not take, by repeatedly misrepresenting what I have claimed, as this quoted statement also does.

There is also nothing unfalsificable about my claim:

Show that humans can compute even a single function outside the Turing computable, and my argument is is proven false.

> In the sense you are now defending, it is "possible" that monks with abacuses could run Crysis given infinite time and perfect execution. This tells us nothing interesting about whether current LLMs have unbounded creativity.

This is the only thing I have been defending. It may not be interesting to you, but to be it matters because without it being possible, there is no point in even arguing over whether it is practical.

If said Crysis-executing monks were fundamentally limited in a way that made it impossible for them to execute the steps, then it would be irrelevant whether or not there were ways for them to speed it up (say, by building computers...).

Since I was arguing against someone who denied the possibility that is the only argument I had any reason to make.

> Perhaps I am misunderstanding your original argument. Could you clarify what your argument is exactly? I want to make sure we are not talking past each other.

I told you how you misunderstood my original argument: I've argued over possibility. I've not made any argument about difficulty or complexity.

You've gone out to falsely and rudely claim that my argument has shifted, but it has not.

Here is my first comment in this sub-thread, where I state there is no evidence to support a claim that LLMs "will not be able to represent ideas that it has not encountered before" and won't be able to "come up with truly novel concepts". My original claim didn't even extent to claim full computational equivalence, because it was not necessary.

https://news.ycombinator.com/item?id=45996749

replies(1): >>46017454 #

16. vidarh ◴[22 Nov 25 18:26 UTC] No.46016993{6}[source]▶

>>46009943 #

Exactly this. I would argue that I believe doing it efficiently is "just engineering", but I would not claim we know that to any reasonable amount of certainty.

I hold beliefs about what LLMs may be capable of that are far stronger than what I argued, but stated only what can be supported by facts for a reason:

That absent evidence we can exceed the Turing computable, we have no reason to believe LLMs can't be trained to "represent ideas that it has not encountered before" or "come up with truly novel concepts".

17. johnisgood ◴[22 Nov 25 19:20 UTC] No.46017454{8}[source]▶

>>46016921 #

I will get back to this later, but I literally quoted you and I replied to what I quoted you said, so you cannot say that I made it up myself when I quoted you verbatim and then responded to that.

In one instance you did say "If the brain is limited to the Turing computable, then the brain itself is existence proof it is possible to do so efficiently.", for example, and I explained why it is not the proof you thought it was.

In any case, no hard feelings. I will get back to you in a minute.

↑