Ask HN: How are Markov chains so different from tiny LLMs?

I polished a Markov chain generator and trained it on an article by Uri Alon and al (https://pmc.ncbi.nlm.nih.gov/articles/PMC7963340/).

It generates text that seems to me at least on par with tiny LLMs, such as demonstrated by NanoGPT. Here is an example:

  jplr@mypass:~/Documenti/2025/SimpleModels/v3_very_good$
  ./SLM10b_train UriAlon.txt 3
  
  Training model with order 3...
  
  Skip-gram detection: DISABLED (order < 5)
  
  Pruning is disabled
  
  Calculating model size for JSON export...
  
  Will export 29832 model entries
  
  Exporting vocabulary (1727 entries)...
  
  Vocabulary export complete.
  
  Exporting model entries...
  
    Processed 12000 contexts, written 28765 entries (96.4%)...
  
  JSON export complete: 29832 entries written to model.json
  
  Model trained and saved to model.json
  
  Vocabulary size: 1727
  
  jplr@mypass:~/Documenti/2025/SimpleModels/v3_very_good$ ./SLM9_gen model.json

Aging cell model requires comprehensive incidence data. To obtain such a large medical database of the joints are risk factors. Therefore, the theory might be extended to describe the evolution of atherosclerosis and metabolic syndrome. For example, late‐stage type 2 diabetes is associated with collapse of beta‐cell function. This collapse has two parameters: the fraction of the senescent cells are predicted to affect disease threshold . For each individual, one simulates senescent‐cell abundance using the SR model has an approximately exponential incidence curve with a decline at old ages In this section, we simulated a wide range of age‐related incidence curves. The next sections provide examples of classes of diseases, which show improvement upon senolytic treatment tends to qualitatively support such a prediction. model different disease thresholds as values of the disease occurs when a physiological parameter ϕ increases due to the disease. Increasing susceptibility parameter s, which varies about 3‐fold between BMI below 25 (male) and 54 (female) are at least mildly age‐related and 25 (male) and 28 (female) are strongly age‐related, as defined above. Of these, we find that 66 are well described by the model as a wide range of feedback mechanisms that can provide homeostasis to a half‐life of days in young mice, but their removal rate slows down in old mice to a given type of cancer have strong risk factors should increase the removal rates of the joint that bears the most common biological process of aging that governs the onset of pathology in the records of at least 104 people, totaling 877 disease category codes (See SI section 9), increasing the range of 6–8% per year. The two‐parameter model describes well the strongly age‐related ICD9 codes: 90% of the codes show R 2 > 0.9) (Figure 4c). This agreement is similar to that of the previously proposed IMII model for cancer, major fibrotic diseases, and hundreds of other age‐related disease states obtained from 10−4 to lower cancer incidence. A better fit is achieved when allowing to exceed its threshold mechanism for classes of disease, providing putative etiologies for diseases with unknown origin, such as bone marrow and skin. Thus, the sudden collapse of the alveoli at the outer parts of the immune removal capacity of cancer. For example, NK cells remove senescent cells also to other forms of age‐related damage and decline contribute (De Bourcy et al., 2017). There may be described as a first‐passage‐time problem, asking when mutated, impair particle removal by the bronchi and increase damage to alveolar cells (Yang et al., 2019; Xu et al., 2018), and immune therapy that causes T cells to target senescent cells (Amor et al., 2020). Since these treatments are predicted to have an exponential incidence curve that slows at very old ages. Interestingly, the main effects are opposite to the case of cancer growth rate to removal rate We next consider the case of frontline tissues discussed above.

Show context

Sohcahtoa82 ◴[20 Nov 25 18:26 UTC] No.45995897[source]▶

>>45958004 (OP) #

A Markov Chain trained by only a single article of text will very likely just regurgitate entire sentences straight from the source material. There just isn't enough variation in sentences.

But then, Markov Chains fall apart when the source material is very large. Try training a chain based on Wikipedia. You'll find that the resulting output becomes incoherent garbage. Increasing the context length may increase coherence, but at the cost of turning into just simple regurgitation.

In addition to the "attention" mechanism that another commenter mentioned, it's important to note that Markov Chains are discrete in their next token prediction while an LLM is more fuzzy. LLMs have latent space where the meaning of a word basically exists as a vector. LLMs will generate token sequences that didn't exist in the source material, whereas Markov Chains will ONLY generate sequences that existed in the source.

This is why it's impossible to create a digital assistant, or really anything useful, via Markov Chain. The fact that they only generate sequences that existed in the source mean that it will never come up with anything creative.

replies(12): >>45995946 #>>45996109 #>>45996662 #>>45996887 #>>45996937 #>>45998252 #>>45999650 #>>46000705 #>>46002052 #>>46002754 #>>46004144 #>>46021459 #

johnisgood ◴[20 Nov 25 18:30 UTC] No.45995946[source]▶

>>45995897 #

> The fact that they only generate sequences that existed in the source mean that it will never come up with anything creative.

I have seen the argument that LLMs can only give you what its been trained on, i.e. it will not be "creative" or "revolutionary", that it will not output anything "new", but "only what is in its corpus".

I am quite confused right now. Could you please help me with this?

Somewhat related: I like the work of David Hume, and he explains it quite well how we can imagine various creatures, say, a pig with a dragon head, even if we have not seen one ANYWHERE. It is because we can take multiple ideas and combine them together. We know how dragons typically look like, and we know how a pig looks like, and so, we can imagine (through our creativity and combination of these two ideas) how a pig with a dragon head would look like. I wonder how this applies to LLMs, if they even apply.

Edit: to clarify further as to what I want to know: people have been telling me that LLMs cannot solve problems that is not in their training data already. Is this really true or not?

replies(16): >>45996256 #>>45996266 #>>45996274 #>>45996313 #>>45996484 #>>45996757 #>>45997088 #>>45997100 #>>45997291 #>>45997366 #>>45999327 #>>45999540 #>>46001856 #>>46001954 #>>46007347 #>>46017836 #

koliber ◴[20 Nov 25 18:57 UTC] No.45996274[source]▶

>>45995946 #

Here's how I see it, but I'm not sure how valid my mental model is.

Imagine a source corpus that consists of:

Cows are big. Big animals are happy. Some other big animals include pigs, horses, and whales.

A Markov chain can only return verbatim combinations. So it might return "Cows are big animals" or "Are big animals happy".

An LLM can get a sense of meaning in these words and can return ideas expressed in the input corpus. So in this case it might say "Pigs and horses are happy". It's not limited to responding with verbatim sequences. It can be seen as a bit more creative.

However, LLMs will not be able to represent ideas that it has not encountered before. It won't be able to come up with truly novel concepts, or even ask questions about them. Humans (some at least) have that unbounded creativity that LLMs do not.

replies(3): >>45996596 #>>45996749 #>>45997780 #

vidarh ◴[20 Nov 25 19:41 UTC] No.45996749[source]▶

>>45996274 #

> However, LLMs will not be able to represent ideas that it has not encountered before. It won't be able to come up with truly novel concepts, or even ask questions about them. Humans (some at least) have that unbounded creativity that LLMs do not.

There's absolutely no evidence to support this claim. It'd require humans to exceed the Turing computable, and we have no evidence that is possible.

replies(3): >>45996979 #>>46001605 #>>46002996 #

1. somenameforme ◴[21 Nov 25 05:47 UTC] No.46001605[source]▶

>>45996749 #

Turing computability is tangential to his claim, as LLMs are obviously not carrying out the breadth of all computable concepts. His claim can be trivially proven by considering the history of humanity. We went from a starting point of having literally no language whatsoever, and technology that would not have expanded much beyond an understanding of 'poke him with the pointy side'. And from there we would go on to discover the secrets of the atom, put a man on the Moon, and more. To say nothing of inventing language itself.

An LLM trained on this starting state of humanity is never going to do anything except remix basically nothing. It's never going to discover the secrets of the atom, or how to put a man on the Moon. Now whether any artificial device could achieve what humans did is where the question of computability comes into play, and that's a much more interesting one. But if we limit ourselves to LLMs, then this is very straight forward to answer.

replies(1): >>46002500 #

2. vidarh ◴[21 Nov 25 08:35 UTC] No.46002500[source]▶

>>46001605 (TP) #

> Turing computability is tangential to his claim, as LLMs are obviously not carrying out the breadth of all computable concepts

They don't need to. To be Turing complete a system including an LLM need to be able to simulate a 2-state 3-symbol Turing machine (or the inverse). Any LLM with a loop can satisfy that.

If you think Turing computability is tangential to this claim, you don't understand the implications of Turing computability.

> His claim can be trivially proven by considering the history of humanity.

Then show me a single example where humans demonstrably exceeding the Turing computable.

We don't even know any way for that to be possible.

replies(2): >>46002701 #>>46070304 #

3. somenameforme ◴[21 Nov 25 09:11 UTC] No.46002701[source]▶

>>46002500 #

This is akin to claiming that a tic-tac-toe game is turing complete since after all we could simply just modify it to make it not a tic tac toe game. It's not exactly a clever argument.

And again there are endless things that seem to reasonably defy turing computability except when you assume your own conclusion. Going from nothing, not even language, to richly communicating, inventing things with no logical basis for such, and so is difficult to even conceive as a computable process unless again you simply assume that it must be computable. For a more common example that rapidly enters into the domain of philosophy - there is the nature of consciousness.

It's impossible to prove that such is Turing computable because you can't even prove consciousness exists. The only way I know it exists is because I'm most certainly conscious, and I assume you are too, but you can never prove that to me, anymore than I could ever prove I'm conscious to you. And so now we enter into the domain of trying to computationally imagine something which you can't even prove exists, it's all just a complete nonstarter.

-----

I'd also add here that I think the current consensus among those in AI is implicit agreement with this issue. If we genuinely wanted AGI it would make vastly more sense to start from as little as possible because it'd ostensibly reduce computational and other requirements by many orders of magnitude, and we could likely also help create a more controllable and less biased model by starting from a bare minimum of first principles. And there's potentially trillions of dollars for anybody that could achieve this. Instead, we get everything dumped into token prediction algorithms which are inherently limited in potential.

replies(1): >>46002903 #

4. vidarh ◴[21 Nov 25 09:45 UTC] No.46002903{3}[source]▶

>>46002701 #

> This is akin to claiming that a tic-tac-toe game is turing complete since after all we could simply just modify it to make it not a tic tac toe game. It's not exactly a clever argument.

No, it is nowhere remotely like that. It is claiming that a machine capable of running a Turing machine is in fact capable of running any other Turing machine. In other words, it is pointing out the principle of Turing equivalence.

> And again there are endless things that seem to reasonably defy turing computability

Show us one. We have no evidence of any single one.

> It's impossible to prove that such is Turing computable because you can't even prove consciousness exists.

Unless you can show that humans exceeds the Turing computable, "consciousness" however you define it is either possible purely with a Turing complete system or can not affect the outputs of such a system. In either case this argument is irrelevant unless you can show evidence we exceed the Turing computable.

> I'd also add here that I think the current consensus among those in AI is implicit agreement with this issue. If we genuinely wanted AGI it would make vastly more sense to start from as little as possible because it'd ostensibly reduce computational and other requirements by many orders of magnitude, and we could likely also help create a more controllable and less biased model by starting from a bare minimum of first principles. And there's potentially trillions of dollars for anybody that could achieve this. Instead, we get everything dumped into token prediction algorithms which are inherently limited in potential.

This is fundamentally failing to engage with the argument. There is nothing in the argument that tells us anything about the complexity of a solution to AGI.

replies(1): >>46003796 #

5. somenameforme ◴[21 Nov 25 12:15 UTC] No.46003796{4}[source]▶

>>46002903 #

LLMs are not capable of simulating turing machines - their output is inherently and inescapably probabilistic. You would need to fundamentally rewrite one to make this possible, at which point it is no longer an LLM.

And as I stated, you are assuming your own conclusion to debate the issue. You believe that nothing is incomputable, and are tying that assumption into your argument as an assumption. It's not on me to prove your assumption is wrong, it's on you to prove that it's correct - proving a negative is impossible. E.g. - I'm going to assume that there is an invisible green massless goblin on your shoulder named Kyzirgurankl. Prove me wrong. Can you give me even the slightest bit of evidence against it? Of course you cannot, yet absence of evidence is not evidence of absence, so the burden of my claim rests on me.

And so now feel free to prove that consciousness is computable, or even replicating humanity's successes from a comparable baseline. Without that proof you must understand that you're not making some falsifiable claim of fact, but simply appealing to your own personal ideology or philosophy, which is of course completely fine (and even a good thing), but also a completely subjective opinion on matters.

replies(2): >>46006424 #>>46015493 #

6. johnisgood ◴[21 Nov 25 17:13 UTC] No.46006424{5}[source]▶

>>46003796 #

After having read your comment, I feel I should have left my comment under this thread. I will just refer to it instead: https://news.ycombinator.com/item?id=46003870. This was my reply to your parent. I agree with you.

7. vidarh ◴[22 Nov 25 15:31 UTC] No.46015493{5}[source]▶

>>46003796 #

> LLMs are not capable of simulating turing machines - their output is inherently and inescapably probabilistic.

This is fundamentally not true. Inference code written to be numerically stable and temperature set to 0 is all you need for an LLM to be entirely deterministic.

> And as I stated, you are assuming your own conclusion to debate the issue. You believe that nothing is incomputable, and are tying that assumption into your argument as an assumption.

This is categorically also false. Please do not make a position for me that I have at no point in my life claimed. I believe plenty of things are incomputable. That is provable the case. What I have repeatedly said is that we have no evidence to show 1) that there are computable functions that exceed the Turing computable, 2) that the brain are capable of computing such functions that exceeds the Turing computable.

If you have evidence of either of those two, please do feel free to provide it - it would be earth-shattering news. It'd revolutionise physics, as it'd involve unknown interactions, it'd revolutionise maths and computer science by forcing us to throw out areas of theory of computation.

> It's not on me to prove your assumption is wrong, it's on you to prove that it's correct - proving a negative is impossible.

That the Turing computable set of functions is the totality of computable functions is not a claim I've come up with.

If you want to make the extraordinary claim that there are computable functions outside that, despite no extant evidence, then since you've invoked a weird version of Russels teapot, that requires extraordinary proof.

And it is not impossible: A single example of a computable function outside the Turing computable would falsify the underlying claim. A single example of humans being able to compute such a function would falsify the claim that Turing equivalence has relevance here.

I've been very careful throughout to make clear that my arguments hinges on humans being unable to exceed the Turing computable.

I don't believe we should talk in absolutes when we can't prove it, hence my challenge to the people here who are so absolutely certain about the limitations of LLMs to show just a single example of humans exceeding the Turing computable.

Because you are so certain, the surely there lies something behind that certainty other than blind faith?

> And so now feel free to prove that consciousness is computable

At no point have I made claims about "consciousness". Before I'd do that, you'd need to define in an objective way what you mean. It is an entirely separate question from the ones I've addressed.

> Without that proof you must understand that you're not making some falsifiable claim of fact

As noted, my claims are falsifiable: Show a single example of a function that exceeds the Turing computable, that humans can compute.

replies(1): >>46018270 #

8. somenameforme ◴[22 Nov 25 21:07 UTC] No.46018270{6}[source]▶

>>46015493 #

Setting the temperature to 0 doesn't make an LLM non-probabilistic. Once again, LLMs are inherently probabilistic. All setting the temperature to 0 does is make it always choose the highest probability token instead of using a weighted randomization. You'll still get endless hallucinations and the same inherent limitations, including the inability to reliably simulate a turing machine.

As for the rest of your post, you are again, consciously or not, trying to say 'give me a calculable function that isn't a calculable function.' I obviously agree that the idea of trying to 'calculate' consciousness is essentially a non-starter. That's precisely the point.

9. vrighter ◴[27 Nov 25 15:45 UTC] No.46070304[source]▶

>>46002500 #

"To be Turing complete a system including an LLM need to be able to simulate a 2-state 3-symbol Turing machine (or the inverse)."

And infinite memory. You forgot the infinite memory. And LLMs are extremely inefficient with memory. I'm not talking about the memory needed in the GPU to store the weights, but rather the ability of an LLM to remember whatever it's working on at the moment.

What could be stored as a couple of bits in a temporary variable is usually output as "Step 3: In the previous step we frobbed the junxer and got junx, and if you do junx + flibbity you get floopity"

And remember that this takes up a bunch of tokens. Without doing this (whether the LLM provider decides to let you see it or not, but still bill you for it), an LLM can't possibly execute an algorithm that requires iteration in the general case. For a more rigorous example, check apple's paper where an LLM failed to solve a tower of hanoi problem even when it had the exact algorithm to do so in context (apart from small instances of the problem for which the solution is available countless times).

↑