204 points JPLeRouzic | 1 comments | 17 Nov 25 20:36 UTC | HN request time: 0s | source

I polished a Markov chain generator and trained it on an article by Uri Alon and al (https://pmc.ncbi.nlm.nih.gov/articles/PMC7963340/).

It generates text that seems to me at least on par with tiny LLMs, such as demonstrated by NanoGPT. Here is an example:

  jplr@mypass:~/Documenti/2025/SimpleModels/v3_very_good$
  ./SLM10b_train UriAlon.txt 3
  
  Training model with order 3...
  
  Skip-gram detection: DISABLED (order < 5)
  
  Pruning is disabled
  
  Calculating model size for JSON export...
  
  Will export 29832 model entries
  
  Exporting vocabulary (1727 entries)...
  
  Vocabulary export complete.
  
  Exporting model entries...
  
    Processed 12000 contexts, written 28765 entries (96.4%)...
  
  JSON export complete: 29832 entries written to model.json
  
  Model trained and saved to model.json
  
  Vocabulary size: 1727
  
  jplr@mypass:~/Documenti/2025/SimpleModels/v3_very_good$ ./SLM9_gen model.json

Aging cell model requires comprehensive incidence data. To obtain such a large medical database of the joints are risk factors. Therefore, the theory might be extended to describe the evolution of atherosclerosis and metabolic syndrome. For example, late‐stage type 2 diabetes is associated with collapse of beta‐cell function. This collapse has two parameters: the fraction of the senescent cells are predicted to affect disease threshold . For each individual, one simulates senescent‐cell abundance using the SR model has an approximately exponential incidence curve with a decline at old ages In this section, we simulated a wide range of age‐related incidence curves. The next sections provide examples of classes of diseases, which show improvement upon senolytic treatment tends to qualitatively support such a prediction. model different disease thresholds as values of the disease occurs when a physiological parameter ϕ increases due to the disease. Increasing susceptibility parameter s, which varies about 3‐fold between BMI below 25 (male) and 54 (female) are at least mildly age‐related and 25 (male) and 28 (female) are strongly age‐related, as defined above. Of these, we find that 66 are well described by the model as a wide range of feedback mechanisms that can provide homeostasis to a half‐life of days in young mice, but their removal rate slows down in old mice to a given type of cancer have strong risk factors should increase the removal rates of the joint that bears the most common biological process of aging that governs the onset of pathology in the records of at least 104 people, totaling 877 disease category codes (See SI section 9), increasing the range of 6–8% per year. The two‐parameter model describes well the strongly age‐related ICD9 codes: 90% of the codes show R 2 > 0.9) (Figure 4c). This agreement is similar to that of the previously proposed IMII model for cancer, major fibrotic diseases, and hundreds of other age‐related disease states obtained from 10−4 to lower cancer incidence. A better fit is achieved when allowing to exceed its threshold mechanism for classes of disease, providing putative etiologies for diseases with unknown origin, such as bone marrow and skin. Thus, the sudden collapse of the alveoli at the outer parts of the immune removal capacity of cancer. For example, NK cells remove senescent cells also to other forms of age‐related damage and decline contribute (De Bourcy et al., 2017). There may be described as a first‐passage‐time problem, asking when mutated, impair particle removal by the bronchi and increase damage to alveolar cells (Yang et al., 2019; Xu et al., 2018), and immune therapy that causes T cells to target senescent cells (Amor et al., 2020). Since these treatments are predicted to have an exponential incidence curve that slows at very old ages. Interestingly, the main effects are opposite to the case of cancer growth rate to removal rate We next consider the case of frontline tissues discussed above.

Show context

dragonwriter ◴[20 Nov 25 23:05 UTC] No.45999104[source]▶

>>45958004 (OP) #

I think, strictly speaking, autoregressive LLMs are markov chains of a very high order.

The trick (aside from the order) is the training process by which they are derived from their source data. Simply enumerating the states and transitions in the source data and the probability of each transition from each state in the source doesn’t get you an LLM.

replies(2): >>46001522 #>>46002393 #

krackers ◴[21 Nov 25 05:29 UTC] No.46001522[source]▶

>>45999104 #

I always like to think LLMs are markov models in the way that real-world computers are finite state machines. It's technically true, but not a useful abstraction at which to analyze them.

Both LLMs and n-gram models satisfy the markov property, and you could in principle go through and compute explicit transition matrices (something on the size of vocab_size*context_size I think). But LLMs aren't trained as n-gram models, so besides giving you autoregressive-ness, there's not really much you can learn by viewing it as a markov model

replies(1): >>46006177 #

dragonwriter ◴[21 Nov 25 16:52 UTC] No.46006177[source]▶

>>46001522 #

> Both LLMs and n-gram models satisfy the markov property, and you could in principle go through and compute explicit transition matrices (something on the size of vocab_size*context_size I think).

Isn’t it actually (vocab_size)^(context_size)?

replies(1): >>46008420 #

1. krackers ◴[21 Nov 25 20:10 UTC] No.46008420{3}[source]▶

>>46006177 #

Yes, you're right. I typed "**" (exponentiation) but HN ate the second star since I forgot to escape.

↑

Ask HN: How are Markov chains so different from tiny LLMs?