Cancer DNA is detectable in blood years before diagnosis

1. biotechbio ◴[18 Jul 25 20:54 UTC] No.44609723[source]▶

Some thoughts on this as someone working on circulating-tumor DNA for the last decade or so:

- Sure, cancer can develop years before diagnosis. Pre-cancerous clones harboring somatic mutations can exist for decades before transformation into malignant disease.

- The eternal challenge in ctDNA is achieving a "useful" sensitivity and specificity. For example, imagine you take some of your blood, extract the DNA floating in the plasma, hybrid-capture enrich for DNA in cancer driver genes, sequence super deep, call variants, do some filtering to remove noise and whatnot, and then you find some low allelic fraction mutations in TP53. What can you do about this? I don't know. Many of us have background somatic mutations speckled throughout our body as we age. Over age ~50, most of us are liable to have some kind of pre-cancerous clones in the esophagus, prostate, or blood (due to CHIP). Many of the popular MCED tests (e.g. Grail's Galleri) use signals other than mutations (e.g. methylation status) to improve this sensitivity / specificity profile, but I'm not convinced its actually good enough to be useful at the population level.

- The cost-effectiveness of most follow on screening is not viable for the given sensitivity-specificity profile of MCED assays (Grail would disagree). To achieve this, we would need things like downstream screening to be drastically cheaper, or possibly a tiered non-invasive screening strategy with increasing specificity to be viable (e.g. Harbinger Health).

replies(10): >>44610108 #>>44610191 #>>44610539 #>>44610565 #>>44610758 #>>44611096 #>>44611258 #>>44612058 #>>44612114 #>>44612576 #

2. tptacek ◴[18 Jul 25 21:37 UTC] No.44610108[source]▶

>>44609723 (TP) #

This seems like yet another place where the base rate is going to fuck us: intuitively (and you've actually thought about this problem and I haven't) I'd expect that even with remarkably good tests, most people who come up positive will not go on to develop related disease.

replies(2): >>44610187 #>>44610628 #

3. rscho ◴[18 Jul 25 21:49 UTC] No.44610187[source]▶

>>44610108 #

Ideally, you'd want a test (or two sequential ones) that's both very sensitive (rule candidates in) and specific (rule healthy peeps out). But that's only the first step, because there's no point knowing you're sick (from the populational and economic pov) if you can't do something useful about it. So you also have to include downstream tests and treatments in your assessment and all this suddenly becomes a very intricate probability network needing lots of data and thinking before decisions are made. And then, there's politics...

4. ada1981 ◴[18 Jul 25 21:49 UTC] No.44610191[source]▶

>>44609723 (TP) #

It could motivate to shift to plant based diet; start meditating; stop drinking; begin regular 5-7 day fasts; etc.

5. zaptheimpaler ◴[18 Jul 25 22:36 UTC] No.44610539[source]▶

>>44609723 (TP) #

I guess the problem is a mismatch between detection capability and treatment capability? We seem to be getting increasingly good at detecting precancerous states but we don't have corresponding precancer treatments, just the regular cancer treatments like chemo or surgery which are a big hit to quality of life, expensive, harmful etc.

Like if we had some kind of prophylactic cancer treatment that was easy/cheap/safe enough to recommend to people even on mild suspicion of cancer with false positives, we could offer it to positive tests. Maybe even just lifestyle interventions if those are proven to work. That's probably very difficult though, just dreaming out loud.

replies(4): >>44611466 #>>44611649 #>>44611948 #>>44612413 #

6. eps ◴[18 Jul 25 22:39 UTC] No.44610565[source]▶

>>44609723 (TP) #

> due to CHIP

What is CHIP?

replies(2): >>44610701 #>>44610710 #

7. Spooky23 ◴[18 Jul 25 22:46 UTC] No.44610628[source]▶

>>44610108 #

You might be able to target and preemptively treat some aggressive cancers!

I lost my wife to melanoma that metastasized to her brain after cancerous mole and margin was removed 4 years earlier. They did due diligence and by all signs there was no evidence of recurrence, until there was. They think that the tumor appeared 2-3 months before symptoms (headaches) appeared, so it was unlikely that you’d discover it otherwise.

With something like this, maybe you could get lower dose immunotherapy that would help your body eradicate the cancer?

replies(1): >>44611094 #

8. biotechbio ◴[18 Jul 25 22:56 UTC] No.44610701[source]▶

>>44610565 #

https://en.wikipedia.org/wiki/Clonal_hematopoiesis

9. bglazer ◴[18 Jul 25 22:57 UTC] No.44610710[source]▶

>>44610565 #

Clonal hematopoiesis of indeterminate potential.

It’s when bone marrow cells acquire mutations and expand to take up a noticeable proportion of all your bone marrow cells, but they’re not fully malignant, expanding out of control.

10. ajb ◴[18 Jul 25 23:04 UTC] No.44610758[source]▶

>>44609723 (TP) #

Here's what may seem like an unrelated question in response: how can we get 10^7+ bits of information out of the human body every day?

There are a lot of companies right now trying to apply AI to health, but what they are ignoring is that there are orders of magnitude less health data per person than there are cat pictures. (My phone probably contains 10^10 bits of cat pictures and my health record probably 10^3 bits, if that). But it's not wrong to try to apply AI, because we know that all processes leak information, including biological ones; and ML is a generic tool for extracting signal from noise, given sufficient data.

But our health information gathering systems are engineered to deal with individual very specific hypotheses generated by experts, which require high quality measurements of specific individual metrics that some expert, such as yourself, have figured may be relevant. So we get high quality data, in very small quantities -a few bits per measurement.

Suppose you invent a new cheap sensor for extracting large (10^7+ bits/day) quantities of information about human biochemistry, perhaps from excretions, or blood. You run a longitudinal study collecting this information from a cohort and start training a model to predict every health outcome.

What are the properties of the bits collected by such a sensor, that would make such a process likely to work out? The bits need to be "sufficiently heterogeneous" (but not necessarily independent) and their indexes need to be sufficiently stable (in some sense). What is not required if for specific individual data items to be measured with high quality. Because some information about the original that we're interested in (even though we don't know exactly what it is) will leak into the other measurements.

I predict that designs for such sensors, which cheaply perform large numbers of low quality measurements are would result in breakthroughs what in detection and treatment, by allowing ML to be applied to the problem effectively.

replies(5): >>44610815 #>>44610880 #>>44611051 #>>44612179 #>>44612833 #

11. gleenn ◴[18 Jul 25 23:10 UTC] No.44610815[source]▶

>>44610758 #

Someone should add a sensor to all those diabetes sensors people have in their arms all day and collect general info. It would obviously bias towards diabetics but that's like half the US population anyways so maybe it wouldn't matter that much.

12. rscho ◴[18 Jul 25 23:16 UTC] No.44610880[source]▶

>>44610758 #

Last time someone tried to inject chips into the bloodstream, public opinion didn't handle it too well. It's the same as we would learn a lot by being more cruel to research animals. But most people have other priorities. Good or bad ? Who knows ? Research meets social constructs.

replies(2): >>44610914 #>>44610920 #

13. ◴[18 Jul 25 23:21 UTC] No.44610914{3}[source]▶

>>44610880 #

14. ajb ◴[18 Jul 25 23:22 UTC] No.44610920{3}[source]▶

>>44610880 #

I am not proposing injecting chips.

replies(1): >>44611015 #

15. rscho ◴[18 Jul 25 23:34 UTC] No.44611015{4}[source]▶

>>44610920 #

Apart from the likely technical infeasibility of your idea in today's society, this would require a humongous and diversified population sample to be meaningful (your 'heterogeneous bits'). This follows directly from the complexity of metabolic pathways you wish to analyze. Socially, you'll only be able to achieve that by not asking your sample for consent. Otherwise you'll have a highly biased sample, which could still be useful but for severely restricted research questions.

replies(1): >>44611222 #

16. standingca ◴[18 Jul 25 23:37 UTC] No.44611051[source]▶

>>44610758 #

Or perhaps even routine bloodwork could incorporate some form of sequencing and longitudinal data banking. Deep sequencing, which may still be too expensive, generates tons of data that can be useful for things that we don't even know to look for today, capturing this data could let us retroactively identify meaningful biomarkers or early signals when we have better techniques. That way, each time models/methods improve, prior data becomes newly valuable. Perhaps the same could be said of raw data/readings from instruments running standard tests as well (as opposed to just the final results).

I'd be really curious to see how longitudinal results of sequencing + data banking, plus other routine bloodwork, could lead to early detection and better health outcomes.

17. tptacek ◴[18 Jul 25 23:44 UTC] No.44611094{3}[source]▶

>>44610628 #

I'm so sorry about your wife.

Literally anything that reduces cancer deaths is a win. I'm certainly not campaigning against early detection tests like this! Just talking about a challenge that comes up operationalizing them.

18. ethan_smith ◴[18 Jul 25 23:44 UTC] No.44611096[source]▶

>>44609723 (TP) #

The sensitivity challenge is compounded by the signal-to-noise ratio problem at ultra-low allelic fractions (<0.1%), where technical artifacts from library preparation and sequencing can mask true variants.

19. ajb ◴[19 Jul 25 00:04 UTC] No.44611222{5}[source]▶

>>44611015 #

There are some pretty big longitudinal studies with consent ( "45 and up" are a quarter of a million people, for example - that's big enough that working predictions within the cohort would be a worthwhile health outcome).

There are nevertheless privacy issues, which I did not address as my first comment was already very long, especially for a tangent. Most obviously, people would be consenting to the collection of data whose significance they cannot reasonably forsee.

I do agree that most current AI companies are unlikely to be a good steward of such data, and the current rush to give away health records needs to stop. In a way it's a good thing that health records are currently so limited, since the costs will so obviously outweigh the benefits.

20. edwardog ◴[19 Jul 25 00:09 UTC] No.44611258[source]▶

>>44609723 (TP) #

Would you say ctDNA tools are sensitive and specific enough now to be able to make a decision about post op adjuvant therapies? “Now that I’ve had surgery, did the R0 resection get it all, or do I need to do chemo and challenging medication like mitotane?”

replies(2): >>44611690 #>>44612280 #

21. pas ◴[19 Jul 25 00:43 UTC] No.44611466[source]▶

>>44610539 #

do the chemo medications not do anything useful at low(er) doses in these precancerous situations?

replies(1): >>44612054 #

22. amy_petrik ◴[19 Jul 25 01:17 UTC] No.44611649[source]▶

>>44610539 #

>I guess the problem is a mismatch between detection capability and treatment capability?

the problem is you do the test for 7 billion people, say, 30 times over their life... 210000000000 tests. imagine how many false negatives and false positives, the cost of follow up testing only to find... false positive. the cost of telling someone they have cancer when they don't. the anger of telling someone they are free of cancer, only to find out they had it all along

this tech isn't that good, nowhere near it, more like a 1 in 100 or 10 in 100 rate of "being wrong". those numbers can get cheesed towards more false positives or false negatives.

as for grail, they tried to achieve this and printed OK numbers... ... .. but their test set was their training set. so the performance metrics went to shit when they rolled it out to production

23. mbreese ◴[19 Jul 25 01:24 UTC] No.44611690[source]▶

>>44611258 #

I’ve seen it most commonly thought of as using ctDNA to detect relapse earlier.

So, more like — did the tumor come back? And if that does happen, with ctDNA, can you detect that there is a relapse before you would otherwise find it with standard imaging. Most studies I’ve seen have shown that this happens and ctDNA is a good biomarker for early detection of relapse.

The case for proactively looking for circulating tumor DNA without an initial diagnosis or underlying genetic condition is a bit dicier IMHO. For example, what if really like to know (I haven’t read this article, but I’m pretty familiar with the field) is how many people had a detectable cancer in their plasma (ctDNA), but didn’t receive a cancer diagnosis. It’s been known for a while that you can detect precancerous lesions well before a formal cancer diagnosis. But, what’s still an open question AFAIK, is how many people have precancerous lesions or positive ctDNA hits that don’t form a tumor?

(I’ve done a little work in this area)

24. m463 ◴[19 Jul 25 02:13 UTC] No.44611948[source]▶

>>44610539 #

People are routinely notified they are pre-diabetes.

It gives people the agency to alter their lifestyle trajectory.

I personally suspect that people get and cure cancer all the time.

I wonder if cancer is just damage to your body - either a lot of direct damage or interfering with the body's ability to manage/heal itself.

if someone was pre-cancer, would it help to exercise, cut out sugar, use the sauna, stop overeating? I'll bet it might make a difference

25. voiprodrigo ◴[19 Jul 25 02:33 UTC] No.44612054{3}[source]▶

>>44611466 #

I think chemo in general kills rapidly dividing cells, which is a characteristic of cancer cells and, unfortunately, many types of regular cells as well, hence many of the side effects, like hair loss. If it is precancerous, then probably it’s not yet dividing in that way, so probably wouldn’t make much of a difference, unless you’d actually catch the moment when the switch to full fledged malignant happens.

26. mapt ◴[19 Jul 25 02:33 UTC] No.44612058[source]▶

>>44609723 (TP) #

This sort of thing is exactly like preventative whole body MRI scans. It's very noisy, very overwhelming data that is only statistically useful in cases we're not even sure about yet. To use it in a treatment program is witchcraft at this moment, probably doing more harm than good.

It COULD be used to craft a pipeline that dramatically improved everyone's health. It would take probably a decade or two of testing (an annual MRI, an annual sequencing effort, an annual very wide blood panel) in a longitudinal study with >10^6 people to start to show significant reductions in overall cancer mortality and improvements in diagnostics of serious illnesses. The diagnostic merit is almost certainly hiding in the data at high N.

The odds are that most of the useful things we would find from this are serendipitous - we wouldn't even know what we were looking at right now, first we need tons of training data thrown into a machine learning algorithm. We need to watch somebody who's going to be diagnosed with cancer 14 years from now, and see what their markers and imaging are like right now, and form a predictive model that differentiates between them and other people who don't end up with cancer 14 years from now. We [now] have the technology for picking through complex multidimensional data looking for signals exactly like this.

In the meantime, though, you have to deal with the fact that the system is set up exclusively for profitable care of well-progressed illnesses. It would be very expensive to run such a trial, over a long period of time, and the administrators would feel ethically bound to unblind and then report on every tiny incidentaloma, which completely fucks the training process.

This US is institutionally unable to run this study. The UK or China might, though.

replies(2): >>44612844 #>>44616092 #

27. im3w1l ◴[19 Jul 25 02:45 UTC] No.44612114[source]▶

>>44609723 (TP) #

Long term the goal should be to find a treatment that is safe enough and with so small side effects that it can be used for any suspicious mutations even though it may be decades away from killing you.

replies(1): >>44612486 #

28. im3w1l ◴[19 Jul 25 03:00 UTC] No.44612179[source]▶

>>44610758 #

I think it's a very interesting approach and I highly support such an initiative. The easiest way to get a lot of data out of the body is probably to tap the body's own monitoring system - the sensory nerves.

A chemosensor also sounds like a useful thing it should give concentration by time. Minimally invasive option would be to monitor breath, better signal in blood.

29. refurb ◴[19 Jul 25 03:19 UTC] No.44612280[source]▶

>>44611258 #

It seems like adjuvant treatment is rather routine at this point?

And the question would be “do I believe the test when it tells me the cancer is gone?” When you know it’s not 100% accurate?

Or do you always do the adjuvant treatment considering the very small chance the test is wrong has a very high cost (death)?

30. marcosdumay ◴[19 Jul 25 03:45 UTC] No.44612413[source]▶

>>44610539 #

Weren't the mRNA vaccines created exactly for that?

31. aetherspawn ◴[19 Jul 25 04:02 UTC] No.44612486[source]▶

>>44612114 #

Yes.. as I read the OP post I was thinking about how many weak natural poisons (ie bloodroot) have been shown to be effective at dispersing through the body and how they might be a good treatment ie 1-2 month course of pills.

32. psadri ◴[19 Jul 25 04:24 UTC] No.44612576[source]▶

>>44609723 (TP) #

How about tracking deltas between blood draws starting ant youngish age when things are on average presumed to be in a good state? When a new feature turns up in a subsequent blood draw, could it then be something more concerning?

33. melagonster ◴[19 Jul 25 05:30 UTC] No.44612833[source]▶

>>44610758 #

If you can let the detector be so cheap, doctor will love you!

34. aquafox ◴[19 Jul 25 05:33 UTC] No.44612844[source]▶

>>44612058 #

> This sort of thing is exactly like preventative whole body MRI scans. It's very noisy, very overwhelming data that is only statistically useful in cases we're not even sure about yet. To use it in a treatment program is witchcraft at this moment, probably doing more harm than good.

The child of a friend of mine has PTEN-Hamartom-Tumor-Syndrom, a tendency to develop tumors throughout life due to a mutation in the PTEN gene. The poor child gets whole body MRIs and other check-ups every half year. As someone in biological data science, I always tell the parents how difficult it will be to prevent false positives, because we don't have a lot of data on routine full body check-ups on healty people. We just know the huge spectrum on how healthy/ok tissue looks like.

replies(1): >>44613341 #

35. lokrian ◴[19 Jul 25 07:29 UTC] No.44613341{3}[source]▶

>>44612844 #

Hopefully gene therapy can fix this sort of problem.

replies(1): >>44613747 #

36. LoganDark ◴[19 Jul 25 08:51 UTC] No.44613747{4}[source]▶

>>44613341 #

is it even possible for gene therapy to just rewrite all the existing DNA in a body? can't you only do that to cells that are dividing or whatever?

replies(1): >>44615063 #

37. tim333 ◴[19 Jul 25 12:47 UTC] No.44615063{5}[source]▶

>>44613747 #

They've managed to treat sickle cell.

>CRISPR/Cas9 can be directed to cut DNA in targeted areas, enabling the ability to accurately edit (remove, add, or replace) DNA where it was cut. The modified blood stem cells are transplanted back into the patient where they engraft (attach and multiply) within the bone marrow...

https://www.fda.gov/news-events/press-announcements/fda-appr...

38. spease ◴[19 Jul 25 15:07 UTC] No.44616092[source]▶

>>44612058 #

> It would be very expensive to run such a trial, over a long period of time, and the administrators would feel ethically bound to unblind and then report on every tiny incidentaloma, which completely fucks the training process.

I wonder if our current research product is only considered the gold standard because doing things in a probabilistic way is the only way we can manage the complexity of the human body to date.

It’s like me running an application many, many times with many different configurations and datasets, while scanning some memory addresses at runtime before and after the test runs, to figure out whether a specific bug exists in a specific feature.

Wouldn’t it be a lot easier if I could look at the relevant function in the source code and understand its implementation to determine whether it was logically possible based on the implementation?

We currently don’t have the ability to decompile the human body, or understand the way it’s “implemented”, but that is something that tech is rapidly developing tools that could be used for such a thing. Either a way to corroborate enough information aggregated about the human body “in mind” than any person can in one lifetime and reason about it, or a way to simulate it with enough granularity to be meaningful.

Alternatively, the double-blindedness of a study might not be as necessary if you can continually objectively quantify the agreement of the results with the hypothesis.

Ie if your AI model is reporting low agreement while the researchers are reporting high agreement, that could be a signal that external investigation is warranted, or prompt the researchers to question their own biases where they would’ve previously succumbed to confidence bias.

All of this is fuzzy anyway - we likely will not ever understand everything at 100% or have perfect outcomes, but if you can cut the overhead of each study down by an order of magnitude, you can run more studies to fine-tune the results.

Alternatively, you can have an AI passively running studies to verify reproducibility and flag cases where it fails, whereas now the way the system values contributions makes it far less useful for a human author to invest the time, effort, and money. Ie improve recovery from a bad study a lot quicker rather than improve the accuracy.

EDIT: These are probably all ideas other people have had before, so sorry to anyone who reaches the end of my brainstorming and didn’t come out with anything new. :)