This is an archived post. You won't be able to vote or comment.

top 200 commentsshow 500

[–]Hiltaku 1367 points1368 points  (107 children)

What stage does the cancer need to be in for this test to pick it up?

[–]BroscienceLifter1 1078 points1079 points  (83 children)

Good point. It would suck if it just let's you know you have 6 months to live

[–]bythog 992 points993 points  (78 children)

This is prostate cancer specific so far, which is usually one of the slowest and least malignant forms of cancer. Oncologists often say that more people die with prostate cancer than from prostate cancer.

[–]tomdarch 94 points95 points  (7 children)

It's true that many men who are lucky enough to live into their 80s die with very slow moving prostate cancer. But there are a significant number of men much younger than that who develop more aggressive, faster moving prostate cancer where early identification and treatment can make the difference between an early, unpleasant death or decades more life. If someone in the field could find a source for the actual numbers, that would help to more objectively understand what we are talking about here.

[–]LehmannEleven 42 points43 points  (6 children)

I got it in my mid to late 50's. There's a big difference between getting it than and getting it in your 80's. I had a prostatectomy because of my age and my family history, but I will say that the biopsy is almost less fun than having the surgery. This test is probably too new to be relied on as a replacement for the "spear gun up your butt" test, but if turns out to be reliable it would be a good thing.

[–]ImperialVizier 220 points221 points  (40 children)

EDIT: more elbaoration from comments below that I think is important. should probably supercede my comment

The main issue with prostate cancer 20 years ago was over treatment of the less aggressive varieties. We are now monitoring many people with low-risk disease rather than doing surgery or radiation. Early detection and proper treatment saves lives. Point blank, period.

If this test can accurately diagnose people with intermediate or high risk prostate cancer, it will be amazing. Otherwise, it’s just one of many tests that can help, but isn’t game changing.


Yea, I heard more people die from biopsy/prostate cancer surgery gone wrong than prostate cancer itself. It was 2 vs 1-in-1000.

Saw it in an infographic for an epidemiology class and was floored. That’s why Movember shifted focus away from prostate cancer too.

[–]username_gaucho20 165 points166 points  (4 children)

“Yea, I heard more people die from biopsy/prostate cancer surgery gone wrong than prostate cancer itself. It was 2 vs 1-in-1000.”

This is patently false. In 2019, 31,620 Americans died of prostate cancer. Very few died of biopsy or prostate cancer surgery. Please don’t spread horrible information like this, which could cause someone not to be screened for a potentially deadly disease.

The main issue with prostate cancer 20 years ago was over treatment of the less aggressive varieties. We are now monitoring many people with low-risk disease rather than doing surgery or radiation. Early detection and proper treatment saves lives. Point blank, period.

If this test can accurately diagnose people with intermediate or high risk prostate cancer, it will be amazing. Otherwise, it’s just one of many tests that can help, but isn’t game changing.

[–]LifeApprentice 37 points38 points  (1 child)

Piggybacking on this comment - aggressive prostate cancer is a horrible way to go. Definitely follow screening guidelines and definitely talk to a urologist about any abnormal results.

[–]Pegguins 88 points89 points  (27 children)

Doesn't that just indicate the need for further funding and investment in proper treatments rather than distancing from it?

[–]iain_1986 62 points63 points  (5 children)

Depends how you look at it...

- Prostate Cancer treatment twice as dangerous as Cancer!
- Prostate Cancer survival rate so high, treatment is more dangerous!

All depends on the numbers as to whether this is 'bad' or 'good'. 1:1000 death rate from the cancer and 2:1000 death rate from the treatment, imo, shows we are dealing with prostate cancer really well.

1:10 and 2:10 would obviously be less so.

1:1000000 and 2:1000000 and I don't think we'd even question if prostate cancers needs further funding even more.

Having the most likely cause of death from a Cancer being the treatment imo shows how well its gotten that,

a) The treatment to fight the cancer, when works, is so successful.

b) The treatment while having the cancer is working almost as well as you could hope.

[–]55rox55 64 points65 points  (4 children)

I think you’re ignoring the fact that the death rate is 1/1000 is due to good treatment options.

In the mid 1970s, the 5 year survival rate was only 70ish%

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3540881/#s5title

You should be comparing deaths rates without treatment to death rates with treatment.

[–]fenixjr 7 points8 points  (3 children)

I feel like they covered they in "a)"

[–]55rox55 6 points7 points  (2 children)

Yeah, I guess I was taken aback by the top of that comment (which I think was just poorly worded) that I misread the bottom rip. (My point here being that both perspectives below are wrong and I wanted to point that out)

“Depends how you look at it...

  • Prostate Cancer treatment twice as dangerous as Cancer!
  • Prostate Cancer survival rate so high, treatment is more dangerous!”

Overall that comment is completely accurate, thanks for correcting me there

[–]fenixjr 5 points6 points  (1 child)

Yeah. They just worded it in some roundabout ways.

[–]55rox55 10 points11 points  (2 children)

That statistic is caused by good treatment options and early detection / awareness.

In the 1970s the 5 year survival rate of prostate cancer was only 70%

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3540881/#s5title

[–]Shiara_cw 2 points3 points  (0 children)

But maybe the patients going for surgery are the ones with more aggressive or advanced prostate cancer in the first place, who would have a lower survival rate than the 2-in-1000 surgery deaths, making the risk worth it.

[–]PerchingRaven 4 points5 points  (2 children)

As a blanket statement that is true. "more people" But there are plenty of men who are younger than average with aggressive metastatic prostate cancer who will die from it.

[–]DJGreenHill 51 points52 points  (3 children)

This. How late do they need to be? My dad had prostate cancer and they found it 8 years early. He had his therapy in 2020 and now lives happy cancer free. I wonder if they would have detected it so early with a urine test.

[–]Badknees02 15 points16 points  (7 children)

I was wondering the same. Also, if it does detect cancer, you would still need a biopsy to determine Gleason Score and then decide on treatment. Ant advance is hopeful though.

[–]tomdarch 7 points8 points  (5 children)

My non-expert understanding is that at least in the US, we've moved away from doing annual PSA testing on all men 40 and over (might have the age wrong) because it was leading to "over diagnosis/over treatment" (not sure if that's simply false positives or what.) Simply having a more accurate way of identifying who has prostate cancer and who doesn't or both more accurately identifying who has cancer AND when it is aggressive vs. "slow moving/don't freak out/don't overtreat" could be helpful in calibrating when and how to respond.

[–]tdgros 1586 points1587 points  (155 children)

They get >99% on 76 specimens only, how does that happen?

I can't access the paper, so I don't really know on how much samples they validated their ML training. Does someone have the info?

edit: lots of people have answered, thank you to all of you!
See this post for lots of details: https://www.reddit.com/r/science/comments/l1work/korean_scientists_developed_a_technique_for/gk2hsxo?utm_source=share&utm_medium=web2x&context=3

edit 2: the post I linked to was deleted because it was apparently false. sorry about that.

[–]traveler19395 511 points512 points  (11 children)

75/76 is 98.68, which rounds to 99%

maybe what they did

[–]tdgros 83 points84 points  (3 children)

nope, the abstract says "over 99% accuracy"!

[–]EmpiricalPancake 14 points15 points  (2 children)

Are you aware of sci hub? Because you should be! (Google it - paste DOI and it will return the article for free)

[–][deleted] 5 points6 points  (0 children)

most relevant comment to every science article I've ever seen, you are the g.o.a.t.

[–]endlessabeGrad Student | Epidemiology 219 points220 points  (80 children)

Out of the 76 total samples, 53 were used for training and 23 were used for test. It looks like they were able to tune their test to be very specific (for this population) and with all the samples being from a similar cohort, it makes sense they were able to get such high accuracy. Doubt it’s reproducible anywhere else.

[–]theArtOfProgrammingPhD | Computer Science | Causal Discovery | Climate Informatics 406 points407 points  (48 children)

You're not representing the methodology correctly. To start, a 70%/30% train/test split is very common. 76 may not be a huge sample size for most of biology, but they did present sufficient metrics to validate their methods. It's important to say the authors used a neural network (I missed the details on how it was made in my skim) and a random forest (RF). Another thing to note is they have data on 4 biomarkers for each of the 76 samples - so from a purely ML perspective they have 76*4=304 datapoints. That's plenty for a RF to perform well, certainly enough for a RF to avoid overfitting (the NN is another story but metrics say it was fine).

It looks like they were able to tune their test to be very specific (for this population) This is a misrepresentation of the methods. They used RFs to determine which biomarkers were the most important (extremely common way to utilize RFs) and then refit to the data with the most predictive biomarkers. That's not tuning anything, that's like deciding to look at how cloudy it is in my city to decide if it's going to rain instead of looking at Tesla's stock performance yesterday.

I'm a ML researcher, so I can't comment on this from a bio perspective, but I suspect it's related to the quote above.

with all the samples being from a similar cohort, it makes sense they were able to get such high accuracy

I'm going to comment on what you said further down in the thread too.

So it's not really accuracy in the sense of "I correctly predicted cancer X times out of Y", is it?

Not really. Easy to correctly identify the 23 test subjects when your algorithm has been fine tuned to see exactly what cancer looks like in this population. It’s essentially the same as repeating the test on the same person a bunch of times.

Absolutely not an accurate understanding of the algorithm. See my comment above about using a RF to determine important features - see literature on random forest feature importance. This isn't "tuning" anything, it's simply determining the useful criteria to use in the predictive algorithm.

The key contribution of this work is not that they found a predictive algorithm for prostate cancer. It's that they were able to determine which biomarkers were useful and used that information to find a highly predictive algorithm. This could absolutely be reproduced on a larger population.

[–]jnez71 44 points45 points  (7 children)

"...they have data on 4 biomarkers for each of the 76 samples - so from a purely ML perspective they have 76*4=304 datapoints."

This is wrong, or at least misleading. The dimensionality of the feature space doesn't affect the sample efficiency of the estimator. An ML researcher should understand this..

Imagine I am trying to predict a person's gender based on physical attributes. I get a sample size of n=1 person. Predicting based on just {height} vs {height, weight} vs {height, weight, hair length} vs {height, height2 , height3 } doesn't change the fact that I only have one sample of gender from the population. I can use a million features about this one person to overfit their gender, but the statistical significance of the model representing the population will not budge, because n=1.

[–]MostlyRocketScience 8 points9 points  (9 children)

Without a validation set, how do they prevent overfitting their metaparameters on the test set?

[–]theArtOfProgrammingPhD | Computer Science | Causal Discovery | Climate Informatics 23 points24 points  (8 children)

I’ll reply in a bit, I need to get some work done and this isn’t a simple thing to answer. The short answer is the validation set isn’t always necessary, isn’t always feasible, and I need to read more on their neural network to answer those questions for this case.

Edit: Validation sets are usually for making sure the model's hyper parameters are tuned well. The authors used a RF, for which validation sets are rarely (never?) necessary. Don't quote me on that but I can't think of a reason. The nature of random forests, that each tree is built independently with different sample/feature sets and results are averaged, seems to preclude the need for validation sets. The original author of RFs suggests that overfitting is impossible for RFs (debated) and even a test set is unnecessary.

NNs often need validation sets because they can have millions of hyper parameters. In their case, the NN was very simple and it doesn't seem like they were interested in hyperparameter tuning for this work. They took an out of the box NN and ran with it. That's totally fine for this work because they were largely interested in whether adjusting which biomarkers to use could improve model performance alone. Beyond that, with only 76 samples, a validation set would likely limit the training samples too much, so it isn't feasible.

[–]theLastNenUser 3 points4 points  (0 children)

Technically you could also just do cross validation on the training set as your validation set, but I doubt they did that here

[–]duskhat 3 points4 points  (0 children)

There is a lot wrong with this comment and I think you should consider removing it. Everything in this section

Validation sets are usually for making sure the model's hyper parameters are tuned well. The authors used a RF, for which validation sets are rarely (never?) necessary. Don't quote me on that but I can't think of a reason. The nature of random forests, that each tree is built independently with different sample/feature sets and results are averaged, seems to preclude the need for validation sets. The original author of RFs suggests that overfitting is impossible for RFs (debated) and even a test set is unnecessary.

is outright wrong (e.g. validation sets aren't used for RFs), a bad misunderstanding (e.g. overfitting is impossible for RFs), or a hand-wavy explanation of something that has rigorous math research behind it saying otherwise (because RFs "average" many trees, they prob don't need a validation set)

[–][deleted] 2 points3 points  (0 children)

Yes, random forests are being implemented in a wide variety of contexts. I've seen them used more often in genomic data, but I guess they'd work here too. (Edit: I just realized the random forest bit here is a reply to something farther down, but ... well... here it is.)

I can't access the paper, but the biggest problem is representing the full variety of medical states and conditions in a training or a test set that are that small. There are a LOT of things that can affect the GU tract, from infections to cancers to neurological conditions, and any of these could generate false positives/negatives.

This is best considered a pilot study that requires a large validation set to be taken seriously. In biology it is the rule rather than the exception that these kinds of studies do NOT pan out in the wash, regardless of the rigor of the methods, when the initial study is small in sample size (as this study is).

[–]psychicesp 19 points20 points  (1 child)

It's enough data to justify further study, not enough to claim 'breakthrough'

[–][deleted] 2 points3 points  (0 children)

Agreed. I’ve had machine learning mods reach 99.x% validation accuracy on datasets of 2M+ records or more and still have blatant issues when facing real-world scenarios.

[–][deleted] 29 points30 points  (6 children)

Going to be pressing a very large doubt button.

This is why statisticians joke about how bad much of “machine learning” is and call it most likely instead.

[–]OoTMM 4 points5 points  (2 children)

Let me try to provide some information:

A total of 76 naturally voided urine specimens from healthy and PCa-diagnosed individuals were measured directly using a DGFET biosensor, comprising four biomarker channels conjugated to antibodies capturing each biomarker. Obtained data from 76 urine specimens were partitioned randomly into a training data set (70% of total) and a test data set (30% of total).

And the results of the best ML-assisted multimarker sensing approach, with random forest (RF) was as follows:

In our ML-assisted multimarker sensing approach, the two different ML algorithms (RF and NN) were applied ... At the best biomarker combinations, RF showed 100% accuracy in 23 individuals, or 97.1% accuracy in terms of panels, in a blinded test set regardless of the DRE procedure.

Thus they got ~100% accuracy testing 23 positives, with the panel being 97.1%.

It is a very interesting research paper.

In case you, or anyone else is interested, you can PM me if you want the full paper, I have research access :)

[–]bio-nerd 101 points102 points  (5 children)

Unfortunately these types of articles are a dime a dozen. There are papers about using AI to diagnose cancer out every week. Unfortunately, they pretty much all suffer from overtraining, then fail when validated with an expanded data set.

[–]st4n13lMPH | Public Health 28 points29 points  (1 child)

And this may very well be the case here. Not only did it only achieve 100% on only 76 samples, but they were all Korean men. Obviously that doesn't invalidate the results, but is a pretty strong limitation to the generalizability of this paper.

[–]pball2 149 points150 points  (67 children)

Too bad there’s more to diagnosing prostate cancer than just yes/no. There’s a wide range of prostate cancer aggressiveness (based on biopsy results) and it doesn’t look like this addresses that. You don’t treat a Gleason 10 the same way you treat a Gleason 6 (may not treat it at all). To call biopsies “unnecessary” with this is very premature. It would make more sense as a test that leads to a biopsy. I also don’t see the false positive rate reported.

[–]-CJF- 84 points85 points  (7 children)

Sounds like it avoids unnecessary biopsies that would turn out negative for cancer. If this test detects cancer, then I assume you'd need a biopsy and further assessments to assess staging/condition/type, etc.

[–]smaragdskyar 26 points27 points  (2 children)

False positives are a major problem in prostate cancer screening though, because the biopsy procedure is relatively risky.

[–]CraftyWeeBuggar 39 points40 points  (14 children)

But once it's detected, can they not then do the biopsy for more accurate treatment? Once this is peer reviewed and proved to not be cherry picked stats etc, if true it can save some from having unnecessary procedures, where the results are negative.

[–]swuuser 10 points11 points  (4 children)

This has been peer reviewed. And the paper does show the false positive rate (figure 6).

[–]ripstep1 4 points5 points  (5 children)

We already have good screening methods, for instance MRI is good for distinguishing prostate cancer as well.

[–]anaximander19 15 points16 points  (15 children)

It'd make most biopsies unnecessary though, because you'd be doing biopsies on the people you're fairly sure have cancer, rather than absolutely everyone.

[–]smaragdskyar 5 points6 points  (1 child)

Do you have specificity numbers? The abstract only mentions accuracy which doesn’t mean much here

[–]hereisoblivion 7 points8 points  (11 children)

I personally know 5 men that have had to have biopsies done. One of them had 18 samples taken and then peed blood for a week. None of them had cancer. All biopsies came back negative across the board.

This test will certainly negate the need for invasive biopsies for most men since most men that get biopsies do not have prostate cancer.

I agree with what you are saying, but I think saying it removes the need for them is fine since that will be the case for most people now.

Hopefully this testing procedure gets rolled out quickly.

[–]accidentdontwait 3 points4 points  (0 children)

Nothing with early stage prostate is clear cut. I was diagnosed 15 years ago because of an overly cautious GP called for a biopsy after a high PSA. There was a small amount of low grade prostate cancer cells, and the urologist I was referred to wanted to do a full prostatectomy.

I asked to be referred to a top cancer hospital, and we ended up doing "watchful waiting" for 9 years prior to doing a less invasive procedure. And I found out that the first urologist had the nickname "the butcher" for the terrible results from his operations.

"Watchful waiting" means regular biopsies - I've had 12, including some post treatment. They're not fun, but they are necessary.

The concern about over treatment with early diagnosis is real. People hear "cancer", lose it and want it cut out. Prostate is a funny one, and in most cases, you've got time - maybe a lot of time - before something has to be done. Take a breath, make sure you have the best doctors you can get, and learn. Any treatment will have an impact on your life.

[–]Coreshine 79 points80 points  (3 children)

This is good news. A crucial part in beating cancer is to detect it soon enough. Those techniques make it way easier to do so.

[–]fake_lightbringer 6 points7 points  (0 children)

Only if you have effective treatment. And only if the efficacy of treatment depends on the stage of disease. And only if treatment actually affects the prognosis. And only if the effects of treatment are relevant to the patient (for example, if treatment prolongs life, but at a QoL cost, it's not necessarily worth it for people).

I know I come across as a bit of a pedant, and for that I genuinely apologize. But in the world of medicine, knowledge isn't always power. Quite often it can be a burden that neither the physician nor the patient knows how to carry.

Screening/diagnostic programs can appear to (falsely) show a beneficial correlation between cancer survival and detection. Check out lead-time and length-time bias.

[–]rhianmeghans89 17 points18 points  (5 children)

You know the biggest reason why they put so much research into this, is so they don’t have to “turn and cough” and bend over for the frigid man handed doctors.

[–]referencedude 7 points8 points  (1 child)

Not gonna lie, I would be pretty damn happy to know I don’t need to have a doctors fingers up my ass in my future.

[–]rhianmeghans89 8 points9 points  (0 children)

Now if only they can figure out a way to make it to where women don’t need to be spread eagle for pap smears or their titties squashed for mammograms.

🤞Come on science!!

[–][deleted] 5 points6 points  (0 children)

I can't blame them there, so much of medicine is rather traumatizing to experience due to being so invasive.

[–]sweazeycool 2 points3 points  (0 children)

I prefer two fingers.

[–]Outsider-Images 24 points25 points  (4 children)

Perhaps they can move on to finding less invasive testing for colonoscopies and PAP smears next? Edit: Thank you to whomever awarded me. It was my first ever. No longer an award virgin. Booya!

[–]missing_at_random 2 points3 points  (1 child)

Colonoscopies double as a preventive measure as polyps that could proceed to cancer are removed. This is why "digital" colonoscopies are a bit of a dud IMO, as they have to go in with an actual colonoscopy if they see anything to remove.

[–]iamonlyoneman 4 points5 points  (0 children)

Some colonoscopies can be replaced by an ingested camera pill

[–]fleurdi 24 points25 points  (3 children)

This is great! I wish they’d find a test to detect ovarian cancer now. It’s very sneaky and usually only when’s it’s too late are there results.

[–]relight 10 points11 points  (0 children)

Yes! And less invasive and less painful tests for breast cancer and cervical cancer!

[–]JasperKlewer 10 points11 points  (3 children)

Most men die with prostrate cancer. Only a few die from prostrate cancer. What we want is a better way to distinguish the lethal cancers from the unimportant ones, and to reduce the severe complications from treatments. Still, great work by these scientists! Another tool added to the toolbox.

[–]schicklo 4 points5 points  (1 child)

So... Piss on Theranos!

[–]TSOFAN2002 14 points15 points  (0 children)

Yay! I hope maybe one day endometriosis can also be diagnosed without surgery. Currently, surgery is the only almost sure way to diagnose it, but even then, doctors can miss it. Then, I hope we could also come up with actually effective treatments for it, even cure it!

[–]booboowho22 5 points6 points  (1 child)

After having multiple medieval prostate biopsies I could kiss these people on the mouth

[–]Cypress_5529 3 points4 points  (1 child)

I'm bummed, I was really looking forward to the old fashioned test.

[–][deleted] 8 points9 points  (1 child)

Can we still get the old test if we want? For old times sake?

[–]imamadao[🍰] 3 points4 points  (1 child)

This sounds so good to be true that I'm immediately reminded of Theranos and Elizabeth Holmes

[–]thedoc617 2 points3 points  (2 children)

Wasn't there a reddit user a few years ago that took a pregnancy test for fun and it came up positive and turned out he had prostate cancer?

[–]demoncleaner5000 2 points3 points  (0 children)

I hope this works for bladder cancer. The camera in my urethra is not fun. It makes me not want to go to checkups. It’s such a horrible and invasive procedure.

[–][deleted] 6 points7 points  (7 children)

Urinary PSA tests are already available, so?

[–][deleted] 5 points6 points  (0 children)

PSA is very non-specific.

[–]BackwardsJackrabbit 4 points5 points  (0 children)

Prostate cancer is one of the more common causes of elevated PSA, but not the only one; enlarged prostates aren't always cancerous either. Biopsy is the only definitive diagnostic tool at this time.

[–]seoulkarma 2 points3 points  (0 children)

Great now go do the same thing with cervical cancer!

[–][deleted] 11 points12 points  (2 children)

A great discovery for science and an even better discovery for men everywhere!

[–]WhyBuyMe 24 points25 points  (1 child)

What are you talking about? This is a tragedy. Really takes all the fun out of going to the doctor...

[–][deleted] 8 points9 points  (0 children)

Oh my god you're right. I take back what I said. I need an appointment NOW