Home..

Cancer Microbiome Drama

The idea that microbial DNA within human tumors could help diagnose cancer gained a lot of attention with the 2020 Nature paper by Gregory Poore and colleagues. By analyzing unmapped reads from cancer sequencing data in The Cancer Genome Atlas (TCGA), they identified microbial patterns they claimed could classify cancer types with very high accuracy. This study, considered “groundbreaking,” quickly became a big point of interest and inspired further research, including a follow-up study by Hermida et al., which linked the microbiome to cancer treatment outcomes. Recent critiques from Steven Salzberg’s lab challenge the validity of Poore et al. their findings, stating they have greatly overestimated microbial signals due to a flawed approach.

As someone who has been working in cancer microbiome research myself, I was intrigued by Poore et al.’s approach and tried to recreate it—this time accounting for the methodological concerns raised by critics, including potential contamination and optimized removal of human reads.

When I explained the back-and-forth between labs in the cancer microbiome field to someone recently, they couldn’t help but laugh, saying it sounded like a soap opera with all its drama. In this blog post, I’ll provide an overview of the ongoing controversies and conflicts, along with some background information.


Main Drama

Poore et al.

Once upon a time there were Gregory Poore and his colleagues. from the lab of Rob Knight

In 2020, they published a paper called “Microbiome analyses of blood and tissues suggest cancer diagnostic approach” in Nature. In this paper they state that the microbiome found by metagenomic sequencing of various tumor types can used to be distinguish cancer types. They used data from The Cancer Genome Atlas (TCGA), a big cohort that characterized over 20,000 primary cancer and matched normal samples spanning 33 cancer types. A quick overview of their methods:

  1. Download reads that were not mapped (with BWA or Bowtie2, depending on analysis date) to a human reference genome (GRCh37 or GRCh38 depending on sequencing date) by TCGA from multiple cancer types
  2. Perform taxonomic classification with Kraken2, based on a database with all known (including drafts) bacterial, archaeal and viral genomes
  3. Decontaminate with four different approaches, resulting in four decontaminated datasets
  4. Perform supervised normalization with Voom
  5. Train stochastic gradient-boosting ML models based on the classified normalized data
  6. Use models to predict the cancer type of reads based on the classified data

This paper was seen as “groundbreaking”, and Poore and a few of his colleagues even started a biotech company based on the results called “Micronoma”.

Hermida et al.

In 2022, Hermida et al. published a paper named “Predicting cancer prognosis and drug response from the tumor microbiome” was published in Nature. This paper was a follow-up on the paper published by Poore et al, where they used the results of Poore et al. to build machine learning (ML) models that could link the microbiome to treatment outcomes. These ML models could predict the treatment outcome with almost the same accuracy as ML models based on gene expression data.

Gihawi et al.

In 2023, two pre-prints appeared on bioRxiv written by Gihawi et al., members of the lab of Steven Salzberg (Caution Regarding the Specificities of Pan-Cancer Microbial Structure and Major data analysis errors invalidate cancer microbiome findings). The first of the two pre-prints was published in January, and a more extended version was published in July. Both of the pre-prints were eventually published; the first one in Microbial Genomics, the second one in mBio. Both of these papers state the the results found by Poore et al. are invalid because they used very flawed methods, and so are the papers that build on their results (Hermida et al.). In brief, the main errors, according to them, are:

  1. Not removing human reads from the TCGA data properly
  2. Using a database that did not have a human reference genome included during taxonomic classification
  3. Using a database with draft bacterial genomes (which are contaminated with human DNA) during taxonomic classification
  4. Using normalized data where hidden signatures have been introduced for the ML models In my opinion, the paper is a really interesting read so I would recommend reading it, but I also provide a summary below:

Summary Major data analysis errors invalidate cancer microbiome findings

Rebuttal of Poore et al.

After the first pre-print of Gihawi et al. appeared, Poore et al. wrote an extensive rebuttal where they replicated the main findings from their initial study Poore et al. but only with genera that had proven biological relevance. In addition, they created a GitHub repo where they show the code they used to find cancer type-specific signals in the data used by Gihawi et al.. A newer version of this pre-print was published in February 2024: Robustness of cancer microbiome signals over a broad range of methodological variation In their response, they reference another more recent paper of their lab: Pan-cancer analyses reveal cancer-type-specific fungal ecologies and bacteriome interactions, in which they performed more or less the same analysis but with a focus on fungi instead of bacteria and viruses.

Retraction of Poore et al.

On 07-02-2024, an editors note was added to the Poore et al. paper that stated: “Readers are alerted that concerns have been raised about the data and conclusions presented in this article. Further editorial action will be taken once this matter has been resolved.”.

On 26-06-2024, the paper was officially retracted. The retraction note states that due to the Gihawi et al. paper, concerns about the robustness of the microbial signatures were brought to the attention of the Editors. Due to peer review of the raised issues and the authors’ responses, it was concluded that some of the findings of the article are affected and that the conclusions are no longer supported. The note also references the Rebuttal of Poore et al.

On 04-07-2024 an editors note was also added to the Hermida et al. paper that stated: “Readers are alerted that the authors of this study have informed the Editors that the starting input for data processing was downloaded from the online repository referenced in an article that has been recently retracted ( https://doi.org/10.1038/s41586-024-07656-x ). The Editors are working with the authors to address this issue. Further editorial action will be taken once this matter has been resolved.”

Chronological overview papers

Pre-prints not included

  1. Microbiome analyses of blood and tissues suggest cancer diagnostic approach
  2. Predicting cancer prognosis and drug response from the tumor microbiome
  3. Major data analysis errors invalidate cancer microbiome findings
  4. Robustness of cancer microbiome signals over a broad range of methodological variation
  5. Retraction Note: Microbiome analyses of blood and tissues suggest cancer diagnostic approach

Background information

Gregory Poore

Gregory Poore is a researcher in the lab of Rob Knight. Due to his work in the cancer microbiome field and a paper showing groundbreaking results he made it on the Forbes 30 under 30 list in 2023. He has published 21 papers, of which most are related to microbiology. Google scholar

Micronoma

A San-Diego-based biotech startup co-founded by Gregory Poore and colleagues, focusing on developing early cancer detection methods through microbiome-based biomarkers. They raised almost 18 million dollar to fund the company. The scientific foundation of this company is the Poore et al. “Microbiome analyses of blood and tissues suggest cancer diagnostic approach” paper. Since the paper has been retracted, the website of micronoma is no longer accessible.

Rob Knight

Rob Knight is a computational microbiologist based at the University of California, San Diego. He is among the top researchers in microbiome research and computational biology. He was involved in developing QIIME(2), UniFrac and UCHIME, and has over 430K citations. Google scholar

Steven Salzberg

Steven Salzberg, a professor at Johns Hopkins University, holds positions in both the Department of Biomedical Engineering and the Department of Computer Science. Salzberg is a pioneer in bioinformatics and genomics, with contributions to tools like Bowtie, the Tuxedo Suite for RNA-Seq analysis and Kraken(2). He has over 360K citations. Google scholar

© 2025 Birgit Rijvers   •