Skip to content
Precision for Medicine

Announcing the Institute@Precision, a Collaborative Forum for Biopharma Innovation.

Reflect on 20 Years of Cancer Research with MIRROR Biospecimens

Reflect on 20 Years of Cancer Research with MIRROR Biospecimens

Matched Oncology Biospecimens Reflecting the TCGA Legacy

MIRROR (Matched & Integrated Repository for Rediscovered Oncology Research) is a novel oncology cohort derived from the matched “twin” tissue blocks of The Cancer Genome Atlas (TCGA) and the Expression Project for Oncology (expO).

The MIRROR cohort bridges 20 years of biomedical innovation, linking digital data to physical biospecimens, and enables researchers to re-examine the tumors that defined precision oncology with next-generation tools such as spatial biology, proteomics, and AI-powered pathology.

Reflecting on 20 Years of Cancer Genomics

Over the last two decades, biomedical innovation has transformed how we study, diagnose, and treat cancer. Precision medicine begins with the individuals whose donated samples continue to drive discovery. Many breakthroughs in cancer research begin with donor generosity and the datasets those contributions made possible, capturing not only disease but the full diversity of cancer biology.

expO Is A Public Dataset That Laid the Groundwork for TCGA and Precision Oncology

Starting in 2005, the expO set out to do something revolutionary. Spearheaded by the International Genomics Consortium (IGC), expO aimed to capture the complexity of cancer in data form, building one of the first large-scale, clinically annotated gene expression repositories for oncology research. Long before RNA sequencing became the norm, scientists were meticulously collecting tumor tissue from thousands of patients, running each through the same Affymetrix microarray platform to measure which genes were turned “on” or “off.”

What made expO groundbreaking wasn’t just its size (2,000+ tumor samples spanning more than a dozen cancer types), but its openness. Every data point and every clinical annotation was released into the public domain without restriction. Anyone, anywhere, could access it. This simple but radical act of transparency made expO one of the most influential datasets in cancer genomics, empowering researchers to discover new biomarkers, define molecular subtypes, and reimagine how we study the disease.

In many ways, expO became the spark that lit the fire for projects like TCGA, which expanded the model to include genomic sequencing, proteomics, and beyond.

From the completion of the human genome to the rise of AI and spatial biology, oncology has advanced through a series of defining milestones:
  • 2005 | expO: One of the first large-scale multi-cancer expression datasets launched by the IGC.
  • 2006–2011 | TCGA: Sequenced 11,000+ tumors across 33 cancer types, creating the data backbone of precision oncology.
  • 2011–2015 | The Immunotherapy Revolution: Checkpoint inhibitors (CTLA-4, PD-1/PD-L1) transformed cancer treatment.
  • 2014–2020 | Precision and Spatial Biology: Next-generation sequencing, liquid biopsy, and single-cell omics revealed tumor architecture in unprecedented detail.
  • 2022 | The Telomere-to-Telomere Human Genome: New sequencing technologies allowed researchers at the National Institute of Genome Research to fill in almost all of the “gaps” of the first draft of the original human genome established by the Human Genome Project.
  • 2020–Present | AI and Multi-Omics Integration: Deep learning models now integrate genomics, proteomics, and histopathology to uncover new biological insights.

True scientific progress is built upon growth. As Issac Newton once said, “If I have seen further, it is by standing on the shoulders of giants.” In cancer research, one of the tallest giants is TCGA.

The Evolution of Modern Cancer Research

The Immunotherapy Revolution

Checkpoint inhibitors targeting CTLA-4 and PD-1/PD-L1 redefine cancer care, unlocking the immune system to fight tumors.

  • First major proof that modulating immune checkpoints can produce durable responses
Precision & Spatial Biology

Advances in next-generation sequencing (NGS), liquid biopsy, and single-cell omics illuminate the tumor microenvironment and spatial heterogeneity.

  • Tumors viewed not as static masses but as dynamic ecosystems. 
The Telomere-to-Telomere Human Genome

The National Institute of Genome Research completes the once-unfinished Human Genome, closing gaps left since 2003.

  • A complete genomic reference opens new doors for rare disease and cancer genomics. 
AI & Multi-Omics Integration

Deep learning unites genomics, proteomics, and histopathology, accelerating biomarker discovery and translational insight.

  • AI reveals complex biological patterns once invisible to human eyes. 

How TCGA Transformed Cancer Research into Big Data

Launched in 2006 as a collaboration between the NIH and NCI, TCGA established the genomic blueprint for cancer biology. By generating and publicly sharing terabytes of multi-omic data (e.g., genomic, transcriptomic, epigenomic, and clinical), It is not an understatement to say that this dataset reshaped oncology research. To date, TCGA has fueled tens of thousands of publications, serving as the reference point for biomarker discovery and precision medicine development.

While the project formally concluded sample collection in 2017, the data it generated continues to power discovery across the world. The TCGA is currently hosted through the Genomic Data Commons (GDC), a publicly accessible repository maintained by the National Cancer Institute. Within it are multi-omic datasets from over 11,000 tumors across 33 cancer types, including DNA and RNA sequencing, methylation, proteomics, and detailed clinical annotations. These datasets are continuously harmonized and reprocessed with new pipelines, ensuring researchers can analyze legacy data with modern bioinformatic tools.

Today, the TCGA is a digital ecosystem without a physical counterpart. The original tissue blocks that generated its data were distributed across dozens of institutions, many now fully exhausted or unavailable for experimental reuse. Scientists can analyze the data endlessly, but they can’t go back to the materials slides, sections, or tumor microenvironments themselves.

Introducing MIRROR: The Biological Reflection of TCGA & expO

MIRROR restores access to the biology behind TCGA’s data. Derived from the matched “twin” tissue and biofluids of the original expO & TCGA tissues, MIRROR enables researchers to revisit these same tumors with modern tools, like AI pathology, spatial biology, and proteomics.

Where TCGA provided data, MIRROR provides material, including tissue, blood, and derivatives annotated with clinical and molecular context. It bridges the historic and the modern, transforming archived tissue into a platform for discovery.

Asking New Questions of Known Tumors

Two decades later, researchers can finally return to the same tumors that shaped modern oncology and ask questions TCGA’s era couldn’t answer:

  • Can new multiplex or spatial assays identify predictive biomarkers that correlate with historical biomarkers or genomic signatures?
  • Can we validate novel technologies using the original benchmark?
  • Will new preparation methods reveal new data?
  • Can spatial and AI models reclassify legacy samples with new biomarkers?
  • What patterns emerge when tissue morphology and multi-omics data are reconnected?

MIRROR makes these questions actionable by combining matched biospecimens with the full context of TCGA data, enabling reanalysis and rediscovery.

What’s Inside the MIRROR Cohort?

Each MIRROR case bridges historic data with modern discovery:

  • Matched tissues & biofluids from TCGA & expO donors
  • Linked clinical, histologic, and molecular annotations through associated platforms
  • Plasma, serum, and blood derivatives for liquid biopsy research
  • IRB-approved and ethically sourced material
  • Ready-to-ship inventory MIRROR turns the static dataset of TCGA and expO into a living, research-ready resource extending the reach of true multi-omics, advanced diagnostic research with matched biopsies & derivatives, and can be used to develop biotechnology tools for tomorrow.

The Science MIRROR Enables

Transform archived tissue into active discovery with the potential to reveal patterns, pathways, and possibilities invisible twenty years ago.

AI Pathology & Spatial Analysis

Train and validate deep learning models using histology linked to multi-omic and clinical data.

Biomarker & Drug Development

Discover and validate new predictive and mechanistic biomarkers to inform trial design and patient selection.

Liquid Biopsy Correlation

Connect circulating signals with tissue-level biology to improve early detection and monitoring.

Rare & Hard-to-Access Cancers

Investigate underrepresented indications mirrored from TCGA, expanding translational reach. 

A Rare Opportunity to Leverage the Power of Reflective Research

Science advances by returning to its roots. Just like how the completion of the human genome was revisited with new technologies in 2022, it’s time to revisit the dataset that transformed what we know about cancer today and view it through different lenses.

In honor of expO and TCGA’s legacy, MIRROR invites the scientific community to look deeper and to revisit the past with sharper tools, richer modalities, and renewed purpose. Because the clearest view of the future might just come from looking in the mirror.

Go where even the pioneers couldn’t with MIRROR matched biospecimens.