← All projects

TCGA Gene-Combination Survival Analysis

Comparing single-gene and combinatorial signatures across cancer endpoints

March 15, 2025Computational BiologyCompletedLead Researcher
PythonRTCGACox regressionKaplan-Meier
Project figurePlaceholder
TCGA Gene-Combination Survival Analysis

Context

Cancer biomarker discovery often begins with single-gene associations against survival endpoints. Yet biological systems are rarely single-gene phenomena — pathway interactions, compensatory mechanisms, and context-dependent expression all suggest that combinatorial signatures may capture prognostic signal that individual genes miss.

The Cancer Genome Atlas (TCGA) provides transcriptomic and clinical data across dozens of cancer types, making it a natural testbed for comparing these approaches systematically.

Question

Do combinatorial gene-expression signatures outperform single-gene markers for survival prediction across TCGA cancer types — and if so, under what conditions?

Method

We developed a pipeline to curate TCGA datasets, compute single-gene Cox proportional hazards models, enumerate combinatorial signatures using greedy forward selection, and compare performance using concordance index, log-rank tests, and cross-validation stability.

Result

Combinatorial signatures showed modest improvement over single-gene markers in a subset of cancer types, with the largest gains in transcriptionally heterogeneous cancers. Many combinatorial signatures failed to replicate across train/test splits — highlighting the gap between statistical significance and biological meaning.

Reflection

The more interesting question may not be "which genes predict survival" but under what biological conditions combinatorial signal reflects real pathway biology versus artifact.

Links