[Nature Biomedical Engineering] Towards A Generalizable Pathology Foundation Model via Unified Knowledge Distillation

Introduction

In pathology—the “gold standard” for cancer diagnosis—a global challenge of “AI specialization” has finally been overcome. Led by Professor Hao Chen from HKUST’s Smart Lab, a team collaborating with Southern Medical University, Shanghai AI Lab, and other institutions has developed GPFM, the first highly generalizable foundation model in pathology. This “all-round champion” outperforms competitors across 72 clinical tasks spanning six major categories (diagnosis, prognosis, etc.), with results recently accepted by Nature Biomedical Engineering, a top-tier biomedical engineering journal.

GPFM

I. The “Achilles’ Heel” of Pathology AI

Current pathology AI models suffer from “narrow expertise”: some excel at tissue classification but fail to interpret pathology reports, while others predict survival accurately yet cannot localize lesions. The team established the first evaluation framework covering 6 categories and 72 clinical tasks, revealing existing models’ average rank of just 3.7, with the best performer leading in only 6 tasks (see Figure 2). Such “specialist AIs” fall short in complex clinical scenarios, hindering smart healthcare adoption.

II. GPFM’s Three Breakthrough Innovations

  1. Pioneering a “Decathlon Arena” for Pathology Evaluation
    Covering 6 task types (whole-slide classification, survival prediction, pathology QA, etc.), this first-of-its-kind framework tests true generalizability.
  2. Knowledge Distillation: Harnessing Collective Wisdom
    Novel dual-engine framework (Figure 1.d):
    • Expert Distillation: Integrates strengths of top models (UNI, Phikon, etc.)
    • Self-Distillation: Achieves cross-scale alignment of tissue features
  3. Unprecedented-Scale Pretraining
    Trained on 190M image-level samples from 95K+ slides across 34 tissue types.

GPFM

III. Clinical Validation: Outperforming All Existing Models

1. Whole-Slide Image Classification: New Benchmark in Diagnostic Accuracy

Across 36 WSI classification tasks, GPFM achieved an average rank score of 1.22, surpassing runner-up UNI (3.60). Using AUC, GPFM reached a best-in-class average AUC of 0.891—1.6% higher than UNI (P < 0.001) (Figure 3).

GPFM

2. Survival Analysis & Prognosis: A New Compass for Clinical Decisions

In all 15 survival tasks, GPFM averaged rank 2.1 (Top-2 in 13 tasks), while UNI ranked 4.6 (best in only 4 tasks). Evaluated by C-Index, GPFM again led with 0.665 (3.4% gain over UNI; P < 0.001), demonstrating superior generalizability (Figure 4).

GPFM

3. ROI Classification

Across 16 ROI tasks, GPFM achieved the best average rank (1.88 vs. Prov-Gigapath’s 3.09) and highest mean AUC (0.946, +0.2% over Prov-Gigapath; P < 0.001) (Figure 5).

GPFM

4. Additional Results

See original paper for pathology image retrieval, VQA, report generation, and more.

IV. Boundless Clinical Translation Potential

Powered by GPFM and mSTAR, the team developed SmartPath—a next-gen diagnostic system covering intraoperative modules for five high-incidence cancers (lung, breast, GI, etc.). Currently under active deployment, it promises to accelerate digital pathology adoption.

Full paper: [2407.18449] Towards A Generalizable Pathology Foundation Model via Unified Knowledge Distillation

Affiliations:
HKUST, Southern Medical University, Shanghai AI Lab, Sun Yat-sen University, CUHK, Kunming Medical University, Huazhong University of Sci & Tech

Citation:
Ma, J., Guo, Z., Zhou, F., et al. (2024). Towards A Generalizable Pathology Foundation Model via Unified Knowledge Distillation. arXiv:2407.18449.