SCI
3 October 2024
A pathology foundation model for cancer diagnosis and prognosis prediction
(Nature, IF: 50.5)
Xiyue Wang, Junhan Zhao, Eliana Marostica, Wei Yuan, Jietian Jin, Jiayu Zhang, Ruijiang Li, Hongping Tang, Kanran Wang, Yu Li, Fang Wang, Yulong Peng, Junyou Zhu, Jing Zhang, Christopher R. Jackson, Jun Zhang, Deborah Dillon, Nancy U. Lin1, Lynette Sholl1, Thomas Denize, David Meredith, Keith L. Ligon1, Sabina Signoretti, Shuji Ogino, Jeffrey A. Golden, MacLean P. Nasrallah, Xiao Han, Sen Yang & Kun-Hsing Yu
CORRESPONDENCE TO: sen.yang.scu@gmail.com; Kun-Hsing Yu@hms.harvard.edu
Histopathology image evaluation is indispensable for cancer diagnoses and subtype classification. Standard artificial intelligence methods for histopathology image analyses have focused on optimizing specialized models for each diagnostic task. Although such methods have achieved some success, they often have limited generalizability to images generated by different digitization protocols or samples collected from different populations. Here, to address this challenge, we devised the Clinical Histopathology Imaging Evaluation Foundation (CHIEF) model, a general- purpose weakly supervised machine learning framework to extract pathology imaging features for systematic cancer evaluation. CHIEF leverages two complementary pretraining methods to extract diverse pathology representations: unsupervised pretraining for tile-level feature identification and weakly supervised pretraining for whole-slide pattern recognition. We developed CHIEF using 60,530 whole-slide images spanning 19 anatomical sites. Through pretraining on 44 terabytes of high- resolution pathology imaging datasets, CHIEF extracted microscopic representations useful for cancer cell detection, tumour origin identification, molecular profile characterization and prognostic prediction. We successfully validated CHIEF using 19,491 whole-slide images from 32 independent slide sets collected from 24 hospitals and cohorts internationally. Overall, CHIEF outperformed the state-of-the-art deep learning methods by up to 36.1%, showing its ability to address domain shifts observed in samples from diverse populations and processed by different slide preparation methods. CHIEF provides a generalizable foundation for efficient digital pathology evaluation for patients with cancer.
组织病理学图像评估对于癌症的诊断和亚型分类是必不可少的。组织病理学图像分析的标准人工智能方法侧重于优化每个诊断任务的专用模型。尽管这些方法取得了一些成功,但它们对不同数字化协议生成的图像或从不同人群收集的样本的通用性往往有限。在此,为了应对这一挑战,我们设计了临床病理成像评估基金会(CHIEF)模型,这是一个通用的弱监督机器学习框架,用于提取病理成像特征,用于系统的癌症评估。CHIEF利用两种互补的预训练方法来提取不同的病理表征:用于瓦片级特征识别的无监督预训练和用于全玻片模式识别的弱监督预训练。我们使用19个解剖部位的60530张全载玻片图像开发了CHIEF。通过对44 TB的高分辨率病理成像数据集进行预训练,CHIEF提取了可用于癌症细胞检测、肿瘤起源鉴定、分子图谱表征和预后预测的显微镜表征。我们使用从国际24家医院和队列收集的32组独立幻灯片中的19491张全幻灯片图像成功验证了CHIEF。总体而言,CHIEF的表现优于最先进的深度学习方法高达36.1%,表明其能够解决来自不同人群的样本中观察到的领域变化,并通过不同的载玻片制备方法进行处理。CHIEF为癌症患者有效的数字化病理评估提供了可推广的基础。