SCI

3 October 2024

A pathology foundation model for cancer diagnosis and prognosis prediction

(Nature, IF: 50.5)

  • Xiyue Wang, Junhan Zhao, Eliana Marostica, Wei Yuan, Jietian Jin, Jiayu Zhang, Ruijiang Li, Hongping Tang, Kanran Wang, Yu Li, Fang Wang, Yulong Peng, Junyou Zhu, Jing Zhang, Christopher R. Jackson, Jun Zhang, Deborah Dillon, Nancy U. Lin1, Lynette Sholl1, Thomas Denize, David Meredith, Keith L. Ligon1, Sabina Signoretti, Shuji Ogino, Jeffrey A. Golden, MacLean P. Nasrallah, Xiao Han, Sen Yang & Kun-Hsing Yu

  • CORRESPONDENCE TO: sen.yang.scu@gmail.com; Kun-Hsing Yu@hms.harvard.edu

Histopathology image evaluation is indispensable for cancer diagnoses and subtype classification. Standard artificial intelligence methods for histopathology image analyses have focused on optimizing specialized models for each diagnostic task. Although such methods have achieved some success, they often have limited generalizability to images generated by different digitization protocols or samples collected from different populations. Here, to address this challenge, we devised the Clinical Histopathology Imaging Evaluation Foundation (CHIEF) model, a general- purpose weakly supervised machine learning framework to extract pathology imaging features for systematic cancer evaluation. CHIEF leverages two complementary pretraining methods to extract diverse pathology representations: unsupervised pretraining for tile-level feature identification and weakly supervised pretraining for whole-slide pattern recognition. We developed CHIEF using 60,530 whole-slide images spanning 19 anatomical sites. Through pretraining on 44 terabytes of high- resolution pathology imaging datasets, CHIEF extracted microscopic representations useful for cancer cell detection, tumour origin identification, molecular profile characterization and prognostic prediction. We successfully validated CHIEF using 19,491 whole-slide images from 32 independent slide sets collected from 24 hospitals and cohorts internationally. Overall, CHIEF outperformed the state-of-the-art deep learning methods by up to 36.1%, showing its ability to address domain shifts observed in samples from diverse populations and processed by different slide preparation methods. CHIEF provides a generalizable foundation for efficient digital pathology evaluation for patients with cancer.

组织病理学图像评估对于癌症的诊断和亚型分类是必不可少的。组织病理学图像分析的标准人工智能方法侧重于优化每个诊断任务的专用模型。尽管这些方法取得了一些成功,但它们对不同数字化协议生成的图像或从不同人群收集的样本的通用性往往有限。在此,为了应对这一挑战,我们设计了临床病理成像评估基金会(CHIEF)模型,这是一个通用的弱监督机器学习框架,用于提取病理成像特征,用于系统的癌症评估。CHIEF利用两种互补的预训练方法来提取不同的病理表征:用于瓦片级特征识别的无监督预训练和用于全玻片模式识别的弱监督预训练。我们使用19个解剖部位的60530张全载玻片图像开发了CHIEF。通过对44 TB的高分辨率病理成像数据集进行预训练,CHIEF提取了可用于癌症细胞检测、肿瘤起源鉴定、分子图谱表征和预后预测的显微镜表征。我们使用从国际24家医院和队列收集的32组独立幻灯片中的19491张全幻灯片图像成功验证了CHIEF。总体而言,CHIEF的表现优于最先进的深度学习方法高达36.1%,表明其能够解决来自不同人群的样本中观察到的领域变化,并通过不同的载玻片制备方法进行处理。CHIEF为癌症患者有效的数字化病理评估提供了可推广的基础。

 

文献摘要
这篇文章介绍了一个用于癌症诊断和预后预测的病理基础模型CHIEF(Clinical Histopathology Imaging Evaluation Foundation)。该模型通过弱监督的机器学习框架,结合无监督和弱监督的预训练方法,从病理图像中提取癌症相关的特征,用于检测癌细胞、识别肿瘤来源、预测基因突变和评估患者的生存预后。

 

研究背景:

 

传统的人工智能方法通常针对特定任务进行优化,导致它们在不同的病理图像数字化协议和人群中缺乏通用性。CHIEF模型旨在解决这个问题,通过大规模预训练,提高不同病理数据集之间的泛化能力。

 

文章重点:

 

CHIEF模型解决了现有AI方法泛化能力差的问题,展示了在大规模病理图像数据集上的强大性能。

 

它不仅可以用于癌症诊断,还可以用于基因突变预测和患者生存预后的评估,具有广泛的应用前景。