๐ If you are using any of the resources (e.g., dataset, source code repositories, AI models, benchmarks, outcomes) associated with the PI-CAI challenge, please cite the following article as reference:
A. Saha, J. S. Bosma, J. J. Twilt, B. van Ginneken, A. Bjartell, A. R. Padhani, D. Bonekamp, G. Villeirs, G. Salomon, G. Giannarini, J. Kalpathy-Cramer, J. Barentsz, K. H. Maier-Hein, M. Rusu, O. Rouviรจre, R. van den Bergh, V. Panebianco, V. Kasivisvanathan, N. A. Obuchowski, D. Yakar, M. Elschot, J. Veltman, J. J. Fรผtterer, M. de Rooij, H. Huisman, and the PI-CAI consortium. โArtificial Intelligence and Radiologists in Prostate Cancer Detection on MRI (PI-CAI): An International, Paired, Non-Inferiority, Confirmatory Studyโ. The Lancet Oncology 2024; 25(7): 879-887.
¶
For more, check out editorials and perspective pieces discussing the outcomes of the PI-CAI challenge as published in Nature Reviews Urology, The Lancet Oncology, European Urology,ย American Journal of Roentgenology,ย Radiology, Radiology: Imaging Cancer and Radiology: Artificial Intelligence.¶
Clinical Problem๐ฏ¶
Diagnosing Prostate Cancer is Difficult (Even for Radiologists) ๐ฅ¶
Prostate cancer (PCa) is one of the most prevalent cancers in men. One million men receive a diagnosis and 300,000 die from clinically significant PCa (csPCa) (defined as ISUP โฅ 2 cancer) each year, worldwide. Multiparametric magnetic resonance imaging (mpMRI) is playing an increasingly important role in the early diagnosis of prostate cancer, and has been recommended by the 2019 European Association of Urology (EAU) guidelines and the 2019 UK National Institute for Health and Care Excellence (NICE) guidelines, prior to biopsies (Mottet et al., 2021). However, current guidelines for reading prostate mpMRI (i.e. PI-RADS v2.1) follow a semi-quantitative assessment, mandating substantial expertise for proper usage. Moreover, prostate cancer can exhibit a broad range of clinical behavior and highly heterogeneous morphology in MRI. As such, assessments are susceptible to low inter-reader agreement (<50%), sub-optimal interpretation and overdiagnosis (Rosenkrantz et al., 2016, Westphalen et al., 2020). ¶
Unlike the mpMRI protocol, biparametric MRI (bpMRI) does not include dynamic contrast-enhanced imaging โthereby reducing costs, eliminating any risk of adverse effects from the use of contrast agents, and shortening examination times (Turkbey et al., 2019). Thus, despite providing less diagnostic information than mpMRI (de Rooij et al., 2020), bpMRI is more suitable within the scope of high-volume, population-based screening (Eklund et al., 2021).¶
¶
Figure. The challenge of discriminating csPCa due to its morphological heterogeneity. (a-b) T2-weighted imaging (T2W), (c-d) high b-value diffusion-weighted imaging (DWI) and (e-f) apparent diffusion coefficient (ADC) maps constituting prostate bpMRI scans for two different patients are shown above, where yellow contours indicate csPCa lesions. While one of the patients had large, severe csPCa developing from both ends (top row), the other was afflicted by a single, relatively focal csPCa lesion surrounded by perceptually similar nodules of benign prostatic hyperplasia (BPH) (bottom row). Probability density functions (right) reveal a large overlap between the distributions of malignant and non-malignant prostatic tissue intensities across all three MRI channels. (excerpt from Saha et al., 2021)¶
Need for Larger Datasets and Adequate Benchmarks ๐¶
Modern artificial intelligence (AI) algorithms have paved the way for powerful computer-aided detection and diagnosis (CAD) systems that rival human performance in medical image analysis (Esteva et al., 2017, McKinney et al., 2020). Clinical trials are the gold standard for assessing new medications and interventions in a controlled and comparative manner, and the equivalent for developing AI algorithms are international competitions or โgrand challengesโ. Grand challenges can address the lack of trust, scientific evidence and adequate validation among AI solutions (Leeuwen et al., 2021), by providing the means to compare algorithms against each other in a bias-free manner, using common training and testing data. Prior to the release of this challenge, the only public benchmark of csPCa detection/diagnosis was the ProstateX Challenge, which used a testing set of 140 mpMRI exams to evaluate and compare AI algorithms. However, its small sample size, limited diversity (all cases from the same center and MRI vendor) and weak evaluation format (with publicly available, as opposed to truly โunseenโ testing images), limited the ability to reliably draw out definitive conclusions. ¶
The PI-CAI Challenge ๐ฉโโ๏ธ๐งโ๐ป¶
PI-CAI (Prostate Imaging: Cancer AI) is an all-new grand challenge, with over 10,000 carefully-curated prostate MRI exams to validate modern AI algorithms and estimate radiologistsโ performance at csPCa detection and diagnosis. Key aspects of the study design have been established in conjunction with an international, multi-disciplinary scientific advisory board (16 experts in prostate AI, radiology and urology) โ โto unify and standardize present-day guidelines, and to ensure meaningful validation of prostate-AI towards clinical translation (Reinke et al., 2022).¶
The 2022 edition of PI-CAI will focus on validating AI at automated 3D detection and diagnosis of csPCa in bpMRI.
PI-CAI primarily consists of two sub-studies:¶
-
AI Study (or Grand Challenge): An annotated multi-center, multi-vendor dataset of 1500 bpMRI exams (including their basic clinical and acquisition variables) is made publicly available for all participating teams and the research community at large. Teams can use this dataset to develop AI models, and submit their trained algorithms (in Docker containers) for evaluation. At the end of this open development phase, all algorithm are ranked, based on their performance on a hidden testing cohort of 1000 unseen scans. In the closed testing phase, organizers retrain the top-ranking 5 AI algorithms using a larger dataset of 9107 bpMRI scans (including additional training scans from a private dataset). Finally, their performance is re-evaluated on the hidden testing cohort (with rigorous statistical analyses).¶
-
Reader Study: 50+ international prostate radiologists perform a reader study using a subset of 400 scans from the hidden testing cohort. For each case, radiologists complete their assessments in two rounds. At first, using basic clinical and acquisition variables + bpMRI sequences only, enabling head-to-head comparisons against AI trained on the same. And then, using basic clinical and acquisition variables + full mpMRI sequences, enabling comparisons between AI and current clinical practice (PI-RADS v2.1). Overall, the goal of this study is to estimate the performance of the average radiologist at detection and diagnosis of csPCa in MRI.¶
In the end, PI-CAI aims to benchmark state-of-the-art AI algorithms developed in the grand challenge, against prostate radiologists participating in the reader study โto evaluate the clinical viability of modern prostate-AI solutions at csPCa detection and diagnosis in MRI.¶
Prizes ๐¶
Top 5 prostate-AI teams (participating in both phases of the challenge) will be invited to join the PI-CAI consortium, and in turn, they will be listed as consortium authors on an upcoming high-impact journal paper summarizing the findings of this challenge. Furthermore, they will receive the following cash prizes, as per their ranking:¶
๐ฅ1st place: โฌ1000
๐ฅ2nd place: โฌ500
๐ฅ3rd place: โฌ250
๐ต๏ธ4th place: โฌ150
๐ต๏ธ5th place: โฌ100