Data Splitsย ๐Ÿ—ƒ๏ธ


Data is sampled into four splits, with the following use-cases:

  • Private/Sequestered Training Dataset (7607 cases):
    Used exclusively by the organizers to retrain the top-ranking 5 AI algorithms, with large-scale data, during theย Closed Testing Phase.

  • Hidden Testing Cohort (1000 cases):
    Used to determine the top 5 AI algorithms at the end of theย Open Development Phase. Used to benchmark AI, radiologists, and test all hypotheses at the end of theย Closed Testing Phase. Includes internal testing data (unseen cases from seen centers) and external testing data (unseen cases from an unseen center). A subset of 400 cases from this cohort is used to facilitate theย PI-CAI: Reader Study.

Imaging Data ๐Ÿฅ


The complete dataset used for the PI-CAI challenge comprises a cohort of 9000โ€“11,000 prostate MRI exams, curated from three Dutch centers {Radboud University Medical Center (RUMC), Ziekenhuis Groep Twente (ZGT), University Medical Center Groningen (UMCG)} and one Norwegian center {Norwegian University of Science and Technology (NTNU)}. Institutional review boards of all four centers have waived the need for informed patient consent, with respect to the retrospective scientific use of anonymized clinical data in this challenge.
All patient exams are of men suspected of harboring csPCa (e.g. due to elevated levels of PSA, abnormal DRE findings). Patients are included only if they do not have a history of treatment or prior ISUP โ‰ฅ 2 findings.
All patient exams include basic clinical variables {patient age, prostate volume, PSA level, PSA density} as reported in their diagnostic reports, basic acquisition variables {scanner manufacturer, scanner model name, diffusion b-value}, and bpMRI scans, acquired using Siemens Healthineers or Philips Medical Systems-based scanners with surface coils. Imaging consists of the following sequences:
  • Axial, sagittal and coronal T2-weighted imaging (T2W).
  • Axial high b-value (โ‰ฅ 1000 s/mmยฒ) diffusion-weighted imaging (DWI).
  • Axial apparent diffusion coefficient maps (ADC).
โš ๏ธAbsolute intensity values of ADC scans used in the PI-CAI challenge are not universal or clinically meaningful on their own (e.g., unlikeย Hounsfield units (HU) in CT scans, where -1000 HU will always indicate air), due to non-standardized acquisition protocols across centers and/or inconsistent image scaling (T.L. Chenevert et al., 2014). Furthermore, PI-RADS v2 recommends that absolute ADC values should be used with caution, as these can vary substantially depending on the value and number of b-values selected, the magnet strength, the vendor, and inter-patient variability (T. Barrett et al., 2015).

For theย Public Training and Development Datasetย and theย Private/Sequestered Training Dataset:
  • Every patient case will at least have three imaging sequences: axial T2W, axial DWI and axial ADC scans (i.e. files ending inย _t2w.mha,ย _hbv.mha,ย _adc.mha). Additionally, they can also have either, both or none of these optional imaging sequences: sagittal and coronal T2W scans (i.e. files ending inย _sag.mha,ย _cor.mhaย here). No patient case will includeย dynamic contrast-enhanced (DCE) sequences.

For theย Hidden Tuning Cohortย and theย Hidden Testing Cohort:

  • Every patient case will have exactly five imaging sequences: axial, sagittal and coronal T2W; axial DWI and axial ADC scans (i.e. files ending inย _t2w.mha,ย _sag.mha,ย _cor.mha,ย _hbv.mha,ย _adc.mhaย here). For part of theย Hidden Testing Cohort, DCE sequences will only be available toย radiologists participating in the PI-CAI: Reader Study. But they will not be available for AI algorithms, within the context of this grand challenge, at any given stage.

To dive deeper into the clinical significance of different prostate MRI sequences, and why they are useful for csPCa detection/diagnosis, feel free to have a look at:


Clinical and Scanner Information ๐Ÿงช


For the Public Training and Development Dataset and the Private/Sequestered Training Dataset:
  • PSAโฐ, prostate volumeโฐ, PSA densityโฐ, patient age^, MRI scanner manufacturer^, MRI scanner model name^ and diffusion b-value of the high b-value DWI/HBV scan^, will be available to every AI algorithm per case.

For the Hidden Tuning Cohort and the Hidden Testing Cohort:

  • PSA^, prostate volume^ยน, PSA density^ยฒ, patient age^, MRI scanner manufacturer^, MRI scanner model name^ and diffusion b-value of the high b-value DWI/HBV scan^, will be available to every AI algorithm per case.

โฐย available, if value is reported during clinical routine
ยน if value is not reported during clinical routine, it is retrospectively calculated by an expert radiologist
ยฒ if value isย not reportedย during clinical routine, it is retrospectivelyย calculated from the PSA and prostate volume
^ always available


Image Registration๐ŸŽš๏ธ


Imaging sequences (T2W, DWI, ADC) for each case in the Public Training and Development Dataset and the Private/Sequestered Training Dataset are not co-registered. Although, the vast majority are reasonably well-aligned, there are several cases with substantial deviations. We expect all participants to handle this in their algorithm design as they best see fit (if deemed necessary), to incentivize the development of automatic co-registration methods or AI models that are invariant to training on misaligned sequences. We believe that this is the only way of developing AI models using thousands of cases, as manual registration is too labour intensive at scale.
However, we can confirm that all sequences for each case in the Hidden Tuning Cohort and the Hidden Testing Cohort will be co-registered by the organizers (given that we only want to evaluate diagnostic performance, and thereby try to minimize the effects of external factors). Manual registration, when deemed necessary, is performed using ITK-SNAP v3.80 (rigid transformation with six degrees of freedom for 3D translation and rotation).


Annotations โœ๏ธ


Annotations for the Private/Sequestered Training Dataset, Hidden Tuning Cohort and Hidden Testing Cohort will not be released publicly. Annotations for the Public Training and Development Dataset have been released and maintained via: github.com/DIAGNijmegen/picai_labels

Human Expert-Derived Annotations
Voxel-level csPCa lesion annotations are delineated and/or patient-level csPCa outcomes are recorded, by one of 10 trained investigators or 1 radiology resident, under supervision of one of 3 expert radiologists, at RUMC, UMCG or NTNU. Each annotation is derived using all available MRI scans, diagnostic reports (radiology, pathology) and whole-mount prostatectomy specimen (if applicable). Lesion delineations are created using ITK-SNAP v3.80.
Out of the 1500 cases shared in the Public Training and Development Dataset, 1075 cases have benign tissue or indolent PCa (i.e. their labels should be empty or full of 0s) and 425 cases have csPCa (i.e. their labels should have lesion blobs of value 2, 3, 4 or 5). Out of these 425 positive cases, only 220 cases carry an annotation derived by a human expert. Remaining 205 positive cases have not been annotated. In other words, only 17% (220/1295) of the annotations provided in picai_labels/csPCa_lesion_delineations/human_expert should have csPCa lesion annotations, while the remaining 83% (1075/1295) of annotations should be empty. This is intentional, because as it is practically infeasible to annotate all lesions at the scale of the Private/Sequestered Training Dataset (7607 cases). Hence, we encourage participants to develop methods that can account for or figure out how to use non-annotated cases in the Public Training and Development Dataset as well.
Human expert-derived csPCa annotations have been provided for the Public Training and Development Dataset via the picai_labels repo in two formats:


AI-Derived Annotations
At RUMC, we deal with non-annotated training cases with a semi-supervised learning strategy (Bosma et al., 2022). We have released AI-derived csPCa lesion annotations for all 1500 cases in the Public Training and Development Dataset (picai_labels/csPCa_lesion_delineations/AI), using this method. Participants can choose to use these AI-derived annotations for non-annotated training cases or use their own methodology for the same. In a similar manner (see algorithm), we have also released AI-derived whole-gland segmentations of the prostate forย all 1500 casesย in theย Public Training and Development Dataset:ย 
picai_labels/anatomical_delineations/whole_gland/AI


Reference Standard ๐Ÿงฌ


Hidden Tuning and Testing Cohorts
For accurate validation of AI and human-reader performance, and in turn, to substantiate any conclusions derived from PI-CAI, a strong reference standard for csPCa is crucial. The PI-CAI reference standard aims to utilize the best possible evidence to define the ground-truth for every case in the validation and testing cohorts, i.e. histologically-confirmed (ISUP โ‰ฅ 2) positives, and histopathology (ISUP โ‰ค 1) or MRI (PI-RADS โ‰ค 2) negatives, with follow-up (โ‰ฅ 3 years), as detailed below:
  • Patients with negative MRI (i.e. benign or carrying PI-RADS 1โ€“2 lesions) generally do not undergo biopsies or RP and lack histologically-confirmed evidence for the absence of csPCa. It is likely that they do not harbor csPCa, but a small percentage (<1% at RUMC; Venderink et al., 2019) can still be missed. To alleviate this, upto 40% of the validation and testing cohorts is composed of multi-center patient data from the 4M cohort (van der Leest et al., 2019), where all patients with negative MRI had received systematic biopsies and subsequent grading was supervised by an expert uropathologist (> 25 years of experience). In other words, by using data from the 4M cohort, we are able to acquire histopathology evidence for a large fraction of the patient population, that is encountered, but typically not histologically-confirmed during clinical routine.
  • Biopsies alone can still be prone to undersampling csPCa, especially in the case of smaller lesions (Srivastava et al., 2019). Hence, all negative cases (negative MRI and/or histopathology) in the validation and testing cohorts are confirmed with follow-up data (e.g. using the national Dutch Pathology Registry (PALGA) for centers based in The Netherlands). Negative patient exams found to be positive (via MRI or histopathology) in โ‰ฅ 3 years of follow-up, were inspected with an expert radiologist for retrospective signs of potentially missed csPCa. If the presence of csPCa can be definitively confirmed, they are included as positive cases; otherwise, they are excluded. Negative patient exams with 100% csPCa diagnosis-free survival (DFS) after at least 3 years, are included.

Training Datasets
Patient cases used for the training datasets of PI-CAI are annotated with the same reference standard as used for the ProstateX challenge, i.e. histopathology (ISUP โ‰ฅ 2) positives, and histopathology (ISUP โ‰ค 1) or MRI (PI-RADS โ‰ค 2) negatives, without follow-up.

Figure. Typical workflow used to establish the ground-truth for each lesion in the hidden validation and testing cohorts. If systematic biopsies (SysBx) were performed in addition to MRI-targeted biopsies (MRBx), then SysBx findings are only used to upgrade the ISUP score not downgrade. If RP is performed, its corresponding findings supersede that of any prior histopathology or radiology findings. Cases for which pathology findings cannot be localized on MRI (e.g. MRI-invisible lesions without prostatectomy specimen, SysBx diagnostic reports with ambiguous or missing location information) are excluded.