MRI Datasets — Brain & Body Magnetic Resonance Data

MRI scan datasets use magnetic resonance imaging to produce high-resolution, multi-parametric images of soft tissue without ionizing radiation, making them essential for neuroimaging, musculoskeletal, abdominal, and oncologic AI research. Unlike CT, MRI acquires multiple contrasts from the same anatomy, and a complete MRI dataset typically includes several sequences: T1-weighted, T2-weighted, FLAIR, diffusion-weighted imaging (DWI) with apparent diffusion coefficient maps, gradient-echo and susceptibility-weighted imaging, and contrast-enhanced T1 with gadolinium. Functional and advanced techniques, including functional MRI (fMRI), diffusion tensor imaging (DTI), MR angiography, and spectroscopy, extend the modality further.

Data is stored as DICOM or research formats such as NIfTI, with sequence parameters (TR, TE, flip angle, field strength, voxel spacing) recorded in metadata, and is frequently organized following the Brain Imaging Data Structure (BIDS) convention. Brain MRI is the most common focus, supporting models for glioma and metastasis detection and segmentation, multiple sclerosis lesion quantification, stroke characterization, and neurodegenerative disease assessment; body MRI covers prostate, breast, liver, and musculoskeletal applications. Clinically valuable MRI datasets include expert voxel-level segmentation of tumors, lesions, and anatomical structures, radiologist-confirmed diagnoses, and standardized scoring such as PI-RADS for prostate or BI-RADS for breast.

Because signal intensity is not standardized across scanners, high-quality datasets document acquisition parameters and span multiple vendors and field strengths (1.5T and 3T) so models remain robust to domain shift. Rigorous de-identification strips PHI from headers and defaces or skull-strips brain volumes while preserving diagnostic detail. On GetDATA, researchers and medtech companies post MRI requests specifying anatomy, required sequences, annotation type (segmentation, bounding box, or study-level label), label taxonomy, field strength, and minimum case counts, and verified providers fulfill them with compliant, quality-scored MRI data.

Harmonization techniques such as intensity normalization and ComBat are frequently applied so that multi-site cohorts can be pooled without scanner-specific bias, and synthetic or accelerated-acquisition data is increasingly used to augment under-represented sequences and pathologies. Browse the open MRI requests below, or explore related cross-sectional imaging categories.

Open MRI requests

500 cardiac MRI studies with cine function analysis and late gadolinium enhancement for cardiomyopathy classification

Open

We are seeking a high-quality cardiac MRI (CMR) dataset for training and validating deep-learning models for automated biventricular segmentation, ejection fraction estimation, and myocardial fibrosis and scar burden quantification targeting clinical deployment in heart failure and cardiomyopathy care pathways. Required studies must originate from patients with confirmed or clinically suspected cardiomyopathy including dilated cardiomyopathy (DCM), hypertrophic cardiomyopathy (HCM), ischaemic cardiomyopathy (ICM), arrhythmogenic right ventricular cardiomyopathy (ARVC), or cardiac amyloidosis, as well as age-matched healthy volunteer controls, scanned on 1.5T or 3T scanners using ECG-triggered breath-hold or navigator-gated respiratory-compensated acquisition. Mandatory sequences per study: a short-axis cine balanced steady-state free precession (bSSFP) stack covering the full left and right ventricle from the atrioventricular plane to the apex in contiguous slices (slice thickness 8–10 mm, inter-slice gap 0–2 mm, temporal resolution ≤45 ms per cardiac phase with ≥25 cardiac phases reconstructed per slice, in-plane resolution ≤1.5 × 1.5 mm); long-axis cine views in two-chamber, three-chamber, and four-chamber orientations using identical bSSFP parameters; and phase-sensitive inversion recovery (PSIR) late gadolinium enhancement (LGE) sequences acquired 10–15 minutes following intravenous gadolinium injection at 0.1–0.2 mmol/kg standard extracellular agent, with inversion time (TI) determined individually per patient using a TI scout sequence to null normal myocardium (typically 240–320 ms at 1.5T and 280–360 ms at 3T), co-localised to the short-axis cine stack with matching slice positions. T1 mapping using MOLLI (5(3)3 scheme) or ShMOLLI acquisition both natively pre-contrast and post-contrast at 15 minutes for extracellular volume (ECV) fraction computation is optional but highly desirable and compensated at a 15% per-study premium. T2 mapping sequences using T2-prepared bSSFP with at least three echo times are also optional and add myocardial oedema characterisation capability. Delivery must be in DICOM format with complete sequence metadata including inversion time, flip angle, TR, TE, and temporal resolution intact in the DICOM header; NIfTI-2 with JSON sidecars is accepted as an additional format and is required when the institution uses a BIDS-compatible CMR archive. Annotation requirements include endocardial and epicardial contours of the left ventricle (LV) and right ventricle (RV) at end-diastole and end-systole on all short-axis cine slices from base to apex, generated by a trained cardiac sonographer or CMR technologist and verified by a board-certified cardiac radiologist or cardiologist with ≥3 years of CMR reading experience, sufficient to compute LV ejection fraction (LVEF), LV end-diastolic volume (LVEDV), LV end-systolic volume (LVESV), LV myocardial mass, RV end-diastolic volume (RVEDV), and RV ejection fraction (RVEF). Contours must be delivered as segmentation masks in NIfTI format or as polygon vertex coordinates in JSON. LGE burden must be annotated as a binary segmentation mask of LGE-positive scar versus healthy non-enhancing myocardium on the short-axis LGE stack, generated using a combination of semi-automated thresholding (6-SD above remote myocardium) and manual adjudication. LGE spatial pattern must be classified per study as ischaemic (subendocardial or transmural, coronary territory distribution) or non-ischaemic (midwall, epicardial, insertion point RV, or diffuse) to support cardiomyopathy aetiology classification downstream. Required clinical metadata: patient age, sex, body surface area, primary diagnosis code, NYHA functional class, LVEF from the clinical CMR report, NT-proBNP or BNP level if available, and presence of implantable cardiac device (pacemaker or ICD); cases with severe susceptibility artefact from device leads obscuring ≥20% of myocardial segments must be excluded from the training split. All imaging data must be fully de-identified per DICOM PS3.15 confidentiality profile with removal of patient name, birth date, accession number, and institution name; any incidentally acquired head MRI slices at the top of the short-axis stack that reveal facial structure must be defaced prior to delivery. This dataset will underpin a regulatory-grade CMR analysis platform targeting cardiologist adoption across European heart failure and cardiac imaging centres, with a planned CE marking submission under MDR 2017/745 as Class IIa medical device software and subsequent FDA 510(k) clearance for the North American market. Scanner vendor diversity across Siemens Magnetom, GE SIGNA, and Philips Ingenia platforms at both 1.5T and 3T is required with no single vendor exceeding 40% of the total study count.

Medical imagingMRIDICOMJSONNIfTI-2
0 / 500 scans0%

1,000 lumbar spine MRI studies annotated for disc herniation, foraminal stenosis, and Modic changes

Open

We are assembling a comprehensive lumbar spine MRI dataset to train radiomics and deep-learning models for automated grading of degenerative disc disease, disc herniation morphology classification, and neural foraminal stenosis severity assessment. Required cases include adult patients aged 25–80 years presenting with low back pain, radiculopathy, neurogenic claudication, or pre-operative surgical evaluation, scanned on 1.5T or 3T systems using a posterior phased-array spine coil. Field strength of 3T is preferred for its superior soft tissue contrast and improved nerve root visualisation within the neural foramen, but well-performed 1.5T studies with adequate SNR are fully acceptable. Mandatory sequences per study: sagittal T1-weighted with TR 400–700 ms, TE 10–15 ms, slice thickness ≤4 mm, FOV 250–300 mm, and matrix ≥256×256; sagittal T2-weighted with TR 3000–5000 ms, TE 80–120 ms and matching slice geometry to T1 for direct co-registration; and axial T2-weighted images acquired at L3-4, L4-5, and L5-S1 disc levels with slice thickness ≤4 mm and in-plane resolution ≤0.6 × 0.6 mm. Optional sequences valued for enhanced scientific utility include sagittal STIR for bone marrow oedema and inflammatory endplate disease; sagittal T2 fat-suppressed for epidural and ligamentous pathology; and post-gadolinium T1-weighted sagittal and axial sequences for post-operative epidural fibrosis assessment in revision surgery cases. DICOM is the required primary delivery format with all sequence parameters preserved in the DICOM header; NIfTI with JSON acquisition sidecars containing TR, TE, flip angle, slice thickness, and field strength is accepted as an equivalent alternative for institutions using BIDS-structured archives. Structured annotations must be provided per intervertebral disc level from L1-2 through L5-S1 covering the following elements: Pfirrmann degeneration grade I through V on sagittal T2 based on nucleus pulposus signal intensity and disc height; disc herniation morphology classified as none, annular bulge, focal protrusion, broad-based protrusion, extrusion with or without cranial or caudal migration, or sequestration using the 2014 NASS nomenclature; herniation zone classified as central, right or left paracentral, right or left foraminal, or far lateral; Modic endplate change type 0, I, II, or III bilaterally at each level; and neural foraminal stenosis grade 0 through 3 bilaterally at each level based on obliteration of the perineural fat signal. All per-level structured annotations must be delivered as JSON objects keyed by disc level and patient study identifier. Segmentation masks of herniated disc material on the axial T2 slices at the most affected level are optional and compensated at a 20% premium per annotated level. Bounding boxes around the primary herniation site on the most diagnostically informative axial slice are required for all extrusion and sequestration cases. Coronal reformations for scoliosis measurement are optional. De-identification must encompass removal of all 18 HIPAA-defined identifiers including patient name, date of birth, medical record number, accession number, device serial number, and geographic subdivisions smaller than state; scan dates must be shifted by a uniform per-patient random integer offset in the range of 1–365 days to prevent date-based triangulation. The intended clinical application is a decision support tool for spine surgeons, pain management physicians, and physiatrists that automatically generates structured disc-level reports from raw lumbar MRI input, reducing inter-radiologist reporting variability and turnaround time in high-volume practices. Post-operative cases with metallic fusion hardware may be included only if the artefact does not obscure the annotated disc levels and must be flagged with implant type in metadata.

Medical imagingMRIDICOMJSONNIfTI
0 / 1000 scans0%

600 prostate multiparametric MRI studies with PI-RADS v2.1 scores and lesion segmentation

Open

We are requesting a curated cohort of prostate multiparametric MRI (mpMRI) examinations to develop an AI-assisted detection and characterisation tool for clinically significant prostate cancer (csPCa, defined as Gleason grade group ≥2, equivalent to Gleason score ≥3+4=7). Each study must be acquired on a 3T scanner using a pelvic phased-array surface coil with ≥16 elements, or alternatively an endorectal coil with external pelvic array, and must include all three standard mpMRI components as specified in the PI-RADS v2.1 technical guidelines. T2-weighted imaging (T2WI) must be acquired in axial, coronal, and sagittal planes with slice thickness ≤3 mm, in-plane resolution ≤0.4 × 0.4 mm, TR ≥3000 ms, and TE 100–120 ms. Diffusion-weighted imaging (DWI) must include a minimum of b-values 0, 500, and 1000 s/mm², plus a computed or directly acquired high-b image at b=1400–2000 s/mm², and the corresponding apparent diffusion coefficient (ADC) map generated from the b0 and b1000 images. ADC maps are the dominant DWI parameter for peripheral zone scoring under PI-RADS v2.1 and must therefore be computed without signal-to-noise smoothing artefacts. Dynamic contrast-enhanced (DCE) imaging with intravenous gadolinium-based contrast agent (standard extracellular agent at 0.1 mmol/kg) must have temporal resolution ≤15 seconds per volume and total acquisition duration ≥5 minutes post-injection. MR spectroscopy (MRS) is optional and will be included as bonus data. Primary delivery format is DICOM with unmodified pixel data and sequence headers intact; NIfTI-2 with JSON metadata sidecars is also accepted and preferred for institutions already running BIDS-compatible research workflows. Each case must be annotated by a radiologist who reads ≥100 prostate mpMRI studies per year and is trained in PI-RADS v2.1 scoring. A PI-RADS score from 1 to 5 must be assigned per lesion with the dominant sequence scoring stated explicitly. For all PI-RADS 3–5 index lesions, a 3D segmentation mask delineated on the axial T2WI is required; a co-registered segmentation mask on the ADC map is strongly preferred to enable model training on both sequences simultaneously. Whole-gland prostate segmentation and zonal anatomy segmentation delineating the transition zone and peripheral zone are optional but will be purchased at a premium per case. Where available, MRI-TRUS fusion biopsy results including Gleason grade group and biopsy core location mapped to the sector model, or radical prostatectomy whole-mount pathology with sector correlation, should be provided as structured JSON or CSV metadata linked to the MRI lesion annotation. Required clinical fields include patient age, PSA level at time of MRI, PSA density (PSA divided by prostate volume), prostate volume on MRI, and prior prostate biopsy history. All data must be fully de-identified and defaced per GDPR Article 89 research exemption requirements and the DICOM PS3.15 confidentiality profile. Quality exclusion criteria include severe motion artefact on DWI, endorectal coil failure causing anterior gland signal loss, and b-value miscalculation producing erroneous ADC values outside the physiological range of 500–2000 µm²/s. This dataset will serve as training and validation data for a prostate cancer detection algorithm targeting radiologist workflow integration across European urology and radiology centres, with intended CE marking under MDR 2017/745 as a Class IIa medical device software. Cases from patients with prior treatment including external beam radiotherapy, brachytherapy, HIFU, or focal laser ablation should be flagged but are still included and scientifically valuable for post-treatment recurrence detection model development.

Medical imagingMRIDICOMJSONNIfTI-2
0 / 600 scans0%

800 knee MRI studies with radiologist annotations for ACL, meniscus, and cartilage pathology

Open

We are compiling a musculoskeletal MRI dataset focused on the knee joint to train and validate computer-aided detection models for common sports-medicine and orthopaedic pathologies. Required cases must include patients aged 16–70 presenting with acute or chronic knee symptoms such as instability, locking, or persistent pain, scanned on 1.5T or 3T systems using a dedicated transmit-receive or receive-only knee coil with ≥8 elements. Mandatory sequences per study: sagittal proton density fat-suppressed (PDFS) with slice thickness ≤3 mm, in-plane resolution ≤0.4 × 0.4 mm, TR 2500–4000 ms, and TE 30–40 ms; coronal PDFS with matching in-plane resolution; and axial PDFS or T2 fat-saturated (fat-sat) with slice thickness ≤3 mm. Preferred field strength is 3T, which provides superior cartilage signal-to-noise ratio and spatial resolution compared with 1.5T systems. Optional but highly valued sequences include 3D DESS (dual echo steady state) or 3D MEDIC for quantitative cartilage morphometry and T2 relaxation mapping, and sagittal T2-weighted without fat saturation for bone marrow oedema assessment and subchondral bone characterisation. T2 mapping with multi-echo spin echo acquisition (echo times 10, 20, 30, 40, 50, 60 ms) is desirable for cartilage matrix assessment and will be purchased at a 15% premium per study. Preferred delivery format is DICOM series with original pixel data and unmodified DICOM headers; PAR/REC format from Philips scanners with corresponding XML headers is also accepted. JSON sidecars with sequence parameters including TR, TE, flip angle, bandwidth, and reconstruction matrix are requested for all cases to enable automated sequence classification. Annotation requirements: each case must carry a structured radiology report or structured JSON label file documenting the status of the following anatomical structures — anterior cruciate ligament (ACL: intact, partial tear, or complete tear with retraction measurement in mm), posterior cruciate ligament (PCL: intact or torn), medial meniscus body, anterior horn, and posterior horn (each graded 0–III by signal intensity and tear morphology), lateral meniscus with equivalent grading, medial and lateral compartment articular cartilage (MOAKS score optional but encouraged), and presence of joint effusion with volume estimate where available. Bounding boxes around the primary lesion site on the most diagnostically informative slice are required for all ACL tear-positive and meniscal tear-positive cases. Segmentation masks of the ACL, medial meniscus, and lateral meniscus are optional but will be compensated at a 25% premium over the base case rate. All data must be de-identified following HIPAA Safe Harbor standard with removal of patient name, date of birth, accession number, device serial number, and all other of the 18 specified identifiers; scan acquisition date must be shifted by a consistent per-patient random offset to prevent re-identification via date triangulation. The resulting dataset will be used to develop and independently validate a deep-learning second-reader tool for knee MRI interpretation intended for deployment in community and teleradiology practices where subspecialty musculoskeletal radiologist access is limited. Cases with post-operative hardware artefacts from prior ACL reconstruction or partial meniscectomy must be excluded unless explicitly flagged in metadata. Mixed pathology cases combining ACL tear with concurrent meniscal tear or chondral defect are particularly desirable as training examples for multi-label pathology detection models.

Medical imagingMRIDICOMJSONPAR/REC
0 / 800 scans0%

2,000 brain FLAIR MRI scans with expert MS lesion segmentation for relapsing-remitting multiple sclerosis

Open

Our research group is building a large-scale benchmark dataset for automated multiple sclerosis (MS) lesion detection, segmentation, and longitudinal volume tracking over time. We require brain MRI examinations from patients with clinically confirmed relapsing-remitting MS (RRMS) or clinically isolated syndrome (CIS), acquired on either 1.5T or 3T scanners using a standardised protocol wherever site-specific constraints allow. Each study must include at minimum a 3D FLAIR sequence with slice thickness ≤1.5 mm, TR 9000–11000 ms, TE 120–140 ms, and TI 2500 ms, and a 3D T1-weighted MPRAGE or SPGR sequence with TI 900 ms and 1 mm isotropic resolution. Additional sequences such as proton density-weighted (PDw), T2-weighted, and double inversion recovery (DIR) are strongly encouraged and will receive positive weighting during case selection. Magnetisation transfer ratio (MTR) sequences and diffusion tensor imaging (DTI) with fractional anisotropy maps are optional but add significant scientific value and will be purchased at a per-sequence premium. Raw DICOM files are the preferred primary format; NIfTI conversion with corresponding JSON sidecar files containing acquisition metadata including field strength, TR, TE, TI, flip angle, and scanner manufacturer is equally acceptable and facilitates automated quality assurance pipelines. All data must be fully anonymised per DICOM standard PS3.15 Profile and additionally defaced using PyDeface or MRI Deface to eliminate any residual facial surface reconstruction risk. Segmentation masks of T2 white matter lesions must be provided as binary or multi-label NIfTI volumes, delineated on the FLAIR sequence by a neuroradiologist with ≥3 years of dedicated MS-specific reading experience. Lesion-level topographic attributes including periventricular, juxtacortical, infratentorial, and spinal cord location should be encoded in accompanying JSON metadata per the 2016 McDonald criteria framework. Whole-brain and lesion-filling T1-hypointense black hole masks are optional but compensated separately. Baseline clinical metadata required per patient: age at scan date, sex, disease duration in years, EDSS score, current disease-modifying therapy (DMT) class, and whether gadolinium-enhancing lesions were identified on the corresponding post-contrast T1 sequence. QA exclusion criteria include severe motion artefact (ghosting grade ≥2), Gibbs ringing affecting lesion boundaries, field-of-view cropping cutting the cortex, and signal dropout in the posterior fossa that impairs infratentorial lesion detection. Scanner vendor diversity across Siemens, Philips, and GE platforms is required with no single vendor exceeding 50% of the total cohort. The dataset will power supervised and semi-supervised learning models for automated lesion segmentation pipelines integrated into MRI post-processing platforms used by academic MS centres globally. Longitudinal acquisition pairs from the same patient captured ≥6 months apart are particularly valuable for training change-detection and lesion evolution algorithms and will command a 30% per-case premium over cross-sectional studies.

Medical imagingMRIDICOMJSONNIfTI
0 / 2000 scans0%

1,500 multiparametric brain MRI studies with BraTS-style glioma segmentation masks

Open

We are seeking a large cohort of multiparametric brain MRI examinations from adult patients (18+) with histologically confirmed glioma (WHO grades II–IV), including glioblastoma multiforme. Each study must include at minimum four MRI sequences acquired on a 3T scanner: pre-contrast T1-weighted (T1w), post-contrast T1-weighted with gadolinium enhancement (T1Gd), T2-weighted (T2w), and T2-FLAIR. Preferred acquisition parameters include slice thickness ≤1.5 mm for volumetric sequences, TR 2000–2500 ms and TE 20–30 ms for T2w, and isotropic or near-isotropic 1 mm voxel resolution for T1w MPRAGE. Flip angle for T1w is typically 9°; inversion time (TI) for MPRAGE is 900–1100 ms. All sequences must be acquired on the same scanner in a single session where possible to minimise inter-sequence registration error. Data must be delivered in NIfTI format after DICOM-to-NIfTI conversion; original DICOM files are also welcomed as a secondary format. Volumetric DICOM series with intact DICOM headers are required to allow retrospective review of acquisition metadata including manufacturer, model, field strength, and sequence name. All volumes must be skull-stripped and defaced using tools such as FreeSurfer mri_deface or FSL BET to prevent facial reconstruction and patient re-identification, in full compliance with HIPAA Safe Harbor de-identification and GDPR pseudonymisation standards under Article 89 research exemptions. Annotation requirements follow the BraTS challenge protocol: three sub-region segmentation masks per case — enhancing tumor (ET), tumor core (TC), and whole tumor (WT) — provided as integer-labeled NIfTI files co-registered to the T1Gd reference volume using affine or non-linear registration. Masks must be generated or verified by a board-certified neuroradiologist or neurosurgeon with at minimum five years of neuro-oncology reading experience. Inter-annotator agreement scores measured by Dice coefficient must reach ≥0.80 on at least a random 10% holdout subset, with cases below threshold re-adjudicated by a senior radiologist. Clinical metadata should include age, sex, WHO tumor grade, IDH mutation status if available, MGMT promoter methylation status, and ECOG performance score where recorded in the patient record. This dataset will be used to train and benchmark deep-learning segmentation models for surgical planning support tools and treatment response and progression monitoring applications. We particularly need representation from scanner vendors including Siemens Healthineers, GE Healthcare, Philips, and Canon to maximise model generalisability across heterogeneous clinical environments. Longitudinal studies capturing pre-operative, post-operative, and follow-up timepoints are highly desirable and will be compensated at a premium rate per case. Institutions contributing multi-site data from geographically diverse centres outside North America are especially encouraged to apply, as global demographic and scanner diversity is a key requirement for regulatory-grade AI model development.

Medical imagingMRIDICOMNIfTI
0 / 1500 scans0%

Related categories