Open marketplace

Browse Open Medical Data Requests

32 open requests from researchers and companies. No account needed to browse -sign in to fulfill.

500 cardiac MRI studies with cine function analysis and late gadolinium enhancement for cardiomyopathy classification

We are seeking a high-quality cardiac MRI (CMR) dataset for training and validating deep-learning models for automated biventricular segmentation, ejection fraction estimation, and myocardial fibrosis and scar burden quantification targeting clinical deployment in heart failure and cardiomyopathy care pathways. Required studies must originate from patients with confirmed or clinically suspected cardiomyopathy including dilated cardiomyopathy (DCM), hypertrophic cardiomyopathy (HCM), ischaemic cardiomyopathy (ICM), arrhythmogenic right ventricular cardiomyopathy (ARVC), or cardiac amyloidosis, as well as age-matched healthy volunteer controls, scanned on 1.5T or 3T scanners using ECG-triggered breath-hold or navigator-gated respiratory-compensated acquisition. Mandatory sequences per study: a short-axis cine balanced steady-state free precession (bSSFP) stack covering the full left and right ventricle from the atrioventricular plane to the apex in contiguous slices (slice thickness 8–10 mm, inter-slice gap 0–2 mm, temporal resolution ≤45 ms per cardiac phase with ≥25 cardiac phases reconstructed per slice, in-plane resolution ≤1.5 × 1.5 mm); long-axis cine views in two-chamber, three-chamber, and four-chamber orientations using identical bSSFP parameters; and phase-sensitive inversion recovery (PSIR) late gadolinium enhancement (LGE) sequences acquired 10–15 minutes following intravenous gadolinium injection at 0.1–0.2 mmol/kg standard extracellular agent, with inversion time (TI) determined individually per patient using a TI scout sequence to null normal myocardium (typically 240–320 ms at 1.5T and 280–360 ms at 3T), co-localised to the short-axis cine stack with matching slice positions. T1 mapping using MOLLI (5(3)3 scheme) or ShMOLLI acquisition both natively pre-contrast and post-contrast at 15 minutes for extracellular volume (ECV) fraction computation is optional but highly desirable and compensated at a 15% per-study premium. T2 mapping sequences using T2-prepared bSSFP with at least three echo times are also optional and add myocardial oedema characterisation capability. Delivery must be in DICOM format with complete sequence metadata including inversion time, flip angle, TR, TE, and temporal resolution intact in the DICOM header; NIfTI-2 with JSON sidecars is accepted as an additional format and is required when the institution uses a BIDS-compatible CMR archive. Annotation requirements include endocardial and epicardial contours of the left ventricle (LV) and right ventricle (RV) at end-diastole and end-systole on all short-axis cine slices from base to apex, generated by a trained cardiac sonographer or CMR technologist and verified by a board-certified cardiac radiologist or cardiologist with ≥3 years of CMR reading experience, sufficient to compute LV ejection fraction (LVEF), LV end-diastolic volume (LVEDV), LV end-systolic volume (LVESV), LV myocardial mass, RV end-diastolic volume (RVEDV), and RV ejection fraction (RVEF). Contours must be delivered as segmentation masks in NIfTI format or as polygon vertex coordinates in JSON. LGE burden must be annotated as a binary segmentation mask of LGE-positive scar versus healthy non-enhancing myocardium on the short-axis LGE stack, generated using a combination of semi-automated thresholding (6-SD above remote myocardium) and manual adjudication. LGE spatial pattern must be classified per study as ischaemic (subendocardial or transmural, coronary territory distribution) or non-ischaemic (midwall, epicardial, insertion point RV, or diffuse) to support cardiomyopathy aetiology classification downstream. Required clinical metadata: patient age, sex, body surface area, primary diagnosis code, NYHA functional class, LVEF from the clinical CMR report, NT-proBNP or BNP level if available, and presence of implantable cardiac device (pacemaker or ICD); cases with severe susceptibility artefact from device leads obscuring ≥20% of myocardial segments must be excluded from the training split. All imaging data must be fully de-identified per DICOM PS3.15 confidentiality profile with removal of patient name, birth date, accession number, and institution name; any incidentally acquired head MRI slices at the top of the short-axis stack that reveal facial structure must be defaced prior to delivery. This dataset will underpin a regulatory-grade CMR analysis platform targeting cardiologist adoption across European heart failure and cardiac imaging centres, with a planned CE marking submission under MDR 2017/745 as Class IIa medical device software and subsequent FDA 510(k) clearance for the North American market. Scanner vendor diversity across Siemens Magnetom, GE SIGNA, and Philips Ingenia platforms at both 1.5T and 3T is required with no single vendor exceeding 40% of the total study count.

Medical imagingMRIDICOMJSONNIfTI-2

0 / 500 scans0%

1,000 lumbar spine MRI studies annotated for disc herniation, foraminal stenosis, and Modic changes

Open

We are assembling a comprehensive lumbar spine MRI dataset to train radiomics and deep-learning models for automated grading of degenerative disc disease, disc herniation morphology classification, and neural foraminal stenosis severity assessment. Required cases include adult patients aged 25–80 years presenting with low back pain, radiculopathy, neurogenic claudication, or pre-operative surgical evaluation, scanned on 1.5T or 3T systems using a posterior phased-array spine coil. Field strength of 3T is preferred for its superior soft tissue contrast and improved nerve root visualisation within the neural foramen, but well-performed 1.5T studies with adequate SNR are fully acceptable. Mandatory sequences per study: sagittal T1-weighted with TR 400–700 ms, TE 10–15 ms, slice thickness ≤4 mm, FOV 250–300 mm, and matrix ≥256×256; sagittal T2-weighted with TR 3000–5000 ms, TE 80–120 ms and matching slice geometry to T1 for direct co-registration; and axial T2-weighted images acquired at L3-4, L4-5, and L5-S1 disc levels with slice thickness ≤4 mm and in-plane resolution ≤0.6 × 0.6 mm. Optional sequences valued for enhanced scientific utility include sagittal STIR for bone marrow oedema and inflammatory endplate disease; sagittal T2 fat-suppressed for epidural and ligamentous pathology; and post-gadolinium T1-weighted sagittal and axial sequences for post-operative epidural fibrosis assessment in revision surgery cases. DICOM is the required primary delivery format with all sequence parameters preserved in the DICOM header; NIfTI with JSON acquisition sidecars containing TR, TE, flip angle, slice thickness, and field strength is accepted as an equivalent alternative for institutions using BIDS-structured archives. Structured annotations must be provided per intervertebral disc level from L1-2 through L5-S1 covering the following elements: Pfirrmann degeneration grade I through V on sagittal T2 based on nucleus pulposus signal intensity and disc height; disc herniation morphology classified as none, annular bulge, focal protrusion, broad-based protrusion, extrusion with or without cranial or caudal migration, or sequestration using the 2014 NASS nomenclature; herniation zone classified as central, right or left paracentral, right or left foraminal, or far lateral; Modic endplate change type 0, I, II, or III bilaterally at each level; and neural foraminal stenosis grade 0 through 3 bilaterally at each level based on obliteration of the perineural fat signal. All per-level structured annotations must be delivered as JSON objects keyed by disc level and patient study identifier. Segmentation masks of herniated disc material on the axial T2 slices at the most affected level are optional and compensated at a 20% premium per annotated level. Bounding boxes around the primary herniation site on the most diagnostically informative axial slice are required for all extrusion and sequestration cases. Coronal reformations for scoliosis measurement are optional. De-identification must encompass removal of all 18 HIPAA-defined identifiers including patient name, date of birth, medical record number, accession number, device serial number, and geographic subdivisions smaller than state; scan dates must be shifted by a uniform per-patient random integer offset in the range of 1–365 days to prevent date-based triangulation. The intended clinical application is a decision support tool for spine surgeons, pain management physicians, and physiatrists that automatically generates structured disc-level reports from raw lumbar MRI input, reducing inter-radiologist reporting variability and turnaround time in high-volume practices. Post-operative cases with metallic fusion hardware may be included only if the artefact does not obscure the annotated disc levels and must be flagged with implant type in metadata.

Medical imagingMRIDICOMJSONNIfTI

0 / 1000 scans0%

600 prostate multiparametric MRI studies with PI-RADS v2.1 scores and lesion segmentation

Open

We are requesting a curated cohort of prostate multiparametric MRI (mpMRI) examinations to develop an AI-assisted detection and characterisation tool for clinically significant prostate cancer (csPCa, defined as Gleason grade group ≥2, equivalent to Gleason score ≥3+4=7). Each study must be acquired on a 3T scanner using a pelvic phased-array surface coil with ≥16 elements, or alternatively an endorectal coil with external pelvic array, and must include all three standard mpMRI components as specified in the PI-RADS v2.1 technical guidelines. T2-weighted imaging (T2WI) must be acquired in axial, coronal, and sagittal planes with slice thickness ≤3 mm, in-plane resolution ≤0.4 × 0.4 mm, TR ≥3000 ms, and TE 100–120 ms. Diffusion-weighted imaging (DWI) must include a minimum of b-values 0, 500, and 1000 s/mm², plus a computed or directly acquired high-b image at b=1400–2000 s/mm², and the corresponding apparent diffusion coefficient (ADC) map generated from the b0 and b1000 images. ADC maps are the dominant DWI parameter for peripheral zone scoring under PI-RADS v2.1 and must therefore be computed without signal-to-noise smoothing artefacts. Dynamic contrast-enhanced (DCE) imaging with intravenous gadolinium-based contrast agent (standard extracellular agent at 0.1 mmol/kg) must have temporal resolution ≤15 seconds per volume and total acquisition duration ≥5 minutes post-injection. MR spectroscopy (MRS) is optional and will be included as bonus data. Primary delivery format is DICOM with unmodified pixel data and sequence headers intact; NIfTI-2 with JSON metadata sidecars is also accepted and preferred for institutions already running BIDS-compatible research workflows. Each case must be annotated by a radiologist who reads ≥100 prostate mpMRI studies per year and is trained in PI-RADS v2.1 scoring. A PI-RADS score from 1 to 5 must be assigned per lesion with the dominant sequence scoring stated explicitly. For all PI-RADS 3–5 index lesions, a 3D segmentation mask delineated on the axial T2WI is required; a co-registered segmentation mask on the ADC map is strongly preferred to enable model training on both sequences simultaneously. Whole-gland prostate segmentation and zonal anatomy segmentation delineating the transition zone and peripheral zone are optional but will be purchased at a premium per case. Where available, MRI-TRUS fusion biopsy results including Gleason grade group and biopsy core location mapped to the sector model, or radical prostatectomy whole-mount pathology with sector correlation, should be provided as structured JSON or CSV metadata linked to the MRI lesion annotation. Required clinical fields include patient age, PSA level at time of MRI, PSA density (PSA divided by prostate volume), prostate volume on MRI, and prior prostate biopsy history. All data must be fully de-identified and defaced per GDPR Article 89 research exemption requirements and the DICOM PS3.15 confidentiality profile. Quality exclusion criteria include severe motion artefact on DWI, endorectal coil failure causing anterior gland signal loss, and b-value miscalculation producing erroneous ADC values outside the physiological range of 500–2000 µm²/s. This dataset will serve as training and validation data for a prostate cancer detection algorithm targeting radiologist workflow integration across European urology and radiology centres, with intended CE marking under MDR 2017/745 as a Class IIa medical device software. Cases from patients with prior treatment including external beam radiotherapy, brachytherapy, HIFU, or focal laser ablation should be flagged but are still included and scientifically valuable for post-treatment recurrence detection model development.

Medical imagingMRIDICOMJSONNIfTI-2

0 / 600 scans0%

800 knee MRI studies with radiologist annotations for ACL, meniscus, and cartilage pathology

Open

We are compiling a musculoskeletal MRI dataset focused on the knee joint to train and validate computer-aided detection models for common sports-medicine and orthopaedic pathologies. Required cases must include patients aged 16–70 presenting with acute or chronic knee symptoms such as instability, locking, or persistent pain, scanned on 1.5T or 3T systems using a dedicated transmit-receive or receive-only knee coil with ≥8 elements. Mandatory sequences per study: sagittal proton density fat-suppressed (PDFS) with slice thickness ≤3 mm, in-plane resolution ≤0.4 × 0.4 mm, TR 2500–4000 ms, and TE 30–40 ms; coronal PDFS with matching in-plane resolution; and axial PDFS or T2 fat-saturated (fat-sat) with slice thickness ≤3 mm. Preferred field strength is 3T, which provides superior cartilage signal-to-noise ratio and spatial resolution compared with 1.5T systems. Optional but highly valued sequences include 3D DESS (dual echo steady state) or 3D MEDIC for quantitative cartilage morphometry and T2 relaxation mapping, and sagittal T2-weighted without fat saturation for bone marrow oedema assessment and subchondral bone characterisation. T2 mapping with multi-echo spin echo acquisition (echo times 10, 20, 30, 40, 50, 60 ms) is desirable for cartilage matrix assessment and will be purchased at a 15% premium per study. Preferred delivery format is DICOM series with original pixel data and unmodified DICOM headers; PAR/REC format from Philips scanners with corresponding XML headers is also accepted. JSON sidecars with sequence parameters including TR, TE, flip angle, bandwidth, and reconstruction matrix are requested for all cases to enable automated sequence classification. Annotation requirements: each case must carry a structured radiology report or structured JSON label file documenting the status of the following anatomical structures — anterior cruciate ligament (ACL: intact, partial tear, or complete tear with retraction measurement in mm), posterior cruciate ligament (PCL: intact or torn), medial meniscus body, anterior horn, and posterior horn (each graded 0–III by signal intensity and tear morphology), lateral meniscus with equivalent grading, medial and lateral compartment articular cartilage (MOAKS score optional but encouraged), and presence of joint effusion with volume estimate where available. Bounding boxes around the primary lesion site on the most diagnostically informative slice are required for all ACL tear-positive and meniscal tear-positive cases. Segmentation masks of the ACL, medial meniscus, and lateral meniscus are optional but will be compensated at a 25% premium over the base case rate. All data must be de-identified following HIPAA Safe Harbor standard with removal of patient name, date of birth, accession number, device serial number, and all other of the 18 specified identifiers; scan acquisition date must be shifted by a consistent per-patient random offset to prevent re-identification via date triangulation. The resulting dataset will be used to develop and independently validate a deep-learning second-reader tool for knee MRI interpretation intended for deployment in community and teleradiology practices where subspecialty musculoskeletal radiologist access is limited. Cases with post-operative hardware artefacts from prior ACL reconstruction or partial meniscectomy must be excluded unless explicitly flagged in metadata. Mixed pathology cases combining ACL tear with concurrent meniscal tear or chondral defect are particularly desirable as training examples for multi-label pathology detection models.

Medical imagingMRIDICOMJSONPAR/REC

0 / 800 scans0%

2,000 brain FLAIR MRI scans with expert MS lesion segmentation for relapsing-remitting multiple sclerosis

Open

Our research group is building a large-scale benchmark dataset for automated multiple sclerosis (MS) lesion detection, segmentation, and longitudinal volume tracking over time. We require brain MRI examinations from patients with clinically confirmed relapsing-remitting MS (RRMS) or clinically isolated syndrome (CIS), acquired on either 1.5T or 3T scanners using a standardised protocol wherever site-specific constraints allow. Each study must include at minimum a 3D FLAIR sequence with slice thickness ≤1.5 mm, TR 9000–11000 ms, TE 120–140 ms, and TI 2500 ms, and a 3D T1-weighted MPRAGE or SPGR sequence with TI 900 ms and 1 mm isotropic resolution. Additional sequences such as proton density-weighted (PDw), T2-weighted, and double inversion recovery (DIR) are strongly encouraged and will receive positive weighting during case selection. Magnetisation transfer ratio (MTR) sequences and diffusion tensor imaging (DTI) with fractional anisotropy maps are optional but add significant scientific value and will be purchased at a per-sequence premium. Raw DICOM files are the preferred primary format; NIfTI conversion with corresponding JSON sidecar files containing acquisition metadata including field strength, TR, TE, TI, flip angle, and scanner manufacturer is equally acceptable and facilitates automated quality assurance pipelines. All data must be fully anonymised per DICOM standard PS3.15 Profile and additionally defaced using PyDeface or MRI Deface to eliminate any residual facial surface reconstruction risk. Segmentation masks of T2 white matter lesions must be provided as binary or multi-label NIfTI volumes, delineated on the FLAIR sequence by a neuroradiologist with ≥3 years of dedicated MS-specific reading experience. Lesion-level topographic attributes including periventricular, juxtacortical, infratentorial, and spinal cord location should be encoded in accompanying JSON metadata per the 2016 McDonald criteria framework. Whole-brain and lesion-filling T1-hypointense black hole masks are optional but compensated separately. Baseline clinical metadata required per patient: age at scan date, sex, disease duration in years, EDSS score, current disease-modifying therapy (DMT) class, and whether gadolinium-enhancing lesions were identified on the corresponding post-contrast T1 sequence. QA exclusion criteria include severe motion artefact (ghosting grade ≥2), Gibbs ringing affecting lesion boundaries, field-of-view cropping cutting the cortex, and signal dropout in the posterior fossa that impairs infratentorial lesion detection. Scanner vendor diversity across Siemens, Philips, and GE platforms is required with no single vendor exceeding 50% of the total cohort. The dataset will power supervised and semi-supervised learning models for automated lesion segmentation pipelines integrated into MRI post-processing platforms used by academic MS centres globally. Longitudinal acquisition pairs from the same patient captured ≥6 months apart are particularly valuable for training change-detection and lesion evolution algorithms and will command a 30% per-case premium over cross-sectional studies.

Medical imagingMRIDICOMJSONNIfTI

0 / 2000 scans0%

1,500 multiparametric brain MRI studies with BraTS-style glioma segmentation masks

Open

We are seeking a large cohort of multiparametric brain MRI examinations from adult patients (18+) with histologically confirmed glioma (WHO grades II–IV), including glioblastoma multiforme. Each study must include at minimum four MRI sequences acquired on a 3T scanner: pre-contrast T1-weighted (T1w), post-contrast T1-weighted with gadolinium enhancement (T1Gd), T2-weighted (T2w), and T2-FLAIR. Preferred acquisition parameters include slice thickness ≤1.5 mm for volumetric sequences, TR 2000–2500 ms and TE 20–30 ms for T2w, and isotropic or near-isotropic 1 mm voxel resolution for T1w MPRAGE. Flip angle for T1w is typically 9°; inversion time (TI) for MPRAGE is 900–1100 ms. All sequences must be acquired on the same scanner in a single session where possible to minimise inter-sequence registration error. Data must be delivered in NIfTI format after DICOM-to-NIfTI conversion; original DICOM files are also welcomed as a secondary format. Volumetric DICOM series with intact DICOM headers are required to allow retrospective review of acquisition metadata including manufacturer, model, field strength, and sequence name. All volumes must be skull-stripped and defaced using tools such as FreeSurfer mri_deface or FSL BET to prevent facial reconstruction and patient re-identification, in full compliance with HIPAA Safe Harbor de-identification and GDPR pseudonymisation standards under Article 89 research exemptions. Annotation requirements follow the BraTS challenge protocol: three sub-region segmentation masks per case — enhancing tumor (ET), tumor core (TC), and whole tumor (WT) — provided as integer-labeled NIfTI files co-registered to the T1Gd reference volume using affine or non-linear registration. Masks must be generated or verified by a board-certified neuroradiologist or neurosurgeon with at minimum five years of neuro-oncology reading experience. Inter-annotator agreement scores measured by Dice coefficient must reach ≥0.80 on at least a random 10% holdout subset, with cases below threshold re-adjudicated by a senior radiologist. Clinical metadata should include age, sex, WHO tumor grade, IDH mutation status if available, MGMT promoter methylation status, and ECOG performance score where recorded in the patient record. This dataset will be used to train and benchmark deep-learning segmentation models for surgical planning support tools and treatment response and progression monitoring applications. We particularly need representation from scanner vendors including Siemens Healthineers, GE Healthcare, Philips, and Canon to maximise model generalisability across heterogeneous clinical environments. Longitudinal studies capturing pre-operative, post-operative, and follow-up timepoints are highly desirable and will be compensated at a premium rate per case. Institutions contributing multi-site data from geographically diverse centres outside North America are especially encouraged to apply, as global demographic and scanner diversity is a key requirement for regulatory-grade AI model development.

Medical imagingMRIDICOMNIfTI

0 / 1500 scans0%

10,000 echocardiogram cine loops with view labels and chamber segmentation masks for foundation model pre-training

Open

We are pre-training a large cardiac ultrasound foundation model intended to serve as a general-purpose feature extractor for a broad range of downstream echocardiography AI tasks, including LVEF regression, valvular disease severity grading, diastolic function classification, and structural congenital anomaly detection. Diversity of acquisition views, patient demographics, scanner vendors, image quality levels, and disease states is the primary data requirement; this dataset is explicitly designed to span the full real-world distribution of clinical echo data rather than a curated high-quality subset, ensuring robust representation learning across the full spectrum of clinical practice. Required acquisitions cover all standard transthoracic echocardiography views: apical 4-chamber (A4C), apical 2-chamber (A2C), and apical 3-chamber (A3C); parasternal long-axis (PLAX); parasternal short-axis at aortic valve level (PSAX-AV), mitral valve level (PSAX-MV), and papillary muscle level (PSAX-PM); subcostal 4-chamber (SC4C) and subcostal inferior vena cava (SCIVS); and suprasternal notch (SSN). B-mode cine loops are the primary modality. Color Doppler overlays, pulsed-wave Doppler spectral tracings, continuous-wave Doppler recordings, and M-mode sweeps through the left ventricle and mitral valve are also accepted and must be labelled by modality. Cine loop duration may range from one to ten cardiac cycles; single-frame still images without temporal context are excluded from this request. DICOM format is mandatory to preserve acquisition metadata embedded in standard tags — imaging depth, transducer frequency, mechanical index, scanner manufacturer and model, gain and time-gain compensation settings — all of which will serve as auxiliary conditioning inputs during foundation model pre-training. De-identification must comply with the DICOM PS3.15 Annex E Basic Application Level Confidentiality Profile, with explicit removal of patient name, date of birth, institution name, referring physician, device serial number, and any burned-in annotation text overlaying pixel data. A per-batch de-identification certificate confirming the method and software version used is required. GDPR-compliant pseudonymisation is acceptable for European institutions in lieu of full anonymisation, provided a subject pseudonym key is retained securely by the contributing institution and never transferred. Annotation requirements are intentionally lightweight to enable dataset scale: each cine loop requires only a view-classification label drawn from the controlled vocabulary (A4C, A2C, A3C, PLAX, PSAX-AV, PSAX-MV, PSAX-PM, SC4C, SCIVS, SSN, or Other) and a sonographer-assigned image quality score on a three-point scale (1 poor, 2 adequate, 3 good). For a 20% random stratified subsample of 2,000 studies, pixel-level segmentation masks of the left ventricle endocardium and epicardium, right ventricle endocardium, and left atrial endocardium at end-diastole are required to enable supervised fine-tuning experiments in parallel with self-supervised pre-training. Anatomical keypoints — medial and lateral mitral annular hinge points and the LV apex — are requested for the same 2,000-study subsample to facilitate alignment-based data augmentation. Scanner vendor diversity targets: minimum 2,000 studies each from GE, Philips, Siemens, and Canon platforms, with remaining studies from other vendors or mixed sources.

Medical imagingUltrasoundDICOMJSONPNG / JPG

0 / 10000 scans0%

1,800 stress echocardiography paired studies (rest and peak stress) for ischemia classification

Open

Stress echocardiography remains a cornerstone of non-invasive ischemia assessment, yet visual wall-motion scoring is highly operator-dependent and shows significant inter-reader variability even among experienced cardiologists. We are developing an automated wall-motion abnormality detection system trained on paired rest-and-peak-stress cine-loop acquisitions, targeting sensitivity and specificity benchmarks comparable to Level III expert readers for detecting hemodynamically significant coronary artery disease across all three major coronary territories. Each study pair must include resting and peak-stress cine-loop acquisitions — obtained via exercise treadmill, upright cycle ergometer, or pharmacological dobutamine infusion protocol with or without atropine augmentation — in at least four standard views: apical 4-chamber, apical 2-chamber, parasternal long-axis, and parasternal short-axis at the mid-papillary muscle level. Apical 3-chamber views are requested where acquired. Side-by-side quad-screen DICOM files in the standard stress-echo cine display format — rest and stress loops displayed simultaneously at matched cardiac cycles — are acceptable and preferred, as they reflect the real-world reporting workflow and facilitate direct comparison learning. Frame rates must be sufficient to resolve individual cardiac phases at elevated heart rates, requiring a minimum of 50 fps at peak stress and 25 fps at rest. Second harmonic B-mode imaging is required; studies acquired with ultrasound contrast agent (UCA) — specifically SonoVue/Lumason or Definity/Luminity — are explicitly welcomed alongside non-contrast acquisitions and must be flagged in the metadata with contrast agent name and dose administered. Doppler tissue imaging (DTI) of the mitral annulus at rest is requested as a supplementary acquisition where available, providing diastolic functional context alongside ischemia assessment. Mandatory structured clinical labels include: stress protocol type, peak heart rate achieved, percentage of age-predicted maximum heart rate, Duke Treadmill Score where applicable, rate-pressure product at peak stress, wall-motion score index (WMSI) at rest and at peak stress per the ASE 17-segment left ventricular model, and the overall study conclusion classified as normal, inducible ischemia, fixed scar, or non-diagnostic. Per-segment wall-motion labels at rest and peak stress — normokinesis, hypokinesis, akinesis, or dyskinesis — are required for all 1,800 study pairs, structured as JSON arrays indexed to ASE segment numbering. Invasive coronary angiography correlation data, including percentage stenosis per vessel and Syntax score where available within 12 months of the stress study, should be linked pseudonymously to the imaging data to provide ground-truth coronary anatomy labels for ischemia territory mapping. QA exclusion criteria: studies with suboptimal image quality precluding wall-motion assessment in more than two of the 17 segments at peak stress must be excluded or clearly flagged as non-diagnostic to prevent label noise during model training.

Medical imagingUltrasoundDICOMJSON

0 / 1800 scans0%

4,000 pediatric and congenital heart disease echocardiogram studies for structural anomaly detection

Open

Congenital heart disease (CHD) affects approximately 1% of live births and represents one of the most diagnostically challenging domains in clinical ultrasound, where accurate and timely diagnosis is critical for surgical planning and outcomes. We are constructing a multi-label classification model capable of identifying common structural anomalies — ventricular septal defect (VSD), atrial septal defect (ASD), tetralogy of Fallot, transposition of the great arteries, hypoplastic left heart syndrome, and coarctation of the aorta — directly from neonatal and pediatric TTE cine loops without requiring manual feature extraction. Required acquisitions span the complete standard pediatric echo protocol: subcostal 4-chamber and short-axis views for septal integrity assessment, apical 4-chamber view, parasternal long-axis and short-axis views at multiple levels including the great vessels, mitral valve annulus, and papillary muscles, and suprasternal notch views for aortic arch and ductal anatomy assessment. Color Doppler overlays demonstrating shunt flow direction and velocity, outflow tract obstruction, or valvular regurgitation jets are strongly requested for each structurally abnormal study and are mandatory for VSD, ASD, and outflow tract lesions. Frame rates of at least 30 fps are required; neonatal studies at higher frame rates of 60–80 fps are welcomed and preferred. All cine loops must be delivered as DICOM files with the original pixel data fully intact and without lossy recompression. Clinical labels must include the confirmed CHD diagnosis or a normal label, age at acquisition in months rather than exact date of birth to protect identity, biological sex, and body weight category coded as neonate under 1 month, infant 1–12 months, child 1–12 years, or adolescent 13–18 years. Segmentation masks of all four cardiac chambers and great vessel origins — aortic root, pulmonary trunk — are required for at least 1,000 studies to support anatomical landmark learning and chamber volumetry in small hearts. Studies should be accompanied by structured echocardiographic report summaries in plain text where available, with all personal identifying information stripped prior to transfer. Doppler-derived hemodynamic measurements including peak VSD jet velocity, estimated right ventricular pressure, and pulmonary-to-systemic flow ratio (Qp:Qs) should be included as structured JSON labels where clinically measured. Inclusion of longitudinal follow-up studies from the same patient, pseudonymously linked by a consistent de-identified subject ID, is highly valuable for disease progression and post-operative remodelling research. Studies from multiple institutions across diverse geographic regions are preferred to capture variation in patient ethnicity, altitude-related physiology, and institutional scanning protocols. De-identification must comply with DICOM PS3.15 Annex E and must include removal of all burned-in annotation text overlaying pixel data.

Medical imagingUltrasoundDICOMJSON

0 / 4000 scans0%

2,500 speckle-tracking echocardiography studies with global longitudinal strain values

Open

Global longitudinal strain (GLS) derived from speckle-tracking echocardiography (STE) is an emerging biomarker for subclinical left ventricular dysfunction, cardiotoxicity monitoring in oncology patients, and early cardiomyopathy detection before overt systolic impairment develops. We are building a regression model that predicts GLS directly from standard B-mode apical cine loops, eliminating the dependency on proprietary vendor speckle-tracking software and enabling GLS estimation at sites without dedicated post-processing workstations. We require high-quality B-mode cine-loop acquisitions from the apical 4-chamber, apical 2-chamber, and apical 3-chamber (apical long-axis, A3C) views, captured at a frame rate of at least 50–80 frames per second to ensure adequate speckle coherence and tracking stability across frames throughout the cardiac cycle. Spatial resolution should be 600×800 pixels or higher. Each study should provide at least five consecutive cardiac cycles free from respiratory motion artefact and with consistent probe position. DICOM files with uncompressed or losslessly compressed pixel data are mandatory; lossy JPEG compression must not be applied, as it irreversibly degrades the high-frequency speckle patterns that are critical for accurate myocardial tracking and strain computation. The mandatory label for each study is the GLS value expressed as a negative percentage (for example −18.5%) computed by the acquiring institution using their validated STE software platform — EchoPAC, TOMTEC 2D Cardiac Performance Analysis, or an equivalent vendor-validated tool — with the software name and version number recorded in the accompanying JSON metadata sidecar file. Segmental longitudinal strain values for all 18 ASE myocardial segments are requested where available to support regional dysfunction mapping. Segmentation masks of the myocardial wall delineating both the endocardial and epicardial borders at end-diastole are requested for a minimum of 500 studies to support geometric normalisation and wall-thickness estimation experiments. Oncology patients undergoing anthracycline chemotherapy or trastuzumab (Herceptin) therapy represent a particularly valuable subpopulation for cardiotoxicity surveillance applications; institutions are encouraged to flag such cases with a treatment-context label — drug class, cumulative dose, and number of cycles completed — while fully preserving patient anonymity in compliance with HIPAA and GDPR requirements. Baseline and follow-up studies from the same pseudonymised patient are highly sought. Demographic balance across age decades (30–49, 50–69, 70+), biological sex, and underlying cardiomyopathy aetiology (ischaemic, dilated, hypertrophic, normal) should be targeted to ensure model generalisability.

Medical imagingUltrasoundDICOMJSON

0 / 2500 scans0%

3,000 Doppler echocardiography studies for aortic stenosis and mitral regurgitation grading

Open

This data request supports the training and external validation of a multimodal classification network designed to grade the severity of left-sided valvular heart disease — specifically aortic stenosis (AS) and mitral regurgitation (MR) — directly from raw echocardiographic cine loops combined with spectral and color Doppler frames. Accurate automated grading would reduce inter-reader variability and accelerate triage in high-volume cardiology laboratories. Required acquisitions include parasternal long-axis (PLAX) and parasternal short-axis (PSAX) cine loops at the level of the aortic valve, continuous-wave (CW) Doppler tracings across the aortic valve, and color Doppler overlays of the mitral valve from the apical 4-chamber view. Pulsed-wave (PW) Doppler recordings at the left ventricular outflow tract (LVOT) are also required to enable computation of the dimensionless velocity index and aortic valve area by the continuity equation. Frame rate for B-mode cine loops should be at least 30 fps; Doppler sweeps should capture a minimum of three consecutive cardiac cycles at a standard sweep speed of 100 mm/s. DICOM format is mandatory for all modalities so that embedded Doppler velocity scale metadata, depth setting, and Nyquist limit can be extracted programmatically during preprocessing. Each study must be accompanied by a clinical label indicating AS severity categorized as none, mild, moderate, or severe per the 2014 AHA/ACC guideline criteria — specifically mean gradient, peak velocity, and aortic valve area — and MR grade categorized as none, mild, moderate, or severe per effective regurgitant orifice area (EROA) quantification or qualitative color jet area assessment. Studies with concurrent moderate-to-severe tricuspid regurgitation should be flagged to support multi-label classification experiments. Pulmonary artery systolic pressure estimated from peak tricuspid regurgitation velocity is requested as a supplementary hemodynamic label. Hospitals are encouraged to include studies spanning the full severity spectrum; a roughly balanced distribution across severity grades is preferred, with a minimum of 150 studies per severity class per valve disease. Inclusion of serial studies from patients following transcatheter aortic valve replacement (TAVR) or surgical mitral valve repair provides longitudinal value and should be pseudonymously linked. Full de-identification compliant with DICOM PS3.15 Annex E Basic Application Level Confidentiality Profile, including removal of institution name, referring physician, and device serial number tags, is required before any transfer.

Medical imagingUltrasoundDICOMJSON

0 / 3000 scans0%

5,000 transthoracic echocardiogram studies with LVEF measurements for heart-failure screening AI

Open

We are developing a deep-learning model to automate left ventricular ejection fraction (LVEF) estimation from transthoracic echocardiography (TTE) studies acquired in routine clinical care. The primary intended use case is population-level heart-failure screening integrated into existing cardiology workflows, reducing the reporting burden on sonographers and cardiologists while enabling earlier intervention in at-risk patients. We require full cine-loop acquisitions from the apical 4-chamber (A4C) and apical 2-chamber (A2C) views, recorded at a minimum of 25 frames per second with spatial resolution no lower than 224×224 pixels. Each study should include at least three complete cardiac cycles per view. Harmonic B-mode imaging is preferred over fundamental mode to improve endocardial border delineation. Data must be delivered in DICOM format with all protected health information removed or replaced per HIPAA Safe Harbor or equivalent GDPR pseudonymisation protocols, including stripping of DICOM tags 0010,0010 through 0010,0040 and any burned-in annotation text. A de-identification manifest confirming the specific method applied is required alongside each batch delivery. Clinical labels must include a cardiologist-verified LVEF value measured by the biplane Simpson's method of discs, New York Heart Association (NYHA) functional class where available, and a binary heart-failure diagnosis flag. Segmentation masks delineating the left ventricular endocardial border at end-diastole and end-systole in the A4C view are strongly preferred for the full dataset and mandatory for at least 30% of studies. Additional chamber measurements — left ventricular end-diastolic volume (LVEDV), end-systolic volume (LVESV), left atrial volume index, and diastolic function grade per ASE 2016 guidelines — are welcomed as supplementary labels to broaden the model's downstream applicability. Patient demographic metadata including age decade, biological sex, body mass index category, and primary diagnosis (heart failure with reduced ejection fraction HFrEF, heart failure with preserved ejection fraction HFpEF, or no heart failure) should be retained in anonymised DICOM tags or an accompanying JSON sidecar. Studies from multiple scanner vendors — GE Vivid, Philips EPIQ, Siemens Acuson, and Canon Aplio — are explicitly sought to ensure device-agnostic model generalisation. Quality control: studies with greater than 20% of frames degraded by ultrasound dropout, rib shadow, or patient motion artefact should be excluded or flagged with a quality score below threshold. A sonographer-assigned image quality rating (1 poor, 2 adequate, 3 good) is requested for each cine loop.

Medical imagingUltrasoundDICOMJSON

0 / 5000 scans0%