4,000 pediatric and congenital heart disease echocardiogram studies for structural anomaly detection

Open

Overview

Congenital heart disease (CHD) affects approximately 1% of live births and represents one of the most diagnostically challenging domains in clinical ultrasound, where accurate and timely diagnosis is critical for surgical planning and outcomes. We are constructing a multi-label classification model capable of identifying common structural anomalies — ventricular septal defect (VSD), atrial septal defect (ASD), tetralogy of Fallot, transposition of the great arteries, hypoplastic left heart syndrome, and coarctation of the aorta — directly from neonatal and pediatric TTE cine loops without requiring manual feature extraction. Required acquisitions span the complete standard pediatric echo protocol: subcostal 4-chamber and short-axis views for septal integrity assessment, apical 4-chamber view, parasternal long-axis and short-axis views at multiple levels including the great vessels, mitral valve annulus, and papillary muscles, and suprasternal notch views for aortic arch and ductal anatomy assessment. Color Doppler overlays demonstrating shunt flow direction and velocity, outflow tract obstruction, or valvular regurgitation jets are strongly requested for each structurally abnormal study and are mandatory for VSD, ASD, and outflow tract lesions. Frame rates of at least 30 fps are required; neonatal studies at higher frame rates of 60–80 fps are welcomed and preferred. All cine loops must be delivered as DICOM files with the original pixel data fully intact and without lossy recompression. Clinical labels must include the confirmed CHD diagnosis or a normal label, age at acquisition in months rather than exact date of birth to protect identity, biological sex, and body weight category coded as neonate under 1 month, infant 1–12 months, child 1–12 years, or adolescent 13–18 years. Segmentation masks of all four cardiac chambers and great vessel origins — aortic root, pulmonary trunk — are required for at least 1,000 studies to support anatomical landmark learning and chamber volumetry in small hearts. Studies should be accompanied by structured echocardiographic report summaries in plain text where available, with all personal identifying information stripped prior to transfer. Doppler-derived hemodynamic measurements including peak VSD jet velocity, estimated right ventricular pressure, and pulmonary-to-systemic flow ratio (Qp:Qs) should be included as structured JSON labels where clinically measured. Inclusion of longitudinal follow-up studies from the same patient, pseudonymously linked by a consistent de-identified subject ID, is highly valuable for disease progression and post-operative remodelling research. Studies from multiple institutions across diverse geographic regions are preferred to capture variation in patient ethnicity, altitude-related physiology, and institutional scanning protocols. De-identification must comply with DICOM PS3.15 Annex E and must include removal of all burned-in annotation text overlaying pixel data.

Medical imagingUltrasoundCardiacDICOMJSON

Progress

0 / 4000 scans0%

Data Specifications

CategoryMedical imaging
Required quantity4000
Data typesMedical imaging, Ultrasound, Cardiac, DICOM, JSON
BudgetUSD 90000.00
Deadline2027-03-29

Use Cases

  • Training and validating Medical imaging AI/ML models
  • Benchmarking Medical imaging detection and segmentation algorithms
  • Building de-identified Medical imaging research datasets for academic studies
  • Augmenting existing Medical imaging datasets to reduce class imbalance