6,000 Pediatric 12-Lead ECGs Across Age Groups from Neonates to Adolescents with Diagnostic Labels
OpenOverview
Pediatric cardiology is a critically underserved domain in AI-driven ECG interpretation because pediatric ECG morphology differs substantially from adult norms: higher resting heart rates, right-ventricular dominance in neonates, evolving QRS axis, and age-specific QTc reference ranges all mean that models trained on adult datasets perform poorly in children. We are building the first large-scale, age-stratified pediatric ECG AI classifier to screen for congenital heart disease, inherited channelopathies, and acquired conditions including Kawasaki disease and myocarditis. We require 6,000 resting 12-lead ECG recordings from patients aged 0–17 years, with the following minimum stratification: neonates and infants (0–12 months) 1,000 recordings, toddlers and pre-school (1–5 years) 1,000 recordings, school-age (6–12 years) 2,000 recordings, and adolescents (13–17 years) 2,000 recordings. Sampling rate must be ≥500 Hz; paper-speed equivalent of 25 mm/s and gain of 10 mm/mV must be documented. EDF or WFDB formats are required. Recording duration ≥10 seconds; longer strips preferred for rhythm assessment. Each record must include cardiologist diagnostic labels from the following categories: normal for age, right bundle branch block, left ventricular hypertrophy, Wolff-Parkinson-White pattern, long-QT syndrome, supraventricular tachycardia, complete AV block, and congenital heart disease (specifying anatomy where known). Reports or structured cardiology findings summaries should accompany records where available, as these provide essential contextual supervision signal. The labeling protocol must be carried out exclusively by board-certified pediatric cardiologists. Primary interpretation is performed by a pediatric cardiology fellow or attending with electrophysiology training; all abnormal findings must be over-read and confirmed by a senior pediatric cardiologist. Diagnostic criteria must be referenced to published age-normative tables (Davignon, Rijnbeek, or equivalent peer-reviewed pediatric ECG reference ranges) because QRS duration, QTc limits, and R-wave amplitude thresholds differ substantially between age groups. Inter-annotator agreement must be assessed for a minimum 10% random subsample, with Cohen's kappa reported per diagnostic category and documented in the data release. Acquisition parameters must be fully documented per record: device manufacturer and model, paper speed setting, gain setting (typically 10 mm/mV for standard leads, 5 mm/mV for high-amplitude neonatal tracings), electrode placement protocol (standard limb positions or pediatric chest electrode spacing), and patient cooperation level (resting/awake, sleeping, or crying — since motion artefact in infants is a major confound). QTc values must be calculated using the Bazett correction for comparison with age-normative ranges, with the raw QT and preceding RR interval also provided. De-identification is strictly required under HIPAA or equivalent national regulation. Because pediatric patients are a protected class, particular care must be taken to remove any free-text that could identify the child or parent. Only age in months for those under two years, or age in completed years, sex assigned at birth, body weight percentile, and relevant metabolic or genetic screening results (e.g., channelopathy gene panel result if available) should be retained as metadata. Strict de-identification is required; only age in months (for those under two years) or age in completed years, sex assigned at birth, and body weight percentile should be retained as metadata. We strongly encourage participation from tertiary paediatric cardiac centres, as these institutions concentrate the rare diagnoses most valuable to the classifier. Downstream use cases include deployment as a screening decision-support tool in general pediatric clinics and neonatal intensive care units, integration with wearable infant cardiac monitors, and a federated learning study across multiple pediatric hospitals to address data rarity.
Progress
Data Specifications
| Category | Sensor / device data |
|---|---|
| Required quantity | 6000 |
| Data types | Sensor / device data, ECG, Cardiac, EDF, WFDB |
| Budget | EUR 48000.00 |
| Deadline | 2027-01-28 |
Use Cases
- Training and validating Sensor / device data AI/ML models
- Benchmarking Sensor / device data detection and segmentation algorithms
- Building de-identified Sensor / device data research datasets for academic studies
- Augmenting existing Sensor / device data datasets to reduce class imbalance