CT Scan Datasets — DICOM Computed Tomography Data

CT scan datasets provide cross-sectional, three-dimensional imaging reconstructed from multiple X-ray projections, offering far greater anatomical detail than plain radiography. Computed tomography is indispensable for training models in oncology, trauma, neuroimaging, and pulmonary disease because it resolves soft tissue, bone, and vasculature in volumetric detail. A CT dataset is typically delivered as DICOM image series, axial slices that reconstruct into isotropic volumes, annotated with slice thickness, reconstruction kernel, contrast phase, kVp, and mAs in the metadata.

Studies span non-contrast and contrast-enhanced acquisitions, multiphase protocols (arterial, venous, delayed), and specialized techniques such as CT angiography (CTA), low-dose lung-cancer screening, and dual-energy CT. Clinically valuable CT datasets cover a wide range of regions and findings: intracranial hemorrhage and ischemic stroke on head CT; pulmonary nodules, emphysema, and interstitial lung disease on chest CT; liver, kidney, and pancreatic lesions on abdominal CT; pulmonary embolism on CTA; and fractures and internal injuries in trauma protocols. The most useful datasets include voxel-level segmentation masks of organs, lesions, and abnormalities, along with radiologist-confirmed labels, lesion measurements following RECIST, and Hounsfield-unit calibration.

High-quality cohorts document acquisition parameters, balance pathology prevalence, and span multiple scanner vendors and reconstruction settings so models generalize beyond a single site. Rigorous de-identification removes PHI from DICOM headers and defaces or skull-strips head CT volumes where required, while preserving diagnostic fidelity. On GetDATA, clients post CT requests specifying body region, contrast phase, slice thickness, annotation type (volumetric segmentation, bounding box, or study-level label), label taxonomy, and minimum case counts, and verified providers fulfill them with compliant, quality-scored CT data in DICOM.

Beyond diagnosis, CT datasets power radiomics pipelines that extract quantitative texture and shape features, support automated organ-at-risk contouring for radiotherapy planning, and enable opportunistic screening for osteoporosis, coronary calcium, and body composition from scans acquired for unrelated indications. Browse the open CT scan requests below, or explore related imaging categories.

Open CT Scan requests

2,500 abdomen-pelvis CT volumes for kidney stone detection, stone composition classification, and urolithiasis burden scoring

Open

Our urology AI research group is assembling a comprehensive urolithiasis dataset to train models that automatically detect, localize, size, and classify renal and ureteral stones on non-contrast CT of the abdomen and pelvis (NCCT-KUB protocol). Kidney stones typically present as hyperdense foci (200–1000+ HU depending on composition; uric acid stones appear at lower density 200–400 HU, while calcium oxalate stones exceed 800 HU), making NCCT the gold-standard imaging modality for urolithiasis evaluation. Accurate Hounsfield unit measurement is essential for predicting stone composition and guiding treatment selection (shock-wave lithotripsy vs. ureteroscopy vs. percutaneous nephrolithotomy). Imaging protocol: NCCT acquisitions at 120 kVp (or low-dose 80–100 kVp protocols acceptable), slice thickness ≤2.5 mm, reconstructed in both soft-tissue window (WL 40 HU, WW 400 HU) and bone window (WL 400 HU, WW 1800 HU) for stone conspicuity. Coronal and sagittal MPR series are strongly encouraged. Each study must cover from the superior poles of the kidneys through the urinary bladder (ureterovesical junction). Dual-energy CT studies with stone composition maps are particularly valuable and should be flagged separately. Volumetric DICOM series are required; thin-section reconstructions at ≤1.25 mm for dual-energy cases enable virtual monochromatic image generation at 40–70 keV for optimal stone conspicuity and composition discrimination. Annotation requirements: bounding-box localization for each stone with anatomical location label (upper/mid/lower pole calyx, renal pelvis, proximal/mid/distal ureter, bladder), maximum axial stone diameter in millimeters per RECIST convention, mean HU value, and a stone composition prediction label (calcium oxalate, calcium phosphate, uric acid, struvite, cystine, or mixed) where dual-energy or prior metabolic workup data are available. Total stone burden (number and cumulative volume in cubic millimeters) should be recorded per patient in the JSON sidecar. Negative studies (no stone) must account for at least 25% of the dataset. Inter-rater agreement for stone localization (bounding-box IoU ≥ 0.60) and mean HU measurement (within ±50 HU) must be reported. QA exclusion criteria include scans with severe streak artifact from bilateral hip prostheses obscuring the ureters, incomplete coverage of the KUB field, or absence of acquisition kVp in DICOM metadata. De-identification per HIPAA Safe Harbor; ureteral stent or nephrostomy tube presence must be flagged in the JSON sidecar as these hardware items directly affect stone visibility and HU measurement accuracy. Acceptable formats are DICOM and NIfTI. The trained model targets integration into automated radiology reporting pipelines to generate structured urolithiasis reports, reducing radiologist workload while improving measurement reproducibility. Scanner diversity across GE, Siemens, Philips, and United Imaging platforms is required; contributions from institutions in multiple geographic regions are welcome to capture dietary and demographic variation in stone epidemiology, as stone composition prevalence differs markedly between Western and East Asian populations.

Medical imagingCTDICOMJSONNIfTI
0 / 2500 scans0%

800 CTPA studies with pulmonary embolism segmentation masks and clot-burden scoring for AI detection

Open

We are developing a deep-learning system for automated detection and clot-burden quantification of acute pulmonary embolism (PE) on CT pulmonary angiography (CTPA). PE manifests as intraluminal filling defects within pulmonary arteries, typically appearing as low-attenuation regions (–20 to +80 HU) surrounded by contrast-enhanced blood (200–400 HU at peak enhancement). Accurate segmentation of emboli from the main, lobar, segmental, and subsegmental pulmonary arteries is the central annotation task. Required imaging protocol: CTPA acquired with bolus-tracked contrast injection (iodinated contrast, 100–120 mL at 4–5 mL/s), slice thickness ≤1.25 mm, reconstructed in mediastinal window (WL 40 HU, WW 400 HU) and lung window (WL –600 HU, WW 1500 HU). Each study must include a complete volumetric DICOM series with full coverage from the lung apices to the costophrenic angles. Incidental findings such as pleural effusion, right heart strain (RV/LV ratio ≥0.9), and pulmonary infarct should be flagged in the JSON metadata but do not require pixel-level annotation. Tube voltage should be 100–120 kVp with automated tube-current modulation; studies acquired at non-standard voltages must include CTDI vol and DLP values in DICOM metadata for dose normalization. Volumetric series must be isotropic or near-isotropic (≤1.25 mm reconstructed slice thickness) to enable accurate three-dimensional vessel-tree segmentation and embolus localization. Annotation requirements: 3D segmentation masks of all emboli, centreline labeling of affected vessel segments, and a computed modified Miller index (mMI) or Qanadli clot-burden score. Negative CTPA studies (no PE) should constitute at least 30% of the dataset to enable specificity optimization. Cases confirmed by ventilation-perfusion (V/Q) scan or catheter angiography are particularly valuable and should be flagged accordingly. Inter-rater agreement for embolus segmentation must reach a Dice coefficient of ≥0.70 on lobar and segmental arteries; subsegmental PE cases may be annotated by consensus read given known inter-observer variability. Scanner balance across GE, Siemens, and Philips CTPA protocols is requested, and at least 15% of studies should originate from institutions in different countries to capture contrast injection protocol variations. QA exclusion criteria include studies with inadequate arterial opacification (main pulmonary artery attenuation below 200 HU), respiratory motion artifact degrading vessel conspicuity, or missing coverage of the pulmonary arterial trunk. De-identification per HIPAA Safe Harbor and DICOM PS3.15, with consistent pseudonymization for patients with follow-up imaging. Delivery in NIfTI-2 format with DICOM originals included is preferred. JSON sidecars must encode PE acuity (acute vs. chronic), Wells score, D-dimer value, and outcome (30-day mortality, need for thrombolysis) where available. The algorithm will be validated for integration into emergency radiology AI triage workflows and submitted for regulatory clearance. A formal data-use agreement and institutional ethics approval are prerequisites before data transfer commences.

Medical imagingCTDICOMJSONNIfTI-2
0 / 800 scans0%

5,000 chest CT scans with COVID-19 and viral pneumonia ground-glass opacity segmentation for AI triage research

Open

This request seeks a large, diverse, multi-site chest CT dataset to support research into AI-assisted diagnosis and severity scoring of COVID-19 pneumonia and other viral lower respiratory tract infections. The hallmark finding of interest is ground-glass opacity (GGO), typically manifesting as hazy areas of increased attenuation that do not obscure underlying bronchovascular structures, with Hounsfield unit range approximately –600 to –200 HU in infected regions compared to normal lung parenchyma near –850 HU. Consolidation, crazy-paving pattern, and subpleural distribution are additional features to be captured. Imaging requirements: axial thin-section CT (slice thickness 0.625–1.5 mm), lung window reconstruction (WL –600 HU, WW 1500 HU), both non-contrast and low-dose protocols accepted. Each volume must carry at least one of the following annotation tiers: (a) lobe-level GGO and consolidation segmentation masks in NIfTI format, (b) whole-lung segmentation mask, (c) per-scan severity score (CT severity index 0–25 or equivalent), or (d) RT-PCR confirmed diagnosis label (COVID-19 positive, influenza, other viral, bacterial, non-infectious). We strongly prefer scans with all four annotation tiers but will accept partial annotation with appropriate metadata flags. De-identification must comply with GDPR Recital 26 for European institutions and HIPAA Safe Harbor for US contributors. Longitudinal series from the same patient (admission, day-5, discharge) are highly valuable and should be pseudonymized with a consistent patient key so temporal progression can be modeled. JSON sidecars should include acquisition date relative to symptom onset, vaccination status if available, ICU admission flag, and oxygen saturation at time of scan. Volumetric DICOM series delivered as complete studies with all reconstructed series (lung kernel, soft-tissue kernel) are preferred; NIfTI-converted volumes are also acceptable. Tube voltage is typically 100–120 kVp for standard chest CT; low-dose screening protocols at 80 kVp are acceptable provided noise characteristics are documented. Scanner diversity is essential: contributions from GE, Siemens, Philips, and Canon sites are all welcome, and geographic diversity spanning Europe, North America, and Asia is prioritized to capture population-level variation in disease presentation. Annotation inter-rater agreement for GGO percentage (intraclass correlation coefficient ≥ 0.85) must be reported. QA exclusion criteria include scans with greater than 20% motion-corrupted slices, incomplete lung coverage, or absence of confirmed microbiological diagnosis. The resulting model will support real-time triage scoring integrated into PACS worklist systems, enabling prioritization of deteriorating patients in high-volume pandemic or endemic disease scenarios. Data will not be used for any commercial purpose beyond the stated AI research scope, and results will be published with appropriate attribution to contributing institutions.

Medical imagingCTDICOMJSONNIfTI
0 / 5000 scans0%

1,200 contrast-enhanced abdominal CT volumes for liver lesion segmentation and RECIST measurement

Open

We are constructing a benchmark dataset for automated liver lesion detection and volumetric segmentation on contrast-enhanced CT of the abdomen, targeting hepatocellular carcinoma (HCC), colorectal liver metastases (CRLM), and benign focal liver lesions (hemangioma, cysts, FNH). Imaging protocol must include arterial-phase and portal-venous-phase acquisitions, with slice thickness ≤2 mm and pixel spacing ≤0.8 mm in-plane. The Hounsfield unit dynamic range of interest is –200 to +300 HU (liver parenchyma typically 50–70 HU in portal phase; hypervascular HCC peaks at 80–120 HU in arterial phase). Multiplanar reconstruction (MPR) in coronal and sagittal planes is welcome but not mandatory. Annotation requirements are demanding: each lesion must have a 3D segmentation mask generated or confirmed by an abdominal radiologist with at least 5 years of subspecialty experience. Masks should be provided in NIfTI or NIfTI-2 format, with a JSON metadata file encoding lesion type, RECIST 1.1 longest axial diameter in millimeters, lesion number, LI-RADS category for HCC cases, and whether the patient had prior locoregional therapy (TACE, ablation). Liver parenchyma whole-organ masks are strongly encouraged as an additional annotation layer to facilitate liver-volume normalization during training. De-identification must satisfy HIPAA Safe Harbor (Method 1) with removal of all 18 identifiers and re-mapping of DICOM UIDs. Cases with prior abdominal surgery or transplant should be flagged in metadata. We require a minimum of 20% negative cases (no focal lesion) to anchor the model's specificity. Tube voltage should be documented for each acquisition phase, with standard portal-venous-phase protocols at 100–120 kVp using automated tube-current modulation. Contrast agent type (iodinated, concentration in mg/mL), injection rate (mL/s), and delay time (seconds from injection to scan start) must be recorded in the JSON sidecar for each phase, as these parameters directly affect lesion-to-liver contrast and Hounsfield unit values at the time of acquisition. Scanner heterogeneity across multiple vendors (GE, Siemens, Philips) and field-site geographic diversity are required to prevent model overfitting to a single institution's acquisition style. QA exclusion criteria include studies with gross motion artifact, incomplete hepatic coverage, or absence of a portal-venous phase. Inter-rater segmentation agreement (Dice ≥ 0.80 on lesions ≥10 mm) must be documented per contributing site. The dataset will be used to develop a clinical decision-support tool for oncology multidisciplinary tumor boards, enabling automated lesion tracking across treatment cycles in compliance with RECIST 1.1 response assessment criteria. Data will be processed within an ISO 27001–certified cloud environment under a fully executed DUA.

Medical imagingCTJSONNIfTINIfTI-2
0 / 1200 scans0%

3,500 non-contrast head CT scans with intracranial hemorrhage labels and hemorrhage-subtype segmentation

Open

Our neuroradiology AI team is building a real-time triage algorithm for the emergency detection of intracranial hemorrhage (ICH) on non-contrast computed tomography (NCCT) of the brain. Acute blood appears hyperdense on NCCT (50–80 HU), making CT the first-line modality in stroke and trauma settings. We require axial NCCT series acquired at standard emergency-room protocols: tube voltage 120–140 kVp, slice thickness ≤5 mm (preferably 2.5 mm or thinner for posterior-fossa coverage), reconstructed with both brain window (WL 40 HU, WW 80 HU) and subdural window (WL 75 HU, WW 200 HU). Scout localizer images should be excluded from the de-identified package. Each scan must carry one or more of the following hemorrhage subtype labels: epidural hematoma (EDH), subdural hematoma (SDH), subarachnoid hemorrhage (SAH), intraparenchymal hemorrhage (IPH), or intraventricular hemorrhage (IVH). Pixel-level 2D or 3D segmentation masks indicating the hemorrhagic region are required for at least 60% of positive cases; the remainder may carry bounding-box annotations. Negative (no hemorrhage) cases should constitute 35–40% of the total dataset to reflect realistic emergency-room case mix and to enable balanced training. All DICOM files must be de-identified in compliance with GDPR Article 89 and HIPAA Safe Harbor, with UIDs re-mapped using a consistent pseudonymization scheme so longitudinal cases (admission plus follow-up) can be linked internally. Accompanying JSON sidecar files should encode subtype, hemorrhage volume estimate in milliliters, midline shift in millimeters, GCS score where available, and scan acquisition timestamp relative to symptom onset. NIfTI conversion is acceptable in addition to or instead of DICOM. Head CT volumes must undergo defacing or skull-stripping-based defacing prior to delivery to eliminate re-identification risk through facial reconstruction. Scanner diversity across 3T-equivalent protocols from GE Discovery, Siemens SOMATOM, and Philips Brilliance platforms is desirable. Volumetric DICOM series reconstructed at isotropic or near-isotropic resolution (≤2.5 mm) enable multiplanar reformatting for 3D lesion characterization. Annotation inter-rater reliability must achieve a minimum Dice similarity coefficient of 0.75 on hemorrhage masks across independent neuroradiologist reads, with adjudication by a third reader for discordant cases. QA exclusion criteria include scans with severe beam-hardening artifact from dental implants obscuring supratentorial structures, incomplete brain coverage, or imaging performed more than 24 hours after initial ictus without temporal metadata. The trained model will be deployed as a CE-marked and FDA 510(k)-pathway medical device for acute ICH flagging in radiology worklist prioritization systems. No patient data will leave the secure processing environment; a signed data-use agreement will be provided to every contributing institution.

Medical imagingCTDICOMJSONNIfTI
0 / 3500 scans0%

2,000 contrast-enhanced chest CT volumes with lung-nodule 3D segmentation masks (LIDC-IDRI style)

Open

We are developing a deep-learning pipeline for automated pulmonary nodule detection, characterization, and malignancy risk stratification, and we require a large, well-annotated chest CT dataset to train and validate our models. Specifically, we need volumetric, thin-slice chest CT series acquired with standard clinical protocols (slice thickness 0.625–1.25 mm, reconstructed in both lung window [WL –600 HU, WW 1500 HU] and soft-tissue window [WL 40 HU, WW 400 HU]). Both contrast-enhanced and non-contrast acquisitions are acceptable, though we prefer a mix to improve model generalizability. Each series must be accompanied by radiologist-confirmed 3D segmentation masks delineating all nodules ≥4 mm in longest diameter, including subsolid and ground-glass opacity (GGO) nodules. Annotation should follow LIDC-IDRI conventions: at least two independent radiologist reads per scan with consensus or majority-vote mask, and per-nodule attributes (subtlety, calcification, spiculation, malignancy suspicion on a 1–5 Likert scale). DICOM series must be fully de-identified per HIPAA Safe Harbor and DICOM PS3.15 guidelines, with all burned-in patient text removed from pixel data. NIfTI-converted volumes and JSON sidecar files containing nodule attributes are strongly preferred for ease of ingestion into our training infrastructure. We will also accept raw DICOM with accompanying NIfTI masks. Scans should span a diverse patient population (age, sex, smoking history where feasible) and include cases with benign nodules confirmed by at least 2-year follow-up stability as well as biopsy-confirmed malignant nodules to create a clinically representative label distribution. Primary use cases include training a 3D U-Net nodule segmentor, a false-positive reduction classifier, and a volumetric growth-rate tracker intended for integration into a lung-cancer screening workflow compliant with Lung-RADS 2022. Secondary use cases include RECIST 1.1 longest-diameter measurement automation and multiplanar reconstruction (MPR) visualization research. Tube voltage should be documented in DICOM metadata, with standard protocols at 120 kVp and low-dose acquisitions between 80–100 kVp both acceptable; effective dose should be below 3 mSv for screening-protocol cases. Scanner manufacturer balance across GE, Siemens, Philips, and Canon platforms is requested to reduce scanner-specific bias. Inter-rater agreement metrics (Dice coefficient, Cohen's kappa for malignancy rating) must be reported per contributing site. Scans with motion artifact, severe streak artifact from metallic implants, or incomplete coverage of both lung apices through the posterior costophrenic angles should be excluded by the contributing site's radiologist before submission. Data will be processed on air-gapped GPU clusters and will not be redistributed. An IRB waiver or equivalent ethics approval documentation must accompany each contributing site.

Medical imagingCTDICOMJSONNIfTI
0 / 2000 scans0%

Related categories