Community-Driven Initiative

RSNA
Benchmarks

Multisite, multi-center, multi-diagnosis benchmarks for evaluating frontier radiology AI models and vision language models.

Explore Benchmarks→ Get Involved

Mission

Rigorous evaluation for the
next generation of radiology AI

RSNA Benchmarks is a community-driven initiative to establish standardized, reproducible evaluation frameworks for frontier radiology AI models. As vision language models rapidly advance, the field needs rigorous, multi-center benchmarks that reflect real-world clinical complexity.

Our benchmarks are designed to be multisite and multi-diagnosis, drawing data and expertise from institutions worldwide. Each benchmark targets a specific clinical domain with carefully curated cases, consensus ground truth, and transparent evaluation metrics. Critically, our datasets are assembled to be representative of real-world clinical populations — capturing the diversity of pathology, patient demographics, and imaging conditions that practitioners encounter in daily practice.

By providing an open, community-governed resource grounded in clinical realism, we aim to accelerate responsible development and deployment of AI in radiology.

Multi

Center Collaboration

Open

Community Resource

VLM

Vision Language Models

RSNA

Society Partnership

Benchmarks

Active & Upcoming Projects

Each benchmark is a structured evaluation covering specific clinical domains, modalities, and diagnostic tasks.

Active Development

CT Abdomen Benchmark

The inaugural RSNA CT Benchmark — a comprehensive evaluation framework for AI models interpreting emergency radiology CT abdomen cases. Focuses on acute diagnoses encountered in clinical practice such as appendicitis, diverticulitis, and cholecystitis, spanning pathologies across liver, kidney, pancreas, bowel, and vascular structures with multi-reader consensus ground truth.

CTAbdomenMulti-diagnosisVLM EvaluationMultisite

v0.1 Version

2026 Date

TBD Sites

Overview 🔒 Collaborators — Private

Planned

Chest X-Ray Benchmark

Multi-center evaluation of AI performance on frontal and lateral chest radiographs across a spectrum of thoracic pathology.

RadiographChestPlanned

Upcoming

Brain MRI Benchmark

Structured assessment of AI interpretation across neuro MRI sequences for common and critical neurological diagnoses.

MRINeuroUpcoming

Get Involved

Built by the community,
for the community

RSNA Benchmarks is an open initiative. We welcome contributions from radiologists, AI researchers, radiology AI vendors, regulators, and institutions worldwide.

☷

Contribute Data

Share anonymized cases from your institution to strengthen benchmark diversity and clinical representativeness.

Learn More→

⚙

Join Development

Help design evaluation frameworks, define ground truth protocols, and build the technical infrastructure.

Get Started→

★

Evaluate Models

Run your models against our benchmarks and contribute results to the growing body of evaluation data.

Coming Soon→

Private Access

RSNABenchmarks

Rigorous evaluation for thenext generation of radiology AI