Stage: Clinical Development
5 benchmarks, ranked by composite score. ← All stages
| # | Benchmark | Description | Modalities | Score | Flags |
|---|---|---|---|---|---|
| 1 | MIMIC-IV Benchmark Tasks | Standardized ICU benchmarks on MIMIC-IV — mortality, LOS, sepsis, AKI, drug dosing. | cross-modality | 89.4 | |
| 2 | ClinBench Quarterly (Insilico) | Quarterly-refreshed clinical-trial outcome benchmark on ScienceAIBench / InsilicoBench / DDB (25 tasks). | cross-modality | 81.5 | |
| 3 | HINT / TrialBench | Clinical trial outcome prediction benchmarks built on ClinicalTrials.gov (17-21k trials). | cross-modality | 76.5 | |
| 4 | Trial Outcome Prediction (TOP) | Benchmarks for predicting Phase 1-3 trial outcomes from structured + text features. | cross-modality | 76.5 | |
| 5 | ClawBio Skill Correctness Bench | Third-party (Biostochastics LLC) benchmark of bio-analysis skills on safety / correctness / honesty. 10 skills × 182 tests. | cross-modality | 74.2 |