Stage: IND-enabling
12 benchmarks, ranked by composite score. ← All stages
| # | Benchmark | Description | Modalities | Score | Flags |
|---|---|---|---|---|---|
| 1 | ProteinGym | 217 DMS substitution assays + indel + clinical variants — de facto standard for VEPs. | protein-general | 97.5 | |
| 2 | ToxCast | EPA's in vitro toxicity screening dataset — ~700 assays × ~9k chemicals. | small-molecule | 85.6 | |
| 3 | ISM Benchmarks: ADMET (Insilico) | 28-endpoint ADMET benchmark suite on ScienceAIBench/InsilicoBench/DDB. | small-molecule | 84.6 | |
| 4 | Open Systems Pharmacology / PK-Sim | OSP Suite — open PBPK/QSP models and validation sets. | small-moleculebiologic-mab | 80.3 | |
| 5 | AMES (mutagenicity) | AMES bacterial mutagenicity benchmark — standard gentox endpoint. | small-molecule | 79.5 | |
| 6 | Tox21 | US Tox21 program HTS data on 12 nuclear receptor / stress response assays. | small-molecule | 77.5 | |
| 7 | Obach PK Dataset | Obach human PK dataset (t1/2, VDss, CL) — standard human-PK ML benchmark. | small-molecule | 77.0 | |
| 8 | SIDER | Drug-side effect associations mined from FDA labels. | small-molecule | 74.9 | |
| 9 | Simcyp Validation Sets | PBPK validation datasets used by Simcyp/Certara community (DDI, pediatric, renal impairment). | small-molecule | 74.4 | license-gated-commercial |
| 10 | hERG (cardio-tox) TDC | Cardiac tox benchmark (hERG inhibition) — standardized from Wang et al. | small-molecule | 73.9 | |
| 11 | DILI / LD50 Zhu | Drug-induced liver injury + rat LD50 (Zhu) — standard acute tox benchmarks. | small-molecule | 73.9 | |
| 12 | ClinTox | Binary classification of FDA-approved vs. trial-failed-for-toxicity compounds. | small-molecule | 65.6 | data-leakage-known |