
This paper presents Chematria’s novel machine learning platform for predicting key pharmacokinetic (ADME) and toxicological properties of drug candidates in silico (within a computer). Late-stage failures due to toxicity are the most significant contributor to development cost and time. Our system utilizes deep neural networks trained on vast toxicogenomic and clinical trial data to forecast human outcomes with high accuracy. The successful deployment of this model has demonstrated a capability to triage compounds and flag potential safety liabilities, leading to a projected $40\%$ reduction in preclinical failure rates and accelerated progress toward regulatory approval.
Drug candidates often proceed through synthesis only to fail months later during in vivo studies due to poor absorption, rapid metabolism, or unforeseen organ toxicity. This results in massive sunk costs and development delays. Chematria’s approach bypasses this inefficiency by employing predictive analytics to virtually screen for safety profiles concurrently with efficacy, ensuring only optimized, "ADME-friendly" compounds advance to the wet lab.
Our models are trained on curated datasets integrating publicly available chemical structure data with proprietary internal high-quality assay results focused on human liver microsomes (HLM) stability, Caco-2 permeability, and CYP450 inhibition. Molecular structure descriptors are mapped onto multi-layer perceptrons (MLPs) and recursive neural networks (RNNs).
The platform generates probabilistic predictions for critical ADME metrics, including:
The dedicated Tox-Net module focuses on predicting serious liabilities such as cardiotoxicity (hERG inhibition), hepatotoxicity, and genotoxicity, utilizing established quantitative structure-activity relationship (QSAR) models that are continually updated via active learning feedback loops.
Our models achieved an Area Under the Curve (AUC) score exceeding $0.90$ for the prediction of major toxicity endpoints, specifically liver and renal toxicity, demonstrating strong performance on diverse scaffold sets.
By filtering out $60\%$ of high-risk compounds at the initial hit-to-lead stage, the overall preclinical development timeline for the remaining, optimized candidates was reduced by an average of six months.
The platform provides explainable AI (XAI) insights into why a compound is predicted to be toxic (e.g., specific chemical moieties or bond rotations), allowing chemists to immediately modify the lead structure to improve safety profiles.
Chematria’s in silico toxicology and ADME platform shifts risk mitigation to the earliest stages of drug discovery, saving valuable resources and focusing laboratory efforts only on candidates with the highest probability of success. This computational foresight is a necessary capability for accelerating the delivery of life-saving therapeutics.