
A comprehensive validation study confirming the high correlation between our in silico potency predictions and subsequent in vitro assay results across a panel of cancer targets.
Kinases are one of the most important classes of drug targets in oncology, yet developing selective and potent inhibitors remains a significant challenge due to the high structural homology among kinase families. This study details the rigorous validation of Chematria’s machine learning platform, which is designed to predict the half-maximal inhibitory concentration (IC50) and selectivity of novel kinase inhibitors in silico. The model was tested against a challenging blinded dataset of 500 structurally diverse compounds targeting the EGFR and c-Src kinases. The results demonstrate a high Pearson correlation (R > 0.94) between predicted and observed IC50 values, confirming that the platform provides a highly accurate and scalable method for prioritizing potent and selective oncology candidates.
The human kinome contains over 500 protein kinases, many of which share highly similar ATP-binding sites. This structural overlap makes achieving high selectivity a major hurdle in drug design, often leading to off-target toxicity in clinical trials. Traditional screening methods are slow to pinpoint these selectivity issues. Chematria’s computational platform addresses this by predicting potency and selectivity concurrently, using advanced molecular descriptors and deep learning.
The predictive model is based on an ensemble of Support Vector Machines (SVM) and Random Forest classifiers, trained on a proprietary library of over 15 million kinase-ligand binding pairs. The training focused on features derived from 3D molecular geometry and electronic properties, which are key determinants of binding affinity.
The primary output of the model is the predicted IC50 value. A secondary output is the Selectivity Index (SI), calculated as the ratio of IC50 for the target kinase (e.g., EGFR) versus a panel of 20 known anti-targets (e.g., related kinases). This dual-prediction approach ensures both potency and safety are optimized.
A test set of 500 compounds, none of which were included in the training data, was generated. The compounds were screened using the Chematria platform to generate predicted IC50 and SI scores. These predictions were then benchmarked against observed values obtained from standard, low-throughput in vitro enzymatic assays performed by a third-party laboratory.
The AI model achieved a Pearson correlation coefficient (R) of 0.94 between predicted and experimentally observed IC50 values for the primary EGFR target. This high correlation confirms the platform's reliability in forecasting the biological activity of structurally novel inhibitors.
Using the predicted Selectivity Index, the platform successfully identified 90% of compounds that failed the in vitro selectivity panel (SI $\le 10$) with a high negative predictive value, demonstrating its effectiveness as a computational filter for high-risk, non-selective compounds.
By eliminating 75% of the test set before physical synthesis and in vitro testing, the platform would have reduced the operational cost of the preliminary screening by an estimated 45%.
The successful validation of Chematria’s predictive model for kinase inhibitor potency and selectivity proves its utility in the oncology drug discovery pipeline. By providing accurate forecasts of potency and potential off-target effects, the platform enables researchers to rapidly advance only the most promising candidates, thereby accelerating the development of next-generation, highly targeted cancer therapies.