H.3.2.10. Medicine and science
Ali Ghanbari; Mohaddeseh Keyhanian; Jamshid pirgazi
Abstract
Accurate prediction of drug–target interactions is essential for advancing drug discovery and repositioning efforts. This study introduces a comprehensive framework that effectively addresses key challenges in DTI prediction, including dataset imbalance and high-dimensional feature representations. ...
Read More
Accurate prediction of drug–target interactions is essential for advancing drug discovery and repositioning efforts. This study introduces a comprehensive framework that effectively addresses key challenges in DTI prediction, including dataset imbalance and high-dimensional feature representations. The approach integrates multiple protein descriptors—specifically, nine statistical and sequence-based features—and drug molecular fingerprints encoded via Morgan algorithms, with optimal feature combinations selected through validation to capture diverse biological and chemical information. To mitigate dataset imbalance, a one-class SVM-based undersampling method (One-SVM-US) models the distribution of positive interactions to guide the selective reduction of the majority class, thereby effectively balancing positive and negative samples. Furthermore, a supervised, classification-oriented variational autoencoder is employed to compress the high-dimensional features into a lower-dimensional space while preserving class-discriminative information relevant to interaction prediction. The refined features are then classified using machine learning models to predict potential drug–target pairs. Experimental evaluations on benchmark datasets demonstrate the effectiveness of the proposed framework, with results showing perfect AUC-ROC scores of 1.00 on the EN, GPCR, and NR datasets, and a score of 0.9731 on the IC dataset, indicating performance improvements over existing methods. These findings confirm the robustness and potential of the approach as a reliable tool for drug–target interaction prediction.