Using Machine Learning Methods to Predict Experimental High Throughput Screening Data

Mballo, Cherif et Makarenkov, Vladimir (2010). « Using Machine Learning Methods to Predict Experimental High Throughput Screening Data ». Combinatorial Chemistry & High Throughput Screening, 13(5), pp. 430-441.

Fichier(s) associé(s) à ce document :
Télécharger (340kB)


High-throughput screening (HTS) remains a very costly process notwithstanding many recent technological advances in the field of biotechnology. In this study we consider the application of machine learning methods for predicting experimental HTS measurements. Such a virtual HTS analysis can be based on the results of real HTS campaigns carried out with similar compounds libraries and similar drug targets. In this way, we analyzed Test assay from McMaster University Data Mining and Docking Competition [1] using binary decision trees, neural networks, support vector machines (SVM), linear discriminant analysis, k-nearest neighbors and partial least squares. First, we studied separately the sets of molecular and atomic descriptors in order to establish which of them provides a better prediction. Then, the comparison of the six considered machine learning methods was made in terms of false positives and false negatives, method’s sensitivity and enrichment factor. Finally, a variable selection procedure allowing one to improve the method’s sensitivity was implemented and applied in the framework of polynomial SVM.

Type: Article de revue scientifique
Mots-clés ou Sujets: CART, decision trees, drug target, hit, k-nearest neighbors (kNN), linear discriminant analysis (LDA), neural networks (NN), partial least squares (PLS), ROC curve, sampling, support vector machines (SVM), virtual high-throughput screening.
Unité d'appartenance: Faculté des sciences > Département d'informatique
Déposé par: Vladimir Makarenkov
Date de dépôt: 23 mars 2016 13:13
Dernière modification: 20 avr. 2016 20:01
Adresse URL :


Voir les statistiques sur cinq ans...