Feature selection methods in QSAR studies.
A quantitative structure-activity relationship (QSAR) relates quantitative chemical structure attributes (molecular descriptors) to a biological activity. QSAR studies have now become attractive in drug discovery and development because their application can save substantial time and human resources. Several parameters are important in the prediction ability of a QSAR model. On the one hand, different statistical methods may be applied to check the linear or nonlinear behavior of a data set. On the other hand, feature selection techniques are applied to decrease the model complexity, to decrease the overfitting/overtraining risk, and to select the most important descriptors from the often more than 1000 calculated. The selected descriptors are then linked to a biological activity of the corresponding compound by means of a mathematical model. Different modeling techniques can be applied, some of which explicitly require a feature selection. A QSAR model can be useful in the design of new compounds with improved potency in the class under study. Only molecules with a predicted interesting activity will be synthesized. In the feature selection problem, a learning algorithm is faced with the problem of selecting a relevant subset of features upon which to focus attention, while ignoring the rest. Up to now, many feature selection techniques, such as genetic algorithms, forward selection, backward elimination, stepwise regression, and simulated annealing have been used extensively. Swarm intelligence optimizations, such as ant colony optimization and partial swarm optimization, which are feature selection techniques usually simulated based on animal and insect life behavior to find the shortest path between a food source and their nests, recently are also involved in QSAR studies. This review paper provides an overview of different feature selection techniques applied in QSAR modeling.
Vrije Universiteit Brussel, Department of Analytical Chemistry and Pharmaceutical Technology, Center for Pharmaceutical Research, Brussels, Belgium.
SourceJournal of AOAC International 95:3 pg 636-51
Quantitative Structure-Activity Relationship
Pub Type(s)Journal Article
Research Support, Non-U.S. Gov't