This article includes a list of general references, but it lacks sufficient corresponding inline citations. Please help to improve this article by introducing more precise citations.(July 2010) (Learn how and when to remove this message)
Part of a series on
Machine learning and data mining
Paradigms
Supervised learning
Unsupervised learning
Online learning
Batch learning
Meta-learning
Semi-supervised learning
Self-supervised learning
Reinforcement learning
Curriculum learning
Rule-based learning
Quantum machine learning
Problems
Classification
Generative modeling
Regression
Clustering
Dimensionality reduction
Density estimation
Anomaly detection
Data cleaning
AutoML
Association rules
Semantic analysis
Structured prediction
Feature engineering
Feature learning
Learning to rank
Grammar induction
Ontology learning
Multimodal learning
Supervised learning (classification • regression)
Apprenticeship learning
Decision trees
Ensembles
Bagging
Boosting
Random forest
k-NN
Linear regression
Naive Bayes
Artificial neural networks
Logistic regression
Perceptron
Relevance vector machine (RVM)
Support vector machine (SVM)
Clustering
BIRCH
CURE
Hierarchical
k-means
Fuzzy
Expectation–maximization (EM)
DBSCAN
OPTICS
Mean shift
Dimensionality reduction
Factor analysis
CCA
ICA
LDA
NMF
PCA
PGD
t-SNE
SDL
Structured prediction
Graphical models
Bayes net
Conditional random field
Hidden Markov
Anomaly detection
RANSAC
k-NN
Local outlier factor
Isolation forest
Artificial neural network
Autoencoder
Cognitive computing
Deep learning
DeepDream
Feedforward neural network
Kolmogorov–Arnold Network
Recurrent neural network
LSTM
GRU
ESN
reservoir computing
Restricted Boltzmann machine
GAN
Diffusion model
SOM
Convolutional neural network
U-Net
Transformer
Vision
Mamba
Spiking neural network
Memtransistor
Electrochemical RAM (ECRAM)
Reinforcement learning
Q-learning
SARSA
Temporal difference (TD)
Multi-agent
Self-play
Learning with humans
Active learning
Crowdsourcing
Human-in-the-loop
RLHF
Model diagnostics
Coefficient of determination
Confusion matrix
Learning curve
ROC curve
Mathematical foundations
Kernel machines
Bias–variance tradeoff
Computational learning theory
Empirical risk minimization
Occam learning
PAC learning
Statistical learning
VC theory
Machine-learning venues
ECML PKDD
NeurIPS
ICML
ICLR
IJCAI
ML
JMLR
Related articles
Glossary of artificial intelligence
List of datasets for machine-learning research
List of datasets in computer vision and image processing
Outline of machine learning
v
t
e
Feature selection is the process of selecting a subset of relevant features (variables, predictors) for use in model construction. Stylometry and DNA microarray analysis are two cases where feature selection is used. It should be distinguished from feature extraction.[1]
Feature selection techniques are used for several reasons:
simplification of models to make them easier to interpret by researchers/users,[2]
shorter training times,[3]
to avoid the curse of dimensionality,[4]
improve data's compatibility with a learning model class,[5]
encode inherent symmetries present in the input space.[6][7][8][9]
The central premise when using a feature selection technique is that the data contains some features that are either redundant or irrelevant, and can thus be removed without incurring much loss of information.[10]Redundant and irrelevant are two distinct notions, since one relevant feature may be redundant in the presence of another relevant feature with which it is strongly correlated.[11]
Feature extraction creates new features from functions of the original features, whereas feature selection returns a subset of the features. Feature selection techniques are often used in domains where there are many features and comparatively few samples (or data points).
^Sarangi, Susanta; Sahidullah, Md; Saha, Goutam (September 2020). "Optimization of data-driven filterbank for automatic speaker verification". Digital Signal Processing. 104: 102795. arXiv:2007.10729. doi:10.1016/j.dsp.2020.102795. S2CID 220665533.
^Gareth James; Daniela Witten; Trevor Hastie; Robert Tibshirani (2013). An Introduction to Statistical Learning. Springer. p. 204.
^Brank, Janez; Mladenić, Dunja; Grobelnik, Marko; Liu, Huan; Mladenić, Dunja; Flach, Peter A.; Garriga, Gemma C.; Toivonen, Hannu; Toivonen, Hannu (2011), "Feature Selection", in Sammut, Claude; Webb, Geoffrey I. (eds.), Encyclopedia of Machine Learning, Boston, MA: Springer US, pp. 402–406, doi:10.1007/978-0-387-30164-8_306, ISBN 978-0-387-30768-8, retrieved 2021-07-13
^Kramer, Mark A. (1991). "Nonlinear principal component analysis using autoassociative neural networks". AIChE Journal. 37 (2): 233–243. doi:10.1002/aic.690370209. ISSN 1547-5905.
^Kratsios, Anastasis; Hyndman, Cody (2021). "NEU: A Meta-Algorithm for Universal UAP-Invariant Feature Representation". Journal of Machine Learning Research. 22 (92): 1–51. ISSN 1533-7928.
^Persello, Claudio; Bruzzone, Lorenzo (July 2014). "Relevant and invariant feature selection of hyperspectral images for domain generalization". 2014 IEEE Geoscience and Remote Sensing Symposium(PDF). IEEE. pp. 3562–3565. doi:10.1109/igarss.2014.6947252. ISBN 978-1-4799-5775-0. S2CID 8368258.
^Hinkle, Jacob; Muralidharan, Prasanna; Fletcher, P. Thomas; Joshi, Sarang (2012). "Polynomial Regression on Riemannian Manifolds". In Fitzgibbon, Andrew; Lazebnik, Svetlana; Perona, Pietro; Sato, Yoichi; Schmid, Cordelia (eds.). Computer Vision – ECCV 2012. Lecture Notes in Computer Science. Vol. 7574. Berlin, Heidelberg: Springer. pp. 1–14. arXiv:1201.2395. doi:10.1007/978-3-642-33712-3_1. ISBN 978-3-642-33712-3. S2CID 8849753.
analysis are two cases where featureselection is used. It should be distinguished from feature extraction. Featureselection techniques are used for several...
Minimum redundancy featureselection is an algorithm frequently used in a method to accurately identify characteristics of genes and phenotypes and narrow...
FeatureSelection Toolbox (FST) is software primarily for featureselection in the machine learning domain, written in C++, developed at the Institute...
and nonlinear approaches. Approaches can also be divided into featureselection and feature extraction. Dimensionality reduction can be used for noise reduction...
model selection include featureselection, hyperparameter optimization, and statistical learning theory. In its most basic forms, model selection is one...
propagation. Featureselection algorithms attempt to directly prune out redundant or irrelevant features. A general introduction to featureselection which summarizes...
pre-processing, feature engineering, feature extraction, and featureselection methods. After these steps, practitioners must then perform algorithm selection and...
include cleaning, instance selection, normalization, one-hot encoding, data transformation, feature extraction and featureselection. Data preprocessing allows...
must be removed from the data set. Then they can create or use a featureselection or dimensionality reduction algorithm to remove samples or features...
featureselection around a specific classifier and select a subset of features based on the classifier's accuracy using cross-validation. The feature...
(2021). A Bootstrap Framework for Aggregating within and between FeatureSelection Methods. Entropy (Basel, Switzerland), 23(2), 200. doi:10.3390/e23020200...
social influence, legitimacy, credibility. Featureselection with social media data – Transforming featureselection to harness the power of social media....
function of a feature that is no longer subject to positive selection pressures when it loses its value in a changing environment. The feature may be selected...
predictor selection can be avoided by the Conditional Inference approach, a two-stage approach, or adaptive leave-one-out featureselection. Many data...
algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic...
League draft, also called the NFL draft or (officially) the Annual Player Selection Meeting, is an annual event which serves as the most common source of...
that exhibits Feature 1, but not Feature 2, will be given a "No". Another point that does not exhibit Feature 1, but does exhibit Feature 3, will be given...
the learned function. In addition, there are many algorithms for featureselection that seek to identify the relevant features and discard the irrelevant...
structural time series (BSTS) model is a statistical technique used for featureselection, time series forecasting, nowcasting, inferring causal impact and...
there is good evidence that a feature is useful, it should be deleted. This is the assumption behind featureselection algorithms. Nearest neighbors:...
process of selectively choosing data to pass on to or reject from the featureselection process. The data cleansing process is usually based on knowledge...
estimated importance). Computed ABC was, for example, applied to featureselection for biomedical data, business process management and bankruptcy prediction...
not (for example, a street map including streets but not geology). FeatureSelection: (sometimes called refinement or eliminate) the choice of which specific...