Global Information Lookup Global Information

Feature selection information


Feature selection is the process of selecting a subset of relevant features (variables, predictors) for use in model construction. Stylometry and DNA microarray analysis are two cases where feature selection is used. It should be distinguished from feature extraction.[1]

Feature selection techniques are used for several reasons:

  • simplification of models to make them easier to interpret by researchers/users,[2]
  • shorter training times,[3]
  • to avoid the curse of dimensionality,[4]
  • improve data's compatibility with a learning model class,[5]
  • encode inherent symmetries present in the input space.[6][7][8][9]

The central premise when using a feature selection technique is that the data contains some features that are either redundant or irrelevant, and can thus be removed without incurring much loss of information.[10] Redundant and irrelevant are two distinct notions, since one relevant feature may be redundant in the presence of another relevant feature with which it is strongly correlated.[11]

Feature extraction creates new features from functions of the original features, whereas feature selection returns a subset of the features. Feature selection techniques are often used in domains where there are many features and comparatively few samples (or data points).

  1. ^ Sarangi, Susanta; Sahidullah, Md; Saha, Goutam (September 2020). "Optimization of data-driven filterbank for automatic speaker verification". Digital Signal Processing. 104: 102795. arXiv:2007.10729. doi:10.1016/j.dsp.2020.102795. S2CID 220665533.
  2. ^ Gareth James; Daniela Witten; Trevor Hastie; Robert Tibshirani (2013). An Introduction to Statistical Learning. Springer. p. 204.
  3. ^ Brank, Janez; Mladenić, Dunja; Grobelnik, Marko; Liu, Huan; Mladenić, Dunja; Flach, Peter A.; Garriga, Gemma C.; Toivonen, Hannu; Toivonen, Hannu (2011), "Feature Selection", in Sammut, Claude; Webb, Geoffrey I. (eds.), Encyclopedia of Machine Learning, Boston, MA: Springer US, pp. 402–406, doi:10.1007/978-0-387-30164-8_306, ISBN 978-0-387-30768-8, retrieved 2021-07-13
  4. ^ Kramer, Mark A. (1991). "Nonlinear principal component analysis using autoassociative neural networks". AIChE Journal. 37 (2): 233–243. doi:10.1002/aic.690370209. ISSN 1547-5905.
  5. ^ Kratsios, Anastasis; Hyndman, Cody (2021). "NEU: A Meta-Algorithm for Universal UAP-Invariant Feature Representation". Journal of Machine Learning Research. 22 (92): 1–51. ISSN 1533-7928.
  6. ^ Persello, Claudio; Bruzzone, Lorenzo (July 2014). "Relevant and invariant feature selection of hyperspectral images for domain generalization". 2014 IEEE Geoscience and Remote Sensing Symposium (PDF). IEEE. pp. 3562–3565. doi:10.1109/igarss.2014.6947252. ISBN 978-1-4799-5775-0. S2CID 8368258.
  7. ^ Hinkle, Jacob; Muralidharan, Prasanna; Fletcher, P. Thomas; Joshi, Sarang (2012). "Polynomial Regression on Riemannian Manifolds". In Fitzgibbon, Andrew; Lazebnik, Svetlana; Perona, Pietro; Sato, Yoichi; Schmid, Cordelia (eds.). Computer Vision – ECCV 2012. Lecture Notes in Computer Science. Vol. 7574. Berlin, Heidelberg: Springer. pp. 1–14. arXiv:1201.2395. doi:10.1007/978-3-642-33712-3_1. ISBN 978-3-642-33712-3. S2CID 8849753.
  8. ^ Yarotsky, Dmitry (2021-04-30). "Universal Approximations of Invariant Maps by Neural Networks". Constructive Approximation. 55: 407–474. arXiv:1804.10306. doi:10.1007/s00365-021-09546-1. ISSN 1432-0940. S2CID 13745401.
  9. ^ Hauberg, Søren; Lauze, François; Pedersen, Kim Steenstrup (2013-05-01). "Unscented Kalman Filtering on Riemannian Manifolds". Journal of Mathematical Imaging and Vision. 46 (1): 103–120. doi:10.1007/s10851-012-0372-9. ISSN 1573-7683. S2CID 8501814.
  10. ^ Kratsios, Anastasis; Hyndman, Cody (June 8, 2021). "NEU: A Meta-Algorithm for Universal UAP-Invariant Feature Representation". Journal of Machine Learning Research. 22: 10312. Bibcode:2015NatSR...510312B. doi:10.1038/srep10312. PMC 4437376. PMID 25988841.
  11. ^ Cite error: The named reference guyon-intro was invoked but never defined (see the help page).

and 26 Related for: Feature selection information

Request time (Page generated in 2.3365 seconds.)

Feature selection

Last Update:

analysis are two cases where feature selection is used. It should be distinguished from feature extraction. Feature selection techniques are used for several...

Word Count : 6933

Minimum redundancy feature selection

Last Update:

Minimum redundancy feature selection is an algorithm frequently used in a method to accurately identify characteristics of genes and phenotypes and narrow...

Word Count : 502

Feature engineering

Last Update:

or One-Button Machine combines feature transformations and feature selection on relational data with feature selection techniques. [OneBM] helps data...

Word Count : 2229

Feature Selection Toolbox

Last Update:

Feature Selection Toolbox (FST) is software primarily for feature selection in the machine learning domain, written in C++, developed at the Institute...

Word Count : 625

Dimensionality reduction

Last Update:

and nonlinear approaches. Approaches can also be divided into feature selection and feature extraction. Dimensionality reduction can be used for noise reduction...

Word Count : 2349

Model selection

Last Update:

model selection include feature selection, hyperparameter optimization, and statistical learning theory. In its most basic forms, model selection is one...

Word Count : 2260

Pattern recognition

Last Update:

propagation. Feature selection algorithms attempt to directly prune out redundant or irrelevant features. A general introduction to feature selection which summarizes...

Word Count : 4267

Automated machine learning

Last Update:

pre-processing, feature engineering, feature extraction, and feature selection methods. After these steps, practitioners must then perform algorithm selection and...

Word Count : 970

Data preprocessing

Last Update:

include cleaning, instance selection, normalization, one-hot encoding, data transformation, feature extraction and feature selection. Data preprocessing allows...

Word Count : 1755

Curse of dimensionality

Last Update:

must be removed from the data set. Then they can create or use a feature selection or dimensionality reduction algorithm to remove samples or features...

Word Count : 4129

Tag SNP

Last Update:

feature selection around a specific classifier and select a subset of features based on the classifier's accuracy using cross-validation. The feature...

Word Count : 3150

Ensemble learning

Last Update:

(2021). A Bootstrap Framework for Aggregating within and between Feature Selection Methods. Entropy (Basel, Switzerland), 23(2), 200. doi:10.3390/e23020200...

Word Count : 6612

Outline of machine learning

Last Update:

reduction Canonical correlation analysis (CCA) Factor analysis Feature extraction Feature selection Independent component analysis (ICA) Linear discriminant...

Word Count : 3582

Social media mining

Last Update:

social influence, legitimacy, credibility. Feature selection with social media data – Transforming feature selection to harness the power of social media....

Word Count : 4200

Vestigiality

Last Update:

function of a feature that is no longer subject to positive selection pressures when it loses its value in a changing environment. The feature may be selected...

Word Count : 3876

Decision tree learning

Last Update:

predictor selection can be avoided by the Conditional Inference approach, a two-stage approach, or adaptive leave-one-out feature selection. Many data...

Word Count : 6385

Genetic algorithm

Last Update:

algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic...

Word Count : 8025

NFL draft

Last Update:

League draft, also called the NFL draft or (officially) the Annual Player Selection Meeting, is an annual event which serves as the most common source of...

Word Count : 10911

Bootstrap aggregating

Last Update:

that exhibits Feature 1, but not Feature 2, will be given a "No". Another point that does not exhibit Feature 1, but does exhibit Feature 3, will be given...

Word Count : 2451

Supervised learning

Last Update:

the learned function. In addition, there are many algorithms for feature selection that seek to identify the relevant features and discard the irrelevant...

Word Count : 3011

Bayesian structural time series

Last Update:

structural time series (BSTS) model is a statistical technique used for feature selection, time series forecasting, nowcasting, inferring causal impact and...

Word Count : 446

Submodular set function

Last Update:

including automatic summarization, multi-document summarization, feature selection, active learning, sensor placement, image collection summarization...

Word Count : 3284

Inductive bias

Last Update:

there is good evidence that a feature is useful, it should be deleted. This is the assumption behind feature selection algorithms. Nearest neighbors:...

Word Count : 808

Structural health monitoring

Last Update:

process of selectively choosing data to pass on to or reject from the feature selection process. The data cleansing process is usually based on knowledge...

Word Count : 2931

ABC analysis

Last Update:

estimated importance). Computed ABC was, for example, applied to feature selection for biomedical data, business process management and bankruptcy prediction...

Word Count : 1202

Cartographic generalization

Last Update:

not (for example, a street map including streets but not geology). Feature Selection: (sometimes called refinement or eliminate) the choice of which specific...

Word Count : 4327

PDF Search Engine © AllGlobal.net