In statistics, classification is the problem of identifying which of a set of categories (sub-populations) an observation (or observations) belongs to. Examples are assigning a given email to the "spam" or "non-spam" class, and assigning a diagnosis to a given patient based on observed characteristics of the patient (sex, blood pressure, presence or absence of certain symptoms, etc.).
Often, the individual observations are analyzed into a set of quantifiable properties, known variously as explanatory variables or features. These properties may variously be categorical (e.g. "A", "B", "AB" or "O", for blood type), ordinal (e.g. "large", "medium" or "small"), integer-valued (e.g. the number of occurrences of a particular word in an email) or real-valued (e.g. a measurement of blood pressure). Other classifiers work by comparing observations to previous observations by means of a similarity or distance function.
An algorithm that implements classification, especially in a concrete implementation, is known as a classifier. The term "classifier" sometimes also refers to the mathematical function, implemented by a classification algorithm, that maps input data to a category.
Terminology across fields is quite varied. In statistics, where classification is often done with logistic regression or a similar procedure, the properties of observations are termed explanatory variables (or independent variables, regressors, etc.), and the categories to be predicted are known as outcomes, which are considered to be possible values of the dependent variable. In machine learning, the observations are often known as instances, the explanatory variables are termed features (grouped into a feature vector), and the possible categories to be predicted are classes. Other fields may use different terminology: e.g. in community ecology, the term "classification" normally refers to cluster analysis.
and 29 Related for: Statistical classification information
sentence; etc. A common subclass of classification is probabilistic classification. Algorithms of this nature use statistical inference to find the best class...
International StatisticalClassification of Diseases and Related Health Problems, although the original title, International Classification of Diseases...
Classification is a broad concept that comprises the process of classifying, the set of groups resulting from classifying, and the assignment of elements...
The StatisticalClassification of Economic Activities in the European Community, commonly referred to as NACE (for the French term "nomenclature statistique...
In machine learning and statisticalclassification, multiclass classification or multinomial classification is the problem of classifying instances into...
at any point by the Diagnostic and Statistical Manual of Mental Disorders (DSM) or the International Classification of Diseases (ICD). A mental disorder...
not detecting a disease when it is present (a false negative). Statisticalclassification is a problem studied in machine learning. It is a type of supervised...
financial markets. National and international statistical agencies use various industry-classification schemes to summarize economic conditions. Securities...
initial impetus for developing a classification of mental disorders in the United States was the need to collect statistical information. The first official...
In statisticalclassification, the Bayes classifier is the classifier having the smallest probability of misclassification of all classifiers using the...
or social problem, it is conventional to begin with a statistical population or a statistical model to be studied. Populations can be diverse groups...
A medical classification is used to transform descriptions of medical diagnoses or procedures into standardized statistical code in a process known as...
Still, a comprehensive comparison with other classification algorithms in 2006 showed that Bayes classification is outperformed by other approaches, such...
Implementation details are not shared but they fall in the field of statisticalclassification or expert systems. Before beginning the questionnaire, the players...
in previously pleasurable activities. While the International StatisticalClassification of Diseases and Related Health Problems, Tenth Revision (ICD-10)...
of the International Classification of Diseases (ICD) and in the American Psychiatric Association's Diagnostic and Statistical Manual of Mental Disorders...
involves training a classifier (the key difference to many other statisticalclassification problems is the inherently unbalanced nature of outlier detection)...
(SIC) North American Product Classification System (NAPCS) Global Industry Classification Standard StatisticalClassification of Economic Activities in the...
making it a difficult method to prevent. In the International StatisticalClassification of Diseases and Related Health Problems, suicides by hanging are...
the observation should belong to. Probabilistic classifiers provide classification that can be useful in its own right or when combining classifiers into...
when the clinician does not give a reason. The International StatisticalClassification of Diseases and Related Health Problems (ICD-10) refers to the...
Organization (WHO) publishes a classification of known diseases and injuries, the International StatisticalClassification of Diseases and Related Health...
The International Standard Classification of Occupations (ISCO) is an International Labour Organization (ILO) classification structure for organizing information...
In statisticalclassification, two main approaches are called the generative approach and the discriminative approach. These compute classifiers by different...
false negative) diagnosis, and in statisticalclassification as a false positive (or false negative) error. In statistical hypothesis testing, the analogous...
The International Standard Classification of Education (ISCED) is a statistical framework for organizing information on education maintained by the United...
statistical learning frameworks of VC theory proposed by Vapnik (1982, 1995) and Chervonenkis (1974). In addition to performing linear classification...
In the traditional language of statistical hypothesis testing, the sensitivity of a test is called the statistical power of the test, although the word...
probability of output in terms of input and does not perform statisticalclassification (it is not a classifier), though it can be used to make a classifier...