List of datasets in computer vision and image processing
Outline of machine learning
v
t
e
Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some specific sense defined by the analyst) to each other than to those in other groups (clusters). It is a main task of exploratory data analysis, and a common technique for statistical data analysis, used in many fields, including pattern recognition, image analysis, information retrieval, bioinformatics, data compression, computer graphics and machine learning.
Cluster analysis refers to a family of algorithms and tasks rather than one specific algorithm. It can be achieved by various algorithms that differ significantly in their understanding of what constitutes a cluster and how to efficiently find them. Popular notions of clusters include groups with small distances between cluster members, dense areas of the data space, intervals or particular statistical distributions. Clustering can therefore be formulated as a multi-objective optimization problem. The appropriate clustering algorithm and parameter settings (including parameters such as the distance function to use, a density threshold or the number of expected clusters) depend on the individual data set and intended use of the results. Cluster analysis as such is not an automatic task, but an iterative process of knowledge discovery or interactive multi-objective optimization that involves trial and failure. It is often necessary to modify data preprocessing and model parameters until the result achieves the desired properties.
Besides the term clustering, there is a number of terms with similar meanings, including automatic classification, numerical taxonomy, botryology (from Greek: βότρυς'grape'), typological analysis, and community detection. The subtle differences are often in the use of the results: while in data mining, the resulting groups are the matter of interest, in automatic classification the resulting discriminative power is of interest.
Cluster analysis was originated in anthropology by Driver and Kroeber in 1932[1] and introduced to psychology by Joseph Zubin in 1938[2] and Robert Tryon in 1939[3] and famously used by Cattell beginning in 1943[4] for trait theory classification in personality psychology.
^Driver and Kroeber (1932). "Quantitative Expression of Cultural Relationships". University of California Publications in American Archaeology and Ethnology. Quantitative Expression of Cultural Relationships. Berkeley, CA: University of California Press: 211–256. Archived from the original on 2020-12-06. Retrieved 2019-02-18.
^Zubin, Joseph (1938). "A technique for measuring like-mindedness". The Journal of Abnormal and Social Psychology. 33 (4): 508–516. doi:10.1037/h0055441. ISSN 0096-851X.
^Tryon, Robert C. (1939). Cluster Analysis: Correlation Profile and Orthometric (factor) Analysis for the Isolation of Unities in Mind and Personality. Edwards Brothers.
^Cattell, R. B. (1943). "The description of personality: Basic traits resolved into clusters". Journal of Abnormal and Social Psychology. 38 (4): 476–506. doi:10.1037/h0054116.
Clusteranalysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar...
hierarchical clustering (also called hierarchical clusteranalysis or HCA) is a method of clusteranalysis that seeks to build a hierarchy of clusters. Strategies...
two dimensions and to visually identify clusters of closely related data points. Principal component analysis has applications in many fields such as...
some multivariate techniques such as multidimensional scaling and clusteranalysis, the concept of distance between the units in the data is often of...
noise from grayscale images. In clusteranalysis, the k-medians clustering algorithm provides a way of defining clusters, in which the criterion of maximising...
vector space using the rows of V {\displaystyle V} . Now the analysis is reduced to clustering vectors with k {\displaystyle k} components, which may be...
discriminant correspondence analysis. Discriminant analysis is used when groups are known a priori (unlike in clusteranalysis). Each case must have a score...
In statistics, cluster sampling is a sampling plan used when mutually homogeneous yet internally heterogeneous groupings are evident in a statistical...
Look up cluster in Wiktionary, the free dictionary. Cluster(s) may refer to: Cluster (spacecraft), constellation of four European Space Agency spacecraft...
more than one cluster. Clustering or clusteranalysis involves assigning data points to clusters such that items in the same cluster are as similar as possible...
Embeddings for machine learning models include support-vector machines, clustering and probabilistic graphical models. Moreover, due to its close connection...
Boolean analysis – a method to find deterministic dependencies between variables in a sample, mostly used in exploratory data analysisClusteranalysis – techniques...
like a single computer Data cluster, an allocation of contiguous storage in databases and file systems Clusteranalysis, the statistical task of grouping...
the number of clusters in a data set, a quantity often labelled k as in the k-means algorithm, is a frequent problem in data clustering, and is a distinct...
pattern recognition and machine learning, where time series analysis can be used for clustering, classification, query by content, anomaly detection as well...
other subgroups. In clusteranalysis, the number of clusters to search for K is determined in advance; how distinct the clusters are varies. The results...
Bivariate analysis is one of the simplest forms of quantitative (statistical) analysis. It involves the analysis of two variables (often denoted as X, Y)...
used in unsupervised learning are principal component and clusteranalysis. Clusteranalysis is used in unsupervised learning to group, or segment, datasets...
describing a cluster is not standardized. Individual economic consultants and researchers develop their own methodologies. All clusteranalysis relies on...
Document clustering (or text clustering) is the application of clusteranalysis to textual documents. It has applications in automatic document organization...
automatic analysis of large quantities of data to extract previously unknown, interesting patterns such as groups of data records (clusteranalysis), unusual...
Clusters/Components/Kernels) is an algorithm based on graph connectivity for clusteranalysis. It works by representing the similarity data in a similarity graph...
Density-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm proposed by Martin Ester, Hans-Peter Kriegel, Jörg...