Global Information Lookup Global Information

Cluster analysis information


The result of a cluster analysis shown as the coloring of the squares into three clusters

Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some specific sense defined by the analyst) to each other than to those in other groups (clusters). It is a main task of exploratory data analysis, and a common technique for statistical data analysis, used in many fields, including pattern recognition, image analysis, information retrieval, bioinformatics, data compression, computer graphics and machine learning.

Cluster analysis refers to a family of algorithms and tasks rather than one specific algorithm. It can be achieved by various algorithms that differ significantly in their understanding of what constitutes a cluster and how to efficiently find them. Popular notions of clusters include groups with small distances between cluster members, dense areas of the data space, intervals or particular statistical distributions. Clustering can therefore be formulated as a multi-objective optimization problem. The appropriate clustering algorithm and parameter settings (including parameters such as the distance function to use, a density threshold or the number of expected clusters) depend on the individual data set and intended use of the results. Cluster analysis as such is not an automatic task, but an iterative process of knowledge discovery or interactive multi-objective optimization that involves trial and failure. It is often necessary to modify data preprocessing and model parameters until the result achieves the desired properties.

Besides the term clustering, there is a number of terms with similar meanings, including automatic classification, numerical taxonomy, botryology (from Greek: βότρυς 'grape'), typological analysis, and community detection. The subtle differences are often in the use of the results: while in data mining, the resulting groups are the matter of interest, in automatic classification the resulting discriminative power is of interest.

Cluster analysis was originated in anthropology by Driver and Kroeber in 1932[1] and introduced to psychology by Joseph Zubin in 1938[2] and Robert Tryon in 1939[3] and famously used by Cattell beginning in 1943[4] for trait theory classification in personality psychology.

  1. ^ Driver and Kroeber (1932). "Quantitative Expression of Cultural Relationships". University of California Publications in American Archaeology and Ethnology. Quantitative Expression of Cultural Relationships. Berkeley, CA: University of California Press: 211–256. Archived from the original on 2020-12-06. Retrieved 2019-02-18.
  2. ^ Zubin, Joseph (1938). "A technique for measuring like-mindedness". The Journal of Abnormal and Social Psychology. 33 (4): 508–516. doi:10.1037/h0055441. ISSN 0096-851X.
  3. ^ Tryon, Robert C. (1939). Cluster Analysis: Correlation Profile and Orthometric (factor) Analysis for the Isolation of Unities in Mind and Personality. Edwards Brothers.
  4. ^ Cattell, R. B. (1943). "The description of personality: Basic traits resolved into clusters". Journal of Abnormal and Social Psychology. 38 (4): 476–506. doi:10.1037/h0054116.

and 23 Related for: Cluster analysis information

Request time (Page generated in 0.8608 seconds.)

Cluster analysis

Last Update:

Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar...

Word Count : 8803

Hierarchical clustering

Last Update:

hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method of cluster analysis that seeks to build a hierarchy of clusters. Strategies...

Word Count : 2895

Principal component analysis

Last Update:

two dimensions and to visually identify clusters of closely related data points. Principal component analysis has applications in many fields such as...

Word Count : 14214

Standard score

Last Update:

some multivariate techniques such as multidimensional scaling and cluster analysis, the concept of distance between the units in the data is often of...

Word Count : 1883

Median

Last Update:

noise from grayscale images. In cluster analysis, the k-medians clustering algorithm provides a way of defining clusters, in which the criterion of maximising...

Word Count : 7641

Spectral clustering

Last Update:

vector space using the rows of V {\displaystyle V} . Now the analysis is reduced to clustering vectors with k {\displaystyle k} components, which may be...

Word Count : 2933

Linear discriminant analysis

Last Update:

discriminant correspondence analysis. Discriminant analysis is used when groups are known a priori (unlike in cluster analysis). Each case must have a score...

Word Count : 5931

Cluster sampling

Last Update:

In statistics, cluster sampling is a sampling plan used when mutually homogeneous yet internally heterogeneous groupings are evident in a statistical...

Word Count : 2205

Cluster

Last Update:

Look up cluster in Wiktionary, the free dictionary. Cluster(s) may refer to: Cluster (spacecraft), constellation of four European Space Agency spacecraft...

Word Count : 575

Fuzzy clustering

Last Update:

more than one cluster. Clustering or cluster analysis involves assigning data points to clusters such that items in the same cluster are as similar as possible...

Word Count : 2018

Quadratic unconstrained binary optimization

Last Update:

Embeddings for machine learning models include support-vector machines, clustering and probabilistic graphical models. Moreover, due to its close connection...

Word Count : 2621

Analysis

Last Update:

Boolean analysis – a method to find deterministic dependencies between variables in a sample, mostly used in exploratory data analysis Cluster analysis – techniques...

Word Count : 2509

Clustering

Last Update:

like a single computer Data cluster, an allocation of contiguous storage in databases and file systems Cluster analysis, the statistical task of grouping...

Word Count : 153

Determining the number of clusters in a data set

Last Update:

the number of clusters in a data set, a quantity often labelled k as in the k-means algorithm, is a frequent problem in data clustering, and is a distinct...

Word Count : 2750

Time series

Last Update:

pattern recognition and machine learning, where time series analysis can be used for clustering, classification, query by content, anomaly detection as well...

Word Count : 4833

Race and genetics

Last Update:

other subgroups. In cluster analysis, the number of clusters to search for K is determined in advance; how distinct the clusters are varies. The results...

Word Count : 11509

Bivariate analysis

Last Update:

Bivariate analysis is one of the simplest forms of quantitative (statistical) analysis. It involves the analysis of two variables (often denoted as X, Y)...

Word Count : 926

Unsupervised learning

Last Update:

used in unsupervised learning are principal component and cluster analysis. Cluster analysis is used in unsupervised learning to group, or segment, datasets...

Word Count : 2378

Business cluster

Last Update:

describing a cluster is not standardized. Individual economic consultants and researchers develop their own methodologies. All cluster analysis relies on...

Word Count : 2975

Document clustering

Last Update:

Document clustering (or text clustering) is the application of cluster analysis to textual documents. It has applications in automatic document organization...

Word Count : 886

Data mining

Last Update:

automatic analysis of large quantities of data to extract previously unknown, interesting patterns such as groups of data records (cluster analysis), unusual...

Word Count : 5009

HCS clustering algorithm

Last Update:

Clusters/Components/Kernels) is an algorithm based on graph connectivity for cluster analysis. It works by representing the similarity data in a similarity graph...

Word Count : 1154

DBSCAN

Last Update:

Density-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm proposed by Martin Ester, Hans-Peter Kriegel, Jörg...

Word Count : 3488

PDF Search Engine © AllGlobal.net