"Collinearity (statistics)" redirects here. Not to be confused with Collinearity (geometry).
This article includes a list of general references, but it lacks sufficient corresponding inline citations. Please help to improve this article by introducing more precise citations.(January 2024) (Learn how and when to remove this message)
In statistics, multicollinearity or collinearity is a situation where the predictors in a regression model are linearly dependent.
Perfect multicollinearity refers to a situation where the predictive variables have an exact linear relationship. When there is perfect collinearity, the design matrix has less than full rank, and therefore the moment matrix cannot be inverted. In this situation, the parameter estimates of the regression are not well-defined, as the system of equations has infinitely many solutions.
Imperfect multicollinearity refers to a situation where the predictive variables have a nearly exact linear relationship.
Contrary to popular belief, neither the Gauss–Markov theorem nor the more common maximum likelihood justification for ordinary least squares relies on any kind of correlation structure between dependent predictors[1][2][3] (although perfect collinearity can cause problems with some software).
There is no justification for the practice of removing collinear variables as part of regression analysis,[1][4][5][6][7] and doing so may constitute scientific misconduct. Econometricians and statisticians have facetiously referred to imperfect collinearity as "micronumerosity", noting that it is only a problem when working with an insufficient sample size.[3][4] Including collinear variables does not reduce the predictive power or reliability of the model as a whole,[6] and does not reduce the accuracy of coefficient estimates.[1]
High collinearity indicates that it is exceptionally important to include all collinear variables, as excluding any will cause worse coefficient estimates, strong confounding, and downward-biased estimates of standard errors.[2]
^ abcGujarati, Damodar (2009). "Multicollinearity: what happens if the regressors are correlated?". Basic Econometrics (4th ed.). McGraw−Hill. pp. 363. ISBN 9780073375779.
^ abKalnins, Arturs; Praitis Hill, Kendall (13 December 2023). "The VIF Score. What is it Good For? Absolutely Nothing". Organizational Research Methods. doi:10.1177/10944281231216381. ISSN 1094-4281.
^ abLeamer, Edward E. (1973). "Multicollinearity: A Bayesian Interpretation". The Review of Economics and Statistics. 55 (3): 371–380. doi:10.2307/1927962. ISSN 0034-6535. JSTOR 1927962.
^ abGiles, Dave (15 September 2011). "Econometrics Beat: Dave Giles' Blog: Micronumerosity". Econometrics Beat. Retrieved 3 September 2023.
^Goldberger,(1964), A.S. (1964). Econometric Theory. New York: Wiley.{{cite book}}: CS1 maint: numeric names: authors list (link)
^ abGoldberger, A.S. "Chapter 23.3". A Course in Econometrics. Cambridge MA: Harvard University Press.
^Blanchard, Olivier Jean (October 1987). "Comment". Journal of Business & Economic Statistics. 5 (4): 449–451. doi:10.1080/07350015.1987.10509611. ISSN 0735-0015.
statistics, multicollinearity or collinearity is a situation where the predictors in a regression model are linearly dependent. Perfect multicollinearity refers...
even more important. One major use of PCR lies in overcoming the multicollinearity problem which arises when two or more of the explanatory variables...
reason for the use of sensitivity analysis is to detect multicollinearity. Multicollinearity is the phenomenon where the correlation between two variables...
regression is a general way of handling this issue. Lack of perfect multicollinearity in the predictors. For standard least squares estimation methods,...
In practice, we rarely face perfect multicollinearity in a data set. More commonly, the issue of multicollinearity arises when there is a "strong linear...
predictors has more variables than observations, and when there is multicollinearity among X values. By contrast, standard regression will fail in these...
other X variables) on the right hand side. Analyze the magnitude of multicollinearity by considering the size of the VIF ( α ^ i ) {\displaystyle \operatorname...
in selecting the explanatory variable, however. They do warn about multicollinearity (choosing two or more explanatory variables that perfectly correlate...
tolerances that affect a particular parameter Tolerance, a measure of multicollinearity in statistics Tolerance interval, a type of statistical probability...
Analysis of covariance (ANCOVA) is a general linear model that blends ANOVA and regression. ANCOVA evaluates whether the means of a dependent variable...
estimation can be carried out if, and only if, there is no perfect multicollinearity between the explanatory variables (which would cause the gram matrix...
variance-covariance matrix homogeneity, and linear relationship, no multicollinearity, and each without outliers. Assume n {\textstyle n} q {\textstyle...
ill-posed problems. It is particularly useful to mitigate the problem of multicollinearity in linear regression, which commonly occurs in models with large numbers...
machine-learning methods, include model selection, overfitting, and multicollinearity. Given a data set { y i , x i 1 , … , x i p } i = 1 n {\displaystyle...
used in this situation. Multicollinearity refers to unacceptably high correlations between predictors. As multicollinearity increases, coefficients remain...
However, only a fraction of genes will be differentially expressed. Multicollinearity often occurs in high-throughput biostatistical settings. Due to high...
pp. 100–104. ISBN 0-471-05856-4. Pesaran, M. Hashem (2015). "The Multicollinearity Problem". Time Series and Panel Data Econometrics. New York: Oxford...
group (mathematics), a sporadic simple group Monte Carlo localization Multicollinearity This disambiguation page lists articles associated with the title...
differences; (d) more uniqueness and less likelihood of suffering from multicollinearity; (e) greater cross-cultural validity, because they did not require...
condition number of the correlation matrix is less than 10 and the multicollinearity effects are not expected to be strong. The results of the detailed...
predictive modeling, the performance of OLS estimates can be poor if multicollinearity is present, unless the sample size is large. Weighted least squares...
also published basic research on sparse data in meta-analysis, on multicollinearity, and on the dichotomization of ordinal data, and is one of the namesakes...