Set of statistical processes for estimating the relationships among variables
Part of a series on
Regression analysis
Models
Linear regression
Simple regression
Polynomial regression
General linear model
Generalized linear model
Vector generalized linear model
Discrete choice
Binomial regression
Binary regression
Logistic regression
Multinomial logistic regression
Mixed logit
Probit
Multinomial probit
Ordered logit
Ordered probit
Poisson
Multilevel model
Fixed effects
Random effects
Linear mixed-effects model
Nonlinear mixed-effects model
Nonlinear regression
Nonparametric
Semiparametric
Robust
Quantile
Isotonic
Principal components
Least angle
Local
Segmented
Errors-in-variables
Estimation
Least squares
Linear
Non-linear
Ordinary
Weighted
Generalized
Generalized estimating equation
Partial
Total
Non-negative
Ridge regression
Regularized
Least absolute deviations
Iteratively reweighted
Bayesian
Bayesian multivariate
Least-squares spectral analysis
Background
Regression validation
Mean and predicted response
Errors and residuals
Goodness of fit
Studentized residual
Gauss–Markov theorem
Mathematics portal
v
t
e
Part of a series on
Machine learning and data mining
Paradigms
Supervised learning
Unsupervised learning
Online learning
Batch learning
Meta-learning
Semi-supervised learning
Self-supervised learning
Reinforcement learning
Curriculum learning
Rule-based learning
Quantum machine learning
Problems
Classification
Generative modeling
Regression
Clustering
Dimensionality reduction
Density estimation
Anomaly detection
Data cleaning
AutoML
Association rules
Semantic analysis
Structured prediction
Feature engineering
Feature learning
Learning to rank
Grammar induction
Ontology learning
Multimodal learning
Supervised learning (classification • regression)
Apprenticeship learning
Decision trees
Ensembles
Bagging
Boosting
Random forest
k-NN
Linear regression
Naive Bayes
Artificial neural networks
Logistic regression
Perceptron
Relevance vector machine (RVM)
Support vector machine (SVM)
Clustering
BIRCH
CURE
Hierarchical
k-means
Fuzzy
Expectation–maximization (EM)
DBSCAN
OPTICS
Mean shift
Dimensionality reduction
Factor analysis
CCA
ICA
LDA
NMF
PCA
PGD
t-SNE
SDL
Structured prediction
Graphical models
Bayes net
Conditional random field
Hidden Markov
Anomaly detection
RANSAC
k-NN
Local outlier factor
Isolation forest
Artificial neural network
Autoencoder
Cognitive computing
Deep learning
DeepDream
Feedforward neural network
Kolmogorov–Arnold Network
Recurrent neural network
LSTM
GRU
ESN
reservoir computing
Restricted Boltzmann machine
GAN
Diffusion model
SOM
Convolutional neural network
U-Net
Transformer
Vision
Mamba
Spiking neural network
Memtransistor
Electrochemical RAM (ECRAM)
Reinforcement learning
Q-learning
SARSA
Temporal difference (TD)
Multi-agent
Self-play
Learning with humans
Active learning
Crowdsourcing
Human-in-the-loop
RLHF
Model diagnostics
Coefficient of determination
Confusion matrix
Learning curve
ROC curve
Mathematical foundations
Kernel machines
Bias–variance tradeoff
Computational learning theory
Empirical risk minimization
Occam learning
PAC learning
Statistical learning
VC theory
Machine-learning venues
ECML PKDD
NeurIPS
ICML
ICLR
IJCAI
ML
JMLR
Related articles
Glossary of artificial intelligence
List of datasets for machine-learning research
List of datasets in computer vision and image processing
Outline of machine learning
v
t
e
In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one or more independent variables (often called 'predictors', 'covariates', 'explanatory variables' or 'features'). The most common form of regression analysis is linear regression, in which one finds the line (or a more complex linear combination) that most closely fits the data according to a specific mathematical criterion. For example, the method of ordinary least squares computes the unique line (or hyperplane) that minimizes the sum of squared differences between the true data and that line (or hyperplane). For specific mathematical reasons (see linear regression), this allows the researcher to estimate the conditional expectation (or population average value) of the dependent variable when the independent variables take on a given set of values. Less common forms of regression use slightly different procedures to estimate alternative location parameters (e.g., quantile regression or Necessary Condition Analysis[1]) or estimate the conditional expectation across a broader collection of non-linear models (e.g., nonparametric regression).
Regression analysis is primarily used for two conceptually distinct purposes. First, regression analysis is widely used for prediction and forecasting, where its use has substantial overlap with the field of machine learning. Second, in some situations regression analysis can be used to infer causal relationships between the independent and dependent variables. Importantly, regressions by themselves only reveal relationships between a dependent variable and a collection of independent variables in a fixed dataset. To use regressions for prediction or to infer causal relationships, respectively, a researcher must carefully justify why existing relationships have predictive power for a new context or why a relationship between two variables has a causal interpretation. The latter is especially important when researchers hope to estimate causal relationships using observational data.[2][3]
^Necessary Condition Analysis
^David A. Freedman (27 April 2009). Statistical Models: Theory and Practice. Cambridge University Press. ISBN 978-1-139-47731-4.
^R. Dennis Cook; Sanford Weisberg Criticism and Influence Analysis in Regression, Sociological Methodology, Vol. 13. (1982), pp. 313–361
and 28 Related for: Regression analysis information
nonparametric regression). Regressionanalysis is primarily used for two conceptually distinct purposes. First, regressionanalysis is widely used for...
linear regression; for more than one, the process is called multiple linear regression. This term is distinct from multivariate linear regression, where...
combination of one or more independent variables. In regressionanalysis, logistic regression (or logit regression) is estimating the parameters of a logistic...
In statistics, polynomial regression is a form of regressionanalysis in which the relationship between the independent variable x and the dependent variable...
Nonlinear Regression: A Practical Guide to Curve Fitting. Oxford University Press. ISBN 978-0-19-803834-4.[page needed] RegressionAnalysis By Rudolf...
In statistics, nonlinear regression is a form of regressionanalysis in which observational data are modeled by a function which is a nonlinear combination...
Quantile regression is a type of regressionanalysis used in statistics and econometrics. Whereas the method of least squares estimates the conditional...
Poisson regression is a generalized linear model form of regressionanalysis used to model count data and contingency tables. Poisson regression assumes...
Segmented regression, also known as piecewise regression or broken-stick regression, is a method in regressionanalysis in which the independent variable...
Regression testing (rarely, non-regression testing) is re-running functional and non-functional tests to ensure that previously developed and tested software...
(2018). Note that regression kinks (or kinked regression) can also mean a type of segmented regression, which is a different type of analysis. Final considerations...
case, the "regression" effect is statistically likely to occur, but in the second case, it may occur less strongly or not at all. Regression toward the...
Analysis of covariance (ANCOVA) is a general linear model that blends ANOVA and regression. ANCOVA evaluates whether the means of a dependent variable...
Ridge regression is a method of estimating the coefficients of multiple-regression models in scenarios where the independent variables are highly correlated...
robust statistics, robust regression seeks to overcome some limitations of traditional regressionanalysis. A regressionanalysis models the relationship...
In statistics, ordinal regression, also called ordinal classification, is a type of regressionanalysis used for predicting an ordinal variable, i.e. a...
linear regression). Bivariate analysis can be contrasted with univariate analysis in which only one variable is analysed. Like univariate analysis, bivariate...
The method of least squares is a parameter estimation method in regressionanalysis based on minimizing the sum of the squares of the residuals (a residual...
to the same analysis. Certain types of problems involving multivariate data, for example simple linear regression and multiple regression, are not usually...
In statistics and numerical analysis, isotonic regression or monotonic regression is the technique of fitting a free-form line to a sequence of observations...
time-varying covariates. The Cox PH regression model is a linear model. It is similar to linear regression and logistic regression. Specifically, these methods...
Nonparametric regression is a category of regressionanalysis in which the predictor does not take a predetermined form but is constructed according to...
In statistics, stepwise regression is a method of fitting regression models in which the choice of predictive variables is carried out by an automatic...
Local regression or local polynomial regression, also known as moving regression, is a generalization of the moving average and polynomial regression. Its...
Partial least squares regression (PLS regression) is a statistical method that bears some relation to principal components regression; instead of finding...
model or general multivariate regression model is a compact way of simultaneously writing several multiple linear regression models. In that sense it is...
a single value, as in linear regression. Binary regression is usually analyzed as a special case of binomial regression, with a single outcome ( n = 1...