In statistics, the predicted residual error sum of squares (PRESS) is a form of cross-validation used in regression analysis to provide a summary measure of the fit of a model to a sample of observations that were not themselves used to estimate the model. It is calculated as the sums of squares of the prediction residuals for those observations.[1][2][3]
A fitted model having been produced, each observation in turn is removed and the model is refitted using the remaining observations. The out-of-sample predicted value is calculated for the omitted observation in each case, and the PRESS statistic is calculated as the sum of the squares of all the resulting prediction errors:[4]
Given this procedure, the PRESS statistic can be calculated for a number of candidate model structures for the same dataset, with the lowest values of PRESS indicating the best structures. Models that are over-parameterised (over-fitted) would tend to give small residuals for observations included in the model-fitting but large residuals for observations that are excluded.
PRESS statistic has been extensively used in Lazy Learning and locally linear learning to speed-up the assessment and the selection of the neighbourhood size.[5][6]
^"Statsoft Electronic Statistics Textbook - Statistics Glossary". Archived from the original on May 10, 2016. Retrieved May 13, 2016.
^Allen, D. M. (1974), "The Relationship Between Variable Selection and Data Augmentation and a Method for Prediction," Technometrics, 16, 125–127
^Tarpey, Thaddeus (2000) "A Note on the Prediction Sum of Squares Statistic for Restricted Least Squares", The American Statistician, Vol. 54, No. 2, May, pp. 116–118
^"R Graphical Manual:Allen's PRESS (Prediction Sum-Of-Squares) statistic, aka P-square". Archived from the original on February 27, 2018. Retrieved February 27, 2018.
^Atkeson, Christopher G.; Moore, Andrew W.; Schaal, Stefan (1 February 1997). "Locally Weighted Learning". Artificial Intelligence Review. 11 (1): 11–73. doi:10.1023/A:1006559212014. ISSN 1573-7462. S2CID 9219592. Archived from the original on 6 May 2021. Retrieved 25 September 2020.
^Bontempi, Gianluca; Birattari, Mauro; Bersini, Hugues (1 January 1999). "Lazy learning for local modelling and control design". International Journal of Control. 72 (7–8): 643–658. doi:10.1080/002071799220830.
observation in each case, and the PRESSstatistic is calculated as the sum of the squares of all the resulting prediction errors: PRESS = ∑ i = 1 n ( y i − y ^...
conscription Press (TV series), a 2018 BBC One/PBS TV series Press (film), a Turkish film Press Holdings, a UK holding company PRESSstatistic, a statistic in regression...
testing. A hypothesis test is typically specified in terms of a test statistic, considered as a numerical summary of a data-set that reduces the data...
statistics, a statistic is sufficient with respect to a statistical model and its associated unknown parameter if "no other statistic that can be calculated...
marketing) Preferential attachment process – see Preferential attachment PRESSstatistic Prevalence Principal component analysis Multilinear principal-component...
A statistical hypothesis test typically involves a calculation of a test statistic. Then a decision is made, either by comparing the test statistic to...
restriction can lead to different values of the test statistic. That is because the Wald statistic is derived from a Taylor expansion, and different ways...
Minimum description length Minimum message length (MML) PRESSstatistic, also known as the PRESS criterion Structural risk minimization Stepwise regression...
samples from this population. A "parameter" is to a population as a "statistic" is to a sample; that is to say, a parameter describes the true value...
or social problem, it is conventional to begin with a statistical population or a statistical model to be studied. Populations can be diverse groups...
the score statistic is the same as the t statistic.[clarification needed] When the data consists of binary observations, the score statistic is the same...
In computer science, an order statistic tree is a variant of the binary search tree (or more generally, a B-tree) that supports two additional operations...
Royal Statistical Society is a peer-reviewed scientific journal of statistics. It comprises three series and is published by Oxford University Press for...