Overfitting

overfitover-fitover-fittedoverfittedexcessive number of parametersover-trainedoverfitsOverfitting (machine learning)robust machine learningUnderfitting
In statistics, overfitting is "the production of an analysis that corresponds too closely or exactly to a particular set of data, and may therefore fail to fit additional data or predict future observations reliably".wikipedia
148 Related Articles

Machine learning

machine-learninglearningstatistical learning
Overfitting and underfitting can occur in machine learning, in particular.
But if the hypothesis is too complex, then the model is subject to overfitting and generalization will be poorer.

Cross-validation (statistics)

cross-validationcross validationLeave-one-out cross-validation
To lessen the chance of, or amount of, overfitting, several techniques are available (e.g. model comparison, cross-validation, regularization, early stopping, pruning, Bayesian priors, or dropout).
The goal of cross-validation is to test the model's ability to predict new data that was not used in estimating it, in order to flag problems like overfitting or selection bias and to give an insight on how the model will generalize to an independent dataset (i.e., an unknown dataset, for instance from a real problem).

Regularization (mathematics)

regularizationregularizedregularize
To lessen the chance of, or amount of, overfitting, several techniques are available (e.g. model comparison, cross-validation, regularization, early stopping, pruning, Bayesian priors, or dropout).
In mathematics, statistics, and computer science, particularly in machine learning and inverse problems, regularization is the process of adding information in order to solve an ill-posed problem or to prevent overfitting.

Early stopping

To lessen the chance of, or amount of, overfitting, several techniques are available (e.g. model comparison, cross-validation, regularization, early stopping, pruning, Bayesian priors, or dropout).
In machine learning, early stopping is a form of regularization used to avoid overfitting when training a learner with an iterative method, such as gradient descent.

Shrinkage (statistics)

shrinkageshrinkshrinking
In particular, the value of the coefficient of determination will shrink relative to the original data.

Training, validation, and test sets

training settraining datatest set
For example, a model might be selected by maximizing its performance on some set of training data, and yet its suitability might be determined by its ability to perform well on unseen data; then overfitting occurs when a model begins to "memorize" training data rather than "learning" to generalize from a trend.
Validation datasets can be used for regularization by early stopping: stop training when the error on the validation dataset increases, as this is a sign of overfitting to the training dataset.

Decision tree pruning

pruningprunedPruning (decision trees)
To lessen the chance of, or amount of, overfitting, several techniques are available (e.g. model comparison, cross-validation, regularization, early stopping, pruning, Bayesian priors, or dropout).
Pruning reduces the complexity of the final classifier, and hence improves predictive accuracy by the reduction of overfitting.

Linear regression

regression coefficientmultiple linear regressionregression
As an extreme example, if there are p variables in a linear regression with p data points, the fitted line can go exactly through every point.

Dropout (neural networks)

dropout
To lessen the chance of, or amount of, overfitting, several techniques are available (e.g. model comparison, cross-validation, regularization, early stopping, pruning, Bayesian priors, or dropout).
Dropout is a regularization technique patented by Google for reducing overfitting in neural networks by preventing complex co-adaptations on training data.

Bias–variance tradeoff

Bias-variance dilemmabias-variance tradeoffvariance
The bias–variance tradeoff is often used to overcome overfit models.

Robustness (computer science)

robustnessrobustNumerical robustness
A learning algorithm that can reduce the chance of fitting noise is called "robust."
Robustness can encompass many areas of computer science, such as robust programming, robust machine learning, and Robust Security Network.

Occam's razor

parsimonyparsimoniousOckham's razor
Burnham & Anderson, in their much-cited text on model selection, argue that to avoid overfitting, we should adhere to the "Principle of Parsimony".
In the related concept of overfitting, excessively complex models are affected by statistical noise (a problem also known as the bias-variance trade-off), whereas simpler models may capture the underlying structure better and may thus have better predictive performance.

One in ten rule

For logistic regression or Cox proportional hazards models, there are a variety of rules of thumb (e.g. 5–9, 10 and 10–15 — the guideline of 10 observations per independent variable is known as the "one in ten rule").
In statistics, the one in ten rule is a rule of thumb for how many predictor parameters can be estimated from data when doing regression analysis (in particular proportional hazards models in survival analysis and logistic regression) while keeping the risk of overfitting low.

Generalization error

generalization
Generalization error can be minimized by avoiding overfitting in the learning algorithm.

Vapnik–Chervonenkis dimension

VC dimensionVC-DimensionVapnik Chervonenkis dimension
This is due to overfitting).

Statistical model

modelprobabilistic modelstatistical modeling
An overfitted model is a statistical model that contains more parameters than can be justified by the data.

Parameter

parametersparametricargument
An overfitted model is a statistical model that contains more parameters than can be justified by the data.

Fraction of variance unexplained

statistical noisenoisenoisy
The essence of overfitting is to have unknowingly extracted some of the residual variation (i.e. the noise) as if that variation represented underlying model structure.

Model selection

statistical model selectionselectingchoose a model
To lessen the chance of, or amount of, overfitting, several techniques are available (e.g. model comparison, cross-validation, regularization, early stopping, pruning, Bayesian priors, or dropout). The possibility of overfitting exists because the criterion used for selecting the model is not the same as the criterion used to judge the suitability of a model.

Coefficient of determination

R-squaredR'' 2 R 2
In particular, the value of the coefficient of determination will shrink relative to the original data.

Prior probability

prior distributionpriorprior probabilities
To lessen the chance of, or amount of, overfitting, several techniques are available (e.g. model comparison, cross-validation, regularization, early stopping, pruning, Bayesian priors, or dropout).