# Bootstrap aggregating

**baggingBootstrap aggregationbagged nearest neighbour classifierBootstrap aggregatedresulting models averaged**

Bootstrap aggregating, also called bagging, is a machine learning ensemble meta-algorithm designed to improve the stability and accuracy of machine learning algorithms used in statistical classification and regression.wikipedia

43 Related Articles

### Ensemble learning

**ensembles of classifiersensembleBayesian model averaging**

Bootstrap aggregating, also called bagging, is a machine learning ensemble meta-algorithm designed to improve the stability and accuracy of machine learning algorithms used in statistical classification and regression.

This flexibility can, in theory, enable them to over-fit the training data more than a single model would, but in practice, some ensemble techniques (especially bagging) tend to reduce problems related to over-fitting of the training data.

### Decision tree learning

**decision treesdecision treeClassification and regression tree**

Although it is usually applied to decision tree methods, it can be used with any type of method. Bagging leads to "improvements for unstable procedures" (Breiman, 1996), which include, for example, artificial neural networks, classification and regression trees, and subset selection in linear regression (Breiman, 1994).

### Leo Breiman

**Breiman, LeoBreiman**

Bagging (Bootstrap aggregating) was proposed by Leo Breiman in 1994 to improve classification by combining classifications of randomly generated training sets.

Bootstrap aggregation was given the name bagging by Breiman.

### Random forest

**random forestsRandom multinomial logitRandom naive Bayes**

The extension combines Breiman's "bagging" idea and random selection of features, introduced first by Ho and later independently by Amit and Geman in order to construct a collection of decision trees with controlled variance.

### Random subspace method

**feature bagging**

One way of combining learners is bootstrap aggregating or bagging, which shows each learner a randomly sampled subset of the training points so that the learners will produce different models that can be sensibly averaged.

### Boosting (machine learning)

**boostingBoosting (meta-algorithm)boosted**

### Bootstrapping (statistics)

**bootstrapbootstrappingbootstrap support**

This kind of sample is known as a bootstrap sample.

Bootstrap aggregating (bagging) is a meta-algorithm based on averaging the results of multiple bootstrap samples.

### Cross-validation (statistics)

**cross-validationcross validationLeave-one-out cross-validation**

### Metaheuristic

**metaheuristicsmeta-algorithmheuristics**

Bootstrap aggregating, also called bagging, is a machine learning ensemble meta-algorithm designed to improve the stability and accuracy of machine learning algorithms used in statistical classification and regression.

### Machine learning

**machine-learninglearningstatistical learning**

Bootstrap aggregating, also called bagging, is a machine learning ensemble meta-algorithm designed to improve the stability and accuracy of machine learning algorithms used in statistical classification and regression.

### Statistical classification

**classificationclassifierclassifiers**

### Regression analysis

**regressionmultiple regressionregression model**

### Variance

**sample variancepopulation variancevariability**

It also reduces variance and helps to avoid overfitting.

### Overfitting

**overfitover-fitover-fitted**

It also reduces variance and helps to avoid overfitting.

### Training, validation, and test sets

**training settraining datatest set**

Given a standard training set D of size n, bagging generates m new training sets D_i, each of size n′, by sampling from D uniformly and with replacement.

### Sampling (statistics)

**samplingrandom samplesample**

Given a standard training set D of size n, bagging generates m new training sets D_i, each of size n′, by sampling from D uniformly and with replacement.

### Probability distribution

**distributioncontinuous probability distributiondiscrete probability distribution**

Given a standard training set D of size n, bagging generates m new training sets D_i, each of size n′, by sampling from D uniformly and with replacement.

### Prime (symbol)

**prime symbolprimeDouble prime**

If n′=n, then for large n the set D_i is expected to have the fraction (1 - 1/e) (≈63.2%) of the unique examples of D, the rest being duplicates.

### E (mathematical constant)

**eEuler's numberbase of the natural logarithm**

If n′=n, then for large n the set D_i is expected to have the fraction (1 - 1/e) (≈63.2%) of the unique examples of D, the rest being duplicates.

### Artificial neural network

**artificial neural networksneural networksneural network**

Bagging leads to "improvements for unstable procedures" (Breiman, 1996), which include, for example, artificial neural networks, classification and regression trees, and subset selection in linear regression (Breiman, 1994).

### Linear regression

**regression coefficientmultiple linear regressionregression**

Bagging leads to "improvements for unstable procedures" (Breiman, 1996), which include, for example, artificial neural networks, classification and regression trees, and subset selection in linear regression (Breiman, 1994).

### Ozone

**ozonationO 3 ozone generator**

To illustrate the basic principles of bagging, below is an analysis on the relationship between ozone and temperature (data from Rousseeuw and Leroy (1986), analysis done in R).

### Peter Rousseeuw

**RousseeuwRousseuw**

To illustrate the basic principles of bagging, below is an analysis on the relationship between ozone and temperature (data from Rousseeuw and Leroy (1986), analysis done in R).

### R (programming language)

**RR programming languageCRAN**

To illustrate the basic principles of bagging, below is an analysis on the relationship between ozone and temperature (data from Rousseeuw and Leroy (1986), analysis done in R).

### Local regression

**LOESSLowess curveLoess curve**

To mathematically describe this relationship, LOESS smoothers (with bandwidth 0.5) are used.