# Random forest

**random forestsRandom multinomial logitRandom naive Bayesrandom decision forestsdecision forestKernel random forestRandom Forest Classificationrandom forest decision trees**

Random forests or random decision forests are an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees.wikipedia

106 Related Articles

### Ensemble learning

**ensembles of classifiersensembleBayesian model averaging**

Random forests or random decision forests are an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees.

Fast algorithms such as decision trees are commonly used in ensemble methods (for example, random forests), although slower algorithms can benefit from ensemble techniques as well.

### Adele Cutler

An extension of the algorithm was developed by Leo Breiman and Adele Cutler, who registered "Random Forests" as a trademark (, owned by Minitab, Inc.).

Adele Cutler is a statistician known as one of the developers of archetypal analysis and of the random forest technique for ensemble learning.

### Statistical classification

**classificationclassifierclassifiers**

Random forests or random decision forests are an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees.

### Tin Kam Ho

The first algorithm for random decision forests was created by Tin Kam Ho using the random subspace method, which, in Ho's formulation, is a way to implement the "stochastic discrimination" approach to classification proposed by Eugene Kleinberg.

Ho is noted for introducing random decision forests in 1995, and for her pioneering work in ensemble learning and data complexity analysis.

### Out-of-bag error

An optimal number of trees B can be found using cross-validation, or by observing the out-of-bag error: the mean prediction error on each training sample xᵢ, using only the trees that did not have xᵢ in their bootstrap sample.

Out-of-bag (OOB) error, also called out-of-bag estimate, is a method of measuring the prediction error of random forests, boosted decision trees, and other machine learning models utilizing bootstrap aggregating (bagging) to sub-sample data samples used for training.

### Random subspace method

**feature bagging**

The first algorithm for random decision forests was created by Tin Kam Ho using the random subspace method, which, in Ho's formulation, is a way to implement the "stochastic discrimination" approach to classification proposed by Eugene Kleinberg.

The random subspace method has been used for decision trees; when combined with "ordinary" bagging of decision trees, the resulting models are called random forests.

### Leo Breiman

**Breiman, LeoBreiman**

An extension of the algorithm was developed by Leo Breiman and Adele Cutler, who registered "Random Forests" as a trademark (, owned by Minitab, Inc.).

Another of Breiman's ensemble approaches is the random forest.

### Donald Geman

**GemanGeman, DonaldGeman brothers**

The extension combines Breiman's "bagging" idea and random selection of features, introduced first by Ho and later independently by Amit and Geman in order to construct a collection of decision trees with controlled variance.

In another milestone paper, in collaboration with Y. Amit, he introduced the notion for randomized decision trees, which have been called random forests and popularized by Leo Breiman.

### Decision tree

**decision treesdecision rulesRegression trees**

tree.

### Decision tree learning

**decision treesdecision treeClassification and regression tree**

Random forests or random decision forests are an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees.

### Bootstrap aggregating

**baggingBootstrap aggregationbagged nearest neighbour classifier**

The extension combines Breiman's "bagging" idea and random selection of features, introduced first by Ho and later independently by Amit and Geman in order to construct a collection of decision trees with controlled variance.

### Naive Bayes classifier

**Naive Bayesnaive Bayes classificationNaïve Bayes**

Instead of decision trees, linear models have been proposed and evaluated as base estimators in random forests, in particular multinomial logistic regression and naive Bayes classifiers.

Still, a comprehensive comparison with other classification algorithms in 2006 showed that Bayes classification is outperformed by other approaches, such as boosted trees or random forests.

### Scikit-learn

**sklearn**

It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.

### Boosting (machine learning)

**boostingBoosting (meta-algorithm)boosted**

### Gradient boosting

**boosted decision treeBoosted treesboosted decision trees**

### Regression analysis

**regressionmultiple regressionregression model**

### Mode (statistics)

**modemodalmodes**

### Overfitting

**overfitover-fitover-fitted**

Random decision forests correct for decision trees' habit of overfitting to their training set.

### Training, validation, and test sets

**training settraining datatest set**

Random decision forests correct for decision trees' habit of overfitting to their training set.

### Trademark

**trademarkstrade marktrademarked**

An extension of the algorithm was developed by Leo Breiman and Adele Cutler, who registered "Random Forests" as a trademark (, owned by Minitab, Inc.).

### Minitab

**Minitab, Inc.**

### Feature (machine learning)

**feature vectorfeature spacefeatures**

Ho established that forests of trees splitting with oblique hyperplanes can gain accuracy as they grow without suffering from overtraining, as long as the forests are randomly restricted to be sensitive to only selected feature dimensions.

### Linear subspace

**subspacesubspacesintersection**

into a randomly chosen subspace before fitting each tree or each node.

### Generalization error

**generalization**

form of a bound on the generalization error which depends on the strength of the

### Correlation and dependence

**correlationcorrelatedcorrelations**

trees in the forest and their correlation.