# Statistical learning theory

**Learning theory (statistics)statistical machine learning**

Statistical learning theory is a framework for machine learning drawing from the fields of statistics and functional analysis.wikipedia

69 Related Articles

### Empirical risk minimization

**empirical riskempirical risk functionalminimize empirical risk**

the empirical risk is called empirical risk minimization.

Empirical risk minimization (ERM) is a principle in statistical learning theory which defines a family of learning algorithms and is used to give theoretical bounds on their performance.

### Proximal gradient methods for learning

**Group LASSOproximal gradient**

Proximal gradient (forward backward splitting) methods for learning is an area of research in optimization and statistical learning theory which studies algorithms for a general class of convex regularization problems where the regularization penalty may not be differentiable.

### Reproducing kernel Hilbert space

**reproducing kernelReproducing kernel Hilbert spacesBergman spaces**

Reproducing kernel Hilbert spaces are particularly important in the field of statistical learning theory because of the celebrated representer theorem which states that every function in an RKHS that minimises an empirical risk functional can be written as a linear combination of the kernel function evaluated at the training points.

### Machine learning

**machine-learninglearningstatistical learning**

Statistical learning theory is a framework for machine learning

### Statistics

**statisticalstatistical analysisstatistician**

drawing from the fields of statistics and functional analysis.

### Functional analysis

**functionalfunctional analyticalgebraic function theory**

drawing from the fields of statistics and functional analysis.

### Computer vision

**visionimage classificationImage recognition**

Statistical learning theory has led to successful applications in fields such as computer vision, speech recognition, bioinformatics, and baseball.

### Speech recognition

**voice recognitionautomatic speech recognitionvoice command**

Statistical learning theory has led to successful applications in fields such as computer vision, speech recognition, bioinformatics, and baseball.

### Bioinformatics

**bioinformaticbioinformaticianbio-informatics**

Statistical learning theory has led to successful applications in fields such as computer vision, speech recognition, bioinformatics, and baseball.

### Baseball

**playerbaseball playerbaseball team**

### Supervised learning

**supervisedsupervised machine learningsupervised classification**

Learning falls into many categories, including supervised learning, unsupervised learning, online learning, and reinforcement learning.

### Unsupervised learning

**unsupervisedunsupervised classificationunsupervised machine learning**

Learning falls into many categories, including supervised learning, unsupervised learning, online learning, and reinforcement learning.

### Online machine learning

**online learningon-line learningonline**

Learning falls into many categories, including supervised learning, unsupervised learning, online learning, and reinforcement learning.

### Reinforcement learning

**reward functionInverse reinforcement learningreinforcement**

### Training, validation, and test sets

**training settraining datatest set**

Supervised learning involves learning from a training set of data.

### Regression analysis

**regressionmultiple regressionregression model**

Depending on the type of output, supervised learning problems are either problems of regression or problems of classification.

### Statistical classification

**classificationclassifierclassifiers**

Depending on the type of output, supervised learning problems are either problems of regression or problems of classification.

### Ohm's law

**ohmicOhmohmic losses**

Using Ohm's Law as an example, a regression could be performed with voltage as input and current as an output.

### Facial recognition system

**facial recognitionface recognitionfacial recognition technology**

In facial recognition, for instance, a picture of a person's face would be the input, and the output label would be that person's name.

### Vector space

**vectorvector spacesvectors**

Take X to be the vector space of all possible inputs, and Y to be

### Probability distribution

**distributioncontinuous probability distributiondiscrete probability distribution**

Statistical learning theory takes the perspective that there is some unknown probability distribution over the product space, i.e. there exists some unknown.

### Loss function

**objective functioncost functionrisk function**

Let be the loss function, a metric for the difference between the predicted value f(\vec{x}) and the actual value y.

### Norm (mathematics)

**normEuclidean normseminorm**

The most common loss function for regression is the square loss function (also known as the L2-norm).

### Taxicab geometry

**Manhattan distanceL1 normtaxicab metric**

The absolute value loss (also known as the L1-norm) is also sometimes used:

### Indicator function

**characteristic functionmembership functionindicator**

In some sense the 0-1 indicator function is the most natural loss function for classification.