# Online machine learning

**online learningon-line learningonlineBatch learningonline algorithm**

In computer science, online machine learning is a method of machine learning in which data becomes available in a sequential order and is used to update our best predictor for future data at each step, as opposed to batch learning techniques which generate the best predictor by learning on the entire training data set at once.wikipedia

69 Related Articles

### Vowpal Wabbit

Vowpal Wabbit is notable as an efficient scalable implementation of online machine learning with support for a number of machine learning reductions, importance weighting, and a selection of different loss functions and optimization algorithms.

### Stochastic gradient descent

**AdaGradAdaptive Moment Estimationgradient-based learning methods**

Mini-batch techniques are used with repeated passing over the training data to obtain optimized out-of-core versions of machine learning algorithms, for example, stochastic gradient descent.

### Sparse dictionary learning

**dictionary learningdictionary matrixMini-batch dictionary learning**

Such cases lie in the field of study of online learning which essentially suggests iteratively updating the model upon the new data points x becoming available.

### Online algorithm

**onlineofflineonline algorithms**

### Computer science

**computer scientistcomputer sciencescomputer scientists**

In computer science, online machine learning is a method of machine learning in which data becomes available in a sequential order and is used to update our best predictor for future data at each step, as opposed to batch learning techniques which generate the best predictor by learning on the entire training data set at once.

### Machine learning

**machine-learninglearningstatistical learning**

In computer science, online machine learning is a method of machine learning in which data becomes available in a sequential order and is used to update our best predictor for future data at each step, as opposed to batch learning techniques which generate the best predictor by learning on the entire training data set at once.

### External memory algorithm

**out-of-coreexternal memoryexternal memory model**

Online learning is a common technique used in areas of machine learning where it is computationally infeasible to train over the entire dataset, requiring the need of out-of-core algorithms.

### Stock market prediction

**fundamental analysispredicting stock valuesstock price prediction**

It is also used in situations where it is necessary for the algorithm to dynamically adapt to new patterns in the data, or when the data itself is generated as a function of time, e.g., stock price prediction.

### Catastrophic interference

**catastrophic forgettingsequential learning**

Online learning algorithms may be prone to catastrophic interference, a problem that can be addressed by incremental learning approaches.

### Incremental learning

**incremental**

Online learning algorithms may be prone to catastrophic interference, a problem that can be addressed by incremental learning approaches.

### Supervised learning

**supervisedsupervised machine learningsupervised classification**

In the setting of supervised learning, a function of f : X \to Y is to be learned, where X is thought of as a space of inputs and Y as a space of outputs, that predicts well on instances that are drawn from a joint probability distribution p(x,y) on X \times Y. In reality, the learner never knows the true distribution p(x,y) over instances.

### Joint probability distribution

**joint distributionjoint probabilitymultivariate distribution**

In the setting of supervised learning, a function of f : X \to Y is to be learned, where X is thought of as a space of inputs and Y as a space of outputs, that predicts well on instances that are drawn from a joint probability distribution p(x,y) on X \times Y. In reality, the learner never knows the true distribution p(x,y) over instances.

### Loss function

**objective functioncost functionrisk function**

In this setting, the loss function is given as, such that V(f(x), y) measures the difference between the predicted value f(x) and the true value y. The ideal goal is to select a function, where \mathcal{H} is a space of functions called a hypothesis space, so that some notion of total loss is minimised.

### Empirical risk minimization

**empirical riskempirical risk functionalminimize empirical risk**

:A common paradigm in this situation is to estimate a function \hat{f} through empirical risk minimization or regularized empirical risk minimization (usually Tikhonov regularization).

### Tikhonov regularization

**ridge regressionregularizeda squared regularizing function**

:A common paradigm in this situation is to estimate a function \hat{f} through empirical risk minimization or regularized empirical risk minimization (usually Tikhonov regularization).

### Least squares

**least-squaresmethod of least squaresleast squares method**

The choice of loss function here gives rise to several well-known learning algorithms such as regularized least squares and support vector machines.

### Kernel method

**kernel trickkernel methodskernelized**

For many formulations, for example nonlinear kernel methods, true online learning is not possible, though a form of hybrid online learning with recursive algorithms can be used where f_{t+1} is permitted to depend on f_t and all previous data points.

### Backpropagation

**back-propagationback propagationbackpropagate**

When combined with backpropagation, this is currently the de facto training method for training artificial neural networks.

### Artificial neural network

**artificial neural networksneural networksneural network**

When combined with backpropagation, this is currently the de facto training method for training artificial neural networks.

### Recursive least squares filter

**Recursive least squaresRLSRecursive Least Square**

One can look at RLS also in the context of adaptive filters (see RLS).

### Stochastic optimization

**Stochastic searchsimulation-based optimisationstochastic optimisation**

This setting is a special case of stochastic optimization, a well known problem in optimization.

### Convex optimization

**convex minimizationconvexconvex programming**

Online convex optimization (OCO) is a general framework for decision making which leverages convex optimization to allow for efficient algorithms.

### Regret

**Regret (emotion)apologyregrets**

The goal is to minimize regret, or the difference between cumulative loss and the loss of the best fixed point u \in S in hindsight.