Delta Training Rule Algorithm is a machine learning algorithm to learn the weights of a single neuron (unit).

Click here to learn what a unit is. At the very least, read the section on what a unit is, but I recommend you read the whole thing.

Here is the algorithm:

• arbitrarily initialize $\vec{w}$
• set error function to something like:

where

• $d$ is a particular training example
• $A_d$ is the actual output for that training example (in this case +1 or -1)
• $P_d$ is the output as predicted by our current weights
• do gradient descent (or stochastic gradient descent) to find $\vec{w}$

# Delta Rule vs Perceptron Rule

Both of these rules can be used to find $\vec{w}$. What are the pros/cons of each?

perceptron rule delta rule
requires linearly separable data does not
can perfectly classify training examples in finite steps asymptotically approaches perfect classification