Delta Training Rule Algorithm is a machine learning algorithm to learn the weights of a single neuron (unit).

Click here to learn what a unit is. At the very least, read the section on what a unit is, but I recommend you read the whole thing.

Here is the algorithm:

  • arbitrarily initialize \(\vec{w}\)
  • set error function to something like:

    \[\textit{error} = \sum_{d} (A_d - P_d)^2\]

    where

    • \(d\) is a particular training example
    • \(A_d\) is the actual output for that training example (in this case +1 or -1)
    • \(P_d\) is the output as predicted by our current weights
  • do gradient descent (or stochastic gradient descent) to find \(\vec{w}\)

Delta Rule vs Perceptron Rule

Both of these rules can be used to find \(\vec{w}\). What are the pros/cons of each?

perceptron rule delta rule
requires linearly separable data does not
can perfectly classify training examples in finite steps asymptotically approaches perfect classification