Introduction
The Algorithm
Why Does This Work (aka theory/proof)?
- Why Go Through All This Proving Trouble? :(

Introduction

Let’s say you want to find some mean, say the mean weight of some leaves. You start by setting your mean to 0. You then collect a leaf, weigh it, and “add” the weight to your mean variable. You collect a second leaf, weigh it, and “add” it to your mean variable. You keep doing this, collecting one leaf at a time, and updating your mean variable in response to each weight.

This version of calculating mean is known as “incrementally calculating mean”, sometimes also called “online mean calculation”.

The Algorithm

Here’s how you do it:

set mean = 0

set n = 0 (n is the number of things you are averaging)

for each sample that you encounter

update n

n += 1

update the mean by incorporating the sample into it

mean = (1/n) * sample + ((n-1)/n) * mean

Why Does This Work (aka theory/proof)?

We will start with the mathematical definition of mean and show that it can be manipulated (by following sound mathematical manipulations) to represent what we are doing above.

First, here is a fact (theorem) that I will present to you:

The sum of a set of numbers is equal to the average (expected) of the numbers times the number of numbers in the set

Disclaimer: Assumes uniform probability distribution for the numbers (i.e. all the numbers have the same probability/frequency)

In mathematical notation:

\(\sum_{i=1}^n x_i = E[x] * n\)

\(x_1\), \(x_2\), all the way to \(x_n\) are the numbers in the set

\(n\) is the number of numbers in the set

\(E[x]\) is the average (aka expected) of the numbers

Here is the definition of mean (we’ll use \(u\) to represent the mean): \(u = \frac{1}{n} \sum_{i=1}^n x_i\)

The mean at the next time step (\(t\)) is: \(u_t = \frac{1}{n} \sum_{i=1}^n x_i\)

Let’s “extract” (from the summation) the sample we want to incorporate into the mean: \(u_t = \frac{1}{n} (x_{i} + \underbrace{\sum_{i=1}^n x_{i-1}})\)

We can use our theorem to substitute for th braced portion: \(u_t = \frac{1}{n} (x_{i} + (n-1)E[x])\)

The expected (\(E[x]\)) is simply your previous mean! We now have: \(u_t = \frac{1}{n} (x_{i} + (n-1)u_{t-1})\)

If we rearrange a little: \(u_t = \frac{1}{n}x_{i} + \frac{n-1}{n}u_{t-1}\)

This equation basically tells us that we can do what we did in the algorithm above!

Why Go Through All This Proving Trouble? :(

Well, because, are you foolish enough to just take my word and blindly follow the algorithm I give you? Don’t you want to be sure that it works? “Well, I can just test it with a few different numbers and see if it works” you say? Even if the algorithm was wrong, it could give you the correct answer by chance, just because of the numbers you picked. By seeing a mathematical proof, you know for a damn fact that the algorithm works. No one can or ever will, be able to question its correctness! The proof establishes what conditions need to be satisfied (here we need uniform probability distribution), and as long as you satisfy those conditions, it will always work!

“Well, I just googled if this actually works, and people on stack overflow say it does, so I will take their word for it and move on with my crappy life.” That’s honestly a legit point! If all you wanna do is use an algorithm or method, as long as a reliable source tells you it works, you’re fine. But as I keep saying, the reasoning behind anything is the most fruitful part! You will encounter lots of problems in your life that haven’t already been solved. A vast reasoning library (in your head) will allow you to solve any new problems you encounter. You might even make a few discoveries!

So don’t shy away from analytical math! Get comfortable with how it works, the notation, axioms, theorems, proving techniques, etc, and you’ll have a framework for making sound claims.

Have a great week!