Filtering an image refers to iterating through each pixel of the image, and making the value of that pixel a function of the neighboring pixels (including itself). Usually we consider the neighbors in a box of some size around the center pixel. And usually, we make the value of the center pixel some linear function of the neighbors. In fact we make it some weighted average of the neighboring pixels. The coeffecient or weight that we give each neighbor varies based on what we are trying to do. Sometimes they can be 0, or negative!

The weights/coeffecients and the size of our box, all of this we call the kernel, or sometimes mask or filter. They all mean the same thing.

The process of applying a kernel to an image refers applying the kernel to each pixel of the image. This process is termed filtering, or convoluting or cross correlating.

The symbol for cross correlation is $\bigotimes$ and the symbol for convolution is $\ast$

Cross correlation/Convolution are really the same operation, except one of them will result in a flipped (in both directions) version of the other’s result. When your kernel is symmetrical, it doesn’t matter which operation you use. But if you have an asymetrical kernel, one of them will produce a flippled version (again, in both directions) of the other.

Convolution/cross correlation are linear operations.