The k-Nearest Neighbours is based on a simple idea: similar points tend to have similar outcomes.

Therefore the idea is to memorise all the points in the dataset. The prediction for a new entry is made by finding the closest point in the dataset. Then the prediction for the new entry is simply the same outcome as the value associated to its closest point.

If 2 points are close enough so should be their outcomes.

The name k-NN comes from the fact that you can look for the k closest points and compute (e.g. average) the outcome of the new point from the outcomes of the k-nearest points.

Continue reading “k-Nearest Neighbours”