-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

Machine Learning Algorithms

Let's consider the following dataset:
We define affinity, a metric function of two arguments with the same dimensionality, m. The most common metrics (also supported by scikit-learn) are the following:
The Euclidean distance is normally a good choice, but sometimes it's useful to have a metric whose difference from the Euclidean one gets larger and larger. As discussed in Chapter 9, Clustering Fundamentals, the Manhattan metric has this property. In the following graph, there's a plot representing the distances from the origin of points belonging to the line y = x:
Distances of the point (x, x) from (0, 0) using the Euclidean and Manhattan metrics
The cosine distance is instead useful when we need a distance proportional to the angle between two vectors. If the direction is the same, the distance is null, while it is the maximum when...