Types Of Distance Functions

30.01.2023 — 1 min read

Distance functions play a crucial role in machine learning as they help determine the similarity or difference between two points. In this post, we'll cover three of the most commonly used distance functions: Manhattan Distance, Hamming Distance, and Euclidean Distance.

Manhattan Distance: This is also known as the "taxi-cab" or "L1" distance. It's calculated as the absolute sum of the differences between the corresponding elements of two points. This is used when the data is on a grid, such as an image or a map, and is useful in cases where the cost of moving along one axis is different from the cost of moving along another axis.
Hamming Distance: This is a simple and efficient method of measuring the difference between two strings of equal length. It's used in text classification, image classification, and in various other fields where the data is in the form of binary vectors. The Hamming Distance is calculated by counting the number of mismatches between the corresponding elements of the two strings.
Euclidean Distance: This is the most widely used distance function and is also known as the "L2" distance. It measures the straight-line distance between two points in n-dimensional space. The Euclidean Distance is calculated as the square root of the sum of the squares of the differences between the corresponding elements of the two points. This distance function is commonly used in clustering algorithms, such as k-means clustering, and in dimensionality reduction techniques, such as PCA.

In conclusion, the choice of distance function depends on the specific problem being solved and the nature of the data. Understanding the properties and use cases of these distance functions is important for effective machine learning.