Metric Spaces — A Primer by Jeremy Kun.
The Blessing of Distance
We have often mentioned the idea of a “metric” on this blog, and we briefly described a formal definition for it. Colloquially, a metric is simply the mathematical notion of a distance function, with certain well-behaved properties. Since we’re now starting to cover a few more metrics (and things which are distinctly not metrics) in the context of machine learning algorithms, we find it pertinent to lay out the definition once again, discuss some implications, and explore a few basic examples.
The most important thing to take away from this discussion is that not all spaces have a notion of distance. For a space to have a metric is a strong property with far-reaching mathematical consequences. Essentially, metrics impose a topology on a space, which the reader can think of as the contortionist’s flavor of geometry. We’ll explore this idea after a few examples.
On the other hand, from a practical standpoint one can still do interesting things without a true metric. The downside is that work relying on (the various kinds of) non-metrics doesn’t benefit as greatly from existing mathematics. This can often spiral into empirical evaluation, where justifications and quantitative guarantees are not to be found.
An enjoyable introduction to metric spaces.
Absolutely necessary for machine learning and computational tasks.
However, I am mindful that the mapping from semantics to a location in metric space is an arbitrary one. Our evaluations of metrics assigned to any semantic, are wholly dependent upon that mapping.
Not that we can escape that trap but to urge caution when claims are made on the basis of arbitrarily assigned metric locations. (A small voice should be asking: What if we change the assigned metric locations? What result then?)