Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

November 3, 2016

Encyclopedia of Distances

Filed under: Distance,Edit Distance,Mathematics,Metric Spaces — Patrick Durusau @ 9:37 am

Encyclopedia of Distances (4th edition) by Michel Marie Deza and Elena Deza.

Springer description:

This 4-th edition of the leading reference volume on distance metrics is characterized by updated and rewritten sections on some items suggested by experts and readers, as well a general streamlining of content and the addition of essential new topics. Though the structure remains unchanged, the new edition also explores recent advances in the use of distances and metrics for e.g. generalized distances, probability theory, graph theory, coding theory, data analysis.

New topics in the purely mathematical sections include e.g. the Vitanyi multiset-metric, algebraic point-conic distance, triangular ratio metric, Rossi-Hamming metric, Taneja distance, spectral semimetric between graphs, channel metrization, and Maryland bridge distance. The multidisciplinary sections have also been supplemented with new topics, including: dynamic time wrapping distance, memory distance, allometry, atmospheric depth, elliptic orbit distance, VLBI distance measurements, the astronomical system of units, and walkability distance.

Leaving aside the practical questions that arise during the selection of a ‘good’ distance function, this work focuses on providing the research community with an invaluable comprehensive listing of the main available distances.

As well as providing standalone introductions and definitions, the encyclopedia facilitates swift cross-referencing with easily navigable bold-faced textual links to core entries. In addition to distances themselves, the authors have collated numerous fascinating curiosities in their Who’s Who of metrics, including distance-related notions and paradigms that enable applied mathematicians in other sectors to deploy research tools that non-specialists justly view as arcane. In expanding access to these techniques, and in many cases enriching the context of distances themselves, this peerless volume is certain to stimulate fresh research.

Ransomed for $149 (US) per digital copy, this remarkable work that should have a broad readership.

From the introduction to the 2009 edition:


Distance metrics and distances have now become an essential tool in many areas of Mathematics and its applications including Geometry, Probability, Statistics, Coding/Graph Theory, Clustering, Data Analysis, Pattern Recognition, Networks, Engineering, Computer Graphics/Vision, Astronomy, Cosmology, Molecular Biology, and many other areas of science. Devising the most suitable distance metrics and similarities, to quantify the proximity between objects, has become a standard task for many researchers. Especially intense ongoing search for such distances occurs, for example, in Computational Biology, Image Analysis, Speech Recognition, and Information Retrieval.

Often the same distance metric appears independently in several different areas; for example, the edit distance between words, the evolutionary distance in Biology, the Levenstein distance in Coding Theory, and the Hamming+Gap or shuffle-Hamming distance.

(emphasis added)

I highlighted that last sentence to emphasize that Encyclopedia of Distances is a static and undisclosed topic map.

While readers familiar with the concepts:

edit distance between words, the evolutionary distance in Biology, the Levenstein distance in Coding Theory, and the Hamming+Gap or shuffle-Hamming distance.

could enumerate why those merit being spoken of as being “the same distance metric,” no indexing program can accomplish the same feat.

If each of those concepts had enumerated properties, which could be compared by an indexing program, readers could not only discover those “same distance metrics” but could also discover new rediscoveries of that same metric.

As it stands, readers must rely upon the undisclosed judgments of the Deza’s and hope they continue to revise and extend this work.

When they cease to do so, successive editors will be forced to re-acquire the basis for adding new/re-discovered metrics to it.

PS: Suggestions of similar titles that deal with non-metric distances? I’m familiar with works that impose metrics on non-metric distances but that’s not what I have in mind. That’s an arbitrary and opaque mapping from non-metric to metric.

May 5, 2014

…immediate metrical meaning

Filed under: Metric Spaces,Semantics — Patrick Durusau @ 2:49 pm

Topology Fact tweeted today:

‘It’s not so easy to free oneself from the idea that coordinates must have an immediate metrical meaning.’ — Albert Einstein

In searching for that quote I found:

The simple fact is that in general relativity, coordinates are essentially arbitrary systems of markers chosen to distinguish one even from another. This gives us great freedom in how we define coordinates…. The relationship between the coordinate differences separating events and the corresponding intervals of time or distance that would be measured by a specified observer must be worked out using the metric of the spacetime. (Relativity, Gravitation and Cosmology by Robert J. A. Lambourne, page 155)

Let’s re-write the first sentence by Lambourne to read:

The simple fact is that in semantics, terms are essentially arbitrary systems of markers chosen to distinguish one semantic even from another.

Just to make clear that sets of terms have no external metric of semantic distance or closeness that separate them.

And re-write the second sentence to read:

The relationship between the term separating semantics and the corresponding semantic intervals would be measured by a specified observer.

I have omitted some words and added others to emphasize that “semantic intervals” have no metric other than as assigned and observed by some specified observer.

True, the original quote goes on to say: “…using the metric of the spacetime.” But spacetime has a generally accepted metric that has proven itself both accurate and useful since the early 20th century. So far as I know, despite contentions to the contrary, there is no similar metric for semantics.

In particular there is no general semantic metric that obtains across all observers.

Something to bear in mind when semantic distances are being calculated with great “precision” between terms. Most pocket calculators can be fairly precise. But being precise isn’t the same thing as being correct.

August 26, 2012

Metric Spaces — A Primer [Semantic Metrics?]

Filed under: Distance,Metric Spaces,Semantics — Patrick Durusau @ 1:45 pm

Metric Spaces — A Primer by Jeremy Kun.

The Blessing of Distance

We have often mentioned the idea of a “metric” on this blog, and we briefly described a formal definition for it. Colloquially, a metric is simply the mathematical notion of a distance function, with certain well-behaved properties. Since we’re now starting to cover a few more metrics (and things which are distinctly not metrics) in the context of machine learning algorithms, we find it pertinent to lay out the definition once again, discuss some implications, and explore a few basic examples.

The most important thing to take away from this discussion is that not all spaces have a notion of distance. For a space to have a metric is a strong property with far-reaching mathematical consequences. Essentially, metrics impose a topology on a space, which the reader can think of as the contortionist’s flavor of geometry. We’ll explore this idea after a few examples.

On the other hand, from a practical standpoint one can still do interesting things without a true metric. The downside is that work relying on (the various kinds of) non-metrics doesn’t benefit as greatly from existing mathematics. This can often spiral into empirical evaluation, where justifications and quantitative guarantees are not to be found.

An enjoyable introduction to metric spaces.

Absolutely necessary for machine learning and computational tasks.

However, I am mindful that the mapping from semantics to a location in metric space is an arbitrary one. Our evaluations of metrics assigned to any semantic, are wholly dependent upon that mapping.

Not that we can escape that trap but to urge caution when claims are made on the basis of arbitrarily assigned metric locations. (A small voice should be asking: What if we change the assigned metric locations? What result then?)

Powered by WordPress