Archive for the ‘Meaning’ Category

The Semasiology of Open Source [How Do You Define Source?]

Wednesday, January 20th, 2016

The Semasiology of Open Source by Robert Lefkowitz (Then, VP Enterprise Systems & Architecture, AT&T Wireless) 2004. Audio file.

Robert’s keynote from the Open Source Convention (OSCON) 2004 in Portland, Oregon.

From the description:

Semasiology, n. The science of meanings or sense development (of words); the explanation of the development and changes of the meanings of words. Source: Webster’s Revised Unabridged Dictionary, 1996, 1998 MICRA, Inc. “Open source doesn’t just mean access to the source code.” So begins the Open Source Definition. What then, does access to the source code mean? Seen through the lens of an Enterprise user, what does open source mean? When is (or isn’t) it significant? And a catalogue of open source related arbitrage opportunities.

If you haven’t heard this keynote, I hadn’t, do yourself a favor and make time to listen to it.

I do have one complaint: It’s not long enough. 😉


New York Times Confirms: Meaning Is Important!

Sunday, March 29th, 2015

Kate O’Neill tweeted:

There’s that pesky need for meaning again: “Learning to See Data” via @NYTimes #bigdata

, along with this image:


I’m flogging their content on my dime, despite their firewall. 😉 Go figure.

Will the trough of disillusionment with Big Data be mostly due to lack of meaning? Lack of integration (which depends on meaning)? Lack of return (which depends on integration and meaning)?

Or some other cause? Such as visualizing data as a replacement for the meaning of data. None of the better visualization experts would so advise but is your visualization vendor one of them?

Linguistic Mapping Reveals How Word Meanings Sometimes Change Overnight

Sunday, November 23rd, 2014

Linguistic Mapping Reveals How Word Meanings Sometimes Change Overnight Data mining the way we use words is revealing the linguistic earthquakes that constantly change our language.

From the post:

language change

In October 2012, Hurricane Sandy approached the eastern coast of the United States. At the same time, the English language was undergoing a small earthquake of its own. Just months before, the word “sandy” was an adjective meaning “covered in or consisting mostly of sand” or “having light yellowish brown colour”. Almost overnight, this word gained an additional meaning as a proper noun for one of the costliest storms in US history.

A similar change occurred to the word “mouse” in the early 1970s when it gained the new meaning of “computer input device”. In the 1980s, the word “apple” became a proper noun synonymous with the computer company. And later, the word “windows” followed a similar course after the release of the Microsoft operating system.

All this serves to show how language constantly evolves, often slowly but at other times almost overnight. Keeping track of these new senses and meanings has always been hard. But not anymore.

Today, Vivek Kulkarni at Stony Brook University in New York and a few pals show how they have tracked these linguistic changes by mining the corpus of words stored in databases such as Google Books, movie reviews from Amazon and of course the microblogging site Twitter.

These guys have developed three ways to spot changes in the language. The first is a simple count of how often words are used, using tools such as Google Trends. For example, in October 2012, the frequency of the words “Sandy” and “hurricane” both spiked in the run-up to the storm. However, only one of these words changed its meaning, something that a frequency count cannot spot.

A very good overview of:

Statistically Significant Detection of Linguistic Change by Vivek Kulkarni, Rami Al-Rfou, Bryan Perozzi, and Steven Skiena.


We propose a new computational approach for tracking and detecting statistically significant linguistic shifts in the meaning and usage of words. Such linguistic shifts are especially prevalent on the Internet, where the rapid exchange of ideas can quickly change a word’s meaning. Our meta-analysis approach constructs property time series of word usage, and then uses statistically sound change point detection algorithms to identify significant linguistic shifts.

We consider and analyze three approaches of increasing complexity to generate such linguistic property time series, the culmination of which uses distributional characteristics inferred from word co-occurrences. Using recently proposed deep neural language models, we first train vector representations of words for each time period. Second, we warp the vector spaces into one unified coordinate system. Finally, we construct a distance-based distributional time series for each word to track it’s linguistic displacement over time.

We demonstrate that our approach is scalable by tracking linguistic change across years of micro-blogging using Twitter, a decade of product reviews using a corpus of movie reviews from Amazon, and a century of written books using the Google Book-ngrams. Our analysis reveals interesting patterns of language usage change commensurate with each medium.

While the authors are concerned with scaling, I would think detecting cracks, crevasses, and minor tremors in the meaning and usage of words, say between a bank and its regulators, or stock traders and the SEC, would be equally important.

Even if auto-detection of the “new” or “changed” meaning is too much to expect, simply detecting dissonance in the usage of terms would be a step in the right direction.

Detecting earthquakes in meaning is a worthy endeavor but there is more tripping on cracks than falling from earthquakes, linguistically speaking.

The Barrier of Meaning

Sunday, October 5th, 2014

The Barrier of Meaning by Gian-Carlo Rota.

The author discusses the “AI-problem” with Stanislaw Ulam. Ulam makes reference to the history of the “AI-problem” and then continues:

Well, said Stan Ulam, let us play a game. Imagine that we write a dictionary of common words. We shall try to write definitions that are unmistakeably explicit, as if ready to be programmed. Let us take, for instance, nouns like key, book, passenger, and verbs like waiting, listening, arriving. Let us start with the word “key.” I now take this object out of my pocket and ask you to look at it. No amount of staring at this object will ever tell you that this is a key, unless you already have some previous familiarity with the way keys are used.

Now look at that man passing by in a car. How do you tell that it is not just a man you are seeing, but a passenger?

When you write down precise definitions for these words, you discover that what you are describing is not an object, but a function, a role that is tied inextricably tied to some context. Take away that context, and the meaning also disappears.

When you perceive intelligently, as you sometimes do, you always perceive a function, never an object in the set-theoretic or physical sense.

Your Cartesian idea of a device in the brain that does the registering is based upon a misleading analogy between vision and photography. Cameras always register objects, but human perception is always the perceptions of functional roles. The two porcesses could not be more different.

Your friends in AI are now beginning to trumpet the role of contexts, but they are not practicing their lesson. They still want to build machines that see by imitating cameras, perhaps with some feedback thrown in. Such an approach is bound to fail since it start out with a logical misunderstanding….

Should someone mention this to the EC Brain project?

BTW, you may be able to access this article at: Physica D: Nonlinear Phenomena, Volume 22, Issues 1–3, Pages 1-402 (October–November 1986), Proceedings of the Fifth Annual International Conference. For some unknown reason, the editorial board pages are $37.95, as are all the other articles, save for this one by Gian-Carlo Rota. Which as of today, is freely accessible.

The webpages say Physica D supports “open access.” I find that rather doubtful when only three (3) pages out of four hundred and two (402) requires no payment. For material published in 1986.


From Frequency to Meaning: Vector Space Models of Semantics

Thursday, September 18th, 2014

From Frequency to Meaning: Vector Space Models of Semantics by Peter D. Turney and Patrick Pantel.


Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys the use of VSMs for semantic processing of text. We organize the literature on VSMs according to the structure of the matrix in a VSM. There are currently three broad classes of VSMs, based on term–document, word–context, and pair–pattern matrices, yielding three classes of applications. We survey a broad range of applications in these three categories and we take a detailed look at a specific open source project in each category. Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field.

At forty-eight (48) pages with a thirteen (13) page bibliography, this survey of vector space models (VSMs) of semantics should keep you busy for a while. You will have to fill in VSMs developments since 2010 but mastery of this paper will certain give you the foundation to do so. Impressive work.

I do disagree with the authors when they say:

Computers understand very little of the meaning of human language.

Truth be told, I would say:

Computers have no understanding of the meaning of human language.

What happens with a VSM of semantics is that we as human readers choose a model we think represents semantics we see in a text. Our computers blindly apply that model to text and report the results. We as human readers choose results that we think are closer to the semantics we see in the text, and adjust the model accordingly. Our computers then blindly apply the adjusted model to the text again and so on. At no time does the computer have any “understanding” of the text or of the model that it is applying to the text. Any “understanding” in such a model is from a human reader who adjusted the model based on their perception of the semantics of a text.

I don’t dispute that VSMs have been incredibly useful and like the authors, I think there is much mileage left in their development for text processing. That is not the same thing as imputing “understanding” of human language to devices that in fact have none at all. (full stop)


I first saw this in a tweet by Christopher Phipps.

PS: You probably recall that VSMs are based on creating a metric space for semantics, which have no preordained metric space. Transitioning from a non-metric space to a metric space isn’t subject to validation, at least in my view.

New Directions in Vector Space Models of Meaning

Tuesday, September 16th, 2014

New Directions in Vector Space Models of Meaning by Edward Grefenstette, Karl Moritz Hermann, Georgiana Dinu, and Phil Blunsom. (video)

From the description:

This is the video footage, aligned with slides, of the ACL 2014 Tutorial on New Directions in Vector Space Models of Meaning, by Edward Grefenstette (Oxford), Karl Moritz Hermann (Oxford), Georgiana Dinu (Trento) and Phil Blunsom (Oxford).

This tutorial was presented at ACL 2014 in Baltimore by Ed, Karl and Phil.

The slides can be found at

Running time is 2:45:12 so you had better get a cup of coffee before you start.

Includes a review of distributional models of semantics.

The sound isn’t bad but the acoustics are so you will have to listen closely. Having the slides in front of you helps as well.

The semantics part starts to echo topic map theory with the realization that having a single token isn’t going to help you with semantics. Tokens don’t stand alone but in a context of other tokens. Each of which has some contribution to make to the meaning of a token in question.

Topic maps function in a similar way with the realization that identifying any subject of necessity involves other subjects, which have their own identifications. For some purposes, we may assume some subjects are sufficiently identified without specifying the subjects that in our view identify it, but that is merely a design choice that others may choose to make differently.

Working through this tutorial and the cited references (one advantage to the online version) will leave you with a background in vector space models and the contours of the latest research.

I first saw this in a tweet by Kevin Safford.

Underspecifying Meaning

Sunday, July 27th, 2014

Word Meanings Evolve to Selectively Preserve Distinctions on Salient Dimensions by Catriona Silvey, Simon Kirby, and Kenny Smith.


Words refer to objects in the world, but this correspondence is not one-to-one: Each word has a range of referents that share features on some dimensions but differ on others. This property of language is called underspecification. Parts of the lexicon have characteristic patterns of underspecification; for example, artifact nouns tend to specify shape, but not color, whereas substance nouns specify material but not shape. These regularities in the lexicon enable learners to generalize new words appropriately. How does the lexicon come to have these helpful regularities? We test the hypothesis that systematic backgrounding of some dimensions during learning and use causes language to gradually change, over repeated episodes of transmission, to produce a lexicon with strong patterns of underspecification across these less salient dimensions. This offers a cultural evolutionary mechanism linking individual word learning and generalization to the origin of regularities in the lexicon that help learners generalize words appropriately.

I can’t seem to access the article today but the premise is intriguing.

Perhaps people can have different “…less salient dimensions…” and therefore are generalizing words “inappropriately” from the standpoint of another person.

Curious if a test can be devised to identify those “…less salient dimensions…” in some target population? Might lead to faster identification of terms likely to be mis-understood.

Learning the meaning behind words

Thursday, August 15th, 2013

Learning the meaning behind words by By Tomas Mikolov, Ilya Sutskever, and Quoc Le, Google Knowledge.

From the post:

Today computers aren’t very good at understanding human language, and that forces people to do a lot of the heavy lifting—for example, speaking “searchese” to find information online, or slogging through lengthy forms to book a trip. Computers should understand natural language better, so people can interact with them more easily and get on with the interesting parts of life.

While state-of-the-art technology is still a ways from this goal, we’re making significant progress using the latest machine learning and natural language processing techniques. Deep learning has markedly improved speech recognition and image classification. For example, we’ve shown that computers can learn to recognize cats (and many other objects) just by observing large amount of images, without being trained explicitly on what a cat looks like. Now we apply neural networks to understanding words by having them “read” vast quantities of text on the web. We’re scaling this approach to datasets thousands of times larger than what has been possible before, and we’ve seen a dramatic improvement of performance — but we think it could be even better. To promote research on how machine learning can apply to natural language problems, we’re publishing an open source toolkit called word2vec that aims to learn the meaning behind words.

Word2vec uses distributed representations of text to capture similarities among concepts. For example, it understands that Paris and France are related the same way Berlin and Germany are (capital and country), and not the same way Madrid and Italy are. This chart shows how well it can learn the concept of capital cities, just by reading lots of news articles — with no human supervision:

Google has open sourced the code for word2vec.

I wonder how this would perform on all the RFC’s?

Or all of the papers at Citeseer?

Learning Grounded Models of Meaning

Friday, March 29th, 2013

Learning Grounded Models of Meaning

Schedule and readings for seminar by Katrin Erk and Jason Baldridge:

Natural language processing applications typically need large amounts of information at the lexical level: words that are similar in meaning, idioms and collocations, typical relations between entities,lexical patterns that can be used to draw inferences, and so on. Today such information is mostly collected automatically from large amounts of data, making use of regularities in the co-occurrence of words. But documents often contain more than just co-occurring words, for example illustrations, geographic tags, or a link to a date. Just like co-occurrences between words, these co-occurrences of words and extra-linguistic data can be used to automatically collect information about meaning. The resulting grounded models of meaning link words to visual, geographic, or temporal information. Such models can be used in many ways: to associate documents with geographic locations or points in time, or to automatically find an appropriate image for a given document, or to generate text to accompany a given image.

In this seminar, we discuss different types of extra-linguistic data, and their use for the induction of grounded models of meaning.

Very interesting reading that should keep you busy for a while! 😉