Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

September 11, 2010

Google’s Instant And User Expectations

Filed under: Search Interface,Searching,Topic Map Software,Topic Maps,Usability — Patrick Durusau @ 5:35 am

Google’s Instant will change user expectations for search interfaces. Any interface that is less responsive will be viewed as less capable. Quality of results will have a minor impact on user ratings of an interface. (I am projecting the results of future surveys analyzing the failure of less responsive interfaces.)

“Instant” display of the names of topics is certainly one useful response to Google’s Instant.

Or display of relationships to other topics.

Or, displaying merging results as property values are selected.

Google’s Instant has raised the bar. Will your topic map interface met the challenge?

September 10, 2010

LNCS Volume 6263: Data Warehousing and Knowledge Discovery

Filed under: Database,Graphs,Heterogeneous Data,Indexing,Merging — Patrick Durusau @ 8:20 pm

LNCS Volume 6263: Data Warehousing and Knowledge Discovery edited by Torben Bach Pedersen, Mukesh K. Mohania, A Min Tjoa, has a number of articles of interest to the topic map community.

Here are five (5) that caught my eye:

September 9, 2010

Calibrated Leakage?

Filed under: Data Mining,Examples,Subject Identity,Topic Maps — Patrick Durusau @ 6:36 pm

Unlike leaks from a faucet, only some leaks from the Obama Whitehouse annoy the administration.

All administrations approve of their “leaks” and dislike unfavorable “leaks.” In either case, it is an information mapping issue.

First, people who have access to particular documents or facts become topics. Their known associates, from FBI background checks, Facebook pages, etc., also become topics. Form associations between them.

Second, phone traffic and visitor/day book log entries become topics and build associations with Whitehouse staff and their friends.

Third, documents with high likelihood to have “leakable” stories or facts, are topics with timed associations as they fan out across the staff.

Fourth, “leaks” in the media, particularly by time of the disclosure, are captured as topics as well as who reported it, etc.

No magic, just automating and making correlations between information and records that already exist in disparate forms.

A topic map enables estimates of how effective approved “leaks” are propagating or investigation of the sources of unapproved “leaks.”

Topic maps: calibrating leakage.

******
PS: There are defenses to highly correlated data gathering/analysis. Please inquire.

Is Glutonny the Answer to Glut?

Filed under: Subject Identity,Topic Maps — Patrick Durusau @ 6:24 pm

One of the reprises about topic maps is that all the information about a subject can be gathered to a single location..

I am already suffering from information glut! A gluttonous topic maps solution is going to find more information about my subjects? That I didn’t even know existed? No thanks!

But: topic maps can gather all user-defined relevant information about a subject to a single location.

Using subject identity topic maps can reliably filter out irrelevant, repetitive or even useless information about a subject. Which leaves you with a modest and digestible amount of information.

Topic maps: Putting information glut on a diet!

High-Performance Dynamic Pattern Matching over Disordered Streams

Filed under: Data Integration,Data Mining,Pattern Recognition,Subject Identity,Topic Maps — Patrick Durusau @ 4:12 pm

High-Performance Dynamic Pattern Matching over Disordered Streams by Badrish Chandramouli, Jonathan Goldstein, and David Maier came to me by way of Jack Park.

From the abstract:

Current pattern-detection proposals for streaming data recognize the need to move beyond a simple regular-expression model over strictly ordered input. We continue in this direction, relaxing restrictions present in some models, removing the requirement for ordered input, and permitting stream revisions (modification of prior events). Further, recognizing that patterns of interest in modern applications may change frequently over the lifetime of a query, we support updating of a pattern specification without blocking input or restarting the operator.

In case you missed it, this is related to: Experience in Extending Query Engine for Continuous Analytics.

The algorithmic trading use case in this article made me think of Nikita Ogievetsky. For those of you who do not know Nikita, he is an XSLT/topic map maven, currently working in the finance industry.

Do trading interfaces allow user definition of subjects to be identified in data streams? And/or merged with subjects identified in other data streams? Or is that an upgrade from the basic service?

September 8, 2010

Maiana August Release

Filed under: Maiana,SPARQL,Topic Map Software,Topic Maps — Patrick Durusau @ 9:00 am

Maiana August Release covers a number of new and exciting features in Maiana.

Among other things, you will find:

  • Maiana is now running on MajorToM
  • a “history” function for changes to a topic map
  • a SPARQL query engine
  • TMQL queries can be saved for later use
  • other improvements/new features.

I assume they left something to do in September. 😉

Performativity and Topic Maps

Filed under: Mapping,Maps,Subject Identity,Topic Maps — Patrick Durusau @ 8:50 am

Parsing Performativity came to me by way of Sam Hunting.

Read the post, then ask yourself: Is my topic map wearing a path from point A to point B?

Perhaps performativity should be a measure of a topic map’s success?

CouchDB: Sell it to Your Boss – Post

Filed under: Graphs,NoSQL,Software — Patrick Durusau @ 8:49 am


CouchDB: Sell it to Your Boss
from Alex Popescu.

CouchDB is one of the many options in the NoSQL world. As a distributed document repository, it is of interest to users of topic maps with document stores. It is written in Erlang, a language for distributed applications, including topic maps.

September 7, 2010

Domains of Discourse, Identification (and Mapping)

Filed under: Subject Identity,Topic Maps — Patrick Durusau @ 6:09 am

Topic maps rhetoric has long maintained that different domains of discourse may have different ways to identify some single subject.

Topic maps provide the means to map between those different identifications, to provide a collocation point for all the information about such a subject.

If different domains can have different ways to identify the same subject, doesn’t it stand to reason that they can also have different ways to map subject identifications from other domains?

Some of them may lack any concept of mapping to/from foreign identifications. Identifications are expressed in a given vocabulary or not at all. Other will have a variety of concepts of mapping, some broader, some narrower.

Understanding subject identifications in various domains as well as their concepts of mapping between domains, will only improve our promotion of topic maps.

September 6, 2010

The Sixth Australasian Ontology Workshop

Filed under: Conferences,Mapping,Ontology — Patrick Durusau @ 6:56 am

The Sixth Australasian Ontology Workshop will be held in conjunction with 23rd Australasian Joint Conference on Artificial Intelligence in Adelaide, South Australia.

Important dates:

  • Submission of papers: 24 September 2010
  • Notification of acceptance/rejection: 22 October 2010
  • Final camera ready copies: 12 November 2010
  • Workshop date: 7 December 2010

Ontologies are used in topic maps just like in other knowledge management technologies. An area of special interest for topic maps is mapping between ontologies (some of which don’t admit the existence of other ontologies, 😉 ).

September 5, 2010

Experience in Extending Query Engine for Continuous Analytics

Filed under: Data Integration,Data Mining,SQL,TMQL,Uncategorized — Patrick Durusau @ 4:37 pm

Experience in Extending Query Engine for Continuous Analytics by Qiming Chen and Meichun Hsu has this problem statement:

Streaming analytics is a data-intensive computation chain from event streams to analysis results. In response to the rapidly growing data volume and the increasing need for lower latency, Data Stream Management Systems (DSMSs) provide a paradigm shift from the load-first analyze-later mode of data warehousing….

Moving from load-first analyze-later has implications for topic maps over data warehouses. Particularly when events that are subjects may only have a transient existence in a data stream.

This is on my reading list to prepare to discuss TMQL in Leipzig.

PS: Only five days left to register for TMRA 2010. It is a don’t miss event.

“Linguistic terms do not hold exact meaning….”

Filed under: Data Integration,Fuzzy Sets,Information Retrieval,Subject Identity — Patrick Durusau @ 10:36 am

In some background research I ran across:

One of the most important applications of fuzzy set theory is the concept of linguistic variables. A linguistic variable is a variable whose values are not numbers, but words or sentences in a natural or artificial language. The value of a linguistic variable is defined as an element of its term set? a predefined set of appropriate linguistic terms. Linguistic terms are essentially subjective categories for a linguistic variable.

Linguistic terms do not hold exact meaning, however, and may be understood differently by different people. The boundaries of a given term are rather subjective, and may also depend on the situation. Linguistic terms therefore cannot be expressed by ordinary set theory; rather, each linguistic term is associated with a fuzzy set. (“Soft sets and soft groups,” by Haci Akta? and Naim Ça?man, Information Sciences, Volume 177, Issue 13, 1 July 2007, Pages 2726-2735

Fuzzy sets are yet another useful approach that has recognized linguistic uncertainty as an issue and developed mechanisms to address it.

What is “linguistic uncertainty” if it isn’t a question of “subject identity?”

Fuzzy sets have developed another way to answer questions about subject identity.

As topic maps mature I want to see the development of equivalences between approaches to subject identity.

Imagine a topic map system consisting of a medical scanning system that is identifying “subjects” in cultures using rough sets, with equivalences to “subjects” identified in published literature using fuzzy sets, that is refined by “subjects” from user contributions and interactions using PSIs or other mechanisms. (Or other mechanisms, past, present or future.)

September 4, 2010

Master Data Management – Successes?

Filed under: Data Integration,Semantics — Patrick Durusau @ 8:23 pm

With email and articles on master data management running neck and neck with Nigeria widows entrusting me with $millions of US dollars on deposit in some third country, I decided to take another (brief) look:

While MDM vendors will probably tell you that the high success rate is due to their superior technology, Baseline’s Jill Dyche, who analyzed the survey results, has come to a different conclusion.

Most current MDM projects have focused on just “low-hanging fruit,” Dyche said. They often tackle jobs like reconciling names and addresses, leaving the more challenging work — sorting out product specifications and other data from numerous internal and external sources, for example — for phase 2 and beyond. As MDM project deployments grow more complex, ‘drama’ could follow, By Jeff Kelly, News Editor

Drama? Well, HR versus Accounting, or Sales versus Production over the unified master record sounds like drama to me. Not to mention the entrenched interests in particular systems.

Topic maps, managing diverse semantics with less drama, what’s there not to like?

September 3, 2010

Making Wikileaks Effective

Filed under: Information Retrieval,Marketing,Subject Identity,Topic Maps — Patrick Durusau @ 7:57 pm

Wikileaks has captured the headlines with the release of Afghan War Diary, 2004-2010.

I haven’t looked at the documents but document collections present the same issues for effective use.

First, document semantics vary depending upon whether they are being read by their intended audience, another military command or other audience. For example, locations may be identified by unfamiliar terms.

Second, and nearly as important, what if one analyst bridges the different semantics and identifies a location? How do they map it to their semantic and communicate that fact to others?

Could pass around a sticky note. Put it on a blackboard. Write it up in a multi-page report.

Topic maps are an effective means to navigate data and multiple interpretations of it, not to mention integrating other data you may have on hand.

Topic maps don’t constrain what subjects you can identify in advance, the basis on which you identify them, and can quickly share discoveries with others.

Wikileaks can be annoying. Topic maps can make Wikileaks effective. There’s a difference.

September 2, 2010

The Matching Web (semantics supplied by users)?

Filed under: Searching,Semantic Diversity,Semantic Web,Semantics — Patrick Durusau @ 10:15 am

Why do we call it ‘The Semantic Web? The web is nothing but a collection of electronic files. Where is the “semantic” in those files? (Even with linking, same question.)

Where was the “semantic” in Egyptian hieroglyphic texts? They had one semantic in the view of Horapollo, Atahanasius Kircher, and others. They have a different semantic in the view of later researchers, Jean-François_Champollion and see the Wikipedia article Egyptian Hieroglyphics.

Same text, different semantics. With the later ones viewed as being “correct.” Yet it would be essential to record the hieroglyphic semantics of Kircher to understand the discussions of his contemporaries and those who relied on his work. One text, multiple semantics.

All our search, reasoning, etc., engines can do is to mechanically apply patterns and return content to us. The returned content has no known “semantic” until we assign it one.  Different users may supply different semantics to the same content.

Perhaps a better name would be “The Matching Web (semantics supplied by users)”.*

******

*Then we could focus on managing the semantics supplied by users. A different task than the one underway in the “Semantic Web” at present.

September 1, 2010

Structural, Syntactic, and Statistical Pattern Recognition

Filed under: Pattern Recognition,Subject Identity — Patrick Durusau @ 7:22 pm

Structural, Syntactic, and Statistical Pattern Recognition (Joint IAPR International Workshop, SSPR&SPR 2010, Cesme, Izmir, Turkey, August 18-20, 2010. Proceedings) edited by: Edwin R. Hancock, Richard C. Wilson, Terry Windeatt, Ilkay Ulusoy, and, Francisco Escolano.

Pattern recognition is a first step towards assisting users in the subject recognition process that results in a topic map.

Content-Based Tile Retrieval System by Pavel Vácha and Michal Haindl, surprised me because it was about matching colors/patterns on ceramic tiles.

I was expecting a paper on tiling of an identity plane but it was just as delightful. Anyone who has ever shopped for paint or tile, particularly with one’s spouse, ;-), can understand the importance of color/pattern matching.

This paper is a good illustration of how pattern matching can be used to assist users, albeit, not in a topic map context. Its application to the construction of a topic map would be just one step further.

Developers of topic map applications targeting real world data will find a number of insights and techniques in this collection of papers.

OpenOntologyRepository IPR/Discussion

Filed under: Ontology — Patrick Durusau @ 7:19 pm

OpenOntologyRepository IPR/Discussion Ontolog series promises to be an interesting discussion of IPR issues in the context of ontology development.

The wiki-page offers a variety of resources on IPR issues.

ONTOLOG (a.k.a. “Ontolog Forum”) is an open, international, virtual community of practice devoted to advancing the field of ontology, ontological engineering and semantic devoted to advancing the field of ontology, ontological engineering and semantic technology, and advocating their adoption into mainstream applications and international standards.

« Newer Posts

Powered by WordPress