Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

August 31, 2010

One of These Things

One of These Things could be a theme song for topic maps.

It is also a good idea for a topic map authoring interface.

Say you get ten (10) “hits” back from a search. Add a “checkbox” to each “hit.” Unchecked means same as other unchecked “hits.” Checked means different from the unchecked “hits.”

The “same subject” judgment becomes a collective one of all the users of the search interface. Different “hits” are going to be unchecked in any search return.

Semantic input = Human input.

Voronoi Treemaps

Filed under: Graphs,Maps,Visualization — Patrick Durusau @ 8:56 am

Voronoi Treemaps (Caution: 19 MB file) came to my attention by way of Jack Park.

A Treemap is a visualization of hierarchical data that uses squares to represent nodes in a tree. The size of a square depends upon a value assigned to it, based on some range of measurement. One drawback of this method is that complex or deep hierarchies are difficult to render for effective use.

The authors provide an excellent introduction to Treemaps, the current state of their use, as well as providing a method that allows the use of Treemaps visualizations with arbitrary shapes.

Computationally complex, Voronoi Treemaps may not be appropriate for real-time renderings of topic maps or domains for mapping.

The visualization of data domains as an aid to the creation of topic maps should include Voronoi Treemaps as part of its research agenda.

August 30, 2010

Is search passé?

Filed under: Interface Research/Design,Search Engines,Searching,Topic Maps — Patrick Durusau @ 4:50 pm

Is search passé? is an intriguing question asked at the Montangue Institute Review for August, 2010. Unfortunately, not being a member, I can’t summarize their answer for you.

It really isn’t that hard to guess some of them. I blogged about Blair and Maron saying twenty-five years ago:

Stated succinctly, it is impossibly difficult for users to predict the exact words, word combinations, and phrases that are used by all (or most) relevant documents and only (or primarily) by those documents, as can be seen in the following examples.

Documents and texts haven’t changed in the last twenty-five years. If anything, the problem has gotten worse due to the volume and variety of material that is now available for searching.

This is a semantic and therefore human judgment problem. Algorithms and “clever” data structures can assist human users in making those judgments, but can’t replace them in the loop.

Imagine a search engine that seeks the assistance of users on semantic issues. As opposed to the skulking around of current search engines and sites. Why not just ask? Politely.

A user-fed search engine with a topic map backend. That could be very interesting.

August 29, 2010

Journal of Artificial Intelligence Research – Journal

Filed under: Data Integration,Merging,Subject Identity — Patrick Durusau @ 7:23 pm

Journal of Artificial Intelligence Research is one of the oldest electronic journals on the Internet, not to mention that it offers free access to all its contents.

While some of the articles have titles like “The Strategy-Proofness Landscape of Merging”, P. Everaere, S. Konieczny and P. Marquis (2007), Volume 28, pages 49-105, they raise issues that sophisticated topic mappers will need to be able to discuss intelligently with data analysts.

Information Fusion – Journal

Filed under: Data Integration,Merging,Subject Identity — Patrick Durusau @ 6:59 pm

Information Fusion covers a number of areas of direct interest to topic map researchers and developers. An incomplete list includes:

  • Fusion Learning In Imperfect, Imprecise And Incomplete Environments
  • Intelligent Techniques For Fusion Processing
  • Fusion System Design And Algorithmic Issues
  • Fusion System Computational Resources and Demands Optimization
  • Special Purpose Hardware Dedicated To Fusion Applications

If you are considering this as a publication venue, consider their “open access” (quotes are theirs) before making that choice.

August 28, 2010

Annotated Computer Vision Bibliography

Filed under: Merging,Searching,Subject Identity — Patrick Durusau @ 5:33 am

Annotated Computer Vision Bibliography in its 17th year on the Internet!

Relevant to topic maps, among other reasons:

  1. Users visually distinguishing subjects in topic map use/authoring
  2. Pattern recognition, clustering, related techniques (chapter 14)
  3. Subject recognition of various types

Suggestions of specific articles of interest to topic mappers greatly appreciated!

August 27, 2010

A Comparison of Merging Operators in Possibilistic Logic

Filed under: Mapping,Merging,Subject Identity — Patrick Durusau @ 7:26 am

A Comparison of Merging Operators in Possibilistic Logic by Guilin Qi, Weiru Liu and David Bell has topic maps written all over it doesn’t it?

The article is not yet available on my university server but I will keep a watch for it and will report back when I have more details. The author links are to their DBLP records.

Try the following searches on “merging operators” in DBLP and CiteSeerX:

******
Update: 28 August 2010

A Comparison of Merging Operators in Possibilistic Logic (another source for the paper) More comments to follow.

******

Update: 28 August 2010

Qi’s PhD thesis (2006) FUSION OF UNCERTAIN INFORMATION IN THE FRAMEWORK OF POSSIBILISTIC LOGIC starts with:

Possibilistic logic provides a good framework for dealing with merging problems when information is pervaded with uncertainty and inconsistency. Many merging operators in possibilistic logic have been proposed. However, there are still some important problems left unsolved.

Makes me curious about the “Many merging operators….” No promises of when but it would be interesting to start a list of those both within and without possibilistic logic.

August 26, 2010

all things cataloged

Filed under: Topic Maps — Patrick Durusau @ 7:41 pm

all things cataloged is a new blog on topic maps and related issues by Saskia, a library cataloger based in Vienna, Austria.

I saw first saw a note this new blog on Topic Map Snippets.

When you have a moment, step over to Saskia’s blog to welcome a new member of our community.

August 25, 2010

Murray – Presentation History

Filed under: Graphs,Information Retrieval,Subject Identity — Patrick Durusau @ 3:36 pm

Ronald Murray forwarded a Presentation History that clarifies some of the issues raised in Ethnomathematics Doodles.

Please use “Presentation History” instead of “Ethnomathematics Doodles” on its own.

August 24, 2010

Ethnomathematics Doodles

Filed under: Graphs,Information Retrieval,Subject Identity — Patrick Durusau @ 7:29 pm

Ethnomathematics Doodles came by way of Ronald Murray, whose presentation, Moby-Dick to Mashups, was mentioned here not all that long ago.

BTW, Ron has placed the slides from that presentation up on Slideshare.net and is seeking comments on them.

August 23, 2010

KNIME – Professional Open-Source Software

Filed under: Heterogeneous Data,Mapping,Software,Subject Identity — Patrick Durusau @ 7:27 pm

KNIME – Professional Open-Source Software is another effort by domain bridging folks I mentioned yesterday.

From the homepage:

KNIME (Konstanz Information Miner) is a user-friendly and comprehensive Open-Source data integration, processing, analysis, and exploration platform. From day one, KNIME has been developed using rigorous software engineering practices and is currently being used actively by over 6.000 professionals all over the world, both in industry and academia.

Read the KNIME features page for a very long list of potentially useful subject identity tests.

There is a place for string matching IRIs, but there is a world of subject identity beyond that as well.

August 22, 2010

Domain Bridging Associations Support Creativity

Filed under: Data Integration,Heterogeneous Data,Mapping,Semantics — Patrick Durusau @ 10:21 am

Domain Bridging Associations Support Creativity by Tobias Kötter, Kilian Thiel, and Michael R. Berthold, offers the following abstract:

This paper proposes a new approach to support creativity through assisting the discovery of unexpected associations across different domains. This is achieved by integrating information from heterogeneous domains into a single network, enabling the interactive discovery of links across the corresponding information resources. We discuss three different pattern of domain crossing associations in this context.

Does that sound familiar to anyone?

Part of the continuing irony that semantic integration research suffers from a lack of semantic integration.

I am just at the tip of this particular iceberg of research so please chime in with pointers to conferences, proceedings, articles, books, etc.

The Universtät Konstanz, Nycomed Chair for Bioinformatics and Data Mining, Publications page, where I found this paper and a number of other resources.

August 21, 2010

Mappify Adds…CTM 1.0!

Filed under: CTM,Topic Map Software,Topic Maps — Patrick Durusau @ 8:23 pm

Lars Heuer posts, Mappify Topic Maps to Topic Maps Web Service has added support for CTM 1.0!

According to Lars:

Mappify can convert CTM sources into JTM 1.0, XTM 2.1 and even into CTM 1.0 🙂

That last one, CTM to CTM, would that be like comparing your chess moves with those of a computer faced with the same situation?

August 20, 2010

Transparency, *-wingers and Legislation

Filed under: Examples,Marketing,Topic Maps — Patrick Durusau @ 8:14 pm

Transparency for U.S. legislation seems like a big nut to crack.

First there is the legislation itself and to be complete, all the revisions, amendments, etc.

Second, there is analysis that legislation, from all sides, from the GAO to “Moles-For-President.”

Third, there is the matching of all the analysis to the legislation and doing so in a timely fashion.

Fourth, useful interfaces so everyone from novices to professional researchers can find the information they need.

Fifth, there is the hardware/software support that would be required to power such a solution.

All that adds up to a large investment in people and infrastructure. Not to mention largely duplicating what has already been done by others.

Let’s take a no-local-copy based view of topic maps That is map to representatives of subjects in place (“in situ” for my archaeology friends).

Offer an interface that allows selection of any part of legislation/regulation and entry of a pointer to commentary on that part.

Capture the enthusiasm of the *-wingers of various persuasions.

Give a preference to linked comments of less than 500 words.

Does that reduce the big nut problem down to a smaller one, one that may be doable?

Suggestions?

August 19, 2010

Human Flesh Search Engine (HFS)

Filed under: Searching,Topic Maps — Patrick Durusau @ 6:02 pm

Computer, August, 2010, has A Study of the Human Flesh Search Engine: Crowd-Powered Expansion of Online Knowledge by Fei-Yue Wang, Daniel Zeng, James A. Hendler, Qingpeng Zhang, Zhuo Feng, Yanqing Gao, Hui Wang, Gaunpi Lai.

Study of several episodes of mass collaboration in China by users that involved collecting and sharing information.

Has to make you wonder about using human communities (as opposed to “experts”) to identify subjects and build topic maps doesn’t it?

To make community maps explicit is probably the more accurate turn of phrase. Communities already identify subjects of interest to them, just not in the same language as an “expert.”

Identifying subjects in human community languages (as opposed to “expert” languages) won’t enable software agents to “reason” about the temperature of drinks in a soft drink machine.

But I know where to hire some bright engineers if I need that sort of information over a web interface.

Post Early, Post Often

Filed under: Uncategorized — Patrick Durusau @ 4:50 am

Apologies for the lack of a post for August 18, 2010.

I was working on a post late yesterday evening when my ISP lost connectivity to the Net. 🙁

I could not stay up late enough to see if it would be repaired before the end of the day.

Hence, no post for August 18, 2010.

Have a lot of stuff in the queue so will try to get an early post out most days.

August 17, 2010

TM/JSON – Proposal

Filed under: Topic Map Software,Topic Maps — Patrick Durusau @ 3:50 pm

TM/JSON is a work in progress by Robert Cerny.

According to Robert:

The main idea is to have an object representation of a topic map in any programming language that supports JSON without writing or generating mapping code and still being able to access the information with little to no knowledge of Topic Maps.
TM/JSON first draft

Code written with no understanding of the inputs seems problematic to me. (The mother’s programming job in Snow Crash?)

TM/JSON does not appear to require ignorance of topic maps so perhaps programmers knowledgeable about topic maps will find it useful as well.

We all need to give it a close read and Robert the benefit of some feedback.

August 16, 2010

A Bill With No Name

Filed under: Examples,Marketing,TMCL,Topic Maps — Patrick Durusau @ 6:13 am

H.R. 1586 as passed by the U.S. Senate and reported by Thomas (Library of Congress, legislative information), reads:

Short Title
section 1. This Act may be cited as the “______Act of____”.

Ask yourself: How would topic maps lead to a different result? (Ok, that probably wasn’t your first thought, work with me here.)

If bills were treated as subjects, represented by topics, using TMCL, we can specify that every topic of type “House Bill” has to have one and only one name.

We can modify the example in TMCL, 7.6 Topic Name Constraint to read

houseBill isa tmcl:topic-type;
has-name(tmdm:topic-name, 1, 1).

Which says every topic of House Bill type has one and only one name. And we should get an error warning if is it missing.

If that seems like a lot of trouble fix a work flow proofing glitch, consider this:

U.S. legislation typically runs hundreds, even thousands of pages with provisions that are relevant to particular constituencies. What if all those provisions and their constituencies were treated as subjects, represented by topics?

Everyone could read those provisions of interest to them or the ones they were interested in opposing (possibly the more popular of the two). Instead of 2,000 pages you might need to read only 3 to 5 pages.

Reading maybe 3 to 5 pages sounds more like transparency to me than dumping 2,000+ pages on my desk and calling it “transparency.”

******
PS: My suggestion to fix the bill title: “Last Opaque Act of 2010.” Whether lobbyists, elected officials and agencies can hear it or not, transparency is coming, to the USA.

August 15, 2010

Index Merging

Filed under: Database,Indexing — Patrick Durusau @ 4:47 pm

Index Merging by Surajit Chaudhuri and Vivek Narasayya caught my eye for obvious reasons!

I must admit to some disappointment when I found it was collecting index columns and placing them together in a single table. I am sure that technique is quite valuable for data warehouses but isn’t what I think of when I use the phrase, “merging indexes.”

The article is well written and was worth reading. As I started to put it to one side, it occurred to me that perhaps I was too hasty in deciding it wasn’t relevant to topic maps.

What if I had a data warehouse with a “merged” index where collectively the columns supported queries based on subject identity? Or if I wanted to use a set of indexes from other applications (say Lucene for example), to query against for similar purposes?

Whether you are into .Net or not, you should add this one to your reading list.

What Is A Map, Really? (2)

Filed under: Mapping,Maps — Patrick Durusau @ 1:37 pm

When you are considering whether a map is a territory, consider the ways in which maps are treated like territories.

Maps are defended like territories. Suggest to one upper ontology that it should consider being more like another upper ontology if you want to see that in real life.

Maps are seen as destinations/territories. Witness the “convert to the latest ….. data model” efforts. A data model is nothing but a map. Advocates of a data map/model will not rest until all data bows to their map/model. (Rest easy, it never happens.)

Maps are seen as destinations/territories (2). The constructs of a map can be seen as subjects in their own right (in addition to its contents). Those subjects are implicitly recognized in conversion. (Topic maps enable those subjects to be made explicit.)

Claim made by particular destinations/territories that: Existence of different designations/territories impede interchange, communication, and create unnecessary expense.

What other characteristics would you ascribe to territories that can also be said about maps?

(Your claim doesn’t need universal acclimation, but you should have a good argument for it.)

*****
I reformed “covert” to “convert,” based on comment from Kirk. Although, from what I read in the papers, “covert” might have been accurate as well! 😉

Thanks for the catch Kirk!

August 14, 2010

What Is A Map, Really?

Filed under: Mapping,Maps — Patrick Durusau @ 3:58 pm

Hayakawa’s dictum “…the map is NOT the territory it stands for.” (Language in Thought and Action, 1949) opens the question of what is the nature of its “NOT” being the territory it stands for?

That question has many aspects but the one for today is that the map “…stands for…” the territory. That implies that it points to or in some way represents the territory.

If a map points to a territory, can there be more than one map of the same territory?

Think of at least 2 examples of where there are different maps of physical territories.

How do those maps point to the territory in question?

How would you point to the maps in your example from another map?

Would that make the maps in your example into territories?
If not, why not?

August 13, 2010

Prescriptive vs. Adaptive Information Retrieval?

Filed under: Concept Hierarchies,Indexing,Information Retrieval,Thesaurus — Patrick Durusau @ 8:47 pm

Gary W. Strong and M. Carl Drott, contend in A Thesaurus for End-User Indexing and Retrieval, Information Processing & Management, Vol. 22, No. 6, pp. 487-492, 1986, that:

A low-cost, practical information retrieval system, if it were to be designed, would require a thesaurus, but one in which end-users would be able to browse research topics by means of an organization that is concept-based rather than term-based as is the typical thesaurus.

…. (while elsewhere)

It is our hypothesis that, when the thesaurus can be envisioned by users as a simple, yet meaningful, organization of concepts, the entire information system is much more likely to be useable in an efficient manner by novice users. (emphasis added)

It puzzles me that experts are building a system of concepts for novices to use. Do you suspect experts have different views of the domains in question than novices? And approach their search for information with different assumptions?

Any concept system designed by an expert is a prescriptive information retrieval system. It represents their view of the domain and not that of a novice. Or rather it represents how the expert thinks a novice should navigate the field.

While the expert’s view may be useful for some purposes, such as socializing a novice into a particular view of the domain, it may be more useful for novices to use a novice’s view of the domain. To build that we would need to turn to novices in a domain. Perhaps through the use of adaptive information retrieval, IR that adapts to its user, rather than the other way around.

Adaptive information retrieval systems, I like that, ones that grow to be more like their users and less like their builders with every use.

August 12, 2010

CXTM-tests Release!

Filed under: CXTM,Topic Map Software,Topic Maps — Patrick Durusau @ 6:39 pm

CXTM-tests 0.3 has been released!

Oh, I guess I had better say what that means. 😉 Or, better yet, let that silver-tongued devil Lars Heuer say it for me:

It is a suite of tests for Topic Maps implementations, based around the various Topic Maps syntaxes. The intention is to help developers of Topic Maps implementations verify that their implementations are actually correct according to the specifications.

Each test consists of (at least) one input file with a corresponding CXTM file. If a Topic Maps implementation works correctly, it has to generate the same canonical output as specified by the reference CXTM file.

Or see his post, ANN: CXTM tests 0.3 released

August 11, 2010

Mappify Topic Maps To Topic Maps web service

Filed under: Topic Map Software,Topic Maps — Patrick Durusau @ 5:44 am

Mappify Topic Maps to Topic Maps web service is an awkwardly titled but interesting announcement from Lars Heuer.

From his post:

I added a web service to Mappify which translates different Topic Maps syntaxes to XTM 2.1, CTM 1.0 and JTM 1.0 (reasons for this limitation can be found in [1]).

Supported input formats:
* JSON Topic Maps (JTM) 1.0
* Linear Topic Maps (LTM) 1.3
* XML Topic Maps 1.0, 2.0, 2.1
* TM/XML Topic Maps 1.0, 1.1

See his post for the full details.

As a community, not to pick on Lars, we need to find better titles for our papers/posts, etc. Take this post for example, why not: “Crossdressing Topic Maps, A Web Service.”? I think that would get a lot more hits than its present title.

August 10, 2010

Master Data Management (MDM)

Filed under: Marketing — Patrick Durusau @ 2:40 pm

Master Data Management (MDM) is a revival of an old idea.

The basic idea is that an organization should have one uniform way to talk about its non-transactional entities. In topic map land we would say subjects.

OK, but here’s come the payoff question: How does the organization deal with heterogeneous data from others?

Ah, yes, well, hmmm, …..that wasn’t part of our MDM contract.

You can be an island of pure data (ghetto?) in a heterogeneous world (MDM) or you can play well with others (topic maps). Which do you think offers the most commercial advantage?

August 9, 2010

TMRA – WG 3 Meeting!

Filed under: Conferences,TMQL,Topic Maps — Patrick Durusau @ 6:30 pm

JTC 1/SC 34 WG 3 (ok, Topic Maps) working group will be meeting two days before TMRA starts in Liepzig, Germany! That is 27-28 September 2010. (Location details forthcoming.)

The main focus of the meeting will be TMQL.

Make it a week in Liepzig!

August 8, 2010

Gephi – The Open Graph Viz Platform

Filed under: Gephi,Graphs,Information Retrieval,Interface Research/Design,Maps,Software — Patrick Durusau @ 3:51 pm

Gephi is an “interactive visualization and exploration platform” for graphs.

From the site:

  • Exploratory Data Analysis: intuition-oriented analysis by networks manipulations in real time.
  • Link Analysis: revealing the underlying structures of associations between objects, in particular in scale-free networks.
  • Social Network Analysis: easy creation of social data connectors to map community organizations and small-world networks.
  • Biological Network analysis: representing patterns of biological data.
  • Poster creation: scientific work promotion with hi-quality printable maps.

I find the notion of interaction with a graph, or in our case a topic map represented as a graph quite fascinating.

Imagine selecting or even adding properties as the basis for merging and then examining those results in an interactive rather than batch process.

Can “drag-n-drop” topic map authoring be that far away?

August 7, 2010

QR Codes & Topic Maps?

Filed under: Marketing — Patrick Durusau @ 8:39 pm

Guided by Barcodes by Meredith Farkas is a great introduction to the growing use of QR codes.

QR codes are 2D bard code that any mobile phone with a camera can read.

Like this one for the URL of this blog:

QR Code for this blog

Meredith says Google will be pushing these for storefronts, to take people to a “favorite place” listing.

I think a QR code that takes you to information about the “favorite place” like customer reviews and health inspections would be a lot more useful.

The article covers other place, museums, special collections, where topic maps as the target of the QR code would be a real value add.

If enough  QR codes start appearing for topic maps I may just have to buy a cell phone.

August 6, 2010

Topic Maps Data Model (TMDM) Turns 10 (next year)

Filed under: TMDM — Patrick Durusau @ 6:07 am

The Topic Maps Data Model (TMDM) first appeared in the SC 34 document registry on 11 August 2001. (That’s SC 34 N0242 for ISO insiders.)

What better way to celebrate its “birthday” than a two day, 2 hours per day, series of presentations on what we have learned in the past ten years and where we would like to go?

I am proposing teleconferences on the 11th and 12th of August, 2011, say from 10 AM UTC/GMT (12 PM Norway, 7 PM Japan, 6 AM Eastern US) until 12 PM UTC/GMT.

General format being 20 minute presentations with 10 minutes Q/A. That should accommodate a maximum of 4 presentations each day.

Comments/suggestions? Volunteers to make presentations?

******

Correction/Clarification:

TMDM – Current TMDM Page the link that appears above.

http://www1.y12.doe.gov/capabilities/sgml/sc34/document/0242.htm The version of the TMDM that will be 10 next year.

Thanks to Lars Heuer for catching the confusion. I wanted to point everyone to the current TMDM page.

*****
Note that I corrected the date for the appearance of the TMDM from 11 August 2011 to 11 August 2001 (its actual date of appearance).

Thanks for Lars Marius Garshol for the correction!

August 5, 2010

PGS – Pretty Good Semantics

Filed under: Uncategorized — Patrick Durusau @ 12:06 pm

PGS – Pretty Good Semantics is the result of months of conversation with Sam Hunting.

Our starting premise: Users want to say things of interest to them, as simply as possible, for them.

Note the focus on users. Not on description logic. Not on formal ontologies. Not on reasoning, artificial or otherwise. Not even on complex mappings between identifications. But on users.

All of those other things are worthwhile enterprises, some of them anyway, which you can pursue your own leisure.

The question is how to empower users to say things about what interests them? And if possible, how to do so without re-writing the WWW to deal with 303 clouds, etc. ?

Our answer to those questions: PGS – Pretty Good Semantics. It asks very little of users yet can annotate any identifier on the WWW to say whatever a user likes.

It uses existing HTML techniques and works with existing web servers and search engines.

Enjoy!

Older Posts »

Powered by WordPress