Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

February 21, 2011

Topics Maps in < 5 Minutes

Filed under: Marketing,Topic Maps — Patrick Durusau @ 6:37 am

Get a cup of coffee and set your timer!

*****

The best way to explain topic maps is to say why we want them in the first place.

I have a set of food/recipe/FDA documents.

Some use the term salt.

Some use the term sodium chloride.

Some use the term NaCl.

Walmart wants to find information on salt/sodium chloride/NaCl, no matter how it is identified.

Not a hypothetical use case: Walmart Launches Major Initiative to Make Food Healthier and Healthier Food More Affordable

Topic maps would enable Walmart to use any one identification for salt, and to retrieve all the information about it, recorded using any of its identifications.*

How?

Topics represent subjects and can have multiple, independent identifications of the same subject.

That sounds cool!

Can I talk about relationships too? Like they do in the tabloids? 😉

Well, sure, but we call those associations.

Let’s keep with the salt theme.

What about: Mark Kurlansky wrote “Salt: A World History”.

How many subjects to you see?

Mark Kurlansky (subject) wrote “Salt: A World History” (subject).

Two right?

But Mark is probably in a lot of associations, both professional and personal. How do we keep those straight?

Topic maps identify another three subjects:

Mark Kurlansky (subject) (role – author) wrote (subject – written-by) “Salt: A World History” (subject) (role – work).

Mark is said to “play” the role of author in the association. “Salt: A World History” plays the role of a work in the association.

Means we can find all the places where Mark plays the author role, as opposed to husband role, speaker role, etc.

One more thing rounds out the typical topic map, occurrences.

Occurrences are used to write down places where we have seen a subject discussed.

That is we say that Mark Kurlansky (subject) occurs at:

http://www.bookbrowse.com/reviews/index.cfm?book_number=960

(How am I doing for time?)

That’s it.

To review:

  1. Topics represent subjects that can have multiple identifications.
  2. Use of any identifier should return information recorded using any identifiers for the same subject.
  3. Associations are relationships between subjects (role-players) playing roles.
  4. Occurrences are pointers to where subjects are discussed.

Exercise:

Choose any subject that you like to talk about. Now answer the following questions about your subject:

  1. Name at least three ways it is identified when people talk about it.
  2. Name at least one association and the roles in it for your subject.
  3. Name at least three occurrences (discussions) of your subject that you would like to find again.

Congratulations! Except for the syntax, you have just gathered all the information you need for your first topic map.

*****
* The salt example is only one of literally hundreds of thousands of multiple identifier type issues that confronts any consumer of information.

Walmart could use topic maps to collate information of interest to its 140 million customers every week and to deliver that as a service both to its customers as well as other commercial consumers of information.

A contrast to the U-Sort-It model of Google, which delivers dumpster loads of information, over and over again, for the same request.

PS: There are more complex issues and nuances of the syntaxes but this is the essence of topic maps.

February 20, 2011

Detecting Defense Fraud With Topic Maps

Filed under: Marketing,Topic Maps — Patrick Durusau @ 4:29 pm

The New York Times, reported Sunday, Hiding Details of Dubious Deal, U.S. Invokes National Security extraordinary fraud in defense contracting.

Gen. Victor E. Renuart, Jr. of the Air Force, former commander of the Northern Command is quoted as saying:

We’ve seen so many folks with a really great idea, who truly believe their technology is a breakthrough, but it turns out not to be.

OK, but the technology in question was alleged to detect messages in broadcasts from Al Jazeera. (I suppose something more than playing them backwards and hearing, “number 9, number 9, number 9, ….”)

The fact that some nut-job wraps up in a flag and proclaim they want to save American lives should not short circuit routine sanity checks.

Here’s one where topic maps would be handy:

Build a topic map interface to the Internet Movie Database that has extracted all the technologies used in all the movies listed.

Give each of those technologies a set of attributes so a contracting officer can check them off while reading contract proposals.

For example, in this case:

  • hidden messages
  • TV broadcasts
  • by enemy

Which would return (along with possibly others): Independence Day.

That should terminate the funding review with a referral to the U.S. Attorney for attempted fraud or the local district attorney for competency hearings.

New technologies are developed all the time but non-fraudulent proposals based upon them can be independently verified before funding.

If it works only for the inventor or in their lab, pass on the opportunity.

BTW, a topic map of who was being funded for what efforts could have made the Air Force aware other departments were terminating funding with this applicant.

Caution: It is always possible to construct topic maps (or other analysis) in hindsight, that avoid problems that have already happened. That said, topic maps can be used to navigate existing information systems, providing a low impact/risk way to evaluate the utility of topic maps in addressing particular problems.

A thought on Hard vs Soft – Post
(nonIdentification vs. multiIdentification?)

Filed under: Marketing,Subject Identity,Topic Maps — Patrick Durusau @ 10:39 am

A thought on Hard vs Soft by Dru Sellers starts off with:

With the move from RDBMS to NoSQL are we seeing the same shift that we saw when we moved from Hardware to Software. Are we seeing a shift from Harddata to Softdata? (emphasis in original)

See his post for the rest of the post and the replies.

Do topic maps address a similar hardIdentification vs. softIdentification?

By hardIdentification I mean a single identification.

But it goes further than that doesn’t it?

There isn’t even a single identification in most information systems.

Think about it. You and I both see the same column names and have different ideas of what they mean.

I remember reading in Doan’s dissertation (see Auditable Reconciliation) that a schema reconciliation project would have taken 12 person years but for the original authors being available.

We don’t have any idea what has been identified in most systems and no way to compare it to other “identifications.”

What is this? Write once, Wonder Many Times (WOWMT)?

So, topic maps really are a leap from nonIdentification to multiIdentification.

No wonder it is such a hard sell!

People aren’t accustomed to avoiding the cost of nonIdentification and here we are pitching the advantages of multiIdentification.

Pull two tables at random for your database and have a contest to see who outside the IT department can successfully identify what the column headers represent. No data, just the column headers.*

What other ways can we illustrate the issue of nonIdentification?

Interested in hearing your suggestions.

*****
*I will be posting column headers from public data sets and asking you to guess their identifications.

BTW, some will argue that documentation exists for at least some of these columns.

True enough, but from a processing standpoint it may as well be on a one way mission to Mars.

If the system doesn’t have access to it, it doesn’t exist. (full stop)

Gives you an idea of how impoverished our systems truly are.

IBM’s Watson (the computer, not IBM’s founder, who was also soulless) has been described as deaf and blind. Not only that, but it has no more information than it is given. It cannot ask for more. The life of pocket calculator, if it had emotions, is sad.

February 18, 2011

Large Scale Packet Dump Analysis with MongoDB

Filed under: Marketing,MongoDB,Subject Identity — Patrick Durusau @ 6:51 am

Large Scale Packet Dump Analysis with MongoDB

I mention this because it occurs to me that distributed topic maps could be a way to track elusive web traffic that passes through any number of servers from one location to another.

I will have to pull Stevens’ TCP/IP Illustrated off the shelf to look up the details.

Thinking that subject identity in this case would be packet content and not the usual identifiers.

And that with distributed topic maps, no one map would have to process all the load.

Instead, upon request, delivering up proxies to be merged with other proxies, which could then be displayed as partial paths through the next works with the servers where changes took place being marked.

The upper level topic maps being responsible for processing summaries of summaries of data, but with the ability to drill back down into the actual data.

True, there is a lot of traffic, but simply by dumping all the porn, that reduces the problem by a considerable percentage. I am sure there are other data collection improvements that could be made.

NECOBELAC

Filed under: Biomedical,Marketing,Medical Informatics — Patrick Durusau @ 5:37 am

NECOBELAC

From the webpage:

NECOBELAC is a Network of Collaboration Between Europe & Latin American-Caribbean countries. The project works in the field of public health NECOBELAC aims to improve scientific writing, promote open access publication models, and foster technical and scientific cooperation between Europe & Latin American Caribbean (LAC) countries.

NECOBELAC acts through training activities in scientific writing and open access by organizing courses for trainers in European and LAC institutions.

Topic maps get mentioned in the faqs for the project: NECOBELAC Project FAQs

Is there any material (i.e. introductory manuals) explaining how the topic maps have been generated as knowledge representation and how can be optimally used?

Yes, a reliable tool introducing the scope and use of the topic maps is represented by the “TAO of topic maps” by Steve Pepper. This document clearly describes the characteristics of this model, and provides useful examples to understand how it actually works.

Well,…, but this is 2011 and NECOBELAC represents a specific project focused on public health.

Perhaps using the “TAO of topic maps” as a touchstone, but we surely can produce more project specific guidance. Yes?

Please post a link about your efforts or a comment here if you decide to help out.

Linked Data-a-thon – ISWC 2011

Filed under: Conferences,Linked Data,Marketing,Semantic Web — Patrick Durusau @ 5:17 am

Linked Data-a-thon http://iswc2011.semanticweb.org/calls/linked-data-a-thon/

I looked at the requirements for the Linked Data-a-thon, which include:

  • make use of Linked Data consumed from multiple data sources
  • be able to make use of additional data from other Linked Data sources
  • be accessible from the Web
  • satisfy the special requirement which will be announced on October 1, 2011.

It would not be hard to fashion a topic map application that consumed Linked Data, made use of additional data from other Linked Data sources and was accessible from the Web.

What would be interesting would be to reliably integrate other information sources, that were not Linked Data with Linked Data sources.

Don’t know about the special requirement.

One person in a team of people would actually have to be attending the conference to enter.

Anyone interested in discussing such a entry?

Suggested Team title: Linked Data Cake (1 Tsp Linked Data, 8 Cups Non-Linked Data, TM Oven – Set to Merge)

Kinda long and pushy but why not?

What better marketing pitch for topic maps than to leverage present investments in Linked Data into a meaningful result with non-linked data.

It isn’t like there is a shortage of non-linked data to choose from. 😉

February 17, 2011

Uncertain Future of Topic Maps?

Filed under: Marketing,Topic Maps — Patrick Durusau @ 7:08 am

I was puzzled by a recent comment concerning the “..uncertain future of topic maps…?”

Some possible sources of an “uncertain future” for topic maps:

  1. Unchanging one world language and semantic adopted universally, plus conversion of all existing data into that language/semantic.
  2. Universal agreement to ignore semantic differences. Increase in death rates stabilizes the world’s population at 17th century levels.*
  3. Extinction of human race.

I hate to be disagreeable, ;-), but I really don’t think any of those three causes are likely to imperil the future of topic maps.

To be honest there could be a fourth cause:

4. As a community we promote topic maps so poorly or create such inane interfaces that users prefer to suffer from semantic impedance than use topic maps.

And there is a fifth cause that I don’t recall anyone addressing:

5. There are people who benefit from semantic impedance and non-sharing.

I think we can address #4 and #5 as a matter of marketing efforts.

Starting by asking yourself: Who benefits from a reduction (or maintenance) of semantic impedance?

*****

*No need for alarm. That is just a wild guess on my part. Has the same validity as software and music piracy figures. The difference is I am willing to admit that I made it up to sound good.

Cablemap

Filed under: Marketing,Wikileaks — Patrick Durusau @ 6:42 am

Cablemap

Just in case you have been in a coma for the last 6 months or in solitary confinement, Wikileaks is publishing a set of diplomatic cables it describes as follows:

Wikileaks began on Sunday November 28th publishing 251,287 leaked United States embassy cables, the largest set of confidential documents ever to be released into the public domain. The documents will give people around the world an unprecedented insight into US Government foreign activities.

The cables, which date from 1966 up until the end of February this year, contain confidential communications between 274 embassies in countries throughout the world and the State Department in Washington DC. 15,652 of the cables are classified Secret.

….

The cables show the extent of US spying on its allies and the UN; turning a blind eye to corruption and human rights abuse in “client states”; backroom deals with supposedly neutral countries; lobbying for US corporations; and the measures US diplomats take to advance those who have access to them.

This document release reveals the contradictions between the US’s public persona and what it says behind closed doors – and shows that if citizens in a democracy want their governments to reflect their wishes, they should ask to see what’s going on behind the scenes.

The online treatments I have seen by the Guardian and the New York Times are more annoying than the parade of horrors suggested by US government sources.

True, the cables show diplomats to be venal and dishonest creatures in the service of even more venal and dishonest creatures but everyone outside of an asylum and over 12 years of age knew that already.

Just as everyone knew that US foreign policy benefits friends and benefactors of elected US officials, not the general U.S. population.

Here is the test: Look over all the diplomatic cables since 1966 and find one where the result benefited you personally. Now pick one at random and identify the person or group who benefited from the activity or policy discussed in the cable.

A topic map that matched up individuals or groups who benefited from the activities or policies discussed in the cables would be a step towards being more than annoying.

Topic mapping in Google map locations for those individuals or representatives of those groups, would be more than annoying still.

Add the ability to seamlessly integrate leaked information into another intelligence system, you are edging towards the potential of topic maps.

Cablemap is a step towards the production of a Cablegate resource that is more than simply annoying.

February 14, 2011

Technology vs. Teaching?

Filed under: Examples,Marketing,Topic Maps — Patrick Durusau @ 11:25 am

The reported experiences with technology at “Data Bootcamp” tutorial at O’Reilly’s Strata Conference 2011, installation and other woes, made me think about technology and teaching for topic maps.

Is it a question of technology vs. teaching?

If technology gets in the way of teaching, does the same happen for users?

I don’t know of any user studies where users are presented with an interface and a list of questions to answer or tasks to perform, in connection with a particular topic maps interface?

Has anyone done such studies?

It would be really good to have a public archive of videos of such sessions (with permission of the participants).

****
PS: For topic map presentations/workshops, it would be good to record comments so tests of the presentation/workshop could be done in advance.

I have done presentations where the slides were perfectly clear to me when I wrote them. At presentation time I had to temporize to remember what I was trying to say. You can imagine the difficulty the audience was having. 😉

February 11, 2011

Dealing with Data

Filed under: Data Analysis,Data Mining,Marketing — Patrick Durusau @ 12:45 pm

Dealing with Data

From the website:

In the 11 February 2011 issue, Science joins with colleagues from Science Signaling, Science Translational Medicine, and Science Careers to provide a broad look at the issues surrounding the increasingly huge influx of research data. This collection of articles highlights both the challenges posed by the data deluge and the opportunities that can be realized if we can better organize and access the data.

Science is making access to this entire collection FREE (simple registration is required for non-subscribers).

The growing concern over the influx of data represents a golden marketing opportunity for topic maps!

First, the predictions about increasing amounts of data are coming true.

That means impressive numbers to cite and even more impressive predictions about the future.

Second, the coming data deluge represents a range of commercial opportunities.

Opportunities for reuse, comparison, and mining such data abound. And, only increase as more data comes online.

Are you going to be the Facebook of some data area?

Third, and the reason unique to topic maps:

The format that contains data is recognized as composed of subjects.

Subjects that can be identified, placed in associations, have properties added to them,

That one insight is critical to re-use, combination and comparison of data in the data deluge.

If you identify the subjects that compose those structures, as well as the subject thought to be recognized by those data structures, you can then create maps between diverse data sets.

It is the identification of subjects that enables the creation and interchange of maps of where to swim in this vast sea of data.

*****
PS: I am going to take a slow walk through these articles and will be posting about opportunities that I see for topic maps. Your comments/feedback welcome!

February 9, 2011

Another Word For It – #1,000

Filed under: Marketing,Topic Maps — Patrick Durusau @ 2:17 pm

This makes my 1,000th post to Another Word For It.

I wanted to take a moment to think about where I would like to go next with the next 1,000 posts.

First, I want to become more systematic with the academic literature that is relevant to topic maps. As you have seen, it is spread from bioinformatics and computer science to library science and semiotics.

One of the things that makes articles/books/presentations slow to add is that I read/view all of them before actually posting them to the blog.

I suppose I could go just on titles/abstracts but then you would have to duplicate my wading through stuff that never makes it onto the blog. That doesn’t seem like a value-add.

Second, along with that, I want to provide more in the way of assistance in that jump between, “ok, topic maps sound great,” and having an operational topic map that provides a meaningful result.

Being more systematic about the literature isn’t going to be easy and providing generalized assistance for topic map authoring is going to be even harder.

My proposal, subject to your comments and suggestions, is to create what I am calling starter maps that have a lot of the basic infrastructure topics, types, etc., plus topics, associations, etc., for a particular domain.

For example, I might want to offer a starter map for say NASCAR racing, that has all the usual structural topics but also all the racetracks, races and competitors for the last decade. Plus relevant associations. Not everything someone would want but enough that getting visible results would not be all that hard.

A boost over the topic map fence as it were.

At least initially, those are mostly going to be in topic map format type data resources. Doesn’t really scale for semantically diverse resources but everyone has to start using topic maps somewhere.

Third, in addition to bare bones starter maps, I would like to create outlines of data sources that look particularly interesting.

Not nearly as easy to use as the starter maps but easier for me to author.

The sort of thing that points out subjects and relationships with subjects in other data sets that may not be readily apparent.

Fourth, I want to continue to discover interesting approaches and resources to bring to your attention.

Those will range from new technologies, such as NoSQL and graph databases, to new algorithms for data processing, to new ways to think about subjects and their identifications, etc. Some of them will prove to be very useful in connection with topic maps and others will prove to be less so, if useful at all.

The key criteria for that last item being that it is interesting material. It is impossible (IMHO) to say what information will or will not spark the next great idea about topic maps in my readers.

Finally, in order to devote the cycles necessary to make all of the foregoing happen, I need donations/sponsorships for these activities.

If you like what you see here on a daily basis and this sounds like a good plan, please use the donate button.





Sponsors welcome as well, please inquire. patrick@durusau.net

PS: Topic map consulting/teaching/training also available.

February 8, 2011

Topic Mapping BoingBoing Data?

Filed under: Dataset,Examples,Marketing — Patrick Durusau @ 6:15 am

A recent entry on Simon Willison’s blog, How we made an API for BoingBoing in an evening caught my eye.

It was based on the release of eleven years worth of post from BoingBoing, which you can download at: Eleven years’ worth of Boing Boing posts in one file!

Curious what subjects you would choose first for creating a topic map of this data set?

And having chosen them, how would you manage their identity to make it easier on yourself to incorporate other blog content?

I am mindful of Robert Barta’s approach of no data before its time for incorporation into a topic map.

Would that make a difference in your design of the topic map or only in the infrastructure that supports it?

Try Redis – Try Topic Maps?

Filed under: Examples,Marketing,NoSQL,Redis — Patrick Durusau @ 5:36 am

Try Redis is a clever introduction to Redis.

I recommend it to you as an introduction to Redis and NoSQL in general.

It also makes me wonder if it would be possible to create a similar resource for topic maps?

Granting that it would have to make prior choices about subject matter, data sets, etc. but still, it could be an effective marketing tool for topic maps.

I suspect so even if the range of choices to be made to effect merging were limited.

If I were a left-wing one political blogger in the US I would create a topic map that includes donations to Republican PACs and white collar crime convictions by family members.

Or for the right-wing, a mapping between the provisions of ObamaCare and various specific groups and agencies.

Such that users could choose additional information and it shows up in some visually pleasing way to make the case that the user already thinks is true.

Will have to give this some thought in terms of a framework with a smallish number of framework topics and the ability to quickly add in additional topics for a particular demonstration.

Such that it would be possible to quickly create a topic map demo for some breaking news story.

Could provide useful additional content but the main purpose being a demonstration of the technology.

Useful content is fairly rare so no need to tax a topic map demo with always providing useful content. Sometimes, content is simply content. 😉

Digital Diplomatics 2011 – Conference

Filed under: Conferences,Examples,Marketing,Topic Maps — Patrick Durusau @ 4:41 am

Digital Diplomatics 2011: Tools for the Digital Diplomatist

From the website:

Scholars of diplomatics never had a fundamental opposition on using modern technology to support their research. Nevertheless no technology since the introduction of photography had such an impact on questions and methods of diplomatics as the computer had: Digital imaging gives us cheap reproductions at high quality, so nowadays large copora of documents are to be found online. Digital imaging allows manipulations to make apparently invisible traces visible. Modern information technology gives us access to huge text corpora in which single words and phrases can be found thus helping to indicate relationships, to retrieve parallel texts for comparison or plot geographical and temporal distributions.

The conference aims at presenting projects which working to enlarge the digitised charter corpus on the one hand and on the other hand will put a particular focus on research applying information technology on medieval and early modern charters aiming at pure diplomatic questions as well as historic or philologic research.

An excellent opportunity for topic maps to illustrate how all the fruits of modern and ancient commentary can be brought to bear, using a text (or at least the idea of a text) as the focal or binding point for information.

Biblical scholarship, for example, becomes less sweat of the brow in terms of travel/access and more a question of seeking answers to interesting questions.

Proposals due: May 15, 2011

Conference: Naples, 29th September – 1st October 2011

February 7, 2011

Client-side Metaservices?

Filed under: Marketing,Metaservices — Patrick Durusau @ 8:14 am

File Under: Metaservices, Rise of?

John Battelle writes:

Let me step back and describe the problem. In short, heavy users of the web depend on scores – sometimes hundreds – of services, all of which work wonderfully for their particular purpose (eBay for auctions, Google for search, OpenTable for restaurant reservations, etc). But these services simply don’t communicate with each other, nor collaborate in a fashion that creates a robust or evolving ecosystem.

The rise of the app economy exacerbates the problem – most apps live in their own closed world, sharing data sparingly, if at all. And while many have suggested that Facebook’s open social graph can help untangle the problem, in fact it only makes it worse, as Fred put it in a recent post (which sparked this Thinking Out Loud session for me):

The people I want to follow on Etsy are not the same people I want to follow on Twitter. The people I want to follow on Svpply are not my Facebook friends. I don’t want to sharemy Foursquare checkins with everyone on Twitter and Facebook.

It is a very interesting take but I disagree with the implication that metaservices need to be server side.

With a client side metaservice, say one based upon a topic map, I could share (or not) information as I choose.

Granted that puts more of a burden for maintenance of privacy on the user, but any who trusts others to manage privacy for them, is already living in fish bowl, they just don’t know it.

I think breaching silos with metaservices on the client-side is an excellent opportunity for demonstrating the information management capabilities of topic maps.

Not to mention being an opportunity for commercialization of a client-side metaservice, which should include mapping for the various online silos and their changing arrangements on a subscription basis.

February 6, 2011

“Astonish Us” – Post

Filed under: Marketing,Topic Maps — Patrick Durusau @ 7:08 am

Astonish Us has to be one of the more persuasive pieces of any genre I have read in a very long time.

The advice you find there is applicable to funding for topic maps research, marketing topic maps to potential customers, promoting topic maps among fellow researchers, in short, every aspect of spreading the word about topic maps.

Go read the post and then come back here (using the back button) to post your comments on what astonishes you with a topic map on some particular subject? How do you want to convey that astonishment to others?

February 4, 2011

TopicView

Filed under: Conferences,Examples,Marketing,Topic Maps — Patrick Durusau @ 3:04 pm

TopicView

TopicView is a project by Morpheus on behalf of the Amsterdam police to bridge the practical and semantic boundaries between their information systems.

That is to say it is a solution that allows existing systems to remain in place, but creating bridges between them to enable the police to make more effective use of the information they do have and to share information across systems.

Do be aware that I used Google’s translate feature to read the homepage of this project so some of my appreciation of it is based on surmises based on my knowledge of topic maps.

I did stumble in some places, such as where the translation reports: Bandages stay hidden for Verbanden blijven verborgen.*

Perhaps fuller information will appear in the future.
*****
*I suspect way off base but since it is a police topic map, I would assume that sources of information can remain hidden, even as the information they provide is shown.

Unix – The Hole Hawg

Filed under: Humor,Marketing — Patrick Durusau @ 5:29 am

Unix – The Hole Hawg by Neal Stephenson.

I assume everyone on the Net has seen and enjoyed this item but it’s Friday and I could not resist repeating it.

I particularly liked the line: “….when I got ready to use the Hole Hawg my heart actually began to pound with atavistic terror.”

Don’t know that I want people to view using topic maps with atavistic terror but I would not mind if the people who topic maps were being used against had that feeling. 😉

Nothing markets a product better than fear someone else has a better version (and it about to do something to you with the better version).

Admittedly I don’t have or know of a demonstration of topic maps that would strike fear in anyone, but I suspect that is only a matter of time.

TweetDeck and Topic Maps

Filed under: Authoring Topic Maps,Marketing — Patrick Durusau @ 4:51 am

If you don’t know TweetDeck.com you need to slide by to take a look.

As an admittedly slow and still uncertain adopter of all this social software, I would appreciate any feedback you have on this or other alternatives.

But, onto the topic map relevant part of this post!

I noticed that TweetDeck 0.37 has a feature: Hide repeated retweets.

I think they should go one better than that and scan tweets for the same shortened URL and to offer an option to display what we would call a topic with multiple occurrences.

That is there would be the one shortend URL, which you could follow if you like, with occurrences under that one tweet that list all the various tweets that contain it.

Would certainly shorten up my tweet windows in TweetDeck a good bit. Most of the repeats aren’t marked as retweets so the software isn’t catching them.

Now, if TweetDeck or equivalent software wanted to be really clever, they could make associations with the senders of those tweets so I could see a list of all the users who sent that resource.

*****
PS: This would be a case where TweetDeck need not offer the generic in your face topic map interface but could offer some of the advantages of topic maps (de-duping content and gathering up all the authors of the same content).

Topic Map Competition

Filed under: Authoring Topic Maps,Interface Research/Design,Marketing,News,Topic Maps — Patrick Durusau @ 4:34 am

The idea of a topic map competition seems like a good one to me.

We need to demonstrate that topic map development isn’t like a trip to the ontological dentist or protologist.

Just some random thoughts that hopefully can firm up in the near future.

Suggest starting off with two contests, with two different data sets.

24-Hour Topic Map

A 24 hour contest, with points, in part, for inclusion of participants in different time zones. To encourage the spread of topic maps around the globe.

Each team would be encouraged (required?) to keep a blog while developing the topic map so that the progress of the map, interaction with others, etc., could be documented.

Points to be awarded for participants in different time zones (up to 24 points), up to 25 points for extraction of subjects/creation of topic map structures, up to 25 points for the interface/delivery, and up to 26 points for generality of the scripts/software used in generating the map.

The greatest number of points being for generality of scripts/software so we can encourage others to try these techniques on their own data sets.

7-Day Topic Map

Not unlike the 24-Hour Topic Map (24HTM) contest except that with a much longer time period, the expectations for the results are much higher.

Points should still be awarded for participants in different time zones but should drop to 12 points, extraction/subject map structures should remain at up to 25 points, interfaces/delivery should go up to 31 points and scripts/software, up to 32 points.

Since the teams will be composed of multiple individuals, I suspect prizes are going to be limited to award certificates, listing on public websites as the winners, etc.

Any number of governments are mandating a transition to digital records (including XML) as though that will solve their access problems. For those seeking contracts, being recognized for work with a data set from a particular government could not hurt.

I suppose that may depend on whether the government views you as having permission to work with the data set. 😉

This is a very rough draft and needs a lot more details before being something practical.

PS: Should either one or both or some other variation of this suggestion prove popular, contests could be run on a monthly basis.

February 2, 2011

Data Governance, Data Architecture and Metadata Essentials – Webinar

Filed under: Data Governance,Data Integration,Marketing — Patrick Durusau @ 9:20 am

Data Governance, Data Architecture and Metadata Essentials

Date: February 24, 2011 Time: 9:00AM PT

Speaker: David Loshin

From the website:

The absence of data governance standards is a critical failure point for enterprise data repurposing. As the rates of data volume grows, you want to make sure you are employing the correct practices and standards to make the most of this volume of information. Data can be your company’s best or worst asset. Join David Loshin, industry expert on data governance for this informative webcast.

I suppose it goes without saying that an absence of data governance means that a topic map effort to use outside data is going to be even more expensive. Or perhaps not.

People have been urging documentation of data practices since before the advent of the digital computer. That is still the starting point for any data governance.

What you don’t know about you can’t govern. It’s just that simple. (Can’t merge it with outside data either. But if your internal systems are toast, topic maps aren’t going to save you.)

January 28, 2011

Next Generation Data Integration – Webinar

Filed under: Data Integration,Marketing — Patrick Durusau @ 9:41 am

Next Generation Data Integration

Date: April 12, 2011 Time: 9:00AM PT

Speaker: Philip Russom

From the website:

Data integration (DI) has undergone an impressive evolution in recent years. Today, DI is a rich set of powerful techniques, including ETL (extract, transform, and load), data federation, replication, synchronization, change data capture, natural language processing, business-to-business data exchange, and more. Furthermore, vendor products for DI have achieved maturity, users have grown their DI teams to epic proportions, competency centers regularly staff DI work, new best practices continue to arise (like collaborative DI and agile DI), and DI as a discipline has earned its autonomy from related practices like data warehousing and database administration.

Given these and the many other generational changes data integration has gone through recently, it’s natural that many people aren’t quite up-to-date with the full potential of modern data integration. Based on a recent TDWI Best Practices report this webinar seeks to cure that malady by redefining data integration in modern terms, plus showing where it’s going with its next generation. This information will help user organizations make more enlightened decisions, as they upgrade, modernize, and expand existing data integration solutions, plus plan infrastructure for next generation data integration.

Every group (tribe as Jack Park would call them) has its own terminology when it comes to data and managing data.

As you can tell from the description of the webinar, data integration is concerned with many of the same issues as topic maps. Albeit under different names.

Regard this as an opportunity to visit another tribe and learn some new terminology.

And some new ideas you can use with topic maps.

January 27, 2011

Baltimore – Semi-Transparent or Semi-Opaque?

Filed under: Data Source,Marketing — Patrick Durusau @ 10:09 am

Open Baltimore is leading the way towards semi-transparent or semi-opaque government.

You be the judge.

The City of Baltimore is leading in placing hundreds of data sets online.

But is that being semi-transparent or semi-opaque?

Data sets I would like to see:

  • City contracts, their amounts and who was successful at bidding on them?
  • Successful bidders not be corporate names but who owns them? Who works there? What lawyers represent them?
  • What are the relationships, personal, business, etc., between staff, elected officials and anyone who does business with the city?
  • Same questions for school, fire, police and other departments.
  • Code violations, what are they, which inspectors write them, for what locations?
  • Arrests made of who, by which officers, for what crimes, locations and times.
  • etc. (these are illustrations and not an exhaustive list)

Make no mistake, I am grateful for the information the city has already provided.

What they have provided took a lot of work and will be useful for a number of purposes.

But I don’t want people to think that a large number of data sets means transparency.

Transparency involves questions of relevant data and meaningful ways to evaluate it and to connect it to other data.

Infochimps

Filed under: Data Source,Marketing,Mashups,Topic Maps — Patrick Durusau @ 8:17 am

Infochimps.com

Another free data source. (Commercial plans also available.)

Large number of data sources and what looks like a friendly number of free API calls while you are building an application.

Observation: Finding one data source or project seems to lead to several others in the same area.

Definitely worth a visit.

*****
PS: The abundance of online data sources opens the door to semantic mappings (can you say topic maps?) that enhance the value of these data sets.

Such as resolving the semantic impedance between the data sets.

Topic map artifacts as commercial products.

The trick is going to be discovering (and resolving) semantic impedances that people are willing to pay to avoid.

Mapping Domains to Domainers

Filed under: Examples,Marketing,Topic Maps — Patrick Durusau @ 6:28 am

Epik Has Epic Semantic Web Plans For Its Domains and Domainers

Unfortunate article about how people who park domains to extort money from others can use semantic technologies to supply content to their sites.

I was thinking last night of a much different use of semantic technologies with regard to domainers.

Wouldn’t it be interesting to have a topic map that traces all the parked and frivolous domains?

That is creates topics to represent the same so Google and other search services can simply exclude those sites from search results?

There’s one useful result right there.

Another useful result would be to associate the individuals who work for or own such companies with those companies.

They are certainly free to generate domain names and snap them up by the thousands, while junking up search results for all the rest of us.

But, then we are also free to choose who we will associate with.

Topic maps can help us bring honor and shame to the WWW. Has worked for centuries, no reason it should not work now.

*****
PS: Maybe we could have contests, Find that Domainer, how many minutes, seconds will it take you to identify a domainer from a domain name? Or to locate their photo? Or place of business/residence on Google maps?

January 26, 2011

DataMarket – Drill Down/Annotate?

Filed under: Data Source,Marketing,Topic Maps — Patrick Durusau @ 6:38 am

DataMarket

From the website:

Find and understand data.

Visualize the world’s economy, societies, nature, and industries, and gain new insights.

100 million time series from the most important data providers, such as the UN, World Bank and Eurostat.

I have just registered for the free account and have started poking about.

This looks deeply awesome!

In addition to being a source of data for analytical tools, I see an opportunity for topic maps to enable a drill-down capacity for such displays.

After all, any point in a time series is data from a file but at least for most such data, it should be traceable back to a file, report, questionnaire.

And from that file, report, questionnaire, it should be further traceable back to the author of the file or report and even further back, to the persons reported upon or questioned.

This site definitely has potential for real growth, particularly if they offer tools that enable drill down into data sets to source materials as well as to annotate points in a data set with other materials. Topic maps would excel at both.

Questions:

  1. Register for a free account.
  2. Choose any two data sets and create two visualizations (use screen capture to capture the graphic).
  3. What information would you want to drill down to find or that you would want to use to annotate data points in either visualization? (3-5 pages, no citations)

I’m just a bill…

Filed under: Examples,Marketing,Topic Maps — Patrick Durusau @ 5:20 am

Remember the Schoolhouse Rock song about how a bill becomes a law in the US?

If you don’t, see: Schoolhouse Rock- How a Bill Becomes a Law.

That level of understanding the legislative process is found in: Stream Congress: A real-time data stream for Congress

From the website:

Once Congress gets back to work, Stream Congress will serve as a good example of what the Real Time Congress API provides: floor updates, bill status, floor votes, committee hearing notices, and much more.

Don’t get me wrong, I like Sunlight Labs.

They have the potential to alter the political landscape.

But not with this understanding of how laws are made in the US.

Members of Congress write bills? Really? You really think that?

Have you ever met a member of Congress? Either house?

Let’s start by naming when a bill is proposed, the staffers, lobbyists, administration representatives, who wrote the bill.

The actual bill authors.

They have goals, friends, etc., that are being furthered by the bills they write (which are passed unread by most members of Congress).

Include who is paying the actual bill authors as well and their sources of funding.

Run that backwards into other legislative sessions. So we can follow patterns of money and ideology that shapes legislation before it ever gets proposed.

Then match up people interested in the bill with financial contributions to members of Congress. And the financial or other interest they have in the bill’s outcome.

We have the capacity to name names and make government truly transparent.

But only if we shine light on the actual process.

Topic maps can help with that.

*****
PS: Transparency would require far more than these off-hand suggestions and would not be cheap. Inquiries welcome.

January 15, 2011

Should I Work For Free?

Filed under: Marketing — Patrick Durusau @ 10:38 am

If you have ever wondered about a request to work for free, designer Jessica Hische has a graphic to help you make that decision, Should I Work For Free?

I think topic map folks get asked for free work as often as anyone else and when I saw this at Flowingdata.com I had to mention it.

I started to say it was ironic that Jessica has this elaborate matrix to avoid working for free but then contributes her chart for all of us to enjoy and profit from.

But, that’s not the same is it?

Jessica decided she felt strongly enough to create this chart and then to broadcast it for all to see.

It was done on her terms and while it benefits the rest of us, it surely benefits her as well.

Something to think about.

January 14, 2011

The Tin Man

Filed under: Marketing,TMDM,Topic Maps — Patrick Durusau @ 5:08 pm

One of the reasons I suggested having a podcast based topic maps conference is that watching presentations by others always (or nearly so) inspires me with new ideas.

Take Lars Marius Garshol’s presentation this morning, Topic Maps – Human Oriented Semantics?

While musing over the presentation, I was reminded of the line from the Tin Man, …Oz never did give nothing to the Tin Man / That he didn’t, didn’t already have….

If you don’t remember the story, the Wizard of Oz gives the Tin Man a heart, which he obviously had through out the story.

Anyway, I think one take away from Lars’ presentation is that users don’t need to go looking for experts in order to have semantics.

Users already have semantics and topic maps are a particularly clever way for users to express their semantics using their understanding of those semantics.

May or may not fit into classical, neo-classical, rough or fuzzy logic.

What matters is that a topic map represents subjects and their relationships as understood by the users of the topic map.

Users already have semantics, they just need topic maps in order to express them!

Topic Maps – Human-oriented Semantics? – A Quibble

Filed under: Marketing,TMDM,Topic Maps — Patrick Durusau @ 4:09 pm

Topic Maps – Human-oriented Semantics?

As promised, I have a quibble about the presentation that Lars made this morning. 😉

When talking about topic maps as semantic technology, Lars suggested or at least I heard him suggest, that topic maps help the person inside the Chinese room in John Searle’s famous example.

Lars then proceeded to use an example of a topic map, where the content was written in Japanese.

To show that you could know something about the content or at least relationships between the content, whether you could read it or not.

All of which is true, but my quibble is that such an understanding is on the part of the audience to the presentation and not of the machine/person inside the Chinese room.

Even with a topic map as input, we still don’t know what, if anything, is understood by a person or machine inside the Chinese room.

All we ever know is that we got the correct response to our input.

The presentation elided the transition from the Chinese room to the audience for the presentation. Quite different, at least in my view.

I did not allow that to distract me from an otherwise excellent presentation but I thought I should mention it. 😉

« Newer PostsOlder Posts »

Powered by WordPress