Almost a Topic Map? Or Just a Mashup?

Thursday, April 9th, 2015

WikipeDPLA by Eric Phetteplace.

See relevant results from the Digital Public Library of America on any Wikipedia article. This extension queries the DPLA each time you visit a Wikipedia article, using the article’s title, redirects, and categories to find relevant items. If you click a link at the top of the article, it loads in a series of links to the items. The original code behind WikiDPLA was written at LibHack, a hackathon at the American Library Association’s 2014 Midwinter Meeting in Philadelphia:

Google Chrome App Home Page

GitHub page

Wikipedia:The Wikipedia Library/WikipeDPLA

How you resolve the topic map versus mashup question depends on how much precision you expect from a topic map. While knowing additional places to search is useful, I never have a problem with assembling more materials than can be read in the time allowed. On the other hand, some people may need more prompting than others, so I can’t say that general references are out of bounds.

Assuming you were maintaining data sets with locally unique identifiers, using a modification of this script to query an index of all local scripts (say Pig scripts) to discover other scripts using the same data could be quite useful.

BTW, you need to have a Wikipedia account and be logged in for the extension to work. Or at least that was my experience.


Mapping Mashups using Google Maps, Facebook and Twitter

Monday, January 21st, 2013

Mapping Mashups using Google Maps, Facebook and Twitter by Wendell Santos.

Over one-third of our mashup directory is made up of mapping mashups and their popularity shows no signs of slowing down. We have taken two looks at mapping mashups in the past. With it being a year since our last review, now is a good time to look at the newest mashups taking advantage of mapping APIs. Read below for more information on each.

Covers four (4) mashup APIs:

Should we be marketing topic maps as “re-usable” mashups?

Or as possessing the ability to “recycle” mashups?

Connecting the Dots with Data Mashups (Webinar – 15th Jan. 2013)

Monday, January 14th, 2013

Connecting the Dots with Data Mashups (Webinar – 15th Jan. 2013)

The Briefing Room with Lyndsay Wise and Tableau Software

While Big Data continues to grab headlines, most information managers know there are many more “small” data sets that are becoming more valuable for gaining insights. That’s partly because business users are getting savvier at mixing and matching all kinds of data, big and small. One key success factor is the ability create compelling visualizations that clearly show patterns in the data.

Register for this episode of The Briefing Room to hear Analyst Lindsay Wise share insights about best practices for designing data visualization mashups. She’ll be briefed by Ellie Fields of Tableau Software who will demonstrate several different business use cases in which such mashups have proven critical for generating significant business value.

Particularly interesting in the use cases part of the presentation.

Topic maps, after all, are re-usable and reliable mashups.

Finding places that like mashups+ (aka, topic maps) is a good marketing move.

PS: It took several minutes to discover a link for the webinar that did not have lots of tracking garbage attached to it. I am considering not listing events without clean URLs to registration materials. What do you think?

ACLU maps cost of marijuana enforcement [Comparison]

Wednesday, August 29th, 2012

ACLU maps cost of marijuana enforcement

Washington spent more than $200 million on enforcing and prosecuting marijuana laws and incarcerating the folks that violated them, the American Civil Liberties Union of Washington estimates.

The organization released an interactive map today of what it estimates each county spent on marijuana law enforcement. Although not specifically tied to Initiative 502, which gives voters a chance to legalize marijuana use for adults under some circumstances, ACLU is a supporter of the ballot measure.

I have always wondered what motivation, other that fear of others having a good time, could drive something as inane as an anti-marijuana policy.

I think I may have a partial answer.

That old American standby – keeping down competition.

In describing the $425.7 million dollars taken in by the Washington State Liquor Control Board, a map was given to show where the money went:

In Fiscal Year 2011, $345 million was sent to the General Fund, $71 million to cities and counties, $8.2 million to education and prevention, and $1.5 million to research. To see how much revenue your city or county received from the WSLCB in Fiscal Year 2011, visit [All the “where-your-liquor-dollars-go” links appear to be broken. Point an an FAQ and not the documentation.].

Consider Pierce County: Spend on anti-marijuana – $21,138,797.

If you can guess the direct URL to the county by county liquor proceeds: (for Pierce county), you will find in 2011, the entire county got $7,489,073.

I’m just a standards editor and semantic integration enthusiast and by no means a captain of industry.

But, spending three times the revenue from competitors to marijuana on anti-marijuana activities makes no business sense.

If you can find the liquor revenue numbers for 2011, what other comparisons would you draw?

MyFCC Platform Enables Government Data Mashups

Thursday, December 22nd, 2011

MyFCC Platform Enables Government Data Mashups by Kin Lane.

The FCC just launched a new tool that allows any user to custom build a dashboard from a variety of FCC released data, tools and services, built on the FCC API. The tool, called MyFCC, lets you create a customized FCC online experience for quick access to the tools and information you feel is most important. MyFCC make it possible to easily create, save and manage a customized page, choosing from a menu of 22 “widgets” such as, latest headlines and official documents, the daily digest, FCC forms and online filings.

Once you have built your customized MyFCC page, you can share your work using popular social network platforms, or embed on any other website. The platform allows for each widget to independently be shared, or embed the entire dashboard into another site.

Modulo my usual comments about subject identification and reuse of identifications, this is at least a step in the right direction.

ProgrammableWeb – New APIs

Thursday, December 22nd, 2011

70 New APIs: Google Affiliate Network, Visual Search and Mobile App Sales Tracking by Wendell Santos.

This week we had 70 new APIs added to our API directory including an audio fingerprinting service, sentiment analysis and analytics service, affiliate marketing network, mobile app sales tracking service, visual search service and an eCommerce service. In addition we covered a “mobile engagement” platform adding revenue analytics to their service. Below are more details on each of these new APIs.

I have a question: ProgrammableWeb lists 4657 APIs (as of 22 December 2011, about 6:30 PM East Coast time) with six (6) filters, Keywords, Category, Company, Protocols/Styles, Data Format, Date, Managed By. How easy/hard is that to use? Care to guess where the break point will come in terms of ease of use?

For example, choosing “government” as a category, results in 154 APIs. A result that is a very uneven listing from Liepzig city data to Brazilian election candidate information to words used in the U.S. Congress. Minimal organization by country would be nice.

Who’s Your Daddy?

Sunday, July 3rd, 2011

Who’s Your Daddy? (Genealogy and Corruption, American Style)

NPR (National Public Radio) News broadcast the opinion this morning that Brits are marginally less corrupt than Americans. Interesting question. Was Bonnie less corrupt than Clyde? Debate at your leisure but the story did prompt me to think of an excellent resource for tracking both U.S. and British style corruption.

Probably all the talk of lineage in the news lately but why not use the genealogy records that are gathered so obsessively to track the soft corruption of influence?

Just another data set to overlay on elected, appointed, and hired positions, lobbyists, disclosure statements, contributions, known sightings, congressional legislation and administrative regulations, etc. Could lead to a “Who’s Your Daddy?” segment on NPR where employment or contracts are questioned naming names. That would be interesting.

It also seems more likely to be effective than the “disclose your corruption” sunlight approach. Corruption is never confessed, it has to be rooted out.

Mashups using IBM Mashup Center 3.0

Monday, April 18th, 2011

Tutorial: Introduction to creating mashups using IBM Mashup Center 3.0

Since I am attending the One Mashup to Rule Them All webinar this week (20th of April, 2011), I thought I would look at some other mashup solutions.

The tutorial appears to be complete and a good introduction to the IBM Mashup Center 3.0 tool.

I am particularly interested in Module H: Publishing and sharing your mashup.

In the lesson you:

In the Tags field, type customer, tutorial, and sales. Tagging helps ensure that others will be able to find your mashup when using these keywords in searches.

Correction: The last sentence should read:

In the Tags field, type customer, tutorial, and sales. Tagging helps ensure that others will be able to find your mashup when using only these keywords in searches.

That’s the mashup problem in nutshell isn’t it?

You are limited to the imagination of the mashup author in creating keywords for finding the mashup.

I may happen to use the same keywords but we all know that is a hit or miss proposition.

IBM should use its experience with Watson to supplement keyword lists to insert likely alternatives.

And the conditions when those alternatives are likely to be seen. (Dare we say properties of a subject?)

One Mashboard to Rule Them All

Wednesday, April 13th, 2011

One Mashboard to Rule Them All

Webinar Overview: We’ll be showcasing real-world examples of Jaspersoft dashboards,adding to your already extensive technical knowledge. Dashboards, with their instant answers for executives and business users, and mashboards, ideal for integrating multiple data sources for improved organizational decision-making are among the most frequently requested BI deliverables. Join us for everything you wanted to know about Jaspersoft Platforms.

April 20, 2011 1:00 pm, Eastern Daylight Time (New York, GMT-04:00)
April 20, 2011 10:00 am, Pacific Daylight Time (San Francisco, GMT-07:00)
April 20, 2011 6:00 pm, Western European Summer Time (London, GMT+01:00)

There is an open source side to Jaspersoft,

Stats from the site:

206224 members
163 today
1707 last 7 days
6643 last 30 days
255 public projects
182 private projects
85193 forum entries

A community where I would like to pose the question: “How do you re-use a mashup created by someone else?”

And given that it has an open source side, a place to pose topic maps as an answer.


Tuesday, March 15th, 2011


Courtesy of, the WeatherSpark site is a graphic and historical representation of weather conditions.

WeatherSpark is a new type of weather website, with interactive weather graphs that allow you to pan and zoom through the entire history of any weather station on earth.

Get multiple forecasts for the current location, overlaid on records and averages to put it all in context.

Unlike some mashups it is fairly apparent what is being used as a binding point. Which would make re-use of this data easier.

For example, if I were looking for weak points in a transportation system, I would take the traffic accident/delay records and then map them against the weather records from this site.

Thereby enabling predictions of when and where disruptive activity would have the greatest multiplier effect from natural weather conditions, time of day, etc.

AI Mashup Challenge 2011

Wednesday, February 23rd, 2011

AI Mashup Challenge 2011

Due date: 1 April 2011

The AI mashup challenge accepts and awards mashups that use AI technology, including but not restricted to machine learning and data mining, machine vision, natural language processing, reasoning, ontologies and the semantic web.
Imagine for example:

  • Information extraction or automatic text summarization to create a task-oriented overview mashup for mobile devices.
  • Semantic Web technology and data sources adapting to user and task-specific configurations.
  • Semantic background knowledge (such as ontologies, WordNet or Cyc) to improve search and content combination.
  • Machine translation for mashups that cross language borders.
  • Machine vision technology for novel ways of aggregating images, for instance mixing real and virtual environments.
  • Intelligent agents taking over simple household planning tasks.
  • Text-to-speech technology creating a voice mashup with intelligent and emotional intonation.
  • The display of Pub Med articles on a map based on geographic entity detection referring to diseases or health centers.

The emphasis is not on providing and consuming semantic markup, but rather on using intelligence to mashup these resources in a more powerful way.

This looks like an opportunity for an application that assists users in explicit identification or confirmation of identification of subjects.

Rather than auto-correcting, human-correcting.

Assuming we can capture the corrections, wouldn’t that mean that our apps would incrementally get “smarter?” Rather than starting off from ground zero with each request? (True, a lot of analysis goes on with logs, etc. Why not just ask?)

DSPL: Dataset Publishing Language

Friday, February 18th, 2011

DSPL: Dataset Publishing Language

DSPL is the Dataset Publishing Language, a representation language for the data and metadata of datasets. Datasets described in this format can be processed by Google and visualized in the Google Public Data Explorer.


  • Use existing data: Just add an XML metadata file to your existing CSV data files
  • Powerful visualizations: Unleash the full capabilities of the Google Public Data Explorer, including the animated bar chart, motion chart, and map visualization
  • Linkable concepts: Link to concepts in other datasets or create your own that others can use
  • Multi-language: Create datasets with metadata in any combination of languages
  • Geo-enabled: Make your data mappable by adding latitude and longitude data to your concept definitions. For even easier mapping, link to Google’s canonical geographic concepts.
  • Fully open: Freely use the DSPL format in your own applications

For the details:

A couple quick observations:

Geared towards data that can be captured in csv files, which are considerable and interesting data sets, but only a slice of all data.

Did not appear on a quick scan of the tutorial or developer guide to provide a way to specify properties for topics.

Did not appear to provide a way to specify when (or why) topic could be merged with one another.

Plus marks for enabling navigation by topics, but that is like complimenting a nautical map for having the compass directions isn’t it?

I think this could be a very good tool for investigating data or even showing, but if you had a topic map, sort of illustrations to clients.

Moving up in the stack, both virtual as well as actual, of reading materials on my desk.

Software for Non-Human Users?

Sunday, February 13th, 2011

The description of: Emerging Intelligent Data and Web Technologies (EIDWT-2011) is a call for software designed for non-human users.

The Social Life of Information by John Seely Brown and Paul Duguid, makes it clear that human users don’t want to share data because sharing data represents a loss of power/status.

A poll of the readers of CACM or Computer would report a universal experience of working in an office where information is hoarded up by individuals in order to increase their own status or power.

9/11 was preceded and followed by, to this day, by a non-sharing of intelligence data. Even national peril cannot overcome the non-sharing reflex with regard to data.

EIDWT-2011 and conferences like it, are predicated on a sharing of data known to not exist, at least among human users.

Hence, I suspect the call must be directed at software for non-human users.

Emerging Intelligent Data and Web Technologies (EIDWT-2011)

Sunday, February 13th, 2011

2nd International Conference on Emerging Intelligent Data and Web Technologies (EIDWT-2011)

The 2-nd International Conference on Emerging Intelligent Data and Web Technologies (EIDWT-2011) is dedicated to the dissemination of original contributions that are related to the theories, practices and concepts of emerging data technologies yet most importantly of their applicability in business and academia towards a collective intelligence approach. In particular, EIDWT-2011 will discuss advances about utilizing and exploiting data generated from emerging data technologies such as Data Centers, Data Grids, Clouds, Crowds, Mashups, Social Networks and/or other Web 2.0 implementations towards a collaborative and collective intelligence approach leading to advancements of virtual organizations and their user communities. This is because, current and future Web and Web 2.0 implementations will store and continuously produce a vast amount of data, which if combined and analyzed through a collective intelligence manner will make a difference in the organizational settings and their user communities. Thus, the scope of EIDWT-2011 is to discuss methods and practices (including P2P) which bring various emerging data technologies together to capture, integrate, analyze, mine, annotate and visualize data – made available from various community users – in a meaningful and collaborative for the organization manner. Finally, EIDWT-2011 aims to provide a forum for original discussion and prompt future directions in the area.

Important Dates:

Submission Deadline: March 10, 2011
Authors Notification: May 10, 2011
Author Registration: June 10, 2011
Final Manuscript: July 1, 2011
Conference Dates: September 7 – 9, 2011

The Silent “a” In Mashup

Wednesday, February 9th, 2011

The “a” in mashup is silent because mashups are missing information that is represented in a topic map by associations.

That isn’t necessarily a criticism of mashups. How much or how little information you represent in any data set or application is up to you.

It is helpful to have a framework for understanding what information you have included or excluded by explicit choice. Why you made those choices or on what basis is entirely up to you.

As of 08-02-2010, there are fifteen definitions of mashup in English reported by define:Mashup in Google.

Most of the definitions of mashup do not exclude (necessarily) what is defined as an association in a topic map, but the general theme is one of juxtaposition of data from different resources.

That juxtaposition leaves the following subjects undefined (at least explicitly):

  1. role players in an association (play #2)
  2. roles in an association
  3. type of an association

Not to mention any TMCL (Topic Maps Constraint Language) constraints on those associations. (Something we will cover on another day.)

You can choose to leave subjects undefined, which is easier than defining them (witness the popularity of mashups), but there is a cost to leaving them undefined.

Defining or leaving subjects undefined is a decision that need to take into account factors such as ease of authoring versus the very real cost of leaving subjects undefined, as well as other factors. Such as your particular project’s requirements for maintenance, semantic integration and interchange.

For example, if the role players (#1 above) are left undefined in a mashup, what are the consequences?

From a topic map perspective, that means the role player subjects are not represented by topics, which means you cannot:

  1. attach other information about those subjects, such as variant names
  2. judge whether those are the same subjects as found in other associations
  3. find all the associations where those subjects are role players (since they are not explicitly identified)
  4. …among other things.

As I said, you can make that choice but while easier, that is less work, you also get less return from your mashup.

Another choice in a mashup, assuming that you identified the role players as topics, would be to simply not identify the roles they play in the mashup (read association).

If you don’t identify the roles as subjects (represented by topics), you can’t:

  1. compare those roles to roles in other mashups
  2. compare the roles being played by role players to roles they play in other associations
  3. discover associations with the same role players playing the same roles, but identified differently
  4. …among other things.

Assuming you have defined role players, the roles they play, there remains the type of the association (read mashup), which could help you locate other associations (mashups) that would be of interest to you.

Even if you defined a type for a mashup, I am not real sure where you would put it. That’s not an issue with a topic map association. It has an explicit type.

Mashups are easier to author than associations because they carry less information.

Which is a legitimate choice on your part.

What if after creating mashups we decide that it would be nice to add some more information?

Can topic maps help with that task?

We will take up the answer to that question tomorrow.


Thursday, January 27th, 2011

Another free data source. (Commercial plans also available.)

Large number of data sources and what looks like a friendly number of free API calls while you are building an application.

Observation: Finding one data source or project seems to lead to several others in the same area.

Definitely worth a visit.

PS: The abundance of online data sources opens the door to semantic mappings (can you say topic maps?) that enhance the value of these data sets.

Such as resolving the semantic impedance between the data sets.

Topic map artifacts as commercial products.

The trick is going to be discovering (and resolving) semantic impedances that people are willing to pay to avoid.

Big Data: Millionfold Mashups and the Shape of Data

Thursday, January 27th, 2011

Big Data: Millionfold Mashups and the Shape of Data

Philip (flip) Kromer ( talks about data, including the nine fold path to data enlightenment.

Hard to pick the most interesting part of the presentation.

Whether it was when Philip said that human experts would need to do the heavy lifting to semantic level or when he said Infochimps is working on an everything about API.

I don’t think those are entirely consistent but it was an impressive presentation! Definitely worth the time to watch.

Found at MyNoSQL by Alex Popescu.