Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

November 26, 2010

Another Take on the Semantic Web?

Filed under: OWL,RDF,Semantic Web — Patrick Durusau @ 10:56 am

Bob Ferris constructs a take on the SW at: On Resources, Information Resources and Documents.

Whatever you think of Bob’s vision of the SW, the fundamental problem is one of requiring universal use of a flat identifier (URI).

Which leaves us with string comparison. Different string, different thing being identified.

Some of the better SW software now evaluates RDF graphs for identification of entities.

Not all that different from how we identify entities.

Departs from the URI = Identifier basis of the SW, but to be useful, that was inevitable.

Two more challenges face the SW (where topic maps can help, there are others):

1) How to communicate to other users what parts of an RDF graph to match for identity purposes? (including matching on subparts)

2) How to communicate to other users when non-Isomorphic RDF graphs are semantically equivalent?

More on those issues anon.

November 25, 2010

Sig.ma – Live views on the Web of Data

Filed under: Indexing,Information Retrieval,Lucene,Mapping,RDF,Search Engines,Semantic Web — Patrick Durusau @ 10:27 am

Sig.ma – Live views on the Web of Data

From the website:

In Sig.ma, elements such as large scale semantic web indexing, logic reasoning, data aggregation heuristics, pragmatic ontology alignments and, last but not least, user interaction and refinement, all play together to provide entity descriptions which become live, embeddable data mash ups.

Read one of various versions of an article on Sig.ma for the technical details.

From the Web Technologies article cited on the homepage:

Sig.ma revolves around the creation of Entity Profiles. An entity profile – which in the Sig.ma dataflow is represented by the “data cache” storage (Fig. 3) – is a summary of an entity that is presented to the user in a visual interface, or which can be returned by the API as a rich JSON object or a RDF document. Entity profiles usually include information that is aggregated from more than one source. The basic structure of an entity profile is a set of key-value pairs that describe the entity. Entity profiles often refer to other entities, for example the profile of a person might refer to their publications.

No, this isn’t an implementation of the TMRM.

This is an implementation of one way to view entities for a particular type of data. A very exciting one but still limited to a particular data set.

This is a big step forward.

For example, it isn’t hard to imagine entity profiles against particular websites or data sets. Entity profiles that are maintained and leased for use with search engines like Sig.ma.

Or going a bit further and declaring a basis for identification of subjects, such as the existence of properties a…n in an RDF graph.

Questions:

  1. Spend a couple of hours with Sig.ma researching library related questions. (Discussion)
  2. What did you like, dislike or find surprising about Sig.ma? (3-5 pages, no citations)
  3. Entity profiles for library science (Class project)

Sig.ma: Live Views on the web of data – bibliography issues

I normally start with a DOI here so you can see article in question.

Not here.

Here’s why:

Sig.ma: Live views on the Web of Data Journal of Web Semantics. (10 pages)

Sig.ma: Live Views on the Web of Data WWW ’10 Proceedings(demo, 4 pages)

Sig.ma: Live Views on the Web of Data (8 pages) http://richard.cyganiak.de/2008/papers/sigma-semwebchallenge2009.pdf

Sig.ma: Live Views on the Web of Data (4 pages) http://richard.cyganiak.de/2008/papers/sigma-demo-www2010.pdf

Sig.ma: Live Views on the Web of Data (25 pages) http://fooshed.net/paper/JWS2010.pdf

Before saying anything ugly, ;-), this is some of the most exciting research I have seen in a long time. I will cover that part of it in a following post. But, to the matter at hand, bibliographic control.

Five (5) different articles, two published in recognized journals that all have the same name? (The demo articles are the same but have different headers/footers, page numbers and so would likely be indexed as different articles.)

I will be able to resolve any confusion by obtaining the article in question.

But that isn’t an excuse.

I, along with everyone else interested in this research, will waste a small part of our time resolving the confusion. Confusion that could have been avoided for everyone.

Not unlike everyone who does the same search having to tread the same google glut.

With no way to pass on what we have resolved, for the benefit of others.

Questions:

  1. Help these authors out. How would you suggest they avoid this in the future? Use of the name is important. (3-5 pages, no citations)
  2. Help the library out. How will you deal with multiple papers with the same title, authors, pub year? (this isn’t uncommon) (3-5 pages, citations optional)
  3. How would you use topic maps to resolve this issue? (3-5 pages, no citations)

Virtuoso Open-Source Edition

Filed under: Linked Data,RDF,Semantic Web,Software — Patrick Durusau @ 7:06 am

Virtuoso Open-Source Edition

I ran across Virtuoso while running down the references in the article on SIREn. (Yes, I check references, not all of them, just the most interesting ones, as time permits.)

Has partial support for a variety of “Semantic Web” technologies.

Is the basis for OpenLink Data Spaces.

A named structured data cluster within a distributed data network where each item of data (each “datum”) has a unique identifier. Fundamental characteristics of data spaces include:

  • Each Data Item (or Entity) is endowed with a unique HTTP-based Identifier
  • Entity Identity, Access, and Representation are each distinct from the others
  • Entities are interlinked via attributes and relationship properties
  • Creation, Update, and Deletion privileges are controlled by the space owner

I can think of lots of “data spaces,” Large Hadron Collider data, radio and optical astronomy data dumps, TCP/IP data streams, bioinformatics data, commercial transaction databases that don’t fit this description. Please submit your own.

Still, if you want to learn the ins and outs as well as the limitations of this approach, it costs nothing more than the time to download the software.

A Node Indexing Scheme for Web Entity Retrieval

Filed under: Entity Resolution,Full-Text Search,Indexing,Lucene,RDF,Topic Maps — Patrick Durusau @ 6:15 am

A Node Indexing Scheme for Web Entity Retrieval Authors(s): Renaud Delbru, Nickolai Toupikov, Michele Catasta, Giovanni Tummarello Keywords: entity, entity search, full-text search, semi-structured queries, top-k query, node indexing, incremental index updates, entity retrieval system, RDF, RDFa, Microformats

Abstract:

Now motivated also by the partial support of major search engines, hundreds of millions of documents are being published on the web embedding semi-structured data in RDF, RDFa and Microformats. This scenario calls for novel information search systems which provide effective means of retrieving relevant semi-structured information. In this paper, we present an “entity retrieval system” designed to provide entity search capabilities over datasets as large as the entire Web of Data. Our system supports full-text search, semi-structural queries and top-k query results while exhibiting a concise index and efficient incremental updates. We advocate the use of a node indexing scheme and show that it offers a good compromise between query expressiveness, query processing time and update complexity in comparison to three other indexing techniques. We then demonstrate how such system can effectively answer queries over 10 billion triples on a single commodity machine.

Consider the requirements for this project:

  1. Support for the multiple formats which are used on the Web of Data;
  2. Support for searching an entity description given its characteristics (entity centric search);
  3. Support for context (provenance) of information: entity descriptions are given in the context of a website or a dataset;
  4. Support for semi-structural queries with full-text search, top-k query results, scalability over shard clusters of commodity machines, efficient caching strategy and incremental index maintenance.
    1. (emphasis added)

SIREn { Semantic Information Retrieval Engine }

Definitely a package to download, install and start to evaluate. More comments forthcoming.

Questions (more for topic map researchers)

  1. To what extent can “entity description” = properties of topics, associations, occurrences?
  2. Can XTM, etc., be regarded as “microformats” for the purposes of SIREn?
  3. To what extent does SIREn meet or exceed query requirements for XTM/TMDM based topic maps?
  4. Reports on use of SIREn by topic mappers?

November 23, 2010

Querying the British National Bibliography

Filed under: British National Bibliography,Dataset,RDF,Semantic Web,SPARQL — Patrick Durusau @ 9:40 am

Querying the British National Bibliography

From the webpage:

Following up on the earlier announcement that the British Library has made the British National Bibliography available under a public domain dedication, the JISC Open Bibliography project has worked to make this data more useable.

The data has been loaded into a Virtuoso store that is queriable through the SPARQL Endpoint and the URIs that we have assigned each record use the ORDF software to make them dereferencable, supporting perform content auto-negotiation as well as embedding RDFa in the HTML representation.

The data contains some 3 million individual records and some 173 million triples. …

The data is also available for local processing but it isn’t much of a “web” if the first step is to always download a local copy of the data.

It should be interesting to watch for projects that combine the results of queries against this data with the results of other queries against other data sets. Particularly if those other data sets follow different metadata regimes.

Isn’t that the indexing problem all over again?

Questions:

  1. What data set would you want to combine with British National Bibliography (BNB)?
  2. What issues do you see arising from combing the BNB with your data set? (3-5 pages, no citations)
  3. Combining the BNB with another data set. (project)

November 18, 2010

A Direct Mapping of Relational Data to RDF

Filed under: Ambiguity,RDF,Semantic Web,Subject Identity — Patrick Durusau @ 7:15 pm

A Direct Mapping of Relational Data to RDF

A major step towards putting relational data “on the web.”

Identifying what that data means and providing a basis for reconciling it with other data remains to be addressed.

URIs and Identity

Filed under: Ambiguity,RDF,Semantic Web,Subject Identity,Topic Maps — Patrick Durusau @ 6:55 pm

If I read Halpin and others correctly, URIs identify the subjects they identify, except when they identify some other subject and it isn’t possible to know which of any number of subjects is being identified.

That is what I (and others) take as “ambiguity.”

Some readers have taken my comments to on URIs to be critical of RDF, which wasn’t my intent.

What I object to is the sentiment that everyone should use only URIs and then cherry pick any RDF graph that may result for identity purposes.

For example, in a family tree, there may be an entry: John Smith.

For which we can create: http://myfamilytree.smith.com/john_smith

That may resolve to an RDF graph but what properties in that graph identify a particular John Smith?

A “uniform” syntax for that “identifier” isn’t helpful if we all reach various conclusions about what properties in the graph to use for identification.

Or if we have different tests to evaluate the values of those properties.

Even with an RDF graph and rules for which properties to evaluate, we may still have ambiguity.

But rules for evaluation of RDF graphs for identity lessen the ambiguity.

All within the context, format, data model of RDF.

It does detract from URIs as identifiers but URIs as identifiers are no more viable than any single token as an identifier.

Sets of key/value pairs, which are made up of tokens, have the potential to lessen ambiguity, but not banish it.

November 16, 2010

In Defense of Ambiguity

Filed under: OWL,RDF,Semantic Web,Subject Identity — Patrick Durusau @ 5:49 pm

by Patrick J. Hayes and Harry Halpin was cited in David Booth’s article so like any academic, I had to go read the cited paper. 😉

Highly recommended.

The authors conclude:

Regardless of the details, the use of any technology in Web architecture to distinguish between access and reference, including our proposed ex:refersTo and ex:describedBy, does nothing more than allow the author of a URI to explain how they would like the URI to be used. Ultimately, there is nothing that Web architecture can do to prevent a URI from being used to refer to some thing non-accessible. However, at least having a clear and coherent device, such as a few RDF predicates, would allow the distinction to be made so the author could give guidance on what they believe best practice for their URI would be. This would vastly improve the situation from where it is today, where this distinction is impossible. The philosophical case for the distinction between reference and access is clear. The main advantage of Web architecture is that there is now a de facto universal identification scheme for accessing networked resources. With the Semantic Web, we can now extend this scheme to the wide world outside the Web by use of reference. By keeping the distinction between reference and access clear, the lemons of ambiguity can be turned into lemonade. Reference is inherently ambiguous, and ambiguity is not an error of communication, but fundamental to the success of communication both on and off the Web.

Sounds like the distinction between subject locators and identifiers that topic maps made long before this paper was written.

Resource Identity and Semantic Extensions: Making Sense of Ambiguity

Filed under: OWL,RDF,Semantic Web,Subject Identity — Patrick Durusau @ 5:29 pm

Resource Identity and Semantic Extensions: Making Sense of Ambiguity David Booth’s paper was cited by Bernard Vatant so I had to go take a look.

Bernard says: “The best analysis of the issue I’ve read so far.” I have to agree.

From the paper’s conclusion:

In general, a URI’s resource identity will necessarily be ambiguous. But this is not the end of the world. Rather, it means that while it may be unambiguous enough for one application, another application may require finer distinctions and thus consider it ambiguous. However, this ambiguity of resource identity can be precisely constrained by the use of URI declarations. Finally, a standard process is proposed for determining a URI’s resource identity.

Ambiguity is part and parcel of any system and the real question is how much can you tolerate?

For some systems that is quite a bit, for others, air traffic controllers come to mind, as little as possible.

Other identifiers are ambiguous as well.

Successful integration of data across systems depends on how well we deal with that ambiguity.

November 8, 2010

BibBase and Beyond

Filed under: BibTeX,OWL,RDF,Semantic Web — Patrick Durusau @ 8:38 am

BibBase is an effort to store BibTeX information as RDF triples. For the data, see: BibBase data.

As of 8 November 2010, there are 6178 publications.

Interesting I suppose but the real question is how to enable researchers using BibTeX to disambiguate their terminology as part of their BibTeX entry?

Has to be as easy as BibTeX and consistent with usage patterns in the communities that use it. If you hope for adoption.

Not hard to imagine a helper application that runs through a set of BibTeX entries and suggest 1998 ACM Computing Classification System or 2010 Mathematics Subject Classification entries. Entries which the author could accept or reject.

Not the fine grained, concept by concept (read subject by subject) analysis of a document that I would like to see, but it’s a start.

ISWC 2010 Data and Demos

Filed under: Linked Data,RDF,Semantic Web,SPARQL — Patrick Durusau @ 6:27 am

ISWC 2010 Data and Demos.

Data and demos from the International Semantic Web Conference 2010. Includes links to prior data sets and browsers that work with the data sets.

Data sets are always important as well as being able to gauge the current state of semantic software.

Ambiguity and Linked Data URIs

Filed under: Ambiguity,Linked Data,Marketing,RDF,Semantic Web,Topic Maps — Patrick Durusau @ 6:14 am

I like the proposal by Ian Davis to avoid the 303 cloud while try to fix the mistake of confusing identifiers with addresses in an address space.

Linked data URIs are already known to be subject to the same issues of ambiguity as any other naming convention.

All naming conventions are subject to ambiguity and “expanded” naming conventions, such as a list of properties in a topic map, may make the ambiguity a bit more manageable.

That depends on a presumption that if more information is added and a user advised of it, the risk of ambiguity will be reduced.

But the user needs to be able to use the additional information. What if the additional information is to distinguish two concepts in calculus and the reader is innocent of even basic algebra?

That is that say ambiguity can be overcome only in particular contexts.

But overcoming ambiguity in a particular context may be enough. Such as:

  • Interchange between intelligence agencies
  • Interchange between audited entities and their auditors (GAO, SEC, Federal Reserve (or their foreign equivalents))
  • Interchange between manufacturers and distributors

None of those are the golden age of seamless knowledge sharing and universal democratization of decision making or even scheduling tennis matches sort of applications.

They are applications that can reduce incremental costs, improve overall efficiency and perhaps contribute to achievement of organizational goals.

Perhaps that is enough.

A Guide to Publishing Linked Data Without Redirects – Post

Filed under: Linked Data,RDF,Semantic Web — Patrick Durusau @ 5:34 am

A Guide to Publishing Linked Data Without Redirects is a proposal by Ian Davis to avoid the 303 while distinguishing between “things” and their descriptions.

A step in the right direction.

November 4, 2010

Is 303 Really Necessary? – Blog Post

Filed under: Linked Data,RDF,Semantic Web,Uncategorized — Patrick Durusau @ 9:46 am

Is 303 Really Necessary?.

Ian Davis details at length why 303’s are unnecessary and offers an interesting alternative.

Read the comments as well.

November 3, 2010

The Semantic Web Garden of Eden

Filed under: Marketing,RDF,Semantic Web,Topic Maps — Patrick Durusau @ 6:48 pm

The Garden of Eden:

[2:19] And out of the ground the LORD God formed every beast of the field, and every fowl of the air; and brought them unto Adam to see what he would call them: and whatsoever Adam called every living creature, that was the name thereof….[1]

As the number of Adams and Eves multiplied, so did the names of things.

Multiple names for the same things, different things with the same names.

Ambiguity had entered the world.

The Semantic Web Garden of Eden sought to banish ambiguity:

…by an RDF statement having…URIrefs are used to identify not only the subject of the original statement, but also the predicate and object, instead of using the words “creator” and “John Smith” [2]

As the number of URIs multipled, so did the URIs of things.

Multiple URIs for the same things, different things with the same URIs.

Ambiguity remains in the world.

******
[1] Genesis 2:19
[2] RDF Primer, 2.2 RDF Model, http://www.w3.org/TR/rdf-primer/

Weaknesses In Linked Data

Filed under: Linked Data,RDF,Semantic Web — Patrick Durusau @ 6:47 pm

A Partnership between Structured Data and Ontotext to address weaknesses in linked data framed it this way:

Volumes of linked data on the Web are growing. This growth is exposing three key weaknesses:

  1. inadequate semantics for how to link disparate information together that recognizes inherently different contexts and viewpoints and (often) approximate mappings
  2. misapplication of many linking predicates, such as owl:sameAs, and
  3. a lack of coherent reference concepts by which to aggregate and organize this linkable content.

The amount of linked data is trivial compared to the total volume of digital data.

Makes me wonder about the “only the web will scale argument.”

Questions:

  1. How do these three “key weaknesses” compared to current barriers to semantic integration? (3-5 pages, no citations)
  2. “inadequate semantics?” What’s wrong with the semantics we have now? Or is the point that formal semantics are inadequate? (discussion)
  3. “coherent reference concepts?” How would you recognize one if you saw it? (3-5 pages, no citations)

November 1, 2010

Rule Markup Initiative

Filed under: RDF,RuleML,Semantic Web — Patrick Durusau @ 4:48 pm

Rule Markup Initiative

From the website:

The RuleML Initiative is an international non-profit organization covering all aspects of Web rules and their interoperation, with a Structure and Technical Groups that center on RuleML specification, tool, and application development. Around RuleML, an open network of individuals and groups from both industry and academia has emerged, having a shared interest in modern rule topics, including the interoperation of Semantic Web rules. The RuleML Initiative has been collaborating with OASIS on Legal XML, Policy RuleML, and related efforts since 2004. The Initiative has further been interacting with the developers of ISO Common Logic (CL), which became an International Standard, First edition, in October 2007. RuleML is also a member of OMG, contributing to its Semantics of Business Vocabulary and Business Rules (SBVR), which went into Version 1.0 in January 2008, and to its Production Rule Representation (PRR), which went into Version 1.0 in December 2009. Moreover, participants of the RuleML Initiative have supported the development of the W3C Rule Interchange Format (RIF), which attained Recommendation status in June 2010. The annual RuleML Symposium has taken the lead in bringing together delegates from industry and academia who share this interest focus in Web rules.

Questions:

  1. Does the use of ISO Common Logic insure interoperability? Why/Why not? (discussion)
  2. How would you define interoperability? (3-5 pages, no citations)
  3. Can rules insure your definition of interoperability? (discussion)
  4. Are rules subject to the same semantic drift as data? Why/Why not?(3-5 pages, no citations)

October 31, 2010

R2RML: RDB to RDF Mapping Language

Filed under: RDF,Semantic Web — Patrick Durusau @ 8:13 pm

R2RML: RDB to RDF Mapping Language

Abstract:

This document describes R2RML, a language for expressing customized mappings from relational databases to RDF datasets. Such mappings provide the ability to view existing relational data in the RDF data model, expressed in a structure and target vocabulary of the mapping author’s choice. R2RML mappings are themselves RDF graphs and written down in Turtle syntax. R2RML enables different types of mapping implementations: processors could, for example, offer a virtual SPARQL endpoint over the mapped relational data, or generate RDF dumps, or offer a Linked Data interface.

First draft from the RDB2RDF working group.

Questions:

  1. Select a table from two (or three) databases in a common area with different schemas.
  2. Convert the tables using the latest version of this proposal to RDF datasets.
  3. On what basis would you integrate the resulting RDF datasets into a single RDF dataset?

October 13, 2010

Semantic Drift: What Are Linked Data/RDF and TMDM Topic Maps Missing?

Filed under: Linked Data,RDF,Subject Identifiers,Subject Identity,Topic Maps — Patrick Durusau @ 9:38 am

One RDF approach to semantic drift is to situate a vocabulary among other terms.

TMDM topic maps enable a user to gather up information that they considered as identifying the subject in question.

Additional information helps to identify a particular subject. (RDF/TMDM approaches)

Isn’t that the opposite of semantic drift?

What’s happening in both cases?

The RDF approach is guessing that it has the sense of the word as used by the author (if the right word at all).

Kelb reports approximately 48% precision.

So in 1 out of 2 emergency room situations we get the right term? (Not to knock Kelb’s work. It is an important approach that needs further development.)

Topic maps are guessing as well.

We don’t know what information in a subject identifier identifies a subject. Some of it? All of it? Under what circumstances?

Question: What information identifies a subject, at least to its author?

Answer: Ask the Author.

Asking authors what information identifies their subject(s) seems like an overlooked approach.

Domain specific vocabularies with additional information about subjects that indicates the information that identifies a subject versus merely supplemental information would be a good start.

That avoids inline syntax difficulties and enables authors to easily and quickly associate subject identification information with their documents.

Both RDF and TMDM Topic Maps could use the same vocabularies to improve their handling of associated document content.

October 11, 2010

Semantic Drift: An RDF Answer (sort-of)

Filed under: RDF,Semantic Web,Subject Identity — Patrick Durusau @ 7:27 am

As promised last week, there are RDF researchers working on issues related to semantic drift.

An interesting approach can be found in: Entity Reference Resolution via Spreading Activation on RDF-Graphs Authors(s): Joachim Kleb, Andreas Abecker

Abstract:

The use of natural language identifiers as reference for ontology elements—in addition to the URIs required by the Semantic Web standards—is of utmost importance because of their predominance in the human everyday life, i.e.speech or print media. Depending on the context, different names can be chosen for one and the same element, and the same element can be referenced by different names. Here homonymy and synonymy are the main cause of ambiguity in perceiving which concrete unique ontology element ought to be referenced by a specific natural language identifier describing an entity. We propose a novel method to resolve entity references under the aspect of ambiguity which explores only formal background knowledge represented in RDF graph structures. The key idea of our domain independent approach is to build an entity network with the most likely referenced ontology elements by constructing steiner graphs based on spreading activation. In addition to exploiting complex graph structures, we devise a new ranking technique that characterises the likelihood of entities in this network, i.e. interpretation contexts. Experiments in a highly polysemic domain show the ability of the algorithm to retrieve the correct ontology elements in almost all cases.

It is the situating of a concept in a context (not assignment of a URI) that enables the correct result in a polysemic domain.

This doesn’t directly model semantic drift but does represent anchoring a term in a particular context.

The questions that divides semantic technologies are:

  • Who throws the anchor?
  • Who governs the anchors?
  • Can there be more than one anchor?
  • What about “my” anchor?
  • …and others

More on those anon.

October 6, 2010

The RelFinder user interface: interactive exploration of relationships between objects of interest

Filed under: Associations,Interface Research/Design,RDF,Semantic Web,Software — Patrick Durusau @ 7:00 am

The RelFinder user interface: interactive exploration of relationships between objects of interest Authors: Steffen Lohmann, Philipp Heim, Timo Stegemann, Jürgen Ziegler Keywords: dbpedia, decision support, graph visualization, linked data, relationship discovery, relationship web, semantic user interfaces, semantic web, sparql, visual exploration

Abstract:

Being aware of the relationships that exist between objects of interest is crucial in many situations. The RelFinder user interface helps to get an overview: Even large amounts of relationships can be visualized, filtered, and analyzed by the user. Common concepts of knowledge representation are exploited in order to support interactive exploration both on the level of global filters and single relationships. The RelFinder is easy-to-use and works on every RDF knowledge base that provides standardized SPARQL access

Software: RelFinder

RelFinder presents a way to leverage data already in RDF for the creation of associations in topic maps.

Or to explore data already available in RDF.

Exploration of relationships is important for “data” but even more important for the syntaxes that contain data.

Such as equivalence between subjects represented by syntax tokens.

October 5, 2010

Grist For Topic Map Mills: German National Library – Authority Files

Filed under: Dataset,Linked Data,RDF,Semantic Web — Patrick Durusau @ 6:09 am

German National Library – Authority Files (linked data)

A post from Lars Svensson announced the release of authority files from the German National Library:

The German National Library (DNB) has published the German library authority files as linked data. The dataset consists of 1.8 Mill differentiated persons from the PND (Personennamendatei, Name authority file), 187.000 subject headings from the SWD (Schlagwortnormdatei, Subject headings authority file), 1.3 Mill corporate bodies from the GKD (Gemeinsame Körperschaftsdatei, Corporate Body Authority file), and 51,000 classes from the German translation of the Dewey Decimal Classification (DDC).

Library students should take particular note of the subject heading and Dewey Decimal Classification materials.

For topic mappers, another set of identifiers that can be mapped between the data sets shown by data cloud as well those that don’t use URIs as identifiers (the vast majority of data).

This will also be of interest to the linked data community.

September 23, 2010

KP-Lab Knowledge Practices Lab

Filed under: Interface Research/Design,RDF,Semantic Web,Software — Patrick Durusau @ 7:06 am

KP-Lab Knowledge Practices Lab.

KP-Lab project design and implement a modular, flexible, and extensible ICT system that supports pedagogical methods to foster knowledge creation in educational and workplace settings. The system provides tools for collaborative work around shared objects, and for knowledge practices in the various settings addressed by the project.

Offer the following tools:

  • Knowledge Practices Environment (KPE)
  • The Visual Modeling (Language) Editor
  • Activity System Design Tools (ASDT)
  • Semantic Multimedia Annotation tool (SMAT)
  • Map-It and M2T (meeting practices)
  • The CASS-Query tool
  • The CASS-Memo tool
  • Awareness Services
  • RDF Suite
  • KMS-Persistence API
  • Text Mining Services

Pick any one of these tools and name five (5) things you like about it and five (5) things you dislike about it. How would you change the things you dislike? (General prose description is sufficient.)

September 19, 2010

Chem2Bio2RDF: a semantic framework for linking and data mining chemogenomic and systems chemical biology data

Chem2Bio2RDF: a semantic framework for linking and data mining chemogenomic and systems chemical biology data

Destined to be a deeply influential resource.

Read the paper, use the application for a week Chem2Bio2RDF, then answer these questions:

  1. Choose three (3) subjects that are identified in this framework.
  2. For each subject, how is it identified in this framework?
  3. For each subject, have you seen it in another framework or system?
  4. For each subject seen in another framework/system, how was it identified there?

Extra credit: What one thing would you change about any of the identifications in this system? Why?

July 22, 2010

Queries and Linked Data

Filed under: Linked Data,RDF — Patrick Durusau @ 6:30 pm

Federated Data Management and Query Optimization for Linked Open Data by Olaf Görlitz and Steffen Staab and,

A Database Perspective on Consuming Linked Data on the Web by Olaf Hartig and Andreas Langegger,

are two recent publications on querying linked data that will repay close study as we prepare to discuss TMQL in Leipzig.

Linked data is a way of organizing subjects. A way that topic maps will encounter in the (still) heterogeneous world.

July 8, 2010

Keeping Up With The “Competition”

Filed under: RDF,Semantic Web — Patrick Durusau @ 8:29 pm

New opportunities for linked data nose-following is a blog post from the W3C about three (3) new IETF RFCs.

Well, or at least two of them. As of my 8:55 PM local, 2010-07-08, “Defining Well-Known URIs” has the following URI, http://www.ietf.org/html/draft-nottingham-site-meta-05. Err, that doesn’t look right.

When it didn’t resolve I thought perhaps it was a redirect.

Nothing that complicated, just a bad URI. I got the IETF “404: Page Not Found” page.

Oh, the correct URI? Defining Well-Known URIs, http://www.rfc-editor.org/rfc/rfc5785.txt.

So, what is a well-known URI?

A well-known URI is a URI [RFC3986] whose path component begins with
the characters “/.well-known/”, and whose scheme is “HTTP”, “HTTPS”,
or another scheme that has explicitly been specified to use well-
known URIs.

Applications that wish to mint new well-known URIs MUST register
them, following the procedures in Section 5.1.

Wait for it….

5.1. The Well-Known URI Registry

This document establishes the well-known URI registry.

Well-known URIs are registered on the advice of one or more
Designated Experts (appointed by the IESG or their delegate), with a
Specification Required (using terminology from [RFC5226]). However,
to allow for the allocation of values prior to publication, the
Designated Expert(s) may approve registration once they are satisfied
that such a specification will be published.

Well, that’s a relief! We are going to have Designated Expert(s) sitting in judgment over “well-known” URIs.

We just narrowly escaped being able to judge for ourselves what are URIs worth treating as “well-known” or not.

Good thing we have TBL, the W3C and Designated Experts to keep us safe.

*******
Update: 2010-07-09

I was worried that since the “Defining Well-known URIs” RFC was dated in April that this was some complicated spoof or joke. I even check the cross linking in the RFC but finally erred on saying it was real.

I had that judgment confirmed this morning by learning that the page “went dark” briefly last night and when I checked it this morning, the incorrect URL that I reported above has been corrected, silently.

W3C blog, goes dark, comes back with correct information, all signs that this must be genuine. Or at least it is being reported as such.

July 7, 2010

Second Verse, Same As The First

Filed under: Marketing,RDF,Semantic Diversity,Semantic Web,Semantics — Patrick Durusau @ 2:44 pm

Unraveling Algol: US, Europe, and the Creation of a Programming Language by David Nofre, University of Amsterdam, is an interesting account of the early history of Algol.

The convention wisdom that what evolved was Algol vs. Fortran is deeply questionable.

The underlying difficulty, a familiar one in semantic integration circles, was a universal programming language versus a diversity of programming languages.

Can you guess who won?

Can you guess where I would put my money in a repeat of a universal solution vs. diverse solutions?

Where is your money riding?

July 5, 2010

Closed World vs. Open World: the First Semantic Web Battle – From Stefano’s Linotype

Filed under: OWL,RDF,Semantic Web — Patrick Durusau @ 7:20 pm

Closed World vs. Open World: the First Semantic Web Battle from Stefano’s Linotype is well worth your time.

See also Stack or Two Towers. Seems like one universal world view may not be a robust as previously thought.

Interesting that non-universal treatment of “doubt” may split the Semantic Web into incompatible parts. Can you say fragile?.

June 30, 2010

Scientists Develop World’s Fastest Program to Find Patterns in Social Networks – News

Filed under: RDF,Search Engines,Searching — Patrick Durusau @ 6:56 pm

Scientists Develop World’s Fastest Program to Find Patterns in Social Networks.

Actually the paper title is: COSI: Cloud Oriented Subgraph Identification in Massive Social Networks

Either way, this looks important for topic map fans.

How important?

The authors:

show our framework works efficiently, answering many complex queries over a 778M edge real-world SN dataset derived from Flickr, LiveJournal, and Orkut in under one second.

That important!

If you think about topic maps less as hand curated XML syntax artifacts and more as interactively and probabilistically created mappings into complex subject spaces then the importance of this research becomes even clearer.

« Newer PostsOlder Posts »

Powered by WordPress