Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

April 28, 2013

4th Open PHACTS Community Workshop (slides) [Operational Equivalence]

Filed under: Bioinformatics,Biomedical,Drug Discovery,Linked Data,Medical Informatics — Patrick Durusau @ 12:24 pm

4th Open PHACTS Community Workshop : Using the power of Open PHACTS

From the post:

The fourth Open PHACTS Community Workshop was held at Burlington House in London on April 22 and 23, 2013. The Workshop focussed on “Using the Power of Open PHACTS” and featured the public release of the Open PHACTS application programming interface (API) and the first Open PHACTS example app, ChemBioNavigator.

The first day featured talks describing the data accessible via the Open PHACTS Discovery Platform and technical aspects of the API. The use of the API by example applications ChemBioNavigator and PharmaTrek was outlined, and the results of the Accelrys Pipeline Pilot Hackathon discussed.

The second day involved discussion of Open PHACTS sustainability and plans for the successor organisation, the Open PHACTS Foundation. The afternoon was attended by those keen to further discuss the potential of the Open PHACTS API and the future of Open PHACTS.

During talks, especially those detailing the Open PHACTS API, a good number of signup requests to the API via dev.openphacts.org were received. The hashtag #opslaunch was used to follow reactions to the workshop on Twitter (see storify), and showed the response amongst attendees to be overwhelmingly positive.

This summary is followed by slides from the two days of presentations.

Not like being there but still quite useful.

As a matter of fact, I found a lead on “operational equivalence” with this data set. More to follow in a separate post.

April 13, 2013

Apache Marmotta (incubator)

Filed under: Apache Marmotta,Linked Data — Patrick Durusau @ 6:19 pm

Apache Marmotta (incubator)

From the webpage:

Apache Marmotta (incubator) is an Open Platform for Linked Data.

The goal of Apache Marmotta is to provide an open implementation of a Linked Data Platform that can be used, extended and deployed easily by organizations who want to publish Linked Data or build custom applications on Linked Data.

Right now the project is being setting up installed in the Apache Software Foundation infrastructure. The team is working to have available to download in the upcoming weeks the first release under incubator. Check the development section for further details how we work or subscribe to our mailing lists to follow the projects day to day.

Features

  • Read-Write Linked Data
  • RDF triple store with transactions, versioning and rule-base reasoning
  • SPARQL and LDPath query languages
  • Transparent Linked Data Caching
  • Integrated security mechanisms

Background

Marmotta comes as a continuation of the work in the Linked Media Framework project. LMF is an easy-to-setup server application that bundles some technologies such as Apache Stanbol or Apache Solr to offer some advanced services. After the release 2.6, the Read-Write Linked Data server code and some related libraries have been set aside to incubate Marmotta within the The Apache Software Foundation. LMF still keeps exactly the same functionallity, but now bundling Marmotta too.

If a client wants a Linked Data Platform, the least you can do is recommend one from Apache.

Linked Data and Law

Filed under: Law,Linked Data — Patrick Durusau @ 4:48 am

Linked Data and Law

A listing of linked data and law resources maintained by the Legal Informatics Blog.

Most recently updated to reflect the availability of the Library of Congress classification K – Class Law Classifcation as linked data.

Law Classification Added to Library of Congress Linked Data Service

Filed under: Classification,Law,Linked Data — Patrick Durusau @ 4:39 am

Law Classification Added to Library of Congress Linked Data Service by Kevin Ford.

From the post:

The Library of Congress is pleased to make the K ClassLaw Classification – and all its subclasses available as linked data from the LC Linked Data Service, ID.LOC.GOV. K Class joins the B, N, M, and Z Classes, which have been in beta release since June 2012. With about 2.2 million new resources added to ID.LOC.GOV, K Class is nearly eight times larger than the B, M, N, and Z Classes combined. It is four times larger than the Library of Congress Subject Headings (LCSH). If it is not the largest class, it is second only to the P Class (Literature) in the Library of Congress Classification (LCC) system.

We have also taken the opportunity to re-compute and reload the B, M, N, and Z classes in response to a few reported errors. Our gratitude to Caroline Arms for her work crawling through B, M, N, and Z and identifying a number of these issues.

Please explore the K Class for yourself at http://id.loc.gov/authorities/classification/K or all of the classes at http://id.loc.gov/authorities/classification.

The classification section of ID.LOC.GOV remains a beta offering. More work is needed not only to add the additional classes to the system but also to continue to work out issues with the data.

As always, your feedback is important and welcomed. Your contributions directly inform service enhancements. We are interested in all forms of constructive commentary on all topics related to ID. But we are particularly interested in how the data available from ID.LOC.GOV is used and continue to encourage the submission of use cases describing how the community would like to apply or repurpose the LCC data.

You can send comments or report any problems via the ID feedback form or ID listserv.

Not leisure reading for everyone but if you are interested, this is fascinating source material.

And an important source of information for potential associations between subjects.

I first saw this at: Ford: Law Classification Added to Library of Congress Linked Data Service.

April 11, 2013

April 4, 2013

The Project With No Name

Filed under: Linked Data,LOD,Open Data — Patrick Durusau @ 4:53 am

Fujitsu Labs And DERI To Offer Free, Cloud-Based Platform To Store And Query Linked Open Data by Jennifer Zaino.

From the post:

The Semantic Web Blog reported last year about a relationship formed between the Digital Enterprise Research Institute (DERI) and Fujitsu Laboratories Ltd. in Japan, focused on a project to build a large-scale RDF store in the cloud capable of processing hundreds of billions of triples. At the time, Dr. Michael Hausenblas, who was then a DERI research fellow, discussed Fujitsu Lab’s research efforts related to the cloud, its huge cloud infrastructure, and its identification of Big Data as an important trend, noting that “Linked Data is involved with answering at least two of the three Big Data questions” – that is, how to deal with volume and variety (velocity is the third).

This week, the DERI and Fujitsu Lab partners have announced a new data storage technology that stores and queries interconnected Linked Open Data, to be available this year, free of charge, on a cloud-based platform. According to a press release about the announcement, the data store technology collects and stores Linked Open Data that is published across the globe, and facilitates search processing through the development of a caching structure that is specifically adapted to LOD.

Typically, search performance deteriorates when searching for common elements that are linked together within data because of requirements around cross-referencing of massive data sets, the release says. The algorithm it has developed — which takes advantage of links in LOD link structures typically being concentrated in only a portion of server nodes, and of past usage frequency — caches only the data that is heavily accessed in cross-referencing to reduce disk accesses, and so accelerate searching.

Not sure what it means for the project between DERI and Fujitsu to have no name. Or at least no name in the press releases.

Until that changes, may I suggest: DERI and Fujitsu Project With No Name (DFPWNN)? 😉

With or without a name I was glad for DERI because, well, I like research and they do it quite well.

DFPWNN’s better query technology for LOD will demonstrate, in my opinion, the same semantic diversity found at Swoogle.

Linking up semantically diverse content means just that, a lot of semantically diverse content, linked up.

The bill for leaving semantic diversity as a problem to address “later” is about to come due.

March 23, 2013

Tensors and Their Applications…

Filed under: Linked Data,Machine Learning,Mathematics,RDF,Tensors — Patrick Durusau @ 6:36 pm

Tensors and Their Applications in Graph-Structured Domains by Maximilian Nickel and Volker Tresp. (Slides.)

Along with the slides, you will like abstract and bibliography found at: Machine Learning on Linked Data: Tensors and their Applications in Graph-Structured Domains.

Abstract:

Machine learning has become increasingly important in the context of Linked Data as it is an enabling technology for many important tasks such as link prediction, information retrieval or group detection. The fundamental data structure of Linked Data is a graph. Graphs are also ubiquitous in many other fields of application, such as social networks, bioinformatics or the World Wide Web. Recently, tensor factorizations have emerged as a highly promising approach to machine learning on graph-structured data, showing both scalability and excellent results on benchmark data sets, while matching perfectly to the triple structure of RDF. This tutorial will provide an introduction to tensor factorizations and their applications for machine learning on graphs. By the means of concrete tasks such as link prediction we will discuss several factorization methods in-depth and also provide necessary theoretical background on tensors in general. Emphasis is put on tensor models that are of interest to Linked Data, which will include models that are able to factorize large-scale graphs with millions of entities and known facts or models that can handle the open-world assumption of Linked Data. Furthermore, we will discuss tensor models for temporal and sequential graph data, e.g. to analyze social networks over time.

Devising a system to deal with the heterogeneous nature of linked data.

Just skimming the slides I could see, this looks very promising.

I first saw this in a tweet by Stefano Bertolo.


Update: I just got an email from Maximilian Nickel and he has altered the transition between slides. Working now!

From slide 53 forward is pure gold for topic map purposes.

Heavy sledding but let me give you one statement from the slides that should capture your interest:

Instance matching: Ranking of entities by their similarity in the entity-latent-component space.

Although written about linked data, not limited to linked data.

What is more, Maximilian offers proof that the technique scales!

Complex, configurable, scalable determination of subject identity!

[Update: deleted note about issues with slides, which read: (Slides for ISWC 2012 tutorial, Chrome is your best bet. Even better bet, Chrome on Windows. Chrome on Ubuntu crashed every time I tried to go to slide #15. Windows gets to slide #46 before failing to respond. I have written to inquire about the slides.)]

March 17, 2013

Beacons of Availability

Filed under: Linked Data,LOD,RDF,Semantic Web — Patrick Durusau @ 10:39 am

From Records to a Web of Library Data – Pt3 Beacons of Availability by Richard Wallis.

Beacons of Availability

As I indicated in the first of this series, there are descriptions of a broader collection of entities, than just books, articles and other creative works, locked up in the Marc and other records that populate our current library systems. By mining those records it is possible to identify those entities, such as people, places, organisations, formats and locations, and model & describe them independently of their source records.

As I discussed in the post that followed, the library domain has often led in the creation and sharing of authoritative datasets for the description of many of these entity types. Bringing these two together, using URIs published by the Hubs of Authority, to identify individual relationships within bibliographic metadata published as RDF by individual library collections (for example the British National Bibliography, and WorldCat) is creating Library Linked Data openly available on the Web.

Why do we catalogue? is a question, I often ask, with an obvious answer – so that people can find our stuff. How does this entification, sharing of authorities, and creation of a web of library linked data help us in that goal. In simple terms, the more libraries can understand what resources each other hold, describe, and reference, the more able they are to guide people to those resources. Sounds like a great benefit and mission statement for libraries of the world but unfortunately not one that will nudge the needle on making library resources more discoverable for the vast majority of those that can benefit from them.

I have lost count of the number of presentations and reports I have seen telling us that upwards of 80% of visitors to library search interfaces start in Google. A similar weight of opinion can be found that complains how bad Google, and the other search engines, are at representing library resources. You will get some balancing opinion, supporting how good Google Book Search and Google Scholar are at directing students and others to our resources. Yet I am willing to bet that again we have another 80-20 equation or worse about how few, of the users that libraries want to reach, even know those specialist Google services exist. A bit of a sorry state of affairs when the major source of searching for our target audience, is also acknowledged to be one of the least capable at describing and linking to the resources we want them to find!

Library linked data helps solve both the problem of better description and findability of library resources in the major search engines. Plus it can help with the problem of identifying where a user can gain access to that resource to loan, download, view via a suitable license, or purchase, etc.

I’m am an ardent sympathizer helping people to find “our stuff.”

I don’t disagree with the description of Google as: “…the major source of searching for our target audience, is also acknowledged to be one of the least capable at describing and linking to the resources we want them to find!”

But in all fairness to Google, I would remind you of Drabenstott’s research that found for the Library of Congress subject headings:

Overall percentages of correct meanings for subject headings in the original order of subdivisions were as follows:

children 32%
adults 40%
reference 53%
technical services librarians 56%

The Library of Congress subject classification has been around for more than a century and just over half of the librarians can use it correctly.

Let’s don’t wait more than a century to test the claim:*

Library linked data helps solve both the problem of better description and findability of library resources in the major search engines.


* By “test” I don’t mean the sort of study, “…we recruited twelve LIS students but one had to leave before the study was complete….”

I am using “test” in the sense of a well designed and organized social science project with professional assistance from social scientists, UI test designers and the like.

I think OCLC is quite sincere in its promotion of linked data, but effectiveness is an empirical question, not one of sincerity.

March 16, 2013

From Records to a Web of Library Data – Pt2 Hubs of Authority

Filed under: Library,Linked Data,LOD,RDF — Patrick Durusau @ 4:00 pm

From Records to a Web of Library Data – Pt2 Hubs of Authority by Richard Wallis.

From the post:

Hubs of Authority

Libraries, probably because of their natural inclination towards cooperation, were ahead of the game in data sharing for many years. The moment computing technology became practical, in the late sixties, cooperative cataloguing initiatives started all over the world either in national libraries or cooperative organisations. Two from personal experience come to mind, BLCMP started in Birmingham, UK in 1969 eventually evolved in to the leading Semantic Web organisation Talis, and in 1967 Dublin, Ohio saw the creation of OCLC. Both in their own way having had significant impact on the worlds of libraries, metadata, and the web (and me!).

One of the obvious impacts of inter-library cooperation over the years has been the authorities, those sources of authoritative names for key elements of bibliographic records. A large number of national libraries have such lists of agreed formats for author and organisational names. The Library of Congress has in addition to its name authorities, subjects, classifications, languages, countries etc. Another obvious success in this area is VIAF, the Virtual International Authority File, which currently aggregates over thirty authority files from all over the world – well used and recognised in library land, and increasingly across the web in general as a source of identifiers for people & organisations.

These, Linked Data enabled, sources of information are developing importance in their own right, as a natural place to link to, when asserting the thing, person, or concept you are identifying in your data. As Sir Tim Berners-Lee’s fourth principle of Linked Data tells us to “Include links to other URIs. so that they can discover more things”. VIAF in particular is becoming such a trusted, authoritative, source of URIs that there is now a VIAFbot responsible for interconnecting Wikipedia and VIAF to surface hundreds of thousands of relevant links to each other. A great hat-tip to Max Klein, OCLC Wikipedian in Residence, for his work in this area.

I don’t deny that VIAF is a very useful tool but if you search for personal name, “Marilyn Monroe,” it returns:

1. Miller, Arthur, 1915-2005
National Library of Australia National Library of the Czech Republic National Diet Library (Japan) Deutsche Nationalbibliothek RERO (Switzerland) SUDOC (France) Library and Archives Canada National Library of Israel (Latin) National Library of Sweden NUKAT Center (Poland) Bibliothèque nationale de France Biblioteca Nacional de España Library of Congress/NACO

Miller, Arthur (Arthur Asher), 1915-2005
National Library of the Netherlands-test

Miller, Arthur, 1915-
Vatican Library Biblioteca Nacional de Portugal

ميلر، ارثر، 1915-2005 م.
Bibliotheca Alexandrina (Egypt)

Miller, Arthur
Wikipedia (en)-test

מילר, ארתור, 1915-2005
National Library of Israel (Hebrew)

2. Monroe, Marilyn, 1926-1962
National Library of Israel (Latin) National Library of the Czech Republic National Diet Library (Japan) Deutsche Nationalbibliothek SUDOC (France) Library and Archives Canada National Library of Australia National Library of Sweden NUKAT Center (Poland) Bibliothèque nationale de France Biblioteca Nacional de España Library of Congress/NACO

Monroe, Marilyn
National Library of the Netherlands-test Wikipedia (en)-test RERO (Switzerland)

Monroe, Marilyn American actress, model, and singer, 1926-1962
Getty Union List of Artist Names

Monroe, Marilyn, pseud.
Biblioteca Nacional de Portugal

3. DiMaggio, Joe, 1914-1999
Library of Congress/NACO Bibliothèque nationale de France

Di Maggio, Joe 1914-1999
Deutsche Nationalbibliothek

Di Maggio, Joseph Paul, 1914-1999
National Diet Library (Japan)

DiMaggio, Joe, 1914-
National Library of Australia

Dimaggio, Joseph Paul, 1914-1999
SUDOC (France)

DiMaggio, Joe (Joseph Paul), 1914-1999
National Library of the Netherlands-test

Dimaggio, Joe
Wikipedia (en)-test

4. Monroe, Marilyn
Deutsche Nationalbibliothek

5. Hurst-Monroe, Marlene
Library of Congress/NACO

6. Wolf, Marilyn Monroe
Deutsche Nationalbibliothek

Maybe Sir Tim is right, users “…can discover more things.”

Some of them are related, some of them are not.

From Records to a Web of Library Data – Pt1 Entification

Filed under: Entities,Library,Linked Data — Patrick Durusau @ 3:10 pm

From Records to a Web of Library Data – Pt1 Entification by Richard Wallis.

From the post:

Entification

Entification – a bit of an ugly word, but in my day to day existence one I am hearing more and more. What an exciting life I lead…

What is it, and why should I care, you may be asking.

I spend much of my time convincing people of the benefits of Linked Data to the library domain, both as a way to publish and share our rich resources with the wider world, and also as a potential stimulator of significant efficiencies in the creation and management of information about those resources. Taking those benefits as being accepted, for the purposes of this post, brings me into discussion with those concerned with the process of getting library data into a linked data form.

As you know, I am far from convinced about the “benefits” of Linked Data, at least with its current definition.

Who knows what definition “Linked Data” may have in some future vision of the W3C? (URL Homonym Problem: A Topic Map Solution, a tale of how the W3C decided to redefine URL.)

But Richard’s point about the ugliness and utility of “entification” is well taken.

So long as you remember that every term can be described “in terms of other things.”

There are no primitive terms, not one.

March 7, 2013

Linked Data Platform 1.0 (W3C)

Filed under: Linked Data,W3C — Patrick Durusau @ 2:13 pm

Linked Data Platform 1.0 (W3C)

Abstract:

A set of best practices and simple approach for a read-write Linked Data architecture, based on HTTP access to web resources that describe their state using the RDF data model.

Just in case you every encounter such a platform.

February 25, 2013

Linked Data for Holdings and Cataloging

Filed under: Cataloging,Library,Linked Data — Patrick Durusau @ 5:45 am

From the ALA Midwinter Meeting:

Linked Data for Holdings and Cataloging: The First Step Is Always the Hardest! by Eric Miller (Zepheira) and Richard Wallis (OCLC). (Video + Slides)

Linked Data for Holdings and Cataloging: Interactive Session. (Audio)

Since linked data wasn’t designed for human users, the advantage for library catalogs isn’t clear.

Most users can’t use LCSH so perhaps the lack of utility will go unnoticed. (Subject Headings and the Semantic Web)

I first saw this at: Linked Data for Holdings and Cataloging – recordings now available!

February 16, 2013

NBA Stats Like Never Before [No RDF/Linked Data/Topic Maps In Sight]

Filed under: Design,Interface Research/Design,Linked Data,RDF,Statistics,Topic Maps — Patrick Durusau @ 4:47 pm

NBA Stats Like Never Before by Timo Elliott.

From the post:

The National Baseball Association today unveiled a new site for fans of games statistics: NBA.com/stats, powered by SAP Analytics technology. The multi-year marketing partnership between SAP and the NBA was announced six months ago:

“We are constantly researching new and emerging technologies in an effort to provide our fans with new ways to connect with our game,” said NBA Commissioner David Stern. “SAP is a leader in providing innovative software solutions and an ideal partner to provide a dynamic and comprehensive statistical offering as fans interact with NBA basketball on a global basis.”

“SAP is honored to partner with the NBA, one of the world’s most respected sports organizations,” said Bill McDermott, co-CEO, SAP. “Through SAP HANA, fans will be able to experience the NBA as never before. This is a slam dunk for SAP, the NBA and the many fans who will now have access to unprecedented insight and analysis.”

The free database contains every box score of every game played since the league’s inception in 1946, including graphical displays of players shooting tendencies.

To the average fan NBA.com/Stats delivers information that is of immediate interest to them, not their computers.

Another way to think about it:

Computers don’t make purchasing decisions, users do.

Something to think about when deciding on your next semantic technology.

February 11, 2013

Saving the “Semantic” Web (part 2) [NOTLogic]

Filed under: Linked Data,RDF,Semantic Web — Patrick Durusau @ 5:45 pm

Expressing Your Semantics: NOTLogic

Saving the “Semantic” Web (part 1) ended concluding authors of data/content should be asked about the semantics of their content.

I asked if there were compelling reasons to ask someone else and got no takers.

The acronym, NOTLogic may not be familiar. It expands to: Not Only Their Logic.

Users should express their semantics in the “logic” of their domain.

After all, it is their semantics, knowledge and domain that are being captured.

Their “logic” may not square up with FOL (first order logic) but where’s the beef?

Unless one of the project requirements is to maintain consistency with FOL, why bother?

The goal in most BI projects is ROI on capturing semantics, not adhering to FOL for its own sake.

Some people want to teach calculators how to mimic “reasoning” by using that subset known as “logic.”

However much I liked the Friden rotary calculator of my youth:

Calculator

teaching it to mimic “reasoning” isn’t going to happen on my dime.

What about yours?

There are cases where machine learning technique are very productive and fully justified.

The question you need to ask yourself (after discovering if you should be using RDF at all, The Semantic Web Is Failing — But Why? (Part 2)) is whether “their” logic works for your use case.

I suspect you will find that you can express your semantics, including relationships, without resort to FOL.

Which may lead you to wonder: Why would anyone want you to use a technique they know, but you don’t?

I don’t know for sure but have some speculations on that score I will share with you tomorrow.

In the mean time, remember:

  1. As the author of content or data, you are the person to ask about its semantics.
  2. You should express your semantics in a way comfortable for you.

AGROVOC 2013 edition released

Filed under: AGROVOC,Linked Data,SKOS,Vocabularies — Patrick Durusau @ 2:08 pm

AGROVOC 2013 edition released

From the post:

The AGROVOC Team is pleased to announce the release of the AGROVOC 2013 edition.

The updated version contains 32,188 concepts in up to 22 languages, resulting in a total of 626,211 terms (in 2012: 32,061 concepts, 625,096 terms).

Please explore AGROVOC by searching terms, or browsing hierarchies.

AGROVOC 2013 is available for download, and accessible via web services.

From the “about” page:

The AGROVOC thesaurus contains 32,188 concepts in up to 22 languages covering topics related to food, nutrition, agriculture, fisheries, forestry, environment and other related domains.

A global community of editors consisting of librarians, terminologists, information managers and software developers, maintain AGROVOC using VocBench, an open-source multilingual, web-based vocabulary editor and workflow management tool that allows simultaneous, distributed editing. AGROVOC is expressed in Simple Knowledge Organization System (SKOS) and published as Linked Data.

Need some seeds for your topic map in “…food, nutrition, agriculture, fisheries, forestry, environment and other related domains”?

February 7, 2013

The Semantic Web Is Failing — But Why? (Part 3)

Filed under: Linked Data,RDF,Semantic Web — Patrick Durusau @ 4:30 pm

Is Linked Data the Answer?

Leaving the failure of users to understand RDF semantics to one side, there is also the issue of the complexity of its various representations.

Consider Kingsley Idehem’s “simple” example Turtle document, which he posted in: Simple Linked Data Deployment via Turtle Docs using various Storage Services:

##### Starts Here #####
# Note: the hash is a comment character in Turtle
# Content start
# You can save this to a local file. In my case I use Local File Name: kingsley.ttl .
# Actual Content:

# prefix decalaration that enable the use of compact identifiers instead of fully expanded 
# HTTP URIs.

@prefix owl:   .
@prefix foaf:  .
@prefix rdfs:  . 
@prefix wdrs:  .
@prefix opl:  .
@prefix cert:  .
@prefix:<#>.

# Profile Doc Stuff

<> a foaf:Document . 
<> rdfs:label "DIY Linked Data Doc About: kidehen" .
<> rdfs:comment "Simple Turtle File That Describes Entity: kidehen " .

# Entity Me Stuff

<> foaf:primaryTopic :this .
<> foaf:maker :this .
:this a foaf:Person . 
:this wdrs:describedby <> . 
:this foaf:name "Kingsley Uyi Idehen" .
:this foaf:firstName "Kingsley" .
:this foaf:familyName "Idehen" .
:this foaf:nick "kidehen" .
:this owl:sameAs  .
:this owl:sameAs  .
:this owl:sameAs  .
:this owl:sameAs  .
:this foaf:page  .
:this foaf:page  .
:this foaf:page  .
:this foaf:page  . 
:this foaf:knows , , , , ,  .

# Entity Me: Identity & WebID Stuff 

#:this cert:key :pubKey .
#:pubKey a cert:RSAPublicKey;
# Public Key Exponent
# :pubkey cert:exponent "65537" ^^ xsd:integer;
# Public Key Modulus
# :pubkey cert:modulus "d5d64dfe93ab7a95b29b1ebe21f3cd8a6651816c9c39b87ec51bf393e4177e6fc
2ee712d92caf9d9f1423f5e65f127274529a2e6cc53f1e452c6736e8db8732f919c4160eaa9b6f327c8617c
40036301b547abfc4c5de610780461b269e3d8f8e427237da6152ac2047d88ff837cddae793d15427fa7ce
067467834663737332be467eb353be678bffa7141e78ce3052597eae3523c6a2c414c2ae9f8d7be807bb3
fc0d516b8ecd2fafee4f20ff3550919601a0ad5d29126fb687c2e8c156f04918a92c4fc09f136473f3303814e1
83185edf0046e124e856ca7ada027345e614f8d665f5d7172d880497005ff4626c2b0f2206f7dce717e4f279
dd2a0ddf04b" ^^ xsd:hexBinary .

# :this opl:hasCertificate :cert .
# :cert opl:fingerprint "640F9DD4CFB6DD6361CBAD12C408601E2479CC4A" ^^ xsd:hexBinary;
#:cert opl:hasPublicKey "d5d64dfe93ab7a95b29b1ebe21f3cd8a6651816c9c39b87ec51bf393e4177e6fc2
ee712d92caf9d9f1423f5e65f127274529a2e6cc53f1e452c6736e8db8732f919c4160eaa9b6f327c8617c400
36301b547abfc4c5de610780461b269e3d8f8e427237da6152ac2047d88ff837cddae793d15427fa7ce06746
7834663737332be467eb353be678bffa7141e78ce3052597eae3523c6a2c414c2ae9f8d7be807bb3fc0d516b
8ecd2fafee4f20ff3550919601a0ad5d29126fb687c2e8c156f04918a92c4fc09f136473f3303814e183185edf00
46e124e856ca7ada027345e614f8d665f5d7172d880497005ff4626c2b0f2206f7dce717e4f279dd2a0ddf04b" 
^^ xsd:hexBinary .

### Ends or Here###

Try handing that “simple” example and Idehem’s article to some non-technical person in your office to gauge its “simplicity.”

For that matter, hand it to some of your technical but non-Semantic Web folks as well.

Your experience with that exercise will speak louder than anything I can say.


The next series starts with Saving the “Semantic” Web (Part 1)

February 1, 2013

BBC …To Explore Linked Data Technology [Instead of hand-curated content management]

Filed under: Linked Data,LOD,News — Patrick Durusau @ 8:07 pm

BBC News Lab to Explore Linked Data Technology by Angela Guess.

From the post:

Matt Shearer of the BBC recently reported that the BBC’s News Lab team will begin exploring linked data technologies. He writes, “Hi I’m Matt Shearer, delivery manager for Future Media News. I manage the delivery of the News Product and I also lead on BBC News Labs. BBC News Labs is an innovation project which was started during 2012 to help us harness the BBC’s wider expertise to explore future opportunities. Generally speaking BBC News believes in allowing creative technologists to innovate and influence the direction of the News product. For example the delivery of BBC News’ responsive design mobile service started in 2011 when we made space for a multidiscipline project to explore responsive design opportunities for BBC News. With this in mind the BBC News team setup News Labs to explore linked data technologies.”

Shearer goes on, “The BBC has been making use of linked data technologies in its internal content production systems since 2011. As explained by Jem Rayfield this enabled the publishing of news aggregation pages ‘per athlete’, ‘per sport’ and ‘per event’ for the 2012 Olympics – something that would not have been possible with hand-curated content management. Linked data is being rolled out on BBC News from early 2013 to enrich the connections between BBC News stories, content assets, the wider BBC website and the World Wide Web. We framed each challenge/opportunity for the News Lab in terms of a clear ‘problem space’ (as opposed to a set of requirements that may limit options) supported by research findings, audience needs, market needs, technology opportunities and framed with the BBC News Strategy.”

Read more here.

(emphasis added)

Apologies for the long quote but I wanted to capture the BBC’s comparison of using linked data to hand-curated content management in context.

I never dreamed the BBC was still using “hand-curated content management” as a measure of modern IT systems.

Quite remarkable.

On the other hand, perhaps they were being kind to the linked data experiment by using a measure that enables it to excel.

If you know which one, please comment.

Thanks!

January 15, 2013

Is Linked Data the future of data integration in the enterprise?

Filed under: Linked Data,LOD — Patrick Durusau @ 8:31 pm

Is Linked Data the future of data integration in the enterprise? by John Walker.

From the post:

Following the basic Linked Data principles we have assigned HTTP URIs as names for things (resources) providing an unambiguous identifier. Next up we have converted data from a variety of sources (XML, CSV, RDBMS) into RDF.

One of the key features of RDF is the ability to easily merge data about a single resource from multiple source into a single “supergraph” providing a more complete description of the resource. By loading the RDF into a graph database, it is possible to make an endpoint available which can be queried using the SPARQL query language. We are currently using Dydra as their cloud-based database-as-a-service model provides an easy entry route to using RDF without requiring a steep learning curve (basically load your RDF and you’re away), but there are plenty of other options like Apache Jena and OpenRDF Sesame. This has made it very easy for us to answer to complex questions requiring data from multiple sources, moreover we can stand up APIs providing access to this data in minutes.

By using a Linked Data Plaform such as Graphity we can make our identifiers (HTTP URIs) dereferencable. In layman’s terms when someone plugs the URI into a browser, we provide a description of the resource in HTML. Using content negotiation we are able to provide this data in one of the standard machine-readable XML, JSON or Turtle formats. Graphity uses Java and XSLT 2.0 which our developers already have loads of experience with and provides powerful mechanisms with which we will be able to develop some great web apps.

What do you make of:

One of the key features of RDF is the ability to easily merge data about a single resource from multiple source into a single “supergraph” providing a more complete description of the resource.

???

I suppose if by some accident we all use the same URI as an identifier, that would be the case. But that hardly requires URIs, Linked Data or RDF.

Scientific conferences on digital retrieval the 1950’s worried about diversity of nomenclature being barriers to discovery of resources. If we haven’t addressed the semantic diversity issue in sixty (60) years of talking about it, it isn’t clear how creating another set of diverse names is going to help.

There may be other reasons for using URIs but seamless merging doesn’t appear to be one of them.

Moreover, how do I know what you have identified with a URI?

You can return one or more properties for a URI, but which ones matter for the identity of the subject it identifies?

I first saw this at Linked Data: The Future of Data Integration by Angela Guess.

January 9, 2013

@AMS Webinars on Linked Data

Filed under: Linked Data,LOD,Semantic Web — Patrick Durusau @ 12:01 pm

@AMS Webinars on Linked Data

From the website:

The traditional approach of sharing data within silos seems to have reached its end. From governments and international organizations to local cities and institutions, there is a widespread effort of opening up and interlinking their data. Linked Data, a term coined by Tim Berners-Lee in his design note regarding the Semantic Web architecture, refers to a set of best practices for publishing, sharing, and interlinking structured data on the Web.

Linked Open Data (LOD), a concept that has leapt onto the scene in the last years, is Linked Data distributed under an open license that allows its reuse for free. Linked Open Data becomes a key element to achieve interoperability and accessibility of data, harmonisation of metadata and multilinguality.

There are four remaining seminars in this series:

Webinar in French | 22nd January 2013 – 11:00am Rome time
Clarifiez le sens de vos données publiques grâce au Web de données
Christophe Guéret, Royal Netherlands Academy of Arts and Sciences, Data Archiving and Networked Services (DANS)

Webinar in Chinese | 29th January 2013 – 02:00am Rome time
基于网络的研讨会 “题目:理解和利用关联数据 --图情档博(LAM)作为关联数据的提供者和使用者”
Marcia Zeng, School of Library and Information Science, Kent State University

Webinar in Russian | 5th February 2013 – 11:00am Rome time
Введение в концепцию связанных открытых данных
Irina Radchenko, Centre of Semantic Technologies, Higher School of Economics

Webinar in Arabic | 12th February 2013 – 11:00am Rome time
Ibrahim Elbadawi, UAE Federal eGovernment

Mark your agenda! New Free Webinars @ AIMS on Linked Open Data for registration and more details.

January 4, 2013

Callimachus Version 1.0

Filed under: Linked Data,LOD — Patrick Durusau @ 7:43 pm

Callimachus Version 1.0 by Eric Franzon.

From the post:

The Callimachus Project has announced that the latest release of the Open Source version of Callimachus is available for immediate download.

Callimachus began as a linked data management system in 2009 and is an Open Source system for navigating, managing, visualizing and building applications on Linked Data.

Version 1.0 introduces several new features, including:

  • Built-in support for most types of Persistent URLs (PURLs), including Active PURLs.
  • Scripted HTTP content type conversions via XProc pipelines.
  • Ability to access remote Linked Data via SPARQL SERVICE keyword and XProc pipelines.
  • Named Queries can now have a custom view page. The view page can be a template for the resources in the query result.
  • Authorization can now be performed based on IP addresses or the DNS domain of the client.

December 24, 2012

10 Rules for Persistent URIs [Actually only one] Present of Persistent URIs

Filed under: Linked Data,Semantic Web,WWW — Patrick Durusau @ 2:11 pm

Interoperability Solutions for European Public Administrations got into the egg nog early:

D7.1.3 – Study on persistent URIs, with identification of best practices and recommendations on the topic for the MSs and the EC (PDF) (I’m not kidding, go see for yourself.)

Five (5) positive rules:

  1. Follow the pattern: http://(domain)/(type)/(concept)/(reference)
  2. Re-use existing identifiers
  3. Link multiple representations
  4. Implement 303 redirects for real-world objects
  5. Use a dedicated servive

Five (5) negative rules:

  1. Avoid stating ownership
  2. Avoid version numbers
  3. Avoid using auto-increment
  4. Avoid query strings
  5. Avoid file extensions

If the goal is “persistent” URIs, only the “Use a dedicated server” has any relationship to making a URIs “persistent.”

That is that five (5) or ten (10) years from now, a URI used as an identifier will return the same value as today.

The other nine rules have no relationship to persistence. Good arguments can be made for some of them, but persistence isn’t one of them.

Why the report hides behind the rhetoric of persistence I cannot say.

But you can satisfy yourself that only a “dedicated server” can persist a URI, whatever its form.

W3C confusion over identifiers and locators for web resources continues to plague this area.

There isn’t anything particularly remarkable about using a URI as an identifier. So long as it is understood that URI identifiers are just like any other identifier.

That is they can be indexed, annotated, searched for and returned to users with data about the object of the identification.

Viewed that way, that once upon a time there was a resource with the location specified by a URI, has little or nothing to do with the persistent of that URI.

So long as we have indexed the URI, that index can serve as a resolution of that URI/identifier for as long as the index persists. With additional information should we choose to create and provide it.

The EU document concedes as much when it says:

Without exception, all the use cases discussed in section 3 where a policy of URI persistence has been adopted, have used a dedicated service that is independent of the data originator. The Australian National Data Service uses a handle resolver, Dublin Core uses purl.org, services, data.gov.uk and publications.europa.eu are all also independent of a specific government department and could readily be transferred and run by someone else if necessary. This does not imply that a single service should be adopted for multiple data providers. On the contrary – distribution is a key advantage of the Web. It simply means that the provision of persistent URIs should be independent of the data originator.

That is if you read: “…independent of the data originator” to mean independent of a particular location on the WWW.

No changes in form, content, protocols, server software, etc., required. And you get persistent URIs.

Merry Christmas to all and to all…, persistent URIs as identifiers (not locators)!

(I first saw this at: New Report: 10 Rules for Persistent URIs)

December 20, 2012

Best Buy Product Catalog via Semantic Endpoints

Filed under: Linked Data,RDF — Patrick Durusau @ 2:31 pm

Announcing BBYOpen Metis Alpha: Best Buy Product Catalog via Semantic Endpoints

From the post:

Announcing BBYOpen Metis Alpha: Best Buy Product Catalog via Semantic Endpoints

These days, consumers have a rich variety of products available at their fingertips. A massive product landscape has evolved, but sadly products in this enormous and rich landscape often get flattened to just a price tag. Over time, it seems the product value proposition, variety, descriptions, specifics, and details that make up products have all but disappeared. This presents consumers with a "paradox of choice" where misinformed decisions can lead to poor product selections, and ultimately product returns and customer remorse.

To solve this problem, BBY Open is excited to announce the first phase Alpha release of Metis, our semantically-driven product insight engine. As part of a phased release approach, this first release consists of publishing all 500K+ of our active Best Buy products with reviews as RDF-enabled endpoints for public consumption.

This alpha release is the first phase in solving this product ambiguity. With the publishing of structured product data in RDF format using industry accepted product ontologies like GoodRelations, standards from the Semantic Web group at the W3C, and the NetKernel platform, the Metis Alpha gives developers the ability to consume and query structured data via SPARQL (get up to speed with Learning SPARQL by Bob DuCharme), enabling the discovery of insight hidden deep inside the product catalog.

Comments?

December 13, 2012

Linked Jazz

Filed under: Linked Data,Music — Patrick Durusau @ 6:55 pm

Linked Jazz

Network display of Jazz artists with a number of display options.

Using Linked Data.

Better network display than I am accustomed to and I know that Lars likes jazz. 😉

I first saw this in a tweet by Christophe Viau.

PS: You may also like the paper: Visualizing Linked Jazz: A web-based tool for social network analysis and exploration.

Optique

Filed under: BigData,Linked Data,Optique — Patrick Durusau @ 6:47 pm

Optique

From the homepage:

Scalable end-user access to Big Data is critical for e ffective data analysis and value creation. Optique will bring about a paradigm shift for data access:

  • by providing a semantic end-to-end connection between users and data sources;
  • enabling users to rapidly formulate intuitive queries using familiar vocabularies and conceptualisations;
  • seamlessly integrating data spread across multiple distributed data sources, including streaming sources;
  • exploiting massive parallelism for scalability far beyond traditional RDBMSs and thus reducing the turnaround time for information requests to minutes rather than days.

Another new EU data project.

Website reports first software will be available towards the end of 2013.

Not much in the way of specifics but it is very early in the project.

Can anyone point me to a public version of their funding application?

I have been given to understand that funding applications have more detail that may appear in public announcements.

PS: I had trouble downloading a presentation by Peter Haase that is cited on the website so when I obtained it, I uploaded a local copy: On Demand Access to Big Data Through Semantic Technologies. (PDF)

I have seen the Linked Data cloud illustration many times. Have you seen it in comparison with the overall data cloud?

December 11, 2012

Developing CODE for a Research Database

Filed under: Entity Extraction,Entity Resolution,Linked Data — Patrick Durusau @ 8:19 pm

Developing CODE for a Research Database by Ian Armas Foster.

From the post:

The fact that there are a plethora of scientific papers readily available online would seem helpful to researchers. Unfortunately, the truth is that the volume of these articles has grown such that determining which information is relevant to a specific project is becoming increasingly difficult.

Austrian and German researchers are thus developing CODE, or Commercially Empowered Linked Open Data Ecosystems in Research, to properly aggregate research data from its various forms, such as PDFs of academic papers and data tables upon which those papers are based, into a single system. The project is in a prototype stage, with the goal being to integrate all forms into one platform by the project’s second year.

The researchers from the University of Passau in Germany and the Know-Center in Graz, Austria explored the challenges to CODE and how the team intends to deal with those challenges in this paper. The goal is to meliorate the research process by making it easier to not only search for both text and numerical data in the same query but also to use both varieties in concert. The basic architecture for the project is shown below.

Stop me if you have heard this one before: “There was this project that was going to disambiguate entities and create linked data….”

I would be the first one to cheer if such a project were successful. But, a few paragraphs in a paper, given the long history of entity resolution and its difficulties, isn’t enough to raise my hopes.

You?

December 8, 2012

Library Hi Tech Journal seeks papers on LOV & LOD

Filed under: Linked Data,LOD,LOV — Patrick Durusau @ 2:44 pm

Library Hi Tech Journal seeks papers on LOV & LOD

From the post:

Library Hi Tech (LHT) seeks papers about new works, initiatives, trends and research in the field of linking and opening vocabularies. This call for papers is inspired by the 2012 LOV Symposium: Linking and Opening Vocabularies symposium and SKOS-2-HIVE —Helping Interdisciplinary Vocabulary Engineering workshop—held at the Universidad Carlos III de Madrid (UC3M).

This Library Hi Tech special issue might include papers delivered at the UC3M-LOV events and other original works related with this subject, not yet published.

Topics: LOV & LOD

Papers specifically addressing research and development activities, implementation challenges and solutions, and educative aspects of Linked Open Vocabularies (LOV) and/or in a broader sense Linked Open Data, are of particular interest.

Those interested in submitting an article should send papers before 30 January 2013. Full articles should be between 4,000 and 8,000 words. References should use the Harvard style. Please submit completed articles via the Scholar One online submission system. All final submissions will be peer reviewed.

On the style for references, you may find the Author Guidelines at LHT useful.

More generally, see Harvard System, posted by the University Library of Anglia Ruskin University.

December 5, 2012

Don’t feed the semantic black holes [Dangers of Semantic Promiscuity]

Filed under: Linked Data,Security,Virus — Patrick Durusau @ 4:42 pm

Don’t feed the semantic black holes by Bernard Vatant.

From the post:

If I remember correctly it was at Knowledge Technologies 2001Ann Wrightson explained us, during the informal RDF-Topic Maps session, how to build a semantic virus for Topic Maps, through abuse of subject indicator. At the time OWL and its now infamous owl:sameAs were not yet around, but the idea was identical : if several “topics” A, B, C, … indicate the same “subject” X, then they should be merged into a single topic. In linked data land ten years after it’s the same story : if RDF descriptions A, B, C … declare a owl:sameAs link to X, then A and B are merged together with the current description of X.

Hence the very simple semantic virus concept :

1. Harvest all the topic identifiers you can grab from distributed topic maps (read today : URIs from distributed linked data).

2. Publish a new topic map adding a common subject indicator to every topic description you have harvested (read today : add owl:sameAs X to all resource descriptions)

Now if you query the resulting data base for the description of any topic (resource) in it you get just all elements of description of everything on anything. All the map is collapsed on a single heavy and meaningless node. An irreversible semantic collapse.

True but that’s like having unprotected sex with a hooker in the bushes near a truck stop in India.

Reliance on non-verified sources of data is like unprotected sex, except for the lack of enjoyable parts.

As Bernard points out, this can lead to very bad consequences.

I would not wait for Bernard’s provenance indication using named graphs. Do you think people who would create malicious owl:sameAs statements would also create false statements about their graphs? Gasp! 😉

Trusting evil-doers to respect provenance conventions meant to exclude their content is a low percentage bet.

One solution, possibly a commercially viable one, would be to harvest and test linked data, being a canonical and trusted source for that data. Any semantic black holes being detected and blocked from reaching you.

A prophylactic service as it were.

November 30, 2012

Linking Web Data for Education Project [Persisting Heterogeneity]

Filed under: Education,Linked Data,WWW — Patrick Durusau @ 3:48 pm

Linking Web Data for Education Project

From the about page:

LinkedUp aims to push forward the exploitation of the vast amounts of public, open data available on the Web, in particular by educational institutions and organizations.

This will be achieved by identifying and supporting highly innovative large-scale Web information management applications through an open competition (the LinkedUp Challenge) and dedicated evaluation framework. The vision of the LinkedUp Challenge is to realise personalised university degree-level education of global impact based on open Web data and information. Drawing on the diversity of Web information relevant to education, ranging from Open Educational Resources metadata to the vast body of knowledge offered by the Linked Data approach, this aim requires overcoming substantial challenges related to Web-scale data and information management involving Big Data, such as performance and scalability, interoperability, multilinguality and heterogeneity problems, to offer personalised and accessible education services. Therefore, the LinkedUp Challenge provides a focused scenario to derive challenging requirements, evaluation criteria, benchmarks and thresholds which are reflected in the LinkedUp evaluation framework. Information management solutions have to apply data and learning analytics methods to provide highly personalised and context-aware views on heterogeneous Web data.

Before linked data, we had: “…interoperability, multilinguality and heterogeneity problems….”

After linked data, we have: “…interoperability, multilinguality and heterogeneity problems….” + linked data (with heterogeneity problems).

Not unexpected but still need a means of resolution. Topic maps anyone?

November 27, 2012

Linking your resources to the Data Web

Filed under: AGROVOC,Linked Data,RDF — Patrick Durusau @ 4:56 am

First LOD@AIMS Webinar with Tom Baker on “Linking your resources to the Data Web”

4th December 2012 – 16:00 Rome Time

From the post:

The AIMS Metadata Community of Practice is glad to announce the first Linked Open Data @ AIMS webinar entitled Linking your resources to the Data Web. The session will take place on 4th December 2012 – 16:00 Rome Time – and will be presented by Tom Baker, chief information officer (CIO) of the Dublin Core Metadata Initiative (DCMI).

This event is part of the series of webinars Linked Open Data @ AIMS that will take place from December 2012 to February 2013. A total of 6 specialists will talk about Linked Open Data and the Semantic Web to the agricultural information management community. The webinars will be in the 6 languages used on AIMS – English, French, Spanish, Arabic, Chinese and Russian.

The objective of Linked Open Data @ AIMS webinars is to help individuals and organizations to understand better the initiatives related to the Semantic Web that are currently taking place within the AIMS Communities of Practice.


Linking data into the Semantic Web means more than just making data available on a Web server. It means using Web addresses (URIs) in data as names for things; tagging resources using those URIs – for example, URIs for agricultural topics from AGROVOC; and using URIs to point to related resources.

This talk walks through a simple example to show how linking works in practice, illustrating RDF technology with animated graphics. It concludes with a recipe for linking your data: Decide what bits of your data are most important, such as Subject, Author, and Publisher. Use URIs in your data, whenever possible, such as Subject terms from AGROVOC. Then publish your data in RDF on the Web where others can link to it. Simple solutions can be enough to yield good results.

Tom Baker of the Dublin Core Metadata Initiative will be an excellent speaker but when I saw:

Tom Baker on “Linking your resources to the Data Web”

my first thoughts were of another Tom Baker and wondering how he had gotten involved with Linked Data. 😉

In the body of the announcement, a URL identifies the “Tom Baker” in the text as another “Tom Baker” than the one I was thinking about.

Interesting. It didn’t take Linked Data or RDF to make the distinction, only the <a> element plus an href attribute. Something to think about.

November 26, 2012

UILLD 2013 — User interaction built on library linked data

Filed under: Interface Research/Design,Library,Linked Data,Usability,Users — Patrick Durusau @ 4:48 pm

UILLD 2013: Workshop on User interaction built on library linked data (UILLD) Pre-conference to the 79th World Library and Information Conference, Jurong Regional Library, Singapore.

Important Dates:

Paper submission deadline: February 28, 2013
Acceptance notification: May 15, 2013
Camera-ready versions of accepted papers: June 30, 2013
Workshop date: August 16, 2013

From the webpage:

The quantity of Linked Data published by libraries is increasing dramatically: Following the lead of the National Library of Sweden (2008), several libraries and library networks have begun to publish authority files and bibliographic information as linked (open) data. However, applications that consume this data are not yet widespread. Particularly, there is a lack of methods for integration of Linked Data from multiple sources and its presentation in appropriate end user interfaces. Existing services tend to build on one or two well integrated datasets – often from the same data supplier – and do not actively use the links provided to other datasets within or outside of the library or cultural heritage sector to provide a better user experience.

CALL FOR PAPERS

The main objective of this workshop/pre-conference is to provide a platform for discussion of deployed services, concepts, and approaches for consuming Linked Data from libraries and other cultural heritage institutions. Special attention will be given to papers presenting working end user interfaces using Linked Data from both cultural heritage institutions (including libraries) and other datasets.

For further information about the workshop, please contact the workshops chairs at uilld2013@gmail.com

In connection with this workshop, see also: IFLA World Library and Information Congress 79th IFLA General Conference and Assembly.

I first saw this in a tweet by Ivan Herman.

« Newer PostsOlder Posts »

Powered by WordPress