Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

April 15, 2014

Wandora – New Version [TMQL]

Filed under: TMQL,Topic Map Software,Topic Maps,Wandora — Patrick Durusau @ 7:23 pm

Wandora – New Version

From the webpage:

It is over six months since last Wandora release. Now we are finally ready to publish new version with some very interesting new features. Release 2014-04-15 features TMQL support and embedded HTML browser, for example. TMQL is the topic map query language and Wandora allows the user to search, query and modify topics and associations with TMQL scripts. Embedded HTML browser expands Wandora’s internal visualizations repertoire. Wandora embedded HTTP server services are now available inside the Wandora application….

Change Log, Download.

Two of the biggest changes:

Download your copy today!

I will post a review by mid-May, 2014.

Interested to hear your comments, questions and suggestions in the mean time.

BTW, the first suggestion I have is that the download file should NOT be wandora.zip but rather wandora-(date).zip if nothing else. Ditto for the source files and javadocs.

January 16, 2013

Optimizing TM Queries?

Filed under: Query Language,TMQL,XML,XQuery — Patrick Durusau @ 7:56 pm

A recent paper by V. Benzaken, G. Castagna, D. Colazzo, and K. Nguyễn, Optimizing XML querying using type-based document projection, suggests some interesting avenues for optimizing topic map queries.

Abstract:

XML data projection (or pruning) is a natural optimization for main memory query engines: given a query Q over a document D, the subtrees of D that are not necessary to evaluate Q are pruned, thus producing a smaller document D ; the query Q is then executed on D , hence avoiding to allocate and process nodes that will never be reached by Q.

In this article, we propose a new approach, based on types, that greatly improves current solutions. Besides providing comparable or greater precision and far lesser pruning overhead, our solution ―unlike current approaches― takes into account backward axes, predicates, and can be applied to multiple queries rather than just to single ones. A side contribution is a new type system for XPath able to handle backward axes. The soundness of our approach is formally proved. Furthermore, we prove that the approach is also complete (i.e., yields the best possible type-driven pruning) for a relevant class of queries and Schemas. We further validate our approach using the XMark and XPathMark benchmarks and show that pruning not only improves the main memory query engine’s performances (as expected) but also those of state of the art native XML databases.

Phrased in traditional XML terms but imagine pruning a topic map by topic or association types, for example, before execution of a query.

While true enough that a query could include topic type, the remains the matter of examining all the instances of topic type before proceeding to the rest of the query.

For common query sub-maps as it were, I suspect that to prune once and store the results could be a viable alternative.

Despite the graphic chart enhancement from processing millions or billions of nodes, processing the right set of nodes and producing a useful answer has its supporters.

April 28, 2012

Scalability of Topic Map Systems

Filed under: MaJorToM,TMQL,TMQL4J,Topic Map Software,Topic Maps — Patrick Durusau @ 6:06 pm

Scalability of Topic Map Systems, thesis by Marcel Hoyer.

Abstract:

The purpose of this thesis was to find approaches solving major performance and scalability issues for Topic Maps-related data access and the merging process. Especially regarding the management of multiple, heterogeneous topic maps with different sizes and structures. Hence the scope of the research was mainly focused on the Maiana web application with its underlying MaJorToM and TMQL4J back-end.

In the first instance the actual problems were determined by profiling the application runtime, creating benchmarks and discussing the current architecture of the Maiana stack. By presenting different distribution technologies afterwards the issues around a single-process instance, slow data access and concurrent request handling were investigated to determine possible solutions. Next to technological aspects (i. e. frameworks or applications) this discussion included fundamental reflection of design patterns for distributed environments that indicated requirements for changes in the use of the Topic Maps API and data flow between components. With the development of the JSON Topic Maps Query Result format and simple query-focused interfaces the essential concept for an prototypical implementation was established. To concentrate on scalability for query processing basic principles and benefits of message-oriented middleware were presented. Those were used in combination with previous results to create a distributed Topic Maps query service and to present ideas about optimizing virtual merging of topic maps.

Finally this work gave multiple insights to improve the architecture and performance of Topic Maps-related applications by depicting concrete bottlenecks and providing prototypical implementations that show the feasibility of the approaches. But it also pointed out remaining performance issues in the persisting data layer.

I have just started reading Marcel’s thesis but I am already impressed by the evaluation of Maiana. I am sure this work will be useful in planning options for future topic map stacks.

Commend it to you for reading and discussion, perhaps on the relatively quiet topic map discussion lists?

November 25, 2011

Topic Map Query Language (TMQL) – Last Draft

Filed under: TMQL — Patrick Durusau @ 4:25 pm

Topic Map Query Language (TMQL) – Last Draft

The last draft of TMQL – ISO/IEC 18048, has been posted to the SC 34 repository.

There will be no further drafts of the TMQL standard unless and until WG 3 has sufficient resources to take up work in this area in the future.

September 30, 2011

Extending tolog

Filed under: Query Language,TMQL,tolog — Patrick Durusau @ 7:04 pm

Extending tolog by Lars Marius Garshol.

Abstract:

This paper describes a number of extensions that might be made to the tolog query language for topic maps in order to make it fulfill all the requirements for the ISO-standardized TMQL (Topic Map Query Language). First, the lessons to be learned from the considerable body of research into the Datalog query language are considered. Finally, a number of different extensions to the existing tolog query language are considered and evaluated.

This paper extends and improves on earlier work on tolog, first described in [Garshol01].

As you can see from some recent post here, Datalog research continues!

July 4, 2011

Translating SPARQL queries into SQL using R2RML

Filed under: R2RML,SPARQL,SQL,TMQL — Patrick Durusau @ 6:04 pm

Translating SPARQL queries into SQL using R2RML

From the post:

The efficient translation of SPARQL into SQL is an active field of research in the academy and in the industry. In fact, a number of triple stores are built as a layer on top of a relational solution. Support for SPARQL in these RDF stores supposes the translation of the SPARQL query to a SQL query that can be executed in a certain relational schema.

Some foundational papers in the field include “A Relational Algebra for SPARQL” by Richard Cyganiak that translates the semantics of SPARQL as they were finally defined by the W3C to the Relational Algebra semantics or “Semantics preserving SPARQL-to-SQL translation” by Chebotko, Lu and Fotohui, that introduces an algorithm to translate SPARQL queries to SQL queries.

This latter paper is specially interesting because the translation mechanism is parametric on the underlying relational schema. This makes possible to adapt their translation mechanism to any relational database using a couple of mapping functions, alpha and beta, that map a triple pattern of the SPARQL query and a triple pattern and a position in the triple to a table and a column in the database.

Provided that R2RML offers a generic mechanism for the description of relational databases, in order to support SPARQL queries in any R2RML RDF graph, we just need to find an algorithm that receives as an input the R2RML mapping and builds the mapping functions required by Chebotko et alter algorithm.

The straightest way to accomplished that is using the R2RML mapping to generate a virtual table with a single relation with only subject, predicate and object. The mapping for this table is trivial. A possible implementation of this algorithm can be found in the following Clojure code. (I added links to the Cyganiak and Chebotko papers.)

I recommend this post, as well as the Cyganiak and Chebotko papers to anyone interested in TMQL as background reading. Other suggestions?

June 8, 2011

Assignment Ops – TempleScript

Filed under: TMQL — Patrick Durusau @ 10:21 am

A posting on assignment ops by Robert Barta to PASTEBIN, which I reproduce here for your reading pleasure:

# templescript assignment ops

“Robert” => name \ rho # unconditional, replaces all names

“rho” => nickname \ rho # unconditional, replaces only nicknames

“der-rho-ist” +> nickname \ rho # unconditional, adds

() => name \ rho # unconditional, removes all names

#–

“Robert” ||> name \ rho # conditional, assigns only if no name existed

This reminds me of a post I need to finish on suggested comparison operations.

May 16, 2011

Semagia Oomap Loomap

Filed under: Oomap Loomap,TMQL — Patrick Durusau @ 3:32 pm

Semagia Oomap Loomap

From Lars Heuer:

GUI for Topic Maps query languages.

Supported languages:

Similar projects:
* TMQL Console <https://github.com/mhoyer/tmql-console>
* Tamana <https://code.google.com/a/eclipselabs.org/p/tamana/>

May 4, 2011

TMQL4J Documentation and Tutorials

Filed under: TMQL — Patrick Durusau @ 12:09 pm

TMQL4J Documentation and Tutorials

With all the activity on TMQL, consoles and the like, thought it would be good to draw attention to the TMQL4J documentation.

If anyone is interested in writing additional beginner articles or tutorials, this would be a good place to start.

April 19, 2011

TMQL4J 3.1 Released

Filed under: TMQL,TMQL4J — Patrick Durusau @ 9:42 am

TMQL4J 3.1 Released

From the release notes:

  • new behavior of roles axis
    • forward navigation return the roles of an association
    • backward navigation returns the association acts as parent of a role
  • new behavior of players axis
    • forward navigation return the players of a role or all roles of the association
    • backward navigation returns the roles played by the topic
  • new roletypes axis
    • forward navigation results in the role types of an association
    • backward navigation results in the associations having a role with this type
  • new datatype axis
  • new update operator ‘REMOVE’ to update clause
    • avaible for: names, occurrences, characteristics, locators, indicators, item, scope, topics, types, supertypes, subtypes, instances
  • characteristics added as alias for names and occurrences modification
  • add variants anchor as update context
  • new method @IResultSet
    • toTopicMap: If the result set supports this operation, it will return a topic map copy containing only the topics and association contained in
    • toCTM: If the result set supports this operation, it will return a CTM string or stream containing only the topics and association contained in
    • toXTM: If the result set supports this operation, it will return an XTM string or stream containing only the topics and association contained in
    • toJTMQR: If the result set supports this operation, it will return a JTMQR string or stream containing only the topics and association contained in
  • CTMResult
    • rename method resultsAsMergedCTM to toCTM
    • rename method resultsAsTopicMap to toTopicMap
  • XMLResult
    • rename method resultsAsMergedXML to toXML
    • add stream variant of method toXML
  • the arguments on update roles are modified argument before operator is the role type, the value after is the player
  • update-clause allow any value-expression in front of the operator
  • moving some classes
  • new function fn:max expecting two arguments
    • context of count and counts
    • e.g.: fn:max ( // tm:subject , fn:count ( . / tm:name )) to get the maximum number of names a topic instance contains
  • new function fn:min expecting two arguments
    • context of count and counts
    • e.g.: fn:min ( // tm:subject , fn:count ( . / tm:name )) to get the minimum number of names a topic instance contains
  • value-expression supports boolean-expression to return true or false
  • the update handler checks the value of occ and variants according to the datatype (validate the value for the given datatype)
    • a pragma was added to disable this functionality datatype-validation
  • the update of variants and occurrences will reuse the datatype instead of setting string automatically, like TMAPI do
    • a pragma was added to disable this functionality datatype-binding
  • allow prepared argument add new positions:
    • as optional axis argument
    • as part of update value

More usefully, you could install a copy of TMQL4J 3.1 and take a look at: TMQL4J Documentation and Tutorials.

April 15, 2011

TMQL Canonizer

Filed under: TMQL,TMQL4J — Patrick Durusau @ 6:34 am

TMQL Canonizer

This is a new service from the Topic Maps Lab but absent any documentation, it is hard to say what to expect from it.

For example, I took a query from the rather excellent TMQL tutorials by Sven Krosse (also of the Topic Maps Lab):

%prefix o http://psi.ontopia.net/music/
FOR $topic IN // tm:subject
RETURN
IF $topic ISA o:composer
THEN $topic >> indicators
ELSE $topic / tm:name [0]

Fed it to the canonizer and got this result:

QueryExpression([%prefix, o, http://psi.ontopia.net/music/, FOR, $topic, IN, //, tm:subject, RETURN, IF, $topic, ISA, o:composer, THEN, $topic, >>, indicators, ELSE, $topic, /, tm:name, [, 0, ]])
|–EnvironmentClause([%prefix, o, http://psi.ontopia.net/music/])
| |–PrefixDirective([%prefix, o, http://psi.ontopia.net/music/])
|–FlwrExpression([FOR, $topic, IN, //, tm:subject, RETURN, IF, $topic, ISA, o:composer, THEN, $topic, >>, indicators, ELSE, $topic, /, tm:name, [, 0, ]])
|–ForClause([FOR, $topic, IN, //, tm:subject])
| |–BindingSet([$topic, IN, //, tm:subject])
| |–VariableAssignment([$topic, IN, //, tm:subject])
| |–Variable([$topic])
| |–Content([//, tm:subject])
| |–QueryExpression([//, tm:subject])
| |–PathExpression([//, tm:subject])
| |–PostfixedExpression([//, tm:subject])
| |–SimpleContent([//, tm:subject])
| |–Anchor([tm:subject])
| |–Navigation([<<, types]) | |--StepDefinition([<<, types]) | |--Step([<<, types]) |--ReturnClause([RETURN, IF, $topic, ISA, o:composer, THEN, $topic, >>, indicators, ELSE, $topic, /, tm:name, [, 0, ]])
|–Content([IF, $topic, ISA, o:composer, THEN, $topic, >>, indicators, ELSE, $topic, /, tm:name, [, 0, ]])
|–PathExpression([$topic, ISA, o:composer])
| |–ISAExpression([$topic, ISA, o:composer])
| |–SimpleContent([$topic])
| | |–Anchor([$topic])
| |–SimpleContent([o:composer])
| |–Anchor([o:composer])
|–Content([$topic, >>, indicators])
| |–QueryExpression([$topic, >>, indicators])
| |–PathExpression([$topic, >>, indicators])
| |–PostfixedExpression([$topic, >>, indicators])
| |–SimpleContent([$topic, >>, indicators])
| |–Anchor([$topic])
| |–Navigation([>>, indicators])
| |–StepDefinition([>>, indicators])
| |–Step([>>, indicators])
|–Content([$topic, /, tm:name, [, 0, ]])
|–QueryExpression([$topic, /, tm:name, [, 0, ]])
|–PathExpression([$topic, /, tm:name, [, 0, ]])
|–PostfixedExpression([$topic, /, tm:name, [, 0, ]])
|–SimpleContent([$topic, /, tm:name, [, 0, ]])
|–Anchor([$topic])
|–Navigation([/, tm:name, [, 0, ]])
|–StepDefinition([>>, characteristics, tm:name])
| |–Step([>>, characteristics, tm:name])
| |–Anchor([tm:name])
|–StepDefinition([>>, atomify, [, 0, ]])
|–Step([>>, atomify])
|–FilterPostfix([[, 0, ]])
|–Anchor([0])

OK, so I omitted the prefix on composer for the following query:

%prefix o http://psi.ontopia.net/music/
FOR $topic IN // tm:subject
RETURN
IF $topic ISA composer
THEN $topic >> indicators
ELSE $topic / tm:name [0]

Then I get:

QueryExpression([%prefix, o, http://psi.ontopia.net/music/, FOR, $topic, IN, //, tm:subject, RETURN, IF, $topic, ISA, composer, THEN, $topic, >>, indicators, ELSE, $topic, /, tm:name, [, 0, ]])
|–EnvironmentClause([%prefix, o, http://psi.ontopia.net/music/])
| |–PrefixDirective([%prefix, o, http://psi.ontopia.net/music/])
|–FlwrExpression([FOR, $topic, IN, //, tm:subject, RETURN, IF, $topic, ISA, composer, THEN, $topic, >>, indicators, ELSE, $topic, /, tm:name, [, 0, ]])
|–ForClause([FOR, $topic, IN, //, tm:subject])
| |–BindingSet([$topic, IN, //, tm:subject])
| |–VariableAssignment([$topic, IN, //, tm:subject])
| |–Variable([$topic])
| |–Content([//, tm:subject])
| |–QueryExpression([//, tm:subject])
| |–PathExpression([//, tm:subject])
| |–PostfixedExpression([//, tm:subject])
| |–SimpleContent([//, tm:subject])
| |–Anchor([tm:subject])
| |–Navigation([<<, types]) | |--StepDefinition([<<, types]) | |--Step([<<, types]) |--ReturnClause([RETURN, IF, $topic, ISA, composer, THEN, $topic, >>, indicators, ELSE, $topic, /, tm:name, [, 0, ]])
|–Content([IF, $topic, ISA, composer, THEN, $topic, >>, indicators, ELSE, $topic, /, tm:name, [, 0, ]])
|–PathExpression([$topic, ISA, composer])
| |–ISAExpression([$topic, ISA, composer])
| |–SimpleContent([$topic])
| | |–Anchor([$topic])
| |–SimpleContent([composer])
| |–Anchor([composer])
|–Content([$topic, >>, indicators])
| |–QueryExpression([$topic, >>, indicators])
| |–PathExpression([$topic, >>, indicators])
| |–PostfixedExpression([$topic, >>, indicators])
| |–SimpleContent([$topic, >>, indicators])
| |–Anchor([$topic])
| |–Navigation([>>, indicators])
| |–StepDefinition([>>, indicators])
| |–Step([>>, indicators])
|–Content([$topic, /, tm:name, [, 0, ]])
|–QueryExpression([$topic, /, tm:name, [, 0, ]])
|–PathExpression([$topic, /, tm:name, [, 0, ]])
|–PostfixedExpression([$topic, /, tm:name, [, 0, ]])
|–SimpleContent([$topic, /, tm:name, [, 0, ]])
|–Anchor([$topic])
|–Navigation([/, tm:name, [, 0, ]])
|–StepDefinition([>>, characteristics, tm:name])
| |–Step([>>, characteristics, tm:name])
| |–Anchor([tm:name])
|–StepDefinition([>>, atomify, [, 0, ]])
|–Step([>>, atomify])
|–FilterPostfix([[, 0, ]])
|–Anchor([0])

So then I enter a query that omits the “$” from the second instance of topic:

%prefix o http://psi.ontopia.net/music/
FOR $topic IN // tm:subject
RETURN
IF topic ISA o:composer
THEN $topic >> indicators
ELSE $topic / tm:name [0]

You can enter that one for yourself. No substantive change in result.

By omitting the “$” from all instances of topic I was finally able to get “an invalid expression” result.

Do note that the following is treated as a valid expression:

%prefix o http://psi.ontopia.net/music/
FOR $topic IN // tm:subject
RETURN
IF topic ISA o:composer
THEN topic >> indicators
ELSE topic / tm:name [0]

A bit more attention to documentation would go a long way to making this a useful project.

*****
PS: From the 2008 TMQL draft:

Examples for invalid variables are x (sigil missing),

April 2, 2011

EuPathDB

Filed under: Bioinformatics,Biomedical,TMQL — Patrick Durusau @ 5:29 pm

EuPathDB

From the website:

EuPathDB Bioinformatics Resource Center for Biodefense and Emerging/Re-emerging Infectious Diseases is a portal for accessing genomic-scale datasets associated with the eukaryotic pathogens (Cryptosporidium, Encephalitozoon, Entamoeba, Enterocytozoon, Giardia, Leishmania, Neospora, Plasmodium, Toxoplasma, Trichomonas and Trypanosoma).

OK, other than being cited in the previous post about integration using ontologies, why is this relevant to topic maps?

Check out the web tutorial on search strategies.

Now imagine being able to select/revise/view results for a TMQL query.

It would take some work and no doubt be domain specific, but I thought the example would be worth bringing to your attention.

Not to mention that these are data sets where improved access using topic maps could attract attention.

March 25, 2011

TMQL Slides for Prague (was incorrectly Leipzig)!

Filed under: TMQL — Patrick Durusau @ 4:32 pm

TMQL Slides for Prague are now available!

Rani Pinchuk prepared slides for discussion in Prague next week.

We should all be appreciative and use this opportunity to provide useful feedback to the editors.

That does not imply that anyone will agree with any particular point but it is possible to express disagreement in a polite way.

I will try to remind myself of that as much as anyone else. 😉

******
Apologies for the incorrect title! Thanks Benjamin!

March 1, 2011

String Syntax – Post

Filed under: String Matching,TMQL — Patrick Durusau @ 10:13 am

String Syntax

Since TMQL discussions are starting up, it seem appropriate to point out at least one resource on string syntax.

BTW, I am assuming that the TMQL draft will not have any fewer capabilities than any of the extant implementations?

Is anyone assuming differently?

TMQL Slides for Prague 2011

Filed under: TMQL,TMRM,Topic Maps — Patrick Durusau @ 10:12 am

TMQL Slides for Prague 2011

TMQL slides with discussion points for Prague have been posted!

Please review even if you don’t plan on attending the Prague meeting to offer your comments and questions.

Comments and questions I am sure are always welcome, but are more useful if received prior to weeks if not months of preparing standards prose.

Since I ask, I have several questions (some of which will probably have to be answered post-Prague):

1st Question:

While I understand the utility of the illustrated syntax reflected on the slides, I am more concerned with the underlying formal model for TMQL. Syntax and its explanation for users is very important, but that can take many forms. Can you say a bit more about the underlying formal model that underlies TMQL?

2nd Question:

See my blog post on Indexing by Properties. To what extent is TMQL going to support the use of multiple properties (occurrences) for the purposes of identifications?

3rd Question:

What datatypes will be supported by TMQL? How are additional datatypes declared?

4th Question:

What comparison operators are supported by TMQL?

February 17, 2011

TMQL4J 3.0 release!

Filed under: TMQL,TMQL4J — Patrick Durusau @ 6:59 am

TMQL4J 3.0 release!

From the website:

The new version 3.0.0 of the tmql4j query suite was released at google code. In this version tmql4j is more flexible and powerful to satisfy every business use case.

The new version 3.0.0 of the tmql4j query suite was released at google code. In this version tmql4j is more flexible and powerful to satisfy every business use case.

As a major modification, the engine architecture and processing model was changed. The new suite contains two different TMQL runtimes, one for each TMQL draft. The drafts are split to avoid ambiguity and conflicts during the querying process. The stack-based processing model is replaced by a more flexible one to enable multi-threaded optimizations.

Each style of the 2008 draft and each part of the topic map modification language ( TMQL-ML ) has been realized in different modules. Because of that, the user can decide which styles and parts of the query language should be supported.

In addition, a new language module was added to enable flexible template definitions, which enables control of the result format of the querying process in the most powerful way. Templates can be used to return results in HTML, XML, JSON or any other format. The results will be embedded automatically by the query processor.

Looking forward to reviewing the documentation. Quite possibly posting some additional exercise material.

January 21, 2011

Feldspar: A System for Finding Information by Association

Filed under: Associations,Query Language,TMQL,Visual Query Language — Patrick Durusau @ 5:28 pm

Feldspar: A System for Finding Information by Association

…use non-specific requirements to find specific things.

Uses associations to build queries.

Associations developed by Google Desktop.

Very cool!

January 7, 2011

Provenance for Aggregate Queries

Filed under: Aggregation,Merging,Query Language,TMQL — Patrick Durusau @ 7:19 am

Provenance for Aggregate Queries Authors: Yael Amsterdamer, Daniel Deutch, Val Tannen

Abstract:

We study in this paper provenance information for queries with aggregation. Provenance information was studied in the context of various query languages that do not allow for aggregation, and recent work has suggested to capture provenance by annotating the different database tuples with elements of a commutative semiring and propagating the annotations through query evaluation. We show that aggregate queries pose novel challenges rendering this approach inapplicable. Consequently, we propose a new approach, where we annotate with provenance information not just tuples but also the individual values within tuples, using provenance to describe the values computation. We realize this approach in a concrete construction, first for “simple” queries where the aggregation operator is the last one applied, and then for arbitrary (positive) relational algebra queries with aggregation; the latter queries are shown to be more challenging in this context. Finally, we use aggregation to encode queries with difference, and study the semantics obtained for such queries on provenance annotated databases.

Not for the faint of heart reading.

But, provenance for merging is one obvious application of this paper.

For that matter, provenance should also be a consideration for TMQL.

December 11, 2010

Cascalog

Filed under: Cascalog,Clojure,Hadoop,TMQL — Patrick Durusau @ 3:23 pm

Cascalog

From the website:

Cascalog is a tool for processing data on Hadoop with Clojure in a concise and expressive manner. Cascalog combines two cutting edge technologies in Clojure and Hadoop and resurrects an old one in Datalog. Cascalog is high performance, flexible, and robust.

Most query languages, like SQL, Pig, and Hive, are custom languages — and this leads to huge amounts of accidental complexity. Constructing queries dynamically by doing string manipulation is an impedance mismatch and makes usual programming techniques like abstraction and composition difficult.

Cascalog queries are first-class within Clojure and are extremely composable. Additionally, the Datalog syntax of Cascalog is simpler and more expressive than SQL-based languages.

Follow the getting started steps, check out the tutorial, and you’ll be running Cascalog queries on your local computer within 5 minutes.

Seems like I have heard the term datalog in TMQL discussions. 😉

I wonder what it would be like to define TMQL operators in Cascalog so that all the other capabilities of Cascalog are also available?

When the next draft appears that will be an interesting question to explore.

December 8, 2010

Aspects of Topic Maps

Writing about Bobo: Fast Faceted Search With Lucene, made me start to think about the various aspects of topic maps.

Authoring of topic maps is something that was never discussed in the original HyTime based topic map standard and despite several normative syntaxes, mostly even now it is either you have a topic map, or you don’t. Depending upon your legend.

Which is helpful given the unlimited semantics that can be addressed with topic maps but looks awfully hand-wavy to, ahem, outsiders.

Subject Identity or should I say: when two subject representatives are deemed for some purpose to represent the same subject. (That’s clearer. ;-)) This lies at the heart of topic maps and the rest of the paradigm supports or is consequences of this principle.

There is no one way to identify any subject and users should be free to use the identification that suits them best. Where subjects include the data structures that we build for users. Yes, IT doesn’t get to dictate what subjects can be identified or how. (Probably should have never been the case but that is another issue.)

Merging of subject representatives. Merging is an aspect of recognizing two or more subject representatives represent the same subject. What happens then is implementation, data model and requirement specific.

A user may wish to see separate representatives just prior to merger so merging can be audited or may wish to see only merged representatives for some subset of subjects or may have other requirements.

Interchange of topic maps. Not exclusively the domain of syntaxes/data models but an important purpose for them. It is entirely possible to have topic maps for which no interchange is intended or desirable. Rumor has it of the topic maps at the Y-12 facility at Oak Ridge for example. Interchange was not their purpose.

Navigation of the topic map. The post that provoked this one is a good example. I don’t need specialized or monolithic software to navigate a topic map. It hampers topic map development to suggest otherwise.

Querying topic maps. Topic maps have been slow to develop a query language and that effort has recently re-started. Graph query languages, that are already fairly mature, may be sufficient for querying topic maps.

Given the diversity of subject identity semantics, I don’t foresee a one size fits all topic maps query language.

Interfaces for topic maps. However one resolves/implements other aspects of topic maps, due regard has to be paid to the issue of interfaces. Efforts thus far range from web portals to “look its a topic map!” type interface.

In the defense of current efforts, human-computer interfaces are poorly understood. Not surprising since the human-codex interface isn’t completely understood and we have been working at that one considerably longer.

Questions:

  1. What other aspects to topic maps would you list?
  2. Would you sub-divide any of these aspects? If so, how?
  3. What suggestions do you have for one or more of these aspects?

December 3, 2010

Neo4j 1.2 Milestone 5 – Reference Manual and HA! – Post (Protends for TMQL?)

Filed under: Graphs,Neo4j,Query Language,TMQL — Patrick Durusau @ 9:27 am

Neo4J 1.2 Milestone 5 – Reference Manual and HA!

News of the release of a reference manual for Neo4j and a High Availability option (the HA in the title).

I know it is a reference manual but I was disappointed there was no mention of topic maps.

Surprising I know but it still happens. 😉

Guess I need to try to find the cycles to generate, collaborate on, etc., some documentation that can be posted to the topic maps community for review.

Assuming it passes muster there, it can be passed along to the Neo4j project.

BTW, I found a “related” article listed for Neo4j that starts off:

A multi-relational graph maintains two or more relations over a vertex set. This article defines an algebra for traversing such graphs that is based on an $n$-ary relational algebra, a concatenative single-relational path algebra, and a tensor-based multi-relational algebra. The presented algebra provides a monoid, automata, and formal language theoretic foundation for the construction of a multi-relational graph traversal engine.

Can’t you just hear Robert saying that with a straight face? 😉

Seriously, if we are going to compete with enterprise grade solutions, that is the level of thinking that needs to underlie TMQL.

It is going to require effort on all our parts but “good enough” solutions aren’t and should not be supported.

December 1, 2010

Semantic Overlay Networks for P2P Systems

Filed under: Semantic Overlay Network,TMQL — Patrick Durusau @ 1:06 pm

Semantic Overlay Networks for P2P Systems Authors: Garcia-Molina, Hector and Crespo, Arturo

Date: 2003

Abstract:

In a peer-to-peer (P2P) system, nodes typically connect to a small set of random nodes (their neighbors), and queries are propagated along these connections. Such query flooding tends to be very expensive. We propose that node connections be influenced by content, so that for example, nodes having many “Jazz” files will connect to other similar nodes. Thus, semantically related nodes form a Semantic Overlay Network (SON). Queries are routed to the appropriate SONs, increasing the chances that matching files will be found quickly, and reducing the search load on nodes that have unrelated content. We have evaluated SONs by using an actual snapshot of music-sharing clients. Our results show that SONs can significantly improve query performance while at the same time allowing users to decide what content to put in their computers and to whom to connect.

The root article for the term Semantic Overlay Network that I mentioned last summer, Semantic Overlay Networks.

The emphasis on query and query efficiency seems particularly relevant for work on TMQL.

October 29, 2010

TMQL Notes from Leipzig

Filed under: Information Retrieval,TMQL,Topic Maps — Patrick Durusau @ 4:48 am

TMQL language proposal – apart from Path Language have been posted to the SC 34 document repository for your review and comments!

Deeply appreciate Lars Marius Garshol leading the discussion.

Now is the time for your comments and suggestions.

Even better, trial implementations of present and requested features.

One of the best ways to argue for a feature is to show it in working code.

Or even better, when applied to show results not otherwise available.

September 12, 2010

Cartesian Products and Topic Maps

Filed under: TMQL,TMRM — Patrick Durusau @ 6:50 pm

Using SQL Cross Join – the report writers secret weapon is a very clear explanation of the utility of cross-joins in SQL.

Cross-join = Cartesian product, something you will remember from the Topic Maps Reference Model.

Makes a robust where clause look important doesn’t it?

September 5, 2010

Experience in Extending Query Engine for Continuous Analytics

Filed under: Data Integration,Data Mining,SQL,TMQL,Uncategorized — Patrick Durusau @ 4:37 pm

Experience in Extending Query Engine for Continuous Analytics by Qiming Chen and Meichun Hsu has this problem statement:

Streaming analytics is a data-intensive computation chain from event streams to analysis results. In response to the rapidly growing data volume and the increasing need for lower latency, Data Stream Management Systems (DSMSs) provide a paradigm shift from the load-first analyze-later mode of data warehousing….

Moving from load-first analyze-later has implications for topic maps over data warehouses. Particularly when events that are subjects may only have a transient existence in a data stream.

This is on my reading list to prepare to discuss TMQL in Leipzig.

PS: Only five days left to register for TMRA 2010. It is a don’t miss event.

August 9, 2010

TMRA – WG 3 Meeting!

Filed under: Conferences,TMQL,Topic Maps — Patrick Durusau @ 6:30 pm

JTC 1/SC 34 WG 3 (ok, Topic Maps) working group will be meeting two days before TMRA starts in Liepzig, Germany! That is 27-28 September 2010. (Location details forthcoming.)

The main focus of the meeting will be TMQL.

Make it a week in Liepzig!

July 30, 2010

The Syntactic Web: Syntax and Semantics on the Web

Filed under: TMQL — Patrick Durusau @ 3:55 pm

The Syntactic Web: Syntax and Semantics on the Web by Jonathan Robie describes the use of XQuery to query both RDF and topic maps.

I ran across it while I was getting a Markup Language journal set ready for auction at Balisage.

Given Jonathan’s depth of experience with query languages, something to help decide what the community wants from TMQL.

July 29, 2010

Complete TMQL Tutorial!

Filed under: TMQL — Patrick Durusau @ 6:52 pm

Complete TMQL Tutorial!

A second plug for the TMQL tutorials from the Topic Maps Lab

Work through the tutorials and discuss what you like/don’t like on one of the topic map mailing lists!

Get in shape for the TMQL discussions at TMRA!

If you don’t speak up, others will have no opinions but their own about what the topic map community wants.

The more opinions we have, the richer the result for the community.

******
PS: Please send feedback to Sven Krosse and favorable feedback to his director, Dr. Lutz Maicher.

😉

June 28, 2010

TMQL Tutorials – Announcement

Filed under: Examples,TMQL — Patrick Durusau @ 10:24 am

Topic Maps Lab is releasing a five (5) [sorry, 2010-07-07, reported to be eight (8) parts. I suspect that will change too. 😉 ] part series of tutorial on TMQL!

Will update this list as other parts appear.

If you are logged into Maiana you can do all the exercises there.

The tutorials are in German so either you can improve your technical German, or translate them for yourself and the community.

*****

On a personal note, we have long discussed how somebody ought to do something to better promote topic maps. Well, several people are doing something. A lot of somethings. The question we have to ask ourselves (not others, ourselves), is how we can contribute to those efforts or make other contributions?

June 18, 2010

TMQL4J suite 2.6.3 Released

Filed under: Search Engines,TMQL,Topic Map Software — Patrick Durusau @ 8:31 am

The Topic Maps Lab is becoming a hotbed of topic map software development.

TMQL4J 2.6.3 was released this week with the following features:

    New query factory – now it is possible to implement your own query types. If the query provides a transformation algorithm, it may be converted to a TMQL query and processed by the tmql4j engine.

  • New language processing – the two core modules ( the lexical scanner and the parser ) were rewritten to become more flexible and stable. The lexical scanner provides new methods to register your own language tokens ( as language extension ) or your own non-canonical tokens.
  • Default prefix – the engine provides the functionality of defining a default prefix in the context of the runtime. The prefix can be used without a specific pattern in the context of a query.
  • New interfaces – the interfaces were reorganized to enable an intuitive usage and understanding of the engine itself.

Plus a plugin architecture with plugins for Tmql4Ontopia, TmqlDraft2010, and TopicMapModificationLanguage. See the announcement for the details.

See also TMQL4J Documentation and Tutorials.

Interested your experiences with the interfaces which “…enable an intuitive usage and understanding of the engine itself.”

Older Posts »

Powered by WordPress