Archive for the ‘TMQL4J’ Category

Scalability of Topic Map Systems

Saturday, April 28th, 2012

Scalability of Topic Map Systems, thesis by Marcel Hoyer.

Abstract:

The purpose of this thesis was to find approaches solving major performance and scalability issues for Topic Maps-related data access and the merging process. Especially regarding the management of multiple, heterogeneous topic maps with different sizes and structures. Hence the scope of the research was mainly focused on the Maiana web application with its underlying MaJorToM and TMQL4J back-end.

In the first instance the actual problems were determined by profiling the application runtime, creating benchmarks and discussing the current architecture of the Maiana stack. By presenting different distribution technologies afterwards the issues around a single-process instance, slow data access and concurrent request handling were investigated to determine possible solutions. Next to technological aspects (i. e. frameworks or applications) this discussion included fundamental reflection of design patterns for distributed environments that indicated requirements for changes in the use of the Topic Maps API and data flow between components. With the development of the JSON Topic Maps Query Result format and simple query-focused interfaces the essential concept for an prototypical implementation was established. To concentrate on scalability for query processing basic principles and benefits of message-oriented middleware were presented. Those were used in combination with previous results to create a distributed Topic Maps query service and to present ideas about optimizing virtual merging of topic maps.

Finally this work gave multiple insights to improve the architecture and performance of Topic Maps-related applications by depicting concrete bottlenecks and providing prototypical implementations that show the feasibility of the approaches. But it also pointed out remaining performance issues in the persisting data layer.

I have just started reading Marcel’s thesis but I am already impressed by the evaluation of Maiana. I am sure this work will be useful in planning options for future topic map stacks.

Commend it to you for reading and discussion, perhaps on the relatively quiet topic map discussion lists?

TMQL4J 3.1 Released

Tuesday, April 19th, 2011

TMQL4J 3.1 Released

From the release notes:

  • new behavior of roles axis
    • forward navigation return the roles of an association
    • backward navigation returns the association acts as parent of a role
  • new behavior of players axis
    • forward navigation return the players of a role or all roles of the association
    • backward navigation returns the roles played by the topic
  • new roletypes axis
    • forward navigation results in the role types of an association
    • backward navigation results in the associations having a role with this type
  • new datatype axis
  • new update operator ‘REMOVE’ to update clause
    • avaible for: names, occurrences, characteristics, locators, indicators, item, scope, topics, types, supertypes, subtypes, instances
  • characteristics added as alias for names and occurrences modification
  • add variants anchor as update context
  • new method @IResultSet
    • toTopicMap: If the result set supports this operation, it will return a topic map copy containing only the topics and association contained in
    • toCTM: If the result set supports this operation, it will return a CTM string or stream containing only the topics and association contained in
    • toXTM: If the result set supports this operation, it will return an XTM string or stream containing only the topics and association contained in
    • toJTMQR: If the result set supports this operation, it will return a JTMQR string or stream containing only the topics and association contained in
  • CTMResult
    • rename method resultsAsMergedCTM to toCTM
    • rename method resultsAsTopicMap to toTopicMap
  • XMLResult
    • rename method resultsAsMergedXML to toXML
    • add stream variant of method toXML
  • the arguments on update roles are modified argument before operator is the role type, the value after is the player
  • update-clause allow any value-expression in front of the operator
  • moving some classes
  • new function fn:max expecting two arguments
    • context of count and counts
    • e.g.: fn:max ( // tm:subject , fn:count ( . / tm:name )) to get the maximum number of names a topic instance contains
  • new function fn:min expecting two arguments
    • context of count and counts
    • e.g.: fn:min ( // tm:subject , fn:count ( . / tm:name )) to get the minimum number of names a topic instance contains
  • value-expression supports boolean-expression to return true or false
  • the update handler checks the value of occ and variants according to the datatype (validate the value for the given datatype)
    • a pragma was added to disable this functionality datatype-validation
  • the update of variants and occurrences will reuse the datatype instead of setting string automatically, like TMAPI do
    • a pragma was added to disable this functionality datatype-binding
  • allow prepared argument add new positions:
    • as optional axis argument
    • as part of update value

More usefully, you could install a copy of TMQL4J 3.1 and take a look at: TMQL4J Documentation and Tutorials.

TMQL Canonizer

Friday, April 15th, 2011

TMQL Canonizer

This is a new service from the Topic Maps Lab but absent any documentation, it is hard to say what to expect from it.

For example, I took a query from the rather excellent TMQL tutorials by Sven Krosse (also of the Topic Maps Lab):

%prefix o http://psi.ontopia.net/music/
FOR $topic IN // tm:subject
RETURN
IF $topic ISA o:composer
THEN $topic >> indicators
ELSE $topic / tm:name [0]

Fed it to the canonizer and got this result:

QueryExpression([%prefix, o, http://psi.ontopia.net/music/, FOR, $topic, IN, //, tm:subject, RETURN, IF, $topic, ISA, o:composer, THEN, $topic, >>, indicators, ELSE, $topic, /, tm:name, [, 0, ]])
|–EnvironmentClause([%prefix, o, http://psi.ontopia.net/music/])
| |–PrefixDirective([%prefix, o, http://psi.ontopia.net/music/])
|–FlwrExpression([FOR, $topic, IN, //, tm:subject, RETURN, IF, $topic, ISA, o:composer, THEN, $topic, >>, indicators, ELSE, $topic, /, tm:name, [, 0, ]])
|–ForClause([FOR, $topic, IN, //, tm:subject])
| |–BindingSet([$topic, IN, //, tm:subject])
| |–VariableAssignment([$topic, IN, //, tm:subject])
| |–Variable([$topic])
| |–Content([//, tm:subject])
| |–QueryExpression([//, tm:subject])
| |–PathExpression([//, tm:subject])
| |–PostfixedExpression([//, tm:subject])
| |–SimpleContent([//, tm:subject])
| |–Anchor([tm:subject])
| |–Navigation([<<, types]) | |--StepDefinition([<<, types]) | |--Step([<<, types]) |--ReturnClause([RETURN, IF, $topic, ISA, o:composer, THEN, $topic, >>, indicators, ELSE, $topic, /, tm:name, [, 0, ]])
|–Content([IF, $topic, ISA, o:composer, THEN, $topic, >>, indicators, ELSE, $topic, /, tm:name, [, 0, ]])
|–PathExpression([$topic, ISA, o:composer])
| |–ISAExpression([$topic, ISA, o:composer])
| |–SimpleContent([$topic])
| | |–Anchor([$topic])
| |–SimpleContent([o:composer])
| |–Anchor([o:composer])
|–Content([$topic, >>, indicators])
| |–QueryExpression([$topic, >>, indicators])
| |–PathExpression([$topic, >>, indicators])
| |–PostfixedExpression([$topic, >>, indicators])
| |–SimpleContent([$topic, >>, indicators])
| |–Anchor([$topic])
| |–Navigation([>>, indicators])
| |–StepDefinition([>>, indicators])
| |–Step([>>, indicators])
|–Content([$topic, /, tm:name, [, 0, ]])
|–QueryExpression([$topic, /, tm:name, [, 0, ]])
|–PathExpression([$topic, /, tm:name, [, 0, ]])
|–PostfixedExpression([$topic, /, tm:name, [, 0, ]])
|–SimpleContent([$topic, /, tm:name, [, 0, ]])
|–Anchor([$topic])
|–Navigation([/, tm:name, [, 0, ]])
|–StepDefinition([>>, characteristics, tm:name])
| |–Step([>>, characteristics, tm:name])
| |–Anchor([tm:name])
|–StepDefinition([>>, atomify, [, 0, ]])
|–Step([>>, atomify])
|–FilterPostfix([[, 0, ]])
|–Anchor([0])

OK, so I omitted the prefix on composer for the following query:

%prefix o http://psi.ontopia.net/music/
FOR $topic IN // tm:subject
RETURN
IF $topic ISA composer
THEN $topic >> indicators
ELSE $topic / tm:name [0]

Then I get:

QueryExpression([%prefix, o, http://psi.ontopia.net/music/, FOR, $topic, IN, //, tm:subject, RETURN, IF, $topic, ISA, composer, THEN, $topic, >>, indicators, ELSE, $topic, /, tm:name, [, 0, ]])
|–EnvironmentClause([%prefix, o, http://psi.ontopia.net/music/])
| |–PrefixDirective([%prefix, o, http://psi.ontopia.net/music/])
|–FlwrExpression([FOR, $topic, IN, //, tm:subject, RETURN, IF, $topic, ISA, composer, THEN, $topic, >>, indicators, ELSE, $topic, /, tm:name, [, 0, ]])
|–ForClause([FOR, $topic, IN, //, tm:subject])
| |–BindingSet([$topic, IN, //, tm:subject])
| |–VariableAssignment([$topic, IN, //, tm:subject])
| |–Variable([$topic])
| |–Content([//, tm:subject])
| |–QueryExpression([//, tm:subject])
| |–PathExpression([//, tm:subject])
| |–PostfixedExpression([//, tm:subject])
| |–SimpleContent([//, tm:subject])
| |–Anchor([tm:subject])
| |–Navigation([<<, types]) | |--StepDefinition([<<, types]) | |--Step([<<, types]) |--ReturnClause([RETURN, IF, $topic, ISA, composer, THEN, $topic, >>, indicators, ELSE, $topic, /, tm:name, [, 0, ]])
|–Content([IF, $topic, ISA, composer, THEN, $topic, >>, indicators, ELSE, $topic, /, tm:name, [, 0, ]])
|–PathExpression([$topic, ISA, composer])
| |–ISAExpression([$topic, ISA, composer])
| |–SimpleContent([$topic])
| | |–Anchor([$topic])
| |–SimpleContent([composer])
| |–Anchor([composer])
|–Content([$topic, >>, indicators])
| |–QueryExpression([$topic, >>, indicators])
| |–PathExpression([$topic, >>, indicators])
| |–PostfixedExpression([$topic, >>, indicators])
| |–SimpleContent([$topic, >>, indicators])
| |–Anchor([$topic])
| |–Navigation([>>, indicators])
| |–StepDefinition([>>, indicators])
| |–Step([>>, indicators])
|–Content([$topic, /, tm:name, [, 0, ]])
|–QueryExpression([$topic, /, tm:name, [, 0, ]])
|–PathExpression([$topic, /, tm:name, [, 0, ]])
|–PostfixedExpression([$topic, /, tm:name, [, 0, ]])
|–SimpleContent([$topic, /, tm:name, [, 0, ]])
|–Anchor([$topic])
|–Navigation([/, tm:name, [, 0, ]])
|–StepDefinition([>>, characteristics, tm:name])
| |–Step([>>, characteristics, tm:name])
| |–Anchor([tm:name])
|–StepDefinition([>>, atomify, [, 0, ]])
|–Step([>>, atomify])
|–FilterPostfix([[, 0, ]])
|–Anchor([0])

So then I enter a query that omits the “$” from the second instance of topic:

%prefix o http://psi.ontopia.net/music/
FOR $topic IN // tm:subject
RETURN
IF topic ISA o:composer
THEN $topic >> indicators
ELSE $topic / tm:name [0]

You can enter that one for yourself. No substantive change in result.

By omitting the “$” from all instances of topic I was finally able to get “an invalid expression” result.

Do note that the following is treated as a valid expression:

%prefix o http://psi.ontopia.net/music/
FOR $topic IN // tm:subject
RETURN
IF topic ISA o:composer
THEN topic >> indicators
ELSE topic / tm:name [0]

A bit more attention to documentation would go a long way to making this a useful project.

*****
PS: From the 2008 TMQL draft:

Examples for invalid variables are x (sigil missing),

TMQL4J 3.0 release!

Thursday, February 17th, 2011

TMQL4J 3.0 release!

From the website:

The new version 3.0.0 of the tmql4j query suite was released at google code. In this version tmql4j is more flexible and powerful to satisfy every business use case.

The new version 3.0.0 of the tmql4j query suite was released at google code. In this version tmql4j is more flexible and powerful to satisfy every business use case.

As a major modification, the engine architecture and processing model was changed. The new suite contains two different TMQL runtimes, one for each TMQL draft. The drafts are split to avoid ambiguity and conflicts during the querying process. The stack-based processing model is replaced by a more flexible one to enable multi-threaded optimizations.

Each style of the 2008 draft and each part of the topic map modification language ( TMQL-ML ) has been realized in different modules. Because of that, the user can decide which styles and parts of the query language should be supported.

In addition, a new language module was added to enable flexible template definitions, which enables control of the result format of the querying process in the most powerful way. Templates can be used to return results in HTML, XML, JSON or any other format. The results will be embedded automatically by the query processor.

Looking forward to reviewing the documentation. Quite possibly posting some additional exercise material.