Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

November 14, 2013

Querying rich text with Lux

Filed under: Lucene,Query Language,XML,XQuery — Patrick Durusau @ 11:17 am

Querying rich text with Lux – XQuery for Lucene by Michael Sokolov.

Slide deck that highlights features of Lux, which is billed at its homepage as:

The XML Search Engine Lux is an open source XML search engine formed by fusing two excellent technologies: the Apache Lucene/Solr search index and the Saxon XQuery/XSLT processor.

Not surprisingly, I am in favor of using XML to provide context for data.

You can get a better feel for Lux by:

Reading Indexing Queries in Lux by Michael Sokolov (Balisage 2013)

Visiting the Lux homepage: http://luxdb.org

Downloading Lux Source: http://github.com/msokolov/lux

BTW, Michael does have experience with XML based content: safaribooksonline.com, oed.com, degruyter.com, oxfordreference.com and others.

PS: Remember any comments on XQuery 3.0 are due by November 19, 2013.

October 22, 2013

X* 3.0 Proposed Recommendations

Filed under: XML,XPath,XQuery,XSLT — Patrick Durusau @ 8:01 pm

XQuery 3.0, XPath 3.0, Data Model, Functions and Operators and XSLT and XQuery Serialization 3.0

From the post:

The XML Query Working Group and the XSLT Working Group have published five Proposed Recommendations today:

Comments are welcome through 19 November. Learn more about the Extensible Markup Language (XML) Activity.

What’s today? October 22nd?

You almost have 30 days. 😉

Which one or more are you going to read?

I first saw this in a tweet by Jonathan Robie.

August 7, 2013

BaseX 7.7 has been released!

Filed under: BaseX,XML,XPath,XQuery — Patrick Durusau @ 6:27 pm

BaseX 7.7 has been released!

From the webpage:

BaseX is a very light-weight, high-performance and scalable XML Database engine and XPath/XQuery 3.0 Processor, including full support for the W3C Update and Full Text extensions. An interactive and user-friendly GUI frontend gives you great insight into your XML documents.

To maximize your productivity and workflows, we offer professional support, highly customized software solutions and individual trainings on XML, XQuery and BaseX. Our product itself is completely Open Source (BSD-licensed) and platform independent; join our mailing lists to get regular updates!

But most important: BaseX runs out of the box and is easy to use…

This was a fortunate find. I have some XML work coming up and need to look at the latest offerings.

June 18, 2013

Lux (0.9 – New Release)

Filed under: Indexing,Lux,XML,XQuery — Patrick Durusau @ 12:46 pm

Lux – The XML Search Engine

From the webpage:

Lux is an open source XML search engine formed by fusing two excellent technologies: the Apache Lucene/Solr search index and the Saxon XQuery/XSLT processor.

Release notes for 0.9 (released today)

This looks quite promising!

May 28, 2013

Four and Twenty < / > ! Baked in a Pie…

Filed under: Conferences,XML,XML Database,XML Query Rewriting,XML Schema,XQuery,XSLT — Patrick Durusau @ 2:53 pm

Balisage 2013 program is online!

From Tommie Usdin’s email:

Balisage is an annual conference devoted to the theory and practice of descriptive markup and related technologies for structuring and managing information. Participants typically include XML users, librarians, archivists, computer scientists, XSLT and XQuery programmers, implementers of XSLT and XQuery engines and other markup-related software, Topic-Map enthusiasts, semantic-Web evangelists, members of the working groups which define the specifications, academics, industrial researchers, representatives of governmental bodies and NGOs, industrial developers, practitioners, consultants, and the world’s greatest concentration of markup theorists. Discussion is open, candid, and unashamedly technical.

Major features of this year’s program include several challenges to the fundamental infrastructure of XML; case studies from government, academia, and publishing; approaches to overlapping data structures; discussions of XML’s political fortunes; and technical papers on XML, XForms, XQuery, REST, XSLT, RDF, XSL-FO, XSD, the DOM, JSON, and XPath.

Attending Balisage even once will keep you from repeating mistakes in language design.

Attending Balisage twice will mark you as a markup expert.

Attending Balisage three or more times, well, this is an open channel so we can’t go there.

But you should go to Balisage!

Send your pics from Saint Catherine Street!

March 16, 2013

Lux

Filed under: Lucene,Saxon,Solr,XQuery,XSLT — Patrick Durusau @ 7:51 pm

Lux

From the readme:

Lux is an open source XML search engine formed by fusing two excellent technologies: the Apache Lucene/Solr search index and the Saxon XQuery/XSLT processor.

At its core, Lux provides XML-aware indexing, an XQuery 1.0 optimizer that rewrites queries to use the indexes, and a function library for interacting with Lucene via XQuery. These capabilities are tightly integrated with Solr, and leverage its application framework in order to deliver a REST service and application server.

The REST service is accessible to applications written in almost any language, but it will be especially convenient for developers already using Solr, for whom Lux operates as a Solr plugin that provides query services using the same REST APIs as other Solr search plugins, but using a different query language (XQuery). XML documents may be inserted (and updated) using standard Solr REST calls: XML-aware indexing is triggered by the presence of an XML-aware field in a document. This means that existing application frameworks written in many different languages are positioned to use Lux as a drop-in capability for indexing and querying semi-structured content.

The application server is a great way to get started with Lux: it provides the ability to write a complete application in XQuery and XSLT with data storage backed by Lucene.

If you are looking for experience with XQuery and Lucene/Solr, look no further!

May be a good excuse for me to look at defining equivalence statements using XQuery.

I first saw this in a tweet by Michael Kay.

February 19, 2013

“…XML User Interfaces” As in Using XML?

Filed under: Conferences,Interface Research/Design,XML,XML Schema,XPath,XQuery,XSLT — Patrick Durusau @ 1:00 pm

International Symposium on Native XML user interfaces

This came across the wire this morning and I need your help interpreting it.

Why would you want to have an interface to XML?

All these years I have been writing XML in Emacs because XML wasn’t supposed to have an interface.

Brave hearts, male, female and unknown, struggling with issues too obscure for mere mortals.

Now I find that isn’t supposed to be so? You can imagine my reaction.

I moved my laptop a bit closer to the peat fire to make sure I read it properly. Waiting for the ox cart later this week to take my complaint to the local bishop about this disturbing innovation.

😉

15 March 2013 — Peer review applications due
19 April 2013 — Paper submissions due
19 April 2013 — Applications due for student support awards due
21 May 2013 — Speakers notified
12 July 2013 — Final papers due
5 August 2013 — International Symposium on Native XML user interfaces
6–9 August 2013 — Balisage: The Markup Conference

International Symposium on
Native XML user interfaces

Monday August 5, 2013 Hotel Europa, Montréal, Canada

XML is everywhere. It is created, gathered, manipulated, queried, browsed, read, and modified. XML systems need user interfaces to do all of these things. How can we make user interfaces for XML that are powerful, simple to use, quick to develop, and easy to maintain?

How are we building user interfaces today? How can we build them tomorrow? Are we using XML to drive our user interfaces? How?

This one-day symposium is devoted to the theory and practice of user interfaces for XML: the current state of implementations, practical case studies, challenges for users, and the outlook for the future development of the technology.

Relevant topics include:

  • Editors customized for specific purposes or users
  • User interfaces for creation, management, and use of XML documents
  • Uses of XForms
  • Making tools for creation of XML textual documents
  • Using general-purpose user-interface libraries to build XML interfaces
  • Looking at XML, especially looking at masses of XML documents
  • XML, XSLT, and XQuery in the browser
  • Specialized user interfaces for specialized tasks
  • XML vocabularies for user-interface specification

Presentations can take a variety of forms, including technical papers, case studies, and tool demonstrations (technical overviews, not product pitches).

This is the same conference I wrote about in: Markup Olympics (Balisage) [No Drug Testing].

In times of lean funding for conferences, if you go to a conference this year, it really should be Balisage.

You will be the envy of your co-workers and have tales to tell your grandchildren.

Not bad for one conference registration fee.

January 16, 2013

Optimizing TM Queries?

Filed under: Query Language,TMQL,XML,XQuery — Patrick Durusau @ 7:56 pm

A recent paper by V. Benzaken, G. Castagna, D. Colazzo, and K. Nguyễn, Optimizing XML querying using type-based document projection, suggests some interesting avenues for optimizing topic map queries.

Abstract:

XML data projection (or pruning) is a natural optimization for main memory query engines: given a query Q over a document D, the subtrees of D that are not necessary to evaluate Q are pruned, thus producing a smaller document D ; the query Q is then executed on D , hence avoiding to allocate and process nodes that will never be reached by Q.

In this article, we propose a new approach, based on types, that greatly improves current solutions. Besides providing comparable or greater precision and far lesser pruning overhead, our solution ―unlike current approaches― takes into account backward axes, predicates, and can be applied to multiple queries rather than just to single ones. A side contribution is a new type system for XPath able to handle backward axes. The soundness of our approach is formally proved. Furthermore, we prove that the approach is also complete (i.e., yields the best possible type-driven pruning) for a relevant class of queries and Schemas. We further validate our approach using the XMark and XPathMark benchmarks and show that pruning not only improves the main memory query engine’s performances (as expected) but also those of state of the art native XML databases.

Phrased in traditional XML terms but imagine pruning a topic map by topic or association types, for example, before execution of a query.

While true enough that a query could include topic type, the remains the matter of examining all the instances of topic type before proceeding to the rest of the query.

For common query sub-maps as it were, I suspect that to prune once and store the results could be a viable alternative.

Despite the graphic chart enhancement from processing millions or billions of nodes, processing the right set of nodes and producing a useful answer has its supporters.

January 15, 2013

XQuery 3.0: An XML Query Language [Subject Identity Equivalence Language?]

Filed under: Identity,XML,XQuery — Patrick Durusau @ 8:32 pm

XQuery 3.0: An XML Query Language – W3C Candidate Recommendation

Abstract:

XML is a versatile markup language, capable of labeling the information content of diverse data sources including structured and semi-structured documents, relational databases, and object repositories. A query language that uses the structure of XML intelligently can express queries across all these kinds of data, whether physically stored in XML or viewed as XML via middleware. This specification describes a query language called XQuery, which is designed to be broadly applicable across many types of XML data sources.

Just starting to read the XQuery CR but the thought occurred to me that it could be a basis for a “subject identity equivalence language.”

Rather than duplicating the work on expressions, paths, data types, operators, expressions, etc., why not take all that as given?

Suffice it to define a “subject equivalence function,” the variables of which are XQuery statements that identify values (or value expressions) as required, optional or forbidden and the definition of the results of the function.

Reusing a well-tested query language seems preferable to writing an entirely new one from scratch.

Suggestions?

I first saw this in a tweet by Michael Kay.

January 10, 2013

Markup Olympics (Balisage) [No Drug Testing]

Filed under: Conferences,XML,XML Database,XML Schema,XPath,XQuery,XSLT — Patrick Durusau @ 1:46 pm

Markup athletes take heart! Unlike venues that intrude into the personal lives of competitors, there are no, repeat no drug tests for presenters at Balisage!

Fear no trainer betrayals or years of being dogged by second-raters in the press.

Eat, drink, visit, ???, present, in the company of your peers.

The more traditional call for participation, yawn, has the following details:

Dates:

15 March 2013 – Peer review applications due
19 April 2013 – Paper submissions due
19 April 2013 – Applications due for student support awards due
21 May 2013 – Speakers notified
12 July 2013 – Final papers due

5 August 2013 – Pre-conference Symposium on XForms
6-9 August 2013 – Balisage: The Markup Conference

From the call:

Balisage is where people interested in descriptive markup meet each year in August for informed technical discussion, occasionally impassioned debate, good coffee, and the incomparable ambience of one of North America’s greatest cities, Montreal. We welcome anyone interested in discussing the use of descriptive markup to build strong, lasting information systems.

Practitioner or theorist, tool-builder or tool-user, student or lecturer — you are invited to submit a paper proposal for Balisage 2013. As always, papers at Balisage can address any aspect of the use of markup and markup languages to represent information and build information systems. Possible topics include but are not limited to:

  • XML and related technologies
  • Non-XML markup languages
  • Big Data and XML
  • Implementation experience with XML parsing, XSLT processors, XQuery processors, XML databases, XProc integrations, or any markup-related technology
  • Semantics, overlap, and other complex fundamental issues for markup languages
  • Case studies of markup design and deployment
  • Quality of information in markup systems
  • JSON and XML
  • Efficiency of Markup Software
  • Markup systems in and for the mobile web
  • The future of XML and of descriptive markup in general
  • Interesting applications of markup

In addition, please consider becoming a Peer Reviewer. Reviewers play a critical role towards the success of Balisage. They review blind submissions — on topics that interest them — for technical merit, interest, and applicability. Your comments and recommendations can assist the Conference Committee in creating the program for Balisage 2013!

How:

More IQ per square foot than any other conference you will attend in 2013!

December 21, 2012

BaseX. The XML Database. [XPath/XQuery]

Filed under: Editor,XML,XQuery — Patrick Durusau @ 11:08 am

BaseX. The XML Database.

From the webpage:

News: BaseX 7.5 has just been released…

BaseX is a very light-weight, high-performance and scalable XML Database engine and XPath/XQuery 3.0 Processor, including full support for the W3C Update and Full Text extensions. An interactive and user-friendly GUI frontend gives you great insight into your XML documents.

Another XML editor but I mention it for its support of XQuery more than as an editor per se.

We continue to lack a standard query language for topic maps and experience with XQuery may prove informative.

Not to mention its possible role in gathering diverse data for presentation in a merged state to users.

November 22, 2012

Teiid (8.2 Final Released!) [Component for TM System]

Filed under: Data Integration,Federation,Information Integration,JDBC,SQL,Teiid,XQuery — Patrick Durusau @ 11:16 am

Teiid

From the homepage:

Teiid is a data virtualization system that allows applications to use data from multiple, heterogenous data stores.

Teiid is comprised of tools, components and services for creating and executing bi-directional data services. Through abstraction and federation, data is accessed and integrated in real-time across distributed data sources without copying or otherwise moving data from its system of record.

Teiid Parts

  • Query Engine: The heart of Teiid is a high-performance query engine that processes relational, XML, XQuery and procedural queries from federated datasources.  Features include support for homogenous schemas, hetrogenous schemas, transactions, and user defined functions.
  • Embedded: An easy-to-use JDBC Driver that can embed the Query Engine in any Java application. (as of 7.0 this is not supported, but on the roadmap for future releases)
  • Server: An enterprise ready, scalable, managable, runtime for the Query Engine that runs inside JBoss AS that provides additional security, fault-tolerance, and administrative features.
  • Connectors: Teiid includes a rich set of Translators and Resource Adapters that enable access to a variety of sources, including most relational databases, web services, text files, and ldap.  Need data from a different source? A custom translators and resource adaptors can easily be developed.
  • Tools:

Teiid 8.2 final was released on November 20, 2012.

Like most integration services, not strong on integration between integration services.

Would make one helluva component for a topic map system.

A system with an inter-integration solution mapping layer in addition to the capabilities of Teiid.

November 20, 2012

Balisage 2013 – Dates/Location

Filed under: Conferences,XML,XML Database,XML Query Rewriting,XML Schema,XPath,XQuery,XSLT,XTM — Patrick Durusau @ 3:19 pm

Tommie Usdin just posted email with the Balisage 2013 dates and location:

Montreal, Hotel Europa, August 5 – 9 , 2013

Hope that works with everything else.

That’s the entire email so I don’t know what was meant by:

Hope that works with everything else.

Short of it being your own funeral, open-heart surgery or giving birth (to your first child), I am not sure what “everything else” there could be?

You get a temporary excuse for the second two cases and a permanent excuse for the first one.

Now’s a good time to hint about plane fare plus hotel and expenses for Balisage as a stocking stuffer.

And to wish a happy holiday Tommie Usdin and to all the folks at Mulberry Technology who make Balisage possible all of us. Each and every one.

October 2, 2012

JSONiq

Filed under: JSON,JSONiq,XQuery — Patrick Durusau @ 7:18 pm

JSONiq: The JSON Query Language

From the webpage:

JSONiq extends XQuery, a mature W3C standard, with native JSON support. Like XQuery and SQL, JSONiq is declarative: Expressions can nest with full composability.

Project, Filter, Join, Group… Like SQL, JSONiq can do all that. And it has many more features inherited from XQuery. JSONiq also inherits all XQuery builtin functions: date times, string manipulation, regular expressions, and more.

JSOniq is an expressive and highly optimizable language to query and update NoSQL stores. It enables developers to leverage the same productive high-level language across a variety of NoSQL products.

This came in over the nosql-discuss mailing list a day or so ago.

Sounds promising. Any early comments?

June 25, 2012

Show Me The Money!

Filed under: Conferences,XBRL,XML,XPath,XQuery — Patrick Durusau @ 2:28 pm

I need to talk to Tommie Usdin about marketing the Balisage conference.

The final program came out today and here is what Tommie had to say:

When the regular (peer-reviewed) part of the Balisage 2012 program was scheduled, a few slots were reserved for presentation of “Late breaking” material. These presentations have now been selected and added to the program.

Topics added include:

  • making robust and multi-platform ebooks
  • creating representative documents from large document collections
  • validating RESTful services using XProc, XSLT, and XSD
  • XML for design-based (e.g. magazine) publishing
  • provenance in XSLT transformation (tracking what XSLT does to documents)
  • literate programming
  • managing the many XML-related standards and specifications
  • leveraging XML for web applications

The program already included talks about adding RDF to TEI documents, compression of XML documents, exploring large XML collections, Schematron, relation of XML to JSON, overlap, higher-order functions in XSLT, the balance between XML and non-XML notations, and many other topics. Now it is a real must for anyone who thinks deeply about markup.

Balisage is the XML Geek-fest; the annual gathering of people who design markup and markup-based applications; who develop XML specifications, standards, and tools; the people who read and write, books about publishing technologies in general and XML in particular; and super-users of XML and related technologies. You can read about the Balisage 2011 conference at http://www.balisage.net.

Yawn. Are we there yet? 😉

Why you should care about XML and Balisage:

  • US government and others are publishing laws and regulations and soon to be legislative material in XML
  • Securities are increasingly using XML for required government reports
  • Texts and online data sets are being made available in XML
  • All the major document formats are based in XML

A $billion here, a $billion there and pretty soon you are talking about real business opportunity.

Your un-Balisaged XML developers have $1,000 bills blowing overhead.

Be smart, make your XML developers imaginative and productive.

Send your XML developers to Balisage.

(http://www.balisage.net/registration.html)

June 21, 2012

BaseX 7.3 (The Summer Edition) is now available!

Filed under: BaseX,XML,XML Database,XML Schema,XPath,XQuery — Patrick Durusau @ 7:47 am

BaseX 7.3 (The Summer Edition) is now available!

From the post:

we are glad to announce a great new release of BaseX, our XML database and XPath/XQuery 3.0 processor! Here are the latest features:

  • Many new internal XQuery Modules have been added, and existing ones have been revised to ensure long-term stability of your future XQuery applications
  • A new powerful Command API is provided to specify BaseX commands and scripts as XML
  • The full-text fuzzy index was extended to also support wildcard queries
  • The simple map operator of XQuery 3.0 gives you a compact syntax to process items of sequences
  • BaseX as Web Application can now start its own server instance
  • All command-line options will now be executed in the given order
  • Charles Foster’s latest XQJ Driver supports XQuery 3.0 and the Update and Full Text extensions

For those of you in the Northern Hemisphere, we wish you a nice summer! No worries, we’ll stay busy..

Just in time for the start of summer in the Northern Hemisphere!

Something you can toss onto your laptop before you head to the beach.

Err, huh? Well, even if you don’t take BaseX 7.3 to the beach, it promises to be good fun for the summer and more serious work should the occasion arise.

I count twenty-three (23) modules in addition to the XQuery functions specified by the latest XPath/XQuery 3.0 draft.

Just so you know, the BaseX database server listens to port 1984 by default.

June 1, 2012

Are You Going to Balisage?

Filed under: Conferences,RDF,RDFa,Semantic Web,XML,XML Database,XML Schema,XPath,XQuery,XSLT — Patrick Durusau @ 2:48 pm

To the tune of “Are You Going to Scarborough Fair:”

Are you going to Balisage?
Parsley, sage, rosemary and thyme.
Remember me to one who is there,
she once was a true love of mine.

Tell her to make me an XML shirt,
Parsley, sage, rosemary, and thyme;
Without any seam or binary code,
Then she shall be a true lover of mine.

….

Oh, sorry! There you will see:

  • higher-order functions in XSLT
  • Schematron to enforce consistency constraints
  • relation of the XML stack (the XDM data model) to JSON
  • integrating JSON support into XDM-based technologies like XPath, XQuery, and XSLT
  • XML and non-XML syntaxes for programming languages and documents
  • type introspection in XQuery
  • using XML to control processing in a document management system
  • standardizing use of XQuery to support RESTful web interfaces
  • RDF to record relations among TEI documents
  • high-performance knowledge management system using an XML database
  • a corpus of overlap samples
  • an XSLT pipeline to translate non-XML markup for overlap into XML
  • comparative entropy of various representations of XML
  • interoperability of XML in web browsers
  • XSLT extension functions to validate OCL constraints in UML models
  • ontological analysis of documents
  • statistical methods for exploring large collections of XML data

Balisage is an annual conference devoted to the theory and practice of descriptive markup and related technologies for structuring and managing information. Participants typically include XML users, librarians, archivists, computer scientists, XSLT and XQuery programmers, implementers of XSLT and XQuery engines and other markup-related software, Topic-Map enthusiasts, semantic-Web evangelists, members of the working groups which define the specifications, academics, industrial researchers, representatives of governmental bodies and NGOs, industrial developers, practitioners, consultants, and the world’s greatest concentration of markup theorists. Discussion is open, candid, and unashamedly technical.

The Balisage 2012 Program is now available at: http://www.balisage.net/2012/Program.html

May 29, 2012

Destination: Montreal!

If you remember the Saturday afternoon sci-fi movies, Destination: …., then you will appreciate the title for this post. 😉

Tommie Usdin and company just posted: Balisage 2012 Call for Late-breaking News, written in torn bodice style:

The peer-reviewed part of the Balisage 2012 program has been scheduled (and will be announced in a few days). A few slots on the Balisage program have been reserved for presentation of “Late-breaking” material.

Proposals for late-breaking slots must be received by June 15, 2012. Selection of late-breaking proposals will be made by the Balisage conference committee, instead of being made in the course of the regular peer-review process.

If you have a presentation that should be part of Balisage, please send a proposal message as plain-text email to info@balisage.net.

In order to be considered for inclusion in the final program, your proposal message must supply the following information:

  • The name(s) and affiliations of all author(s)/speaker(s)
  • The email address of the presenter
  • The title of the presentation
  • An abstract of 100-150 words, suitable for immediate distribution
  • Disclosure of when and where, if some part of this material has already been presented or published
  • An indication as to whether the presenter is comfortable giving a conference presentation and answering questions in English about the material to be presented
  • Your assurance that all authors are willing and able to sign the Balisage Non-exclusive Publication Agreement (http://www.balisage.net/BalisagePublicationAgreement.pdf) with respect to the proposed presentation

In order to be in serious contention for inclusion in the final program, your proposal should probably be either a) really late-breaking (it happened in the last month or two) or b) a paper, an extended paper proposal, or a very long abstract with references. Late-breaking slots are few and the competition is fiercer than for peer-reviewed papers. The more we know about your proposal, the better we can appreciate the quality of your submission.

Please feel encouraged to provide any other information that could aid the conference committee as it considers your proposal, such as a detailed outline, samples, code, and/or graphics. We expect to receive far more proposals than we can accept, so it’s important that you send enough information to make your proposal convincing and exciting. (This material may be attached to the email message, if appropriate.)

The conference committee reserves the right to make editorial changes in your abstract and/or title for the conference program and publicity. (emphasis added to last sentence)

Read that last sentence again!

The conference committee reserves the right to make editorial changes in your abstract and/or title for the conference program and publicity.

The conference committee might change your abstract and/or title to say something …. controversial? ….attention getting? ….CNN / Slashdot worthy?

Bring it on!

Submit late breaking proposals!

Please!

February 14, 2012

Would You Know “Good” XML If It Bit You?

Filed under: Uncategorized,XML,XML Schema,XPath,XQuery,XSLT — Patrick Durusau @ 5:16 pm

XML is a pale imitation of a markup language. It has resulted in real horrors across the markup landscape. After years in its service, I don’t have much hope of that changing.

But, the Princess of the Northern Marches has organized a war council to consider how to stem the tide of bad XML. Despite my personal misgivings, I wish them well and invite you to participate as you see fit.

Oh, and I found this message about the council meeting:

International Symposium on Quality Assurance and Quality Control in XML

Monday August 6, 2012
Hotel Europa, Montréal, Canada

Paper submissions due April 20, 2012.

A one-day discussion of issues relating to Quality Control and Quality Assurance in the XML environment.

XML systems and software are complex and constantly changing. XML documents are highly varied, may be large or small, and often have complex life-cycles. In this challenging environment quality is difficult to define, measure, or control, yet the justifications for using XML often include promises or implications relating to quality.

We invite papers on all aspects of quality with respect to XML systems, including but not limited to:

  • Defining, measuring, testing, improving, and documenting quality
  • Quality in documents, document models, software, transformations, or queries
  • Case studies in the control of quality in an XML environment
  • Theoretical or practical approaches to measuring quality in XML
  • Does the presence of XML, XML schemas, and XML tools make quality checking easier, harder, or even different from other computing environments
  • Should XML transforms and schemas be QAed as software? Or configuration files? Or documents? Does it matter?

Paper submissions due April 20, 2012.

Details at: http://www.balisage.net/QA-QC/

You do have to understand the semantics of even imitation markup languages before mapping them with more robust languages. Enjoy!

February 12, 2012

XML Prague 2012 (proceedings)

Filed under: Conferences,XML,XML Schema,XPath,XQuery,XSLT — Patrick Durusau @ 5:11 pm

XML Prague 2012 (proceedings) (PDF)

Fourteen papers by the leading lights in the XML world covering everything from XProc and XQuery to NVDL and JSONiq, and places in between.

Put it on your XML reading list.

January 1, 2012

Zorba: The Most Complete XQuery Processor

Filed under: Data Mining,XQuery — Patrick Durusau @ 5:57 pm

Zorba: The Most Complete XQuery Processor

From the homepage:

All Flavors Available

General purpose XQuery processor – written in C++.

Complete family of W3C familly of specifications: XPath, XQuery, Update, Scripting, Full-Text, XSLT, XQueryX, and more.

Pluggable Store

Seamlessly process XML data stored in different places.

Main memory, mobile devices, browsers, disk-based, or cloud-based stores.

Developer Friendly Tools

Benefit from a rich ecosystem of tools.

Eclipse plugins, command-line interface, and debugger.

Rich Module Library

Web mashups, cryptography, image processing, geo projections, emails, data cleaning… there is a module for that.

Runs Everywhere

Available on Windows, Linux, and Mac OS.

Bindings available for 6 Programming Languages: C++, C, PHP, Ruby, Java and Python.

Fun & Productive

XQuery unifies development for all tiers; database, content management, application logic, and presentation.

I started to mention this under the Cutting Edge Data Processing with PHP & XQuery post (which uses Zorba) but XQuery is important enough to list it separately.

In the draft Topic Map Tool Chain, I would put this under mining/analysis, but as was pointed out in comments, the mining/analysis phase can be informed by an ontology.

I would say “explicitly” informed by an ontology since there is always some ontology in play, whether explicit or not. (Formal ontologists, note the small “o” in ontology. An explicit ontology would have a name and be written <NAME> Ontology.

Cutting Edge Data Processing with PHP & XQuery

Filed under: PHP,XQuery — Patrick Durusau @ 5:57 pm

Cutting Edge Data Processing with PHP & XQuery

From the webpage:

PHP and XQuery have always been an happy couple and we are looking to build on that momentum. Our goal is to contribute a powerful toolkit to harness unstructured data in PHP developments. In this perspective, the first edition of the PHP Tour was a perfect fit to introduce developers with the possible interactions between PHP and XQuery. The aim of the talk was to explore the gain of functionality and productivity that can be achieved by introducing XQuery into PHP applications.

The slide deck by William Cadillion, from PHP Tour Lille 2011, will give you an idea of the capabilities of PHP and XQuery. I mention this because PHP is widely used in the library community and XQuery will make that use more productive and powerful.

October 15, 2011

BaseX

Filed under: BaseX,XML Database,XPath,XQuery — Patrick Durusau @ 4:29 pm

BaseX

From the webpage:

BaseX is a very light-weight and high-performance XML database system and XPath/XQuery processor, including full support for the W3C Update and Full Text extensions. An interactive and user-friendly GUI frontend gives you great insight into your XML documents and collections.

To maximize your productivity and workflows, we offer professional support, tailor-made software solutions and individual trainings on XML, XQuery and BaseX. The product itself is completely Open Source (BSD-licensed) and platform independent. Join our mailing lists to get regular updates!

But most important: BaseX runs out of the box and is easy to use…

For those of us who don’t think documents, even XML documents, are all that weird. 😉

September 21, 2011

XQuery Survey

Filed under: Query Language,XQuery — Patrick Durusau @ 7:09 pm

XQuery Survey

From the webpage:

I am looking for feedback on the XQuery programming language. Please answer all questions as completely as possible. This poll and other information is forming the basis of a talk I am giving at GOTO 2011 Arhus, Denmark (http://lanyrd.com/2011/gotocon-aarhus/shqhc/). I will share all results at the end of October 2011.

Please help Jim Fuller out with his survey on XQuery!

July 30, 2011

XQuery As Semantic Lens?

Filed under: XQuery — Patrick Durusau @ 9:12 pm

With Michael Kay thinking about tuples in XQuery, I started to wonder about XQuery as a semantic lens?

I say that because in discussions of Linked Data for example, there is always the question of getting data sources to release their data as Linked Data and/or complaints about the nature or quality of the Linked Data released.

While it may not be true in all cases, my operating assumption is that a user wants only some small portion of data from any particular data source. If data is obtained and viewed as linked data or with whatever desired annotations or additional properties, why pester the data owner?

Or to take the other side, why should we be limited by the data owner’s imagination or views about the data? Our “view” of data is probably more valuable to us than its source in most (all?) cases.

I am sure there are cases where conversion or annotation of an entire data set makes analytic, economic or performance sense, assuming you have the resources to make the conversion.

But that won’t be the case for small groups or individuals who want to access large data stores. Being able to query for subsets of data that they can use creatively will be a real advantage for them.

Of course, I am interested in using XQuery to produce input for topic map engines and representing declarations of semantic equivalence.

Suggestions for view as examples?

BTW, when I posted about the XQuery/XPath drafts, I foolishly used the dated URLs. I should have used the latest version URLs. Unless you are tracing comments back to drafts or the history of evolution of the XQuery, the latest version is the one you would want.

XQuery 3.0 – Latest Version Links

XQuery 3.0: An XML Query Language

XQueryX 3.0

XSLT and XQuery Serialization 3.0

XQuery 3.0 Use Cases

XQuery 3.0 Requirements

Tuples

Filed under: Tuples,XQuery — Patrick Durusau @ 9:11 pm

Tuples

It was interesting to see XQuery 3.0 introduce tuple operations.

It is important to see Michal Kay start to talk about implementing tuple operations in Saxon.

I wonder what it would take to create a profile of XQuery 3.0 that introduces a semantic equivalence operator?

June 23, 2011

Six Drafts Published Related to XSLT, XQuery, XPath (21 June 2011)

Filed under: XPath,XQuery,XSLT — Patrick Durusau @ 1:50 pm

Six Drafts Published Related to XSLT, XQuery, XPath (21 June 2011)

From the post:

Has anyone compared the addressing capabilities of XQuery to HyTime?

June 4, 2011

XQuery Guestbook

Filed under: XQuery — Patrick Durusau @ 7:14 pm

XQuery Guestbook

A guestbook written entirely in XQuery.

Is XQuery a good model for the capabilities people will expect from TMQL? Thinking that to offer less than the “average” set of capabilities is going to make TMQL look lame. Thoughts?

May 18, 2011

Balisage 2011 Preliminary Program

Filed under: Conferences,Data Mining,RDF,SPARQL,XPath,XQuery,XSLT — Patrick Durusau @ 6:40 pm

At-A-Glance

Program (in full)

From the announcement (Tommie Usdin):

Topics this year include:

  • multi-ended hypertext links
  • optimizing XSLT and XQuery processing
  • interchange, interoperability, and packaging of XML documents
  • eBooks and epub
  • overlapping markup and related topics
  • visualization
  • encryption
  • data mining

The acronyms this year include:

XML XSLT XQuery XDML REST XForms JSON OSIS XTemp RDF SPARQL XPath

New this year will be:

Lightning talks: an opportunity for participants to say what they think, simply, clearly, and persuasively.

As I have said before, simply the best conference of the year!

Conference site: http://www.balisage.net/

Registration: http://www.balisage.net/registration.html

April 18, 2011

Classify content with XQuery

Filed under: Classification,Text Analytics,XQuery — Patrick Durusau @ 1:40 pm

Classify content with XQuery by James R. Fuller (jim.fuller@webcomposite.com), Technical Director, Webcomposite.

Summary: With the expanding growth of semi-structured and unstructured data (XML) comes the need to categorize and classify content to make querying easier, faster, and more relevant. In this article, try several techniques using XQuery to automatically tag XML documents with content categorization based on the analysis of their content and structure.

Good article on the use of XQuery for basic text analysis and how to invoke web services while using XQuery for more sophisticated text analysis.

« Newer Posts

Powered by WordPress