## Archive for the ‘XPath’ Category

### Structural Issues in XPath/XQuery/XPath-XQuery F&O Drafts

Friday, January 9th, 2015

Apologies as I thought I was going to be further along in demonstrating some proofing techniques for XPath 3.1, XQuery 3.1, XPath and XQuery Functions and Operations 3.1 by today.

Instead, I encountered structural issues that are common to all three drafts that I didn’t anticipate but that need to be noted before going further with proofing. I will be using sample material to illustrate the problems and will not always have a sample from all three drafts or even note every occurrence of the issues. They are too numerous for that treatment and it would be repetition for repetition’s sake.

First, consider these passages from XPath 3.1, 1 Introduction:

[Definition: XPath 3.1 operates on the abstract, logical structure of an XML document, rather than its surface syntax. This logical structure, known as the data model, is defined in [XQuery and XPath Data Model (XDM) 3.1].]

[Definition: An XPath 3.0 Processor processes a query according to the XPath 3.0 specification.] [Definition: An XPath 2.0 Processor processes a query according to the XPath 2.0 specification.] [Definition: An XPath 1.0 Processor processes a query according to the XPath 1.0 specification.]

1. Unnumbered Definitions – Unidentified Cross-References

The first structural issue that you will note with the “[Definition…” material is that all such definitions are unnumbered and appear throughout all three texts. The lack of numbering means that it is difficult to refer with any precision to a particular definition. How would I draw your attention to the third definition of the second grouping? Searching for XPath 1.0 turns up 79 occurrences in XPath 3.1 so that doesn’t sound satisfactory. (FYI, “Definition” turns up 193 instances.)

While the “Definitions” have anchors that allow them to be addressed by cross-references, you should note that the cross-references are text hyperlinks that have no identifier by which a reader can find the definition without using the hyperlink. That is to say when I see:

A lexical QName with a prefix can be converted into an expanded QName by resolving its namespace prefix to a namespace URI, using the statically known namespaces. [These are fake links to draw your attention to the text in question.]

The hyperlinks in the original will take me to various parts of the document where these definitions occur, but if I have printed the document, I have no clue where to look for these definitions.

The better practice is to number all the definitions and since they are all self-contained, to put them in a single location. Additionally, all interlinear references to those definitions (or other internal cross-references) should have a visible reference that enables a reader to find the definition or cross-reference, without use of an internal hyperlink.

Example:

A lexical QName Def-21 with a prefix can be converted into an expanded QName Def-19 by resolving its namespace prefix to a namespace URI, using the statically known namespaces. Def-99 [These are fake links to draw your attention to the text in question. The Def numbers are fictitious in this example. Actual references would have the visible definition numbers assigned to the appropriate definition.]

2. Vague references – $N versus 5000 x$N

Another problem I encountered was what I call “vague references,” or less generously, $N versus 5,000 x$N.

For example:

[Definition: An atomic value is a value in the value space of an atomic type, as defined in [XML Schema 1.0] or [XML Schema 1.1].] [Definition: A node is an instance of one of the node kinds defined in [XQuery and XPath Data Model (XDM) 3.1].

Contrary to popular opinion, standards don’t write themselves and every jot and tittle was placed in a draft at the expense of someone’s time and resources. Let’s call that $N. In the example, you and I both know somewhere in XML Schema 1.0 and XML Schema 1.1 that the “value space of the atomic type” is defined. The same is true for nodes and XQuery and XPath Data Model (XDM) 3.1. But where? The authors of these specifications could insert that information at a cost of$N.

What is the cost of not inserting that information in the current drafts? I estimate the number of people interested in reading these drafts to be 5,000. So each of those person will have to find the same information omitted from these specifications, which is a cost of 5,000 x $N. In terms of convenience to readers and reducing their costs of reading these specifications, references to exact locations in other materials are a necessity. In full disclosure, I have no more or less reason to think 5,000 people are interested in these drafts than the United States has for positing the existence of approximately 5,000 terrorists in the world. I suspect the number of people interested in XML is actually higher but the number works to make the point. Editors can either convenience themselves or their readers. Vague references are also problematic in terms of users finding the correct reference. The citation above, [XML Schema 1.0] for “value space of an atomic type,” refers to all three parts of XML Schema 1.0. Part 1, at 3.14.1 (non-normative) The Simple Type Definition Schema Component, has the only reference to “atomic type.” Part 2, actually has “0” hits for “atomic type.” True enough, “2.5.1.1 Atomic datatypes” is likely the intended reference but that isn’t what the specification says to look for. Bottom line is that any external reference needs to include in the inline citation the precise internal reference in the work being cited. If you want to inconvenience readers by pointing to internal bibliographies rather than online HTML documents, where available, that’s an editorial choice. But in any event, for every external reference, give the internal reference in the work being cited. Your readers will appreciate it and it could make your work more accurate as well. 3. Normative vs. Non-Normative Text Another structural issue which is important for proofing is the distinction between normative and non-normative text. In XPath 3.1, still in the Introduction we read: This document normatively defines the static and dynamic semantics of XPath 3.1. In this document, examples and material labeled as “Note” are provided for explanatory purposes and are not normative. OK, and under 2.2.3.1 Static Analysis Phase (XPath 3.1), we find: Examples of inferred static types might be: Which is followed by a list so at least we know where the examples end. However, there are numerous cases of: For example, with the expression substring($a, $b,$c), $a must be of type xs:string (or something that can be converted to xs:string by the function calling rules), while$b and $c must be of type xs:double. [also in 2.2.3.1 Static Analysis Phase (XPath 3.1)] So, is that a non-normative example? If so, what is the nature of the “must” that occurs in it? Is that normative? Moreover, the examples (XPath 3.1 has 283 occurrences of that term, XQuery has 455 occurrences of that term, XPath and XQuery Functions and Operators have 537 occurrences of that term) are unnumbered, which makes referencing the examples by other materials very imprecise and wordy. For the use of authors creating secondary literature on these materials, to promote adoption, etc., number of all examples should be the default case. Oh, before anyone protests that XPath and XQuery Functions and Operators has separated its examples into lists, that is true but only partially. There remain 199 occurrences of “for example” which do not occur in lists. Where lists are used, converting to numbered examples should be trivial. The elimination of “for example” material may be more difficult. Hard to say without a good sampling of the cases. Conclusion: As I said at the outset, apologies for not reaching more substantive proofing techniques but structural issues are important for the readability and usability of specifications for readers. Being correct and unreadable isn’t a useful goal. It may seem like some of the changes I suggest are a big “ask” this late in the processing of these specifications. If this were a hand edited document, I would quickly agree with you. But it’s not. Or at least it shouldn’t be. I don’t know where the source is held but the HTML you read is an generated artifact. Gathering and numbering the definitions and inserting those numbers into the internal cross-references are a matter of applying a different style sheet to the source. Fixing the vague references and unnumbered example texts would take more editorial work but readers would greatly benefit from precise references and a clear separation of normative from non-normative text. I will try again over the weekend to reach aids for substantive proofing on these drafts. With luck, I will return to these drafts on Monday of next week (12 January 2014). ### MUST in XPath 3.1/XQuery 3.1/XQueryX 3.1 Wednesday, January 7th, 2015 I mentioned the problems with redefining may and must in XPath and XQuery Functions and Operators 3.1 in Redefining RFC 2119? Danger! Danger! Will Robinson! last Monday. Requirements language is one of the first things to check for any specification so I thought I should round that issue out by looking at the requirement language in XPath 3.1, XQuery 3.1, and, XQueryX 3.1. XPath 3.1 XPath 3.1 includes RFC 2119 as a normative reference but then never cites RFC 2119 in the document or use the uppercase MUST. I suspect that is the case because of Appendix F Conformance: XPath is intended primarily as a component that can be used by other specifications. Therefore, XPath relies on specifications that use it (such as [XPointer] and [XSL Transformations (XSLT) Version 3.0]) to specify conformance criteria for XPath in their respective environments. Specifications that set conformance criteria for their use of XPath must not change the syntactic or semantic definitions of XPath as given in this specification, except by subsetting and/or compatible extensions. The specification of such a language may describe it as an extension of XPath provided that every expression that conforms to the XPath grammar behaves as described in this specification. (Edited on include the actual links to XPointer and XSLT, pointing internally to a bibliography defeats the purpose of hyperlinking.) Personally I would simply remove the RFC 2119 reference since XPath 3.1 is a set of definitions to which conformance is mandated or not, by other specifications. XQuery 3.1 and XQueryX 3.1 XQuery 3.1 5 Conformance reads in part: This section defines the conformance criteria for an XQuery processor. In this section, the following terms are used to indicate the requirement levels defined in [RFC 2119]. [Definition: MUST means that the item is an absolute requirement of the specification.] [Definition: MUST NOT means that the item isan absolute prohibition of the specification.] [Definition: MAY means that an item is truly optional.] [Definition: SHOULD means that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course.] (Emphasis in the original) XQueryX 3.1 5 Conformance reads in part: This section defines the conformance criteria for an XQueryX processor (see Figure 1, “Processing Model Overview”, in [XQuery 3.1: An XML Query Language] , Section 2.2 Processing Model XQ31. In this section, the following terms are used to indicate the requirement levels defined in [RFC 2119]. [Definition: MUST means that the item is an absolute requirement of the specification.] [Definition: SHOULD means that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course.] [Definition: MAY means that an item is truly optional.] First, the better practice is not to repeat definitions found elsewhere (a source of error and misstatement) but to cite RFC 2119 as follows: The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in [RFC2119]. Second, the bolding found in XQuery 3.1 of MUST, etc., is unnecessary, particularly when not then followed by bolding in the use of MUST in the conformance clauses. Best practice to simply use UPPERCASE in both cases. Third, and really my principal reason for mentioning XQuery 3.1 and XQueryX 3.1 is to call attention to their use of RFC 2119 keywords. That is to say you will find the keywords in the conformance clauses and not any where else in the specification. Both use the word “must” in their texts but only as would normally appear in prose and implementers don’t have to pour through a sprinkling of MUST as you see in some drafts, which makes for stilted writing and traps for the unwary. The usage of RFC 2119 keywords in XQuery 3.1 and XQueryX 3.1 make the job of writing in declarative prose easier, eliminates the need to distinguish MUST and must in the normative text, and gives clear guidance to implementers as to the requirements to be met for conformance. I was quick to point out an error in my last post so it is only proper that I be quick to point out a best practice in XQuery 3.1 and XQueryX 3.1 as well. This coming Friday, 9 January 2015, I will have a post on proofing content proper for this bundle of specifications. PS: I am encouraging you to take on this venture into proofing specifications because this particular bundle of W3C specification work is important for pointing into data. If we don’t have reliable and consistent pointing, your topic maps will suffer. ### Redefining RFC 2119? Danger! Danger! Will Robinson! Monday, January 5th, 2015 I’m lagging behind in reading XQuery 3.1: An XML Query Language, XML Path Language (XPath) 3.1, and, XPath and XQuery Functions and Operators 3.1 in order to comment by 13 February 2015. In order to catch up this past weekend I started trying to tease these candidate recommendations apart to make them easier to proof. One of the things I always do I check for key word conformance language and that means, outside of ISO, RFC 2119. I was reading XPath and XQuery Functions and Operators 3.1 (herein Functions and Operators) when I saw: 1.1 Conformance The Functions and Operators specification is intended primarily as a component that can be used by other specifications. Therefore, Functions and Operators relies on specifications that use it (such as [XML Path Language (XPath) 3.1], [XQuery 3.1: An XML Query Language], and potentially future versions of XSLT) to specify conformance criteria for their respective environments. That works. You have a normative document of definitions, etc., and some other standard cites those definitions and supplies the must,should, may according to RFC 2119. Not common but that works. But then I started running scripts for usage of key words and I found in Functions and Operators: 1.6.3 Conformance terminology [Definition] may Conforming documents and processors are permitted to, but need not, behave as described. [Definition] must Conforming documents and processors are required to behave as described; otherwise, they are either non-conformant or else in error. Thus the title: Redefining RFC 2119? Danger! Danger! Will Robinson! RFC 2119 reads in part: 1. MUST This word, or the terms “REQUIRED” or “SHALL”, mean that the definition is an absolute requirement of the specification. 5. MAY This word, or the adjective “OPTIONAL”, mean that an item is truly optional. One vendor may choose to include the item because a particular marketplace requires it or because the vendor feels that it enhances the product while another vendor may omit the same item. An implementation which does not include a particular option MUST be prepared to interoperate with another implementation which does include the option, though perhaps with reduced functionality. In the same vein an implementation which does include a particular option MUST be prepared to interoperate with another implementation which does not include the option (except, of course, for the feature the option provides.) 6. Guidance in the use of these Imperatives Imperatives of the type defined in this memo must be used with care and sparingly. In particular, they MUST only be used where it is actually required for interoperation or to limit behavior which has potential for causing harm (e.g., limiting retransmisssions) For example, they must not be used to try to impose a particular method on implementors where the method is not required for interoperability. First, the referencing of RFC 2119 is standard practice at the W3C, at least with regard to XML specifications. I wanted to have more than personal experience to cite so I collected the fifty-eight current XML specifications and summarize them in the list at the end of this post. Of the fifty-nine (59) current XML specifications (there may be others, the W3C has abandoned simply listing its work without extraneous groupings), fifty-two of the standards cite and follow RFC 2119. Three of the remaining seven (7) fail to cite RFC due to errors in editing. The final four (4) as it were that don’t cite RFC 2119 are a good illustration of how errors get perpetuated from one standard to another. The first W3C XML specification to not cite RFC 2119 was: Extensible Markup Language (XML) 1.0 (Second Edition) where it reads in part: 1.2 Terminology may [Definition: Conforming documents and XML processors are permitted to but need not behave as described.] must [Definition: Conforming documents and XML processors are required to behave as described; otherwise they are in error. ] The definitions of must and may were ABANDONED in Extensible Markup Language (XML) 1.0 (Third Edition), which simply dropped those definitions and instead reads in part: 1.2 Terminology The terminology used to describe XML documents is defined in the body of this specification. The key words must, must not, required, shall, shall not, should, should not, recommended, may, and optional, when emphasized, are to be interpreted as described in [IETF RFC 2119]. The exclusive use of RFC 2119 continues through Extensible Markup Language (XML) 1.0 (Fourth Edition) to the current Extensible Markup Language (XML) 1.0 (Fifth Edition) However, as is often said, whatever good editing we do is interred with us and any errors we make live on. Before the abandonment of attempts to define may and must appeared in XML 3rd edition, XML Schema Part 1: Structures Second Edition and XML Schema Part 2: Datatypes Second Edition cite XML 2nd edition as their rationale for defining may and must. That error has never been corrected. Which brings us to W3C XML Schema Definition Language (XSD) 1.1 Part 2: Datatypes which is the last W3C XML specification to not cite RFC 2119. XSD 1.1 Part 2 reads in part, under Appendix I Changes since version 1.0, I.4 Other Changes: The definitions of must, must not, and ·error· have been changed to specify that processors must detect and report errors in schemas and schema documents (although the quality and level of detail in the error report is not constrained). The problem being XML Schema Part 2: Datatypes Second Edition relies upon XML Schema Part 2: Datatypes Second Edition which cites Extensible Markup Language (XML) 1.0 (Second Edition) as the reason for redefining the terms may and must. The redefining of may and must relies upon language in a superceded version of the XML standard. Language that was deleted ten (10) years ago from the XML standard. If you have read this far, you have a pretty good guess that I am going to suggest that XPath and XQuery Functions and Operators 3.1 drop the attempt to redefine terms that appear in RFC 2119. First, redefining widely used terms for conformance is clearly a bad idea. Do you mean RFC2119 must or do you mean and F&O must? Clearly different. If a requirement has an RFC2119 must, my application either conforms or fails. If a requirement has an F&O must, my application may simple be in error. All the time. Is that useful? Second, by redefining must, we lose the interoperability aspects as define by RFC2119 for all uses of must. Surely interoperability is a goal of Functions and Operators. Yes? Third, the history of redefining may and must at the W3C shows (to me) the perpetuation of an error long beyond its correction date. It’s time to put an end to redefining may and must. PS: Before you decide you “know” the difference in upper and lower case key words from RFC 2119, take a look at: RFC Editorial Guidelines and Procedures, Normative References to RFC 2119. Summary, UPPER CASE is normative, lower case is “a necessary logical relationship.” PPS: Tracking this error down took longer than expected so it will be later this week before I have anything that may help with proofing the specifications. XML Standards Consulted in preparation of this post. Y = Cites RFC 2119, N = Does not cite RFC 2119. ### XQuery, XPath, XQuery/XPath Functions and Operators 3.1 Friday, December 19th, 2014 XQuery, XPath, XQuery/XPath Functions and Operators 3.1 were published on 18 December 2014 as a call for implementation of these specifications. The changes most often noted were the addition of capabilities for maps and arrays. “Support for JSON” means sections 17.4 and 17.5 of XPath and XQuery Functions and Operators 3.1. XQuery 3.1 and XPath 3.1 depend on XPath and XQuery Functions and Operators 3.1 for JSON support. (Is there no acronym for XPath and XQuery Functions and Operators? Suggest XF&O.) For your reading pleasure: XQuery 3.1: An XML Query Language XML Path Language (XPath) 3.1 XPath and XQuery Functions and Operators 3.1 Hoping that your holiday gifts include a large box of highlighters and/or a box of red pencils! Oh, these specifications will “…remain as Candidate Recommendation(s) until at least 13 February 2015. (emphasis added)” Less than two months so read quickly and carefully. Enjoy! I first saw this in a tweet by Jonathan Robie. ### Last Call: XQuery 3.1 and XQueryX 3.1; and additional supporting documents Friday, October 10th, 2014 Last Call: XQuery 3.1 and XQueryX 3.1; and additional supporting documents From the post: Today the XQuery Working Group published a Last Call Working Draft of XQuery 3.1 and XQueryX 3.1. Additional supporting documents were published jointly with the XSLT Working Group: a Last Call Working Draft of XPath 3.1, together with XPath Functions and Operators, XQuery and XPath Data Model, and XSLT and XQuery Serialization. XQuery 3.1 and XPath 3.1 introduce improved support for working with JSON data with map and array data structures as well as loading and serializing JSON; additional support for HTML class attributes, HTTP dates, scientific notation, cross-scaling between XSLT and XQuery and more. Comments are welcome through 7 November 2014. Learn more about the XML Activity. How closely do you read? To answer that question, read all the mentioned documents by 7 November 2014, keeping a list of errors you spot. Submit your list to the XQuery Working Group by by 7 November 2014 and score your reading based on the number of “errors” accepted by the working group. What is your W3C Proofing Number? (Average number of accepted “errors” divided by the number of W3C drafts where “errors” were submitted.) ### Sharing key indexes Wednesday, June 4th, 2014 Sharing key indexes by Michael Kay. From the post: For ever and a day, Saxon has tried to ensure that when several transformations are run using the same stylesheet and the same source document(s), any indexes built for those documents are reused across transformations. This has always required some careful juggling of Java weak references to ensure that the indexes are dropped from memory as soon as either the executable stylesheet or the source document are no longer needed. I’ve now spotted a flaw in this design. It wasn’t reported by a user, and it didn’t arise from a test case, it simply occurred to me as a theoretical possibility, and I have now written a test case that shows it actually happens. The flaw is this: if the definition of the key includes a reference to a global variable or a stylesheet parameter, then the content of the index depends on the values of global variables, and these are potentially different in different transformations using the same stylesheet. Michael discovers a very obscure bug entirely on his own and yet resolves to fix it. That is so unusual that I thought it merited mentioning. It should give you great confidence in Saxon. How that impacts your confidence on other software I cannot say. ### 7 First Public Working Drafts of XQuery and XPath 3.1 Friday, April 25th, 2014 7 First Public Working Drafts of XQuery and XPath 3.1 From the post: Today the XML Query Working Group and the XSLT Working Group have published seven First Public Working Drafts, four of which are jointly developed and three are from the XQuery Working Group. The joint documents are: • XML Path Language (XPath) 3.1. XPath is a powerful expression language that allows the processing of values conforming to the data model defined in the XQuery and XPath Data Model. The main features of XPath 3.1 are maps and arrays. • XPath and XQuery Functions and Operators 3.1. This specification defines a library of functions available for use in XPath, XQuery, XSLT and other languages. • XQuery and XPath Data Model 3.1. This specification defines the data model on which all operations of XPath 3.1, XQuery 3.1, and XSLT 3.1 operate. • XSLT and XQuery Serialization 3.1. This document defines serialization of an instance of the XQuery and XPath Data model Data Model into a sequence of octets, such as into XML, text, HTML, JSON. The three XML Query Working Group documents are: • XQuery 3.1 Requirements and Use Cases, which describes the reasons for producing XQuery 3.1, and gives examples. • XQuery 3.1: An XML Query Language. XQuery is a versatile query and application development language, capable of processing the information content of diverse data sources including structured and semi-structured documents, relational databases and tree-bases databases. The XQuery language is designed to support powerful optimizations and pre-compilation leading to very efficient searches over large amounts of data, including over so-called XML-native databases that read and write XML but have an efficient internal storage. The 3.1 version adds support for features such as arrays and maps primarily to facilitate processing of JSON and other structures. • XQueryX 3.1, which defines an XML syntax for XQuery 3.1. Learn more about the XML Activity. To show you how far behind I am on my reading, I haven’t even ordered Michael Kay‘s XSLT 3.0 and XPath 3.0 book and the W3C is already working on 3.1 for both. 😉 I am hopeful that Michael will duplicate his success with XSLT 2.0 and XPath 2.0. This time though, I am going to get the Kindle edition. 😉 ### The X’s Are In Town Thursday, April 10th, 2014 XQuery 3.0, XPath 3.0, XQueryX 3.0, XDM 3.0, Serialization 3.0, Functions and Operators 3.0 are now W3C Recommendations From the post: The XML Query Working Group published XQuery 3.0: An XML Query Language, along with XQueryX, an XML representation for XQuery, both as W3C Recommendations, as well as the XQuery 3.0 Use Cases and Requirements as final Working Group Notes. XQuery extends the XPath language to provide efficient search and manipulation of information represented as trees from a variety of sources. The XML Query Working Group and XSLT Working Group also jointly published W3C Recommendations of XML Path Language (XPath) 3.0, a widely-used language for searching and pointing into tree-based structures, together with XQuery and XPath Data Model 3.0 which defines those structures, XPath and XQuery Functions and Operators 3.0 which provides facilities for use in XPath, XQuery, XSLT and a number of other languages, and finally the XSLT and XQuery Serialization 3.0 specification giving a way to turn values and XDM instances into text, HTML or XML. Read about the XML Activity. I was wondering what I was going to have to read this coming weekend. 😉 It may just be me but the “…provide efficient search and manipulation of information represented as trees from a variety of sources…” sounds a lot like groves to me. You? ### Balisage Papers Due 18 April 2014 Tuesday, March 18th, 2014 Unlike the rolling dates for Obamacare, Balisage Papers are due 18 April 2014. (That’s this year for health care wonks.) From the website: Balisage is an annual conference devoted to the theory and practice of descriptive markup and related technologies for structuring and managing information. Are you interested in open information, reusable documents, and vendor and application independence? Then you need descriptive markup, and Balisage is the conference you should attend. Balisage brings together document architects, librarians, archivists, computer scientists, XML wizards, XSLT and XQuery programmers, implementers of XSLT and XQuery engines and other markup-related software, Topic-Map enthusiasts, semantic-Web evangelists, standards developers, academics, industrial researchers, government and NGO staff, industrial developers, practitioners, consultants, and the world’s greatest concentration of markup theorists. Some participants are busy designing replacements for XML while other still use SGML (and know why they do). Discussion is open, candid, and unashamedly technical. Content-free marketing spiels are unwelcome and ineffective. I can summarize that for you: There are conferences on the latest IT buzz. There are conferences on last year’s IT buzz. Then there are conferences on information as power, which decides who will sup and who will serve. Balisage is about information as power. How you use it, well, that’s up to you. ### Balisage 2014: Near the Belly of the Beast Tuesday, January 14th, 2014 Balisage: The Markup Conference 2014 Bethesda North Marriott Hotel & Conference Center, just outside Washington, DC Key dates: – 28 March 2014 — Peer review applications due – 18 April 2014 — Paper submissions due – 18 April 2014 — Applications for student support awards due – 20 May 2014 — Speakers notified – 11 July 2014 — Final papers due – 4 August 2014 — Pre-conference Symposium – 5–8 August 2014 — Balisage: The Markup Conference From the call for participation: Balisage is the premier conference on the theory, practice, design, development, and application of markup. We solicit papers on any aspect of markup and its uses; topics include but are not limited to: • Cutting-edge applications of XML and related technologies • Integration of XML with other technologies (e.g., content management, XSLT, XQuery) • Performance issues in parsing, XML database retrieval, or XSLT processing • Development of angle-bracket-free user interfaces for non-technical users • Deployment of XML systems for enterprise data • Design and implementation of XML vocabularies • Case studies of the use of XML for publishing, interchange, or archving • Alternatives to XML • Expressive power and application adequacy of XSD, Relax NG, DTDs, Schematron, and other schema languages Detailed Call for Participation: http://balisage.net/Call4Participation.html About Balisage: http://balisage.net/Call4Participation.html Instructions for authors: http://balisage.net/authorinstructions.html For more information: info@balisage.net or +1 301 315 9631 I checked, from the conference hotel you are anywhere from 25.6 to 27.9 miles by car from the NSA Visitor Center at Fort Meade. Take appropriate security measures. When I heard Balisage was going to be in Bethesda, the first song that came to mind was: Back in the U.S.S.R.. Followed quickly by Leonard Cohen’s Democracy Is Coming to the U.S.A.. I don’t know where the equivalent of St. Catherine Street of Montreal is in Bethesda. But when I find out, you will be the first to know! Balisage is simply the best markup technology conference. (full stop) Start working on your manager now to get time to write a paper and to attend Balisage. When the time comes for “big data” to make sense, markup will be there to answer the call. You should be too. ### xslt3testbed Friday, January 3rd, 2014 xslt3testbed From the post: Testbed for trying out XSLT 3.0 (http://www.w3.org/TR/xslt-30/) techniques. Since few people yet have much (or any) experience using XSLT 3.0 on more than toy examples, this is a public, medium-sized XSLT 3.0 project where people could try out new XSLT 3.0 features on the transformations to (X)HTML(5) and XSL-FO that are what we do most often and, along the way, maybe come up with new design patterns for doing transformations using the higher-order functions, partial function application, and other goodies that XSLT 3.0 gives us. If you haven’t been investigating XSLT 3.0 (and related specifications) you need to take corrective action. As an incentive, read Pearls Of XSLT And XPath 3.0 Design. If you thought XSLT was useful for data operations, you will be amazed by XSLT 3.0! ### Frameless Saturday, November 23rd, 2013 Frameless From the webpage: Frameless is an XSLT 2 processor running in the browser, directly written in JavaScript. It includes an XPath 2 query engine for simple, powerful querying. It works cross-browser, we have even reached compatibility with IE6 and Firefox 1. With Frameless you’ll be able to do things the browsers won’t let you, such as using $variables and adding custom functions to XPath. What’s more, XPath 2 introduces if/else and for-loops. We’ll even let you use some XPath 3 functionality! Combine data into a string using the brand new string concatenation operator.

Use way overdue math functions such as sin() and cos(), essential when generating data-powered SVG graphics. And use Frameless.select() to overcome the boundaries between XSLT and JavaScript.

When to use Frameless?

Frameless is created to simplify application development and is, due to its API, great for writing readable code.

It will make application development a lot easier and it’s a good fit for all CRUD applications and applications with tricky DOM manipulation.

Who will benefit by using it?

• Designers and managers will be able to read the code and even fix some bugs.
• Junior developers will get up to speed in no time and write code with a high level of abstraction, and they will be able to create prototypes that’ll be shippable.
• Senior developers will be able to create complicated webapplications for all browsers and write them declaratively

What it’s not

Frameless doesn’t intend to fully replace functional DOM manipulation libraries like jQuery. If you like you can use such libraries and Frameless at the same time.

Frameless doesn’t provide a solution for cross-browser differences in external CSS stylesheets. We add prefixes to some inline style attributes, but you should not write your styles inline only for this purpose. We do not intend to replace any CSS extension language, such as for example Sass.

Frameless is very sparse on documentation but clearly the potential for browser-based applications is growing.

I first saw this in a tweet by Michael Kay.

### X* 3.0 Proposed Recommendations

Tuesday, October 22nd, 2013

XQuery 3.0, XPath 3.0, Data Model, Functions and Operators and XSLT and XQuery Serialization 3.0

From the post:

The XML Query Working Group and the XSLT Working Group have published five Proposed Recommendations today:

What’s today? October 22nd?

You almost have 30 days. 😉

Which one or more are you going to read?

I first saw this in a tweet by Jonathan Robie.

### N1QL – It Makes Cents! [Rediscovery of Paths]

Friday, October 11th, 2013

N1QL – It Makes Cents! by Robin Johnson.

*Ba Dum Tschhh* …See what I did there? Makes cents? Get it? Haha.

So… N1QL (pronounced Nickel)… Couchbase’s new next-generation query language; what is it? Well, it’s a rather genius designed, human readable / writable, extensible language designed for ad-hoc and operational querying within Couchbase. For those already familiar with querying within Couchbase, that blurb will probably make sense to you. If not – well, probably not, so let me clear it up a little more.

But before I do that, I must inform you that this blog article isn’t the best place for you to go if you want to dive in and get started learning N1QL. It is a view into N1QL from a developer’s perspective including why I am so excited about it, and the features I am proud to point out. If you want to get started learning about N1QL, click here. Or alternatively, go and have a go of the Online Tutorial. Anyway, back to clearing up what I mean when I say N1QL…

“N1QL is similar to the standard SQL language for relational databases, but also includes additional features; which are suited for document-oriented databases.” N1QL has been designed as an intuitive Query Language for use on databases structured around Documents instead of tables. To locate and utilise information in a document-oriented database, you need the correct logic and expressions for navigating documents and document structures. N1QL provides a clear, easy-to-understand abstraction layer to query and retrieve information in your document-database.

Before we move on with N1QL, let’s talk quickly about document modeling within Couchbase. As you probably know; within Couchbase we model our documents primarily in JSON. We’re all familiar with JSON, so I won’t go into it in detail, but one thing we need to bear in mind is the fact that: our JSON documents can have complex nested data structures, nested arrays and objects which ordinarily would make querying a problem. Contrary to SQL though, N1QL has the ability to navigate nested data because it supports the concept of paths. This is very cool. We can use paths by using a dot-notation syntax to give us the logical location of an attribute within a document. For example; if we had an e-commerce site with documents containing customers’ orders, we could look inside those documents, to an Nth nested level for attributes. So if we wanted to look for the customer’s shipping street: (emphasis in original)

Paths are “very cool,” but I thought that documents could already be navigated by paths?

Yes?

True, CouchDB uses JSON documents but the notion of paths in data structures isn’t news.

Not having paths into data structures, now, that would be news. 😉

### BaseX 7.7 has been released!

Wednesday, August 7th, 2013

BaseX 7.7 has been released!

From the webpage:

BaseX is a very light-weight, high-performance and scalable XML Database engine and XPath/XQuery 3.0 Processor, including full support for the W3C Update and Full Text extensions. An interactive and user-friendly GUI frontend gives you great insight into your XML documents.

To maximize your productivity and workflows, we offer professional support, highly customized software solutions and individual trainings on XML, XQuery and BaseX. Our product itself is completely Open Source (BSD-licensed) and platform independent; join our mailing lists to get regular updates!

But most important: BaseX runs out of the box and is easy to use…

This was a fortunate find. I have some XML work coming up and need to look at the latest offerings.

### “…XML User Interfaces” As in Using XML?

Tuesday, February 19th, 2013

International Symposium on Native XML user interfaces

This came across the wire this morning and I need your help interpreting it.

Why would you want to have an interface to XML?

All these years I have been writing XML in Emacs because XML wasn’t supposed to have an interface.

Brave hearts, male, female and unknown, struggling with issues too obscure for mere mortals.

Now I find that isn’t supposed to be so? You can imagine my reaction.

I moved my laptop a bit closer to the peat fire to make sure I read it properly. Waiting for the ox cart later this week to take my complaint to the local bishop about this disturbing innovation.

😉

15 March 2013 — Peer review applications due
19 April 2013 — Paper submissions due
19 April 2013 — Applications due for student support awards due
21 May 2013 — Speakers notified
12 July 2013 — Final papers due
5 August 2013 — International Symposium on Native XML user interfaces
6–9 August 2013 — Balisage: The Markup Conference

International Symposium on
Native XML user interfaces

Monday August 5, 2013 Hotel Europa, Montréal, Canada

XML is everywhere. It is created, gathered, manipulated, queried, browsed, read, and modified. XML systems need user interfaces to do all of these things. How can we make user interfaces for XML that are powerful, simple to use, quick to develop, and easy to maintain?

How are we building user interfaces today? How can we build them tomorrow? Are we using XML to drive our user interfaces? How?

This one-day symposium is devoted to the theory and practice of user interfaces for XML: the current state of implementations, practical case studies, challenges for users, and the outlook for the future development of the technology.

Relevant topics include:

• Editors customized for specific purposes or users
• User interfaces for creation, management, and use of XML documents
• Uses of XForms
• Making tools for creation of XML textual documents
• Using general-purpose user-interface libraries to build XML interfaces
• Looking at XML, especially looking at masses of XML documents
• XML, XSLT, and XQuery in the browser
• Specialized user interfaces for specialized tasks
• XML vocabularies for user-interface specification

Presentations can take a variety of forms, including technical papers, case studies, and tool demonstrations (technical overviews, not product pitches).

This is the same conference I wrote about in: Markup Olympics (Balisage) [No Drug Testing].

In times of lean funding for conferences, if you go to a conference this year, it really should be Balisage.

You will be the envy of your co-workers and have tales to tell your grandchildren.

Not bad for one conference registration fee.

### Markup Olympics (Balisage) [No Drug Testing]

Thursday, January 10th, 2013

Markup athletes take heart! Unlike venues that intrude into the personal lives of competitors, there are no, repeat no drug tests for presenters at Balisage!

Fear no trainer betrayals or years of being dogged by second-raters in the press.

Eat, drink, visit, ???, present, in the company of your peers.

The more traditional call for participation, yawn, has the following details:

Dates:

15 March 2013 – Peer review applications due
19 April 2013 – Paper submissions due
19 April 2013 – Applications due for student support awards due
21 May 2013 – Speakers notified
12 July 2013 – Final papers due

5 August 2013 – Pre-conference Symposium on XForms
6-9 August 2013 – Balisage: The Markup Conference

From the call:

Balisage is where people interested in descriptive markup meet each year in August for informed technical discussion, occasionally impassioned debate, good coffee, and the incomparable ambience of one of North America’s greatest cities, Montreal. We welcome anyone interested in discussing the use of descriptive markup to build strong, lasting information systems.

Practitioner or theorist, tool-builder or tool-user, student or lecturer — you are invited to submit a paper proposal for Balisage 2013. As always, papers at Balisage can address any aspect of the use of markup and markup languages to represent information and build information systems. Possible topics include but are not limited to:

• XML and related technologies
• Non-XML markup languages
• Big Data and XML
• Implementation experience with XML parsing, XSLT processors, XQuery processors, XML databases, XProc integrations, or any markup-related technology
• Semantics, overlap, and other complex fundamental issues for markup languages
• Case studies of markup design and deployment
• Quality of information in markup systems
• JSON and XML
• Efficiency of Markup Software
• Markup systems in and for the mobile web
• The future of XML and of descriptive markup in general
• Interesting applications of markup

In addition, please consider becoming a Peer Reviewer. Reviewers play a critical role towards the success of Balisage. They review blind submissions — on topics that interest them — for technical merit, interest, and applicability. Your comments and recommendations can assist the Conference Committee in creating the program for Balisage 2013!

How:

More IQ per square foot than any other conference you will attend in 2013!

### Balisage 2013 – Dates/Location

Tuesday, November 20th, 2012

Tommie Usdin just posted email with the Balisage 2013 dates and location:

Montreal, Hotel Europa, August 5 – 9 , 2013

Hope that works with everything else.

That’s the entire email so I don’t know what was meant by:

Hope that works with everything else.

Short of it being your own funeral, open-heart surgery or giving birth (to your first child), I am not sure what “everything else” there could be?

You get a temporary excuse for the second two cases and a permanent excuse for the first one.

Now’s a good time to hint about plane fare plus hotel and expenses for Balisage as a stocking stuffer.

And to wish a happy holiday Tommie Usdin and to all the folks at Mulberry Technology who make Balisage possible all of us. Each and every one.

### Show Me The Money!

Monday, June 25th, 2012

I need to talk to Tommie Usdin about marketing the Balisage conference.

The final program came out today and here is what Tommie had to say:

When the regular (peer-reviewed) part of the Balisage 2012 program was scheduled, a few slots were reserved for presentation of “Late breaking” material. These presentations have now been selected and added to the program.

• making robust and multi-platform ebooks
• creating representative documents from large document collections
• validating RESTful services using XProc, XSLT, and XSD
• XML for design-based (e.g. magazine) publishing
• provenance in XSLT transformation (tracking what XSLT does to documents)
• literate programming
• managing the many XML-related standards and specifications
• leveraging XML for web applications

The program already included talks about adding RDF to TEI documents, compression of XML documents, exploring large XML collections, Schematron, relation of XML to JSON, overlap, higher-order functions in XSLT, the balance between XML and non-XML notations, and many other topics. Now it is a real must for anyone who thinks deeply about markup.

Balisage is the XML Geek-fest; the annual gathering of people who design markup and markup-based applications; who develop XML specifications, standards, and tools; the people who read and write, books about publishing technologies in general and XML in particular; and super-users of XML and related technologies. You can read about the Balisage 2011 conference at http://www.balisage.net.

Yawn. Are we there yet? 😉

Why you should care about XML and Balisage:

• US government and others are publishing laws and regulations and soon to be legislative material in XML
• Securities are increasingly using XML for required government reports
• Texts and online data sets are being made available in XML
• All the major document formats are based in XML

A $billion here, a$billion there and pretty soon you are talking about real business opportunity.

Be smart, make your XML developers imaginative and productive.

Send your XML developers to Balisage.

### BaseX 7.3 (The Summer Edition) is now available!

Thursday, June 21st, 2012

BaseX 7.3 (The Summer Edition) is now available!

From the post:

we are glad to announce a great new release of BaseX, our XML database and XPath/XQuery 3.0 processor! Here are the latest features:

• Many new internal XQuery Modules have been added, and existing ones have been revised to ensure long-term stability of your future XQuery applications
• A new powerful Command API is provided to specify BaseX commands and scripts as XML
• The full-text fuzzy index was extended to also support wildcard queries
• The simple map operator of XQuery 3.0 gives you a compact syntax to process items of sequences
• BaseX as Web Application can now start its own server instance
• All command-line options will now be executed in the given order
• Charles Foster’s latest XQJ Driver supports XQuery 3.0 and the Update and Full Text extensions

For those of you in the Northern Hemisphere, we wish you a nice summer! No worries, we’ll stay busy..

Just in time for the start of summer in the Northern Hemisphere!

Something you can toss onto your laptop before you head to the beach.

Err, huh? Well, even if you don’t take BaseX 7.3 to the beach, it promises to be good fun for the summer and more serious work should the occasion arise.

I count twenty-three (23) modules in addition to the XQuery functions specified by the latest XPath/XQuery 3.0 draft.

Just so you know, the BaseX database server listens to port 1984 by default.

### Are You Going to Balisage?

Friday, June 1st, 2012

To the tune of “Are You Going to Scarborough Fair:”

Are you going to Balisage?
Parsley, sage, rosemary and thyme.
Remember me to one who is there,
she once was a true love of mine.

Tell her to make me an XML shirt,
Parsley, sage, rosemary, and thyme;
Without any seam or binary code,
Then she shall be a true lover of mine.

….

Oh, sorry! There you will see:

• higher-order functions in XSLT
• Schematron to enforce consistency constraints
• relation of the XML stack (the XDM data model) to JSON
• integrating JSON support into XDM-based technologies like XPath, XQuery, and XSLT
• XML and non-XML syntaxes for programming languages and documents
• type introspection in XQuery
• using XML to control processing in a document management system
• standardizing use of XQuery to support RESTful web interfaces
• RDF to record relations among TEI documents
• high-performance knowledge management system using an XML database
• a corpus of overlap samples
• an XSLT pipeline to translate non-XML markup for overlap into XML
• comparative entropy of various representations of XML
• interoperability of XML in web browsers
• XSLT extension functions to validate OCL constraints in UML models
• ontological analysis of documents
• statistical methods for exploring large collections of XML data

Balisage is an annual conference devoted to the theory and practice of descriptive markup and related technologies for structuring and managing information. Participants typically include XML users, librarians, archivists, computer scientists, XSLT and XQuery programmers, implementers of XSLT and XQuery engines and other markup-related software, Topic-Map enthusiasts, semantic-Web evangelists, members of the working groups which define the specifications, academics, industrial researchers, representatives of governmental bodies and NGOs, industrial developers, practitioners, consultants, and the world’s greatest concentration of markup theorists. Discussion is open, candid, and unashamedly technical.

The Balisage 2012 Program is now available at: http://www.balisage.net/2012/Program.html

### Destination: Montreal!

Tuesday, May 29th, 2012

If you remember the Saturday afternoon sci-fi movies, Destination: …., then you will appreciate the title for this post. 😉

Tommie Usdin and company just posted: Balisage 2012 Call for Late-breaking News, written in torn bodice style:

The peer-reviewed part of the Balisage 2012 program has been scheduled (and will be announced in a few days). A few slots on the Balisage program have been reserved for presentation of “Late-breaking” material.

Proposals for late-breaking slots must be received by June 15, 2012. Selection of late-breaking proposals will be made by the Balisage conference committee, instead of being made in the course of the regular peer-review process.

If you have a presentation that should be part of Balisage, please send a proposal message as plain-text email to info@balisage.net.

In order to be considered for inclusion in the final program, your proposal message must supply the following information:

• The name(s) and affiliations of all author(s)/speaker(s)
• The email address of the presenter
• The title of the presentation
• An abstract of 100-150 words, suitable for immediate distribution
• Disclosure of when and where, if some part of this material has already been presented or published
• An indication as to whether the presenter is comfortable giving a conference presentation and answering questions in English about the material to be presented
• Your assurance that all authors are willing and able to sign the Balisage Non-exclusive Publication Agreement (http://www.balisage.net/BalisagePublicationAgreement.pdf) with respect to the proposed presentation

In order to be in serious contention for inclusion in the final program, your proposal should probably be either a) really late-breaking (it happened in the last month or two) or b) a paper, an extended paper proposal, or a very long abstract with references. Late-breaking slots are few and the competition is fiercer than for peer-reviewed papers. The more we know about your proposal, the better we can appreciate the quality of your submission.

Please feel encouraged to provide any other information that could aid the conference committee as it considers your proposal, such as a detailed outline, samples, code, and/or graphics. We expect to receive far more proposals than we can accept, so it’s important that you send enough information to make your proposal convincing and exciting. (This material may be attached to the email message, if appropriate.)

The conference committee reserves the right to make editorial changes in your abstract and/or title for the conference program and publicity. (emphasis added to last sentence)

The conference committee reserves the right to make editorial changes in your abstract and/or title for the conference program and publicity.

The conference committee might change your abstract and/or title to say something …. controversial? ….attention getting? ….CNN / Slashdot worthy?

Bring it on!

Submit late breaking proposals!

### Would You Know “Good” XML If It Bit You?

Tuesday, February 14th, 2012

XML is a pale imitation of a markup language. It has resulted in real horrors across the markup landscape. After years in its service, I don’t have much hope of that changing.

But, the Princess of the Northern Marches has organized a war council to consider how to stem the tide of bad XML. Despite my personal misgivings, I wish them well and invite you to participate as you see fit.

Oh, and I found this message about the council meeting:

International Symposium on Quality Assurance and Quality Control in XML

Monday August 6, 2012

Paper submissions due April 20, 2012.

A one-day discussion of issues relating to Quality Control and Quality Assurance in the XML environment.

XML systems and software are complex and constantly changing. XML documents are highly varied, may be large or small, and often have complex life-cycles. In this challenging environment quality is difficult to define, measure, or control, yet the justifications for using XML often include promises or implications relating to quality.

We invite papers on all aspects of quality with respect to XML systems, including but not limited to:

• Defining, measuring, testing, improving, and documenting quality
• Quality in documents, document models, software, transformations, or queries
• Case studies in the control of quality in an XML environment
• Theoretical or practical approaches to measuring quality in XML
• Does the presence of XML, XML schemas, and XML tools make quality checking easier, harder, or even different from other computing environments
• Should XML transforms and schemas be QAed as software? Or configuration files? Or documents? Does it matter?

Paper submissions due April 20, 2012.

Details at: http://www.balisage.net/QA-QC/

You do have to understand the semantics of even imitation markup languages before mapping them with more robust languages. Enjoy!

### XML Prague 2012 (proceedings)

Sunday, February 12th, 2012

Fourteen papers by the leading lights in the XML world covering everything from XProc and XQuery to NVDL and JSONiq, and places in between.

### BaseX

Saturday, October 15th, 2011

BaseX

From the webpage:

BaseX is a very light-weight and high-performance XML database system and XPath/XQuery processor, including full support for the W3C Update and Full Text extensions. An interactive and user-friendly GUI frontend gives you great insight into your XML documents and collections.

But most important: BaseX runs out of the box and is easy to use…

For those of us who don’t think documents, even XML documents, are all that weird. 😉

### Balisage 2011 – Final Program

Friday, July 1st, 2011

A recent post from Tommie Usdin announce the following additions to the Balisage 2011 program:

• XQuery and SparQL
• XQuery and XSLT
• the Logical Form of a Metadata Record
• Why is XML a pain to produce?
• XML Serialization of C# and Java Objects
• testing XSLT in continuous integration
• dealing with markup without using words
• REST for document resource nodes
• tagging journal article supplemental materials
• using 15 year old SGML documents in current software

and then goes on to talk about why markup geeks should be at Balisage.

I’ll make that shorter:

If you see either < or > at work or anyone talks about them, you need to be at Balisage 2011.

If you are not a markup geek, you will be one by the time you leave. Road to Damascus sort of experience. Or you will decide to move to San Francisco. Either way, what do you have to lose?

August 2-5, 2011, Montreal, Canada Time is running out!

### Six Drafts Published Related to XSLT, XQuery, XPath (21 June 2011)

Thursday, June 23rd, 2011

Six Drafts Published Related to XSLT, XQuery, XPath (21 June 2011)

From the post:

Has anyone compared the addressing capabilities of XQuery to HyTime?

### Balisage 2011 Preliminary Program

Wednesday, May 18th, 2011

At-A-Glance

Program (in full)

From the announcement (Tommie Usdin):

Topics this year include:

• optimizing XSLT and XQuery processing
• interchange, interoperability, and packaging of XML documents
• eBooks and epub
• overlapping markup and related topics
• visualization
• encryption
• data mining

The acronyms this year include:

XML XSLT XQuery XDML REST XForms JSON OSIS XTemp RDF SPARQL XPath

New this year will be:

Lightning talks: an opportunity for participants to say what they think, simply, clearly, and persuasively.

As I have said before, simply the best conference of the year!

Conference site: http://www.balisage.net/

Registration: http://www.balisage.net/registration.html

### Practical Transformation Using XSLT and XPath

Sunday, February 13th, 2011

A new edition of Practical Transformation Using XSLT and XPath by Ken Holman is out.

While not topic map specific, ;-), this is one of the two resources you need for transformations getting to (or from) topic maps using XSLT and XPath. The other one, would be: XSLT 2.0 and XPath 2.0: programmer’s reference. (You can also use both of these for non-topic map, XML based work.)

While your looking at Ken’s training resources, note his series on UBL (Universal Business Language).

I mention that because the greater the exposure of business systems the greater the need for the mapping of semantics (that means topic maps).