Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

May 16, 2019

Brzozowski derivatives – Invisible XML – Thinking, Wishing, Saying – Must be … Balisage 2019!

Filed under: Conferences,XML,XQuery,XSLT — Patrick Durusau @ 1:20 pm

Balisage 2019 Program Announced!

An awesome lineup of topics and speakers await Balisage 2019 goers. From the expected, standoff markup in browsers (yes, that usual fare at Balisage) to re-invention of markup “seen” when looking at a file with no markup (HyTime) and beyond, you are in for a real treat.

I saw several slots for late-breaking news so if you have something really profound and coherent to say, you’d best be polishing it now. Just looking at the current program gives you an idea of the competition for slots.

Why attend? General Eric Shinseki said it best:

If you dislike change, you’re going to dislike irrelevance even more.

Don’t risk irrelevance! Attend Balisage 2019!

January 13, 2019

Exciting new features in XSLT 3 for book publishers

Filed under: Publishing,XSLT — Patrick Durusau @ 3:28 pm

Exciting new features in XSLT 3 for book publishers by Liam Quin.

From the post:


For e-publishers, the ability of XSLT 3 engines to read from and write to zip archives means you can generate EPUB files directly, or even extract files from ebooks. You can also process binary files, so that it’s possible to work out the size of a bitmap image in pixels, which is useful when embedding graphics into web pages or ebooks. And you can process text files a line at a time with fn:unparsed-text-lines().

Probably the single feature that’s the biggest game-changer for most people in publishing, the most fun, and that gives the largest reduction in costs, is the ability to call XSLT from within XSLT using the new fn:transform() function. This means you can easily build a collection of documents, such as making an EPUB 3 zip file, even if it involves running a separate transformation to create some or all of the components such as the spine or table of contents or index, without resorting to complex batch scripts or other programming languages. This reduces the number of programming or scripting languages you need in a project, reduces the number of components, controls the way the components interlock, and results in something easier to understand and maintain by the same person who works with the underlying XSLT transformations.

Part of a tease for Quin’s presentation at: EBOOKCRAFT March 18-19, 2019 | MaRS Discovery District. Videos from 2018 are available.

I like to think of XQuery and XSLT as ways to liberate and transform data, but I have to concede they have legitimate purposes as well. 😉

If you are an ebook publisher, Quin’s presentation at EBookCraft should be on your must-attend calendar.

January 9, 2019

Summer is Coming! Balisage is Coming! Papers Due April 12, 2019!

Filed under: Conferences,XML,XML Database,XML Query Rewriting,XML Schema,XPath,XProc,XQuery,XSLT — Patrick Durusau @ 7:52 pm

From a recent email about Balisage 2019:

Some “Balisage: The Markup Conference 2019” dates are coming soon:

March 29, 2019 — Peer-review applications due
April 12, 2019 — Paper submissions due
July 30 — August 2, 2019 — Balisage: The Markup Conference
July 29, 2019 — Pre-conference Symposium – Topic to be announced https://www.balisage.net/

Balisage: where serious markup practitioners and theoreticians meet every August.

A colleague recently asked me to share the program for Balisage 2019 to help support a request to attend. What, I was asked, will we talk about at Balisage 2019. I replied “It will be a variety of topics relating to markup, but we won’t know the specifics until May.” “Why? It seems like you should know that now.” was the response. “Why don’t you just decide who you want to talk about what and assign topics?” “Because that would not be a contributed paper conference, it would be some other sort of event!”

Balisage *is* a contributed paper conference, and the submissions from people who want to speak drive the program, the hallway conversations, and the whole tone of Balisage!

If you want to speak at Balisage 2019, if you want to help shape the conversation, if you have an idea, experience, opinion, or question relating to markup, please submit a paper to Balisage 2019!

We solicit papers on any aspect of markup and its uses; topics include but ARE NOT LIMITED TO:

• Cutting-edge applications of XML and related technologies
• Integration of XML with other technologies (e.g., content management, XSLT, XQuery)
• Performance issues in parsing, XML database retrieval, or XSLT processing
• Development of angle-bracket-free user interfaces for non-technical users
• Deployment of XML systems for enterprise data
• Design and implementation of XML vocabularies
• Case studies of the use of XML for publishing, interchange, or archiving
• Alternatives to XML/JSON/whatever
• Expressive power and application adequacy of XSD, Relax NG, DTDs, Schematron, and other schema languages
• Invisible XML

Detailed Call for Participation: https://www.balisage.net/Call4Participation.html
Call for Peer Reviewers: https://www.balisage.net/peer/ReviewAppForm.html
About Balisage: https://www.balisage.net/

For more information: info@balisage.net or +1 301 315 9631

Papers are due for Balisage in a little more than 90 days.

Anyone doing a topic map paper this year?

“If you can point to it, we can identify it. If we can identify it, we can map it. If we can map it, …,” well, you know how the rest of it goes.

Data silos continue to exist because they are armor. Armor that protects some stakeholders from prying eyes. Up for a little peeping?

May 29, 2018

Balisage Late-Breaking News Deadline – 6 July 2018 – Attract/Spot a Fed!

Filed under: Conferences,XML,XML Schema,XPath,XQuery,XSLT — Patrick Durusau @ 7:10 pm

Balisage 2018 Call for Late-breaking News

From the post:


Proposals for late-breaking slots must be received at info@balisage.net by July 6, 2018. Selection of late-breaking proposals will be made by the Balisage conference committee, instead of being made in the course of the regular peer-review process. (emphasis in original)

The Def Con conference attendees play spot the fed.

But spot the fed requires some feds in order to play.

Feds show up at hacker conferences. For content or the company of people with poor personal hygiene.

Let’s assume it’s the content.

What content for a markup paper will attract undercover federal agents?

Success means playing spot the fed at Balisage 2018.

Topics anyone?

May 23, 2018

Balisage 2018 Program!

Filed under: Conferences,XML,XPath,XQuery,XSLT — Patrick Durusau @ 12:40 pm

The Balisage 2018 program has hit the Web!

Among the goodies on the agenda:

  • Implementing and using concurrent document structures
  • White-hat web crawling: Industrial strength web crawling for serious content acquisition
  • Easing the road to declarative programming in XSLT for imperative programmers
  • Fractal information is
  • Scaling XML using a Beowulf cluster

That’s a random sampling from the talk already scheduled!

Even more intriguing are the open spots left for “late-breaking” news.

Perhaps you have some “late-breaking” XML related news to share?

I haven’t seen the 2018 Call for Late-Breaking papers but if the 2017 Call for Late-Breaking papers is any guide, time is running out!

Enjoy!

February 9, 2018

XML Prague 2018 Conference Proceedings – Weekend Reading!

Filed under: Conferences,XML,XML Database,XPath,XQuery,XSLT — Patrick Durusau @ 9:13 pm

XML Prague 2018 Conference Proceedings

Two Hundred and Sixty (260) pages of high quality content on XML!

From the table of contents:

  • Assisted Structured Authoring using Conditional Random Fields – Bert Willems
  • XML Success Story: Creating and Integrating Collaboration Solutions to Improve the Documentation Process – Steven Higgs
  • xqerl: XQuery 3.1 Implementation in Erlang – Zachary N. Dean
  • XML Tree Models for Efficient Copy Operations – Michael Kay
  • Using Maven with XML development projects – Christophe Marchand and Matthieu Ricaud-Dussarget
  • Varieties of XML Merge: Concurrent versus Sequential – Tejas Pradip Barhate and Nigel Whitaker
  • Including XML Markup in the Automated Collation of Literary Text – Elli Bleeker, Bram Buitendijk, Ronald Haentjens Dekker, and Astrid Kulsdom
  • Multi-Layer Content Modelling to the Rescue – Erik Siegel
  • Combining graph and tree – Hans-Juergen Rennau
  • SML – A simpler and shorter representation of XML – Jean-François Larvoire
  • Can we create a real world rich Internet application using Saxon-JS? – Pieter Masereeuw
  • Implementing XForms using interactive XSLT 3.0 – O’Neil Delpratt and Debbie Lockett
  • Life, the Universe, and CSS Tests – Tony Graham
  • Form, and Content – Steven Pemberton
  • tokenized-to-tree – Gerrit Imsieke

I just got a refurbished laptop for reading in bed. Now I have to load XML parsers, etc. on it to use along with reading these proceedings!

Enjoy!

PS: Be sure to thank Jirka Kosek for his tireless efforts promoting XML and XML Prague!

January 9, 2018

Sessions for XML Prague 2018 – January 10th, Early Bird Deadline!

Filed under: Conferences,XML,XQuery,XSLT — Patrick Durusau @ 8:03 pm

List of sessions for XML Prague 2018

The range of great presentations is no surprise.

That early registration is still open, with this list of presentations, well, that is a surprise!

January 10, 2018 is the deadline for early birds!

From the post:

Unconference day

Schematron Users Meetup
XSL-FO, CSS and Paged Output – hosted by Antenna House
Introduction to CSS for Paged Media
XSpec Users Meetup
oXygen Users Meeup
Creating beautiful documents with the speedata Publisher
eXist-db Community Meetup
XML with Emacs workshop

Friday and Saturday sessions

Bert Willems: Assisted Structured Authoring using Conditional Random Fields
Christophe Marchand and Matthieu Ricaud-Dussarget: Using Maven with XML Projects
Elli Bleeker, Bram Buitendijk, Ronald Haentjens Dekker and Astrid Kulsdom: Including XML Markup in the Automated Collation of Literary Texts
Erik Siegel: Multi-layered content modelling to the rescue
Francis Cave: Does the world need more XML standards?
Gerrit Imsieke: tokenized-to-tree – An XProc/XSLT Library For Patching Back Tokenization/Analysis Results Into Marked-up Text
Hans-Juergen Rennau: Combining graph and tree: writing SHAX, obtaining SHACL, XSD and more
James Fuller: Diff with XQuery
Jean-François Larvoire: SML – A simpler and shorter representation of XML
Johannes Kolbe and Manuel Montero: XML periodic table, XML repository and XSLT checker
Michael Kay: XML Tree Models for Efficient Copy Operations
O’Neil Delpratt and Debbie Lockett: Implementing XForms using interactive XSLT 3:0
Pieter Masereeuw: Can we create a real world rich Internet application using Saxon-JS?
Radu Coravu: A short story about XML encoding and opening very large documents in an XML editing application
Steven Higgs: XML Success Story: Creating and Integrating Collaboration Solutions to Improve the Documentation Process
Steven Pemberton: Form, and Content
Tejas Barhate and Nigel Whitaker: Varieties of XML Merge: Concurrent versus Sequential
Tony Graham: Life, the Universe, and CSS Tests
Vasu Chakkera: Effective XSLT Documentation and its separation from XSLT code:
Zachary Dean: xqerl: XQuery 3:1 Implementation in Erlang

I’m expecting lots of tweets and posts about these presentations!

November 13, 2017

XML Prague 2017 – 21 Reasons to Attend 2018 – Offensive Use of XQuery

Filed under: Conferences,XML,XPath,XQuery,XSLT — Patrick Durusau @ 8:41 pm

XML Prague 2017 Videos

Need reasons for your attending XML Prague 2018?

The XML Prague 2017 YouTube playlist has twenty-one (21) very good reasons (videos). (You may have to hold the hands of c-suite types if you share the videos with them.)

Two things that I see missing from the presentations, security and offensive use of XQuery.

XML Security

You may have noticed that corporations, governments and others have been hemorrhaging data in 2017 (and before). While legislators wail ineffectually and wish for a 18th century world, the outlook for cybersecurity looks grim for 2018.

XML and XML applications exist in a law of the jungle security context. But there weren’t any presentations on security related issues at XML Prague in 2017. Are you going to break the ice in 2018?

Offensive use of XQuery

XQuery has the power to extract, enhance and transform data to serve your interests, not those of its authors.

I’ve heard the gospel that technologists should disarm themselves and righteously await a better day. Meanwhile, governments, military forces, banks, and their allies loot and rape the Earth and its peoples.

Are data scientists at the NSA, FSB, MSS, MI6, Mossad, CIA, etc., constrained by your “do no evil” creeds?

Present governments or their successors, can move towards more planet and people friendly policies, but they require, ahem, encouragement.

XQuery, which doesn’t depend upon melting data centers, supercomputers, global data vacuuming, etc., can help supply that encouragement.

How would you use XQuery to transform government data to turn it against its originator?

October 12, 2017

XML Prague 2018 – Apology to Procrastinators

Filed under: Conferences,Cybersecurity,Security,XML,XPath,XQuery,XSLT — Patrick Durusau @ 10:49 am

Apology to all procrastinators, I just saw the Call for Proposals for XML Prague 2018

You only have 50 days (until November 30, 2017) to submit your proposals for XML Prague 2018.

Efficient people don’t realize that 50 days is hardly enough time to put off thinking about a proposal topic, much less fail to write down anything for a proposal. Completely unreasonable demand but, do try to procrastinate quickly and get a proposal done for XML Prague 2018.

The suggestion of doing a “…short video…” seems rife with potential for humor and/or NSFW images. Perhaps XML Prague will post the best “…short videos…” to YouTube?

From the webpage:

XML Prague 2018 now welcomes submissions for presentations on the following topics:

  • Markup and the Extensible Web – HTML5, XHTML, Web Components, JSON and XML sharing the common space
  • Semantic visions and the reality – micro-formats, semantic data in business, linked data
  • Publishing for the 21th century – publishing toolchains, eBooks, EPUB, DITA, DocBook, CSS for print, …
  • XML databases and Big Data – XML storage, indexing, query languages, …
  • State of the XML Union – updates on specs, the XML community news, …
  • XML success stories – real-world use cases of successful XML deployments

There are several different types of slots available during the conference and you can indicate your preferred slot during submission:

30 minutes
15 minutes
These slots are suitable for normal conference talks.
90 minutes (unconference)
Ideal for holding users meeting or workshop during the unconference day (Thursday).

All proposals will be submitted for review by a peer review panel made up of the XML Prague Program Committee. Submissions will be chosen based on interest, applicability, technical merit, and technical correctness.

Authors should strive to contain original material and belong in the topics previously listed. Submissions which can be construed as product or service descriptions (adverts) will likely be deemed inappropriate. Other approaches such as use case studies are welcome but must be clearly related to conference topics.

Proposals can have several forms:

full paper
In our opinion still ideal and classical way of proposing presentation. Full paper gives reviewers enough information to properly asses your proposal.
extended abstract
Concise 1-4 page long description of your topic. If you do not have time to write full paper proposal this is one possible way to go. Try to make your extended abstract concrete and specific. Too short or vague abstract will not convince reviewers that it is worth including into the conference schedule.
short video (max. 5 minutes)
If you are not writing person but you still have something interesting to present. Simply capture short video (no longer then 5 minutes) containing part of your presentation. Video can capture you or it can be screen cast.

I mentioned XSLT security attacks recently, perhaps you could do something similar on XQuery? Other ways to use XML and related technologies to breach cybersecurity?

Do submit proposals and enjoy XML Prague 2018!

October 6, 2017

XSLT Server Side Injection Attacks

Filed under: Cybersecurity,Security,XML,XSLT — Patrick Durusau @ 12:02 pm

XSLT Server Side Injection Attacks by David Turco.

From the post:

Extensible Stylesheet Language Transformations (XSLT) vulnerabilities can have serious consequences for the affected applications, often resulting in remote code execution. Examples of XSLT remote code execution vulnerabilities with public exploits are CVE-2012-5357 affecting the .Net Ektron CMS; CVE-2012-1592 affecting Apache Struts 2.0; and CVE-2005-3757 which affected the Google Search Appliance.

From the examples above it is clear that XSLT vulnerabilities have been around for a long time and, although they are less common than other similar vulnerabilities such as XML Injection, we regularly find them in our security assessments. Nonetheless the vulnerability and the exploitation techniques are not widely known.

In this blog post we present a selection of attacks against XSLT to show the risks of using this technology in an insecure way.

We demonstrate how it is possible to execute arbitrary code remotely; exfiltrate data from remote systems; perform network scans; and access resources on the victim’s internal network.

We also make available a simple .NET application vulnerable to the described attacks and provide recommendations on how to mitigate them.

A great post for introducing XML and XSLT to potential hackers!

Equally great potential for a workshop at a markup conference.

Enjoy!

October 4, 2017

Procrastinators – Dates/Location for Balisage: The Markup Conference 2018

Filed under: Conferences,JSON,XML,XPath,XQuery,XSLT — Patrick Durusau @ 12:48 pm

Procrastinators can be caught short, without enough time for proper procrastination on papers and slides.

To insure ample time for procrastination, Balisage: The Markup Conference 2018 has published its dates and location.

31 July 2018–3 August 2018 … Balisage: The Markup Conference
30 July 2018 … Symposium – topic to be announced
CAMBRiA Hotel & Suites
1 Helen Heneghan Way
Rockville, Maryland 20850
USA

For indecisive procrastinators, Balisage offers suggestions for your procrastination:

The 2017 program included papers discussing XML vocabularies, cutting-edge digital humanities, lossless JSON/XML roundtripping, reflections on concrete syntax and abstract syntax, parsing and generation, web app development using the XML stack, managing test cases, pipelining and micropipelinging, electronic health records, rethinking imperative algorithms for XSLT and XQuery, markup and intellectual property, digitiziging Ethiopian and Eritrean manuscripts, exploring “shapes” in RDF and their relationship to schema validation, exposing XML data to users of varying technical skill, test-suite management, and use case studies about large conversion applications, DITA, and SaxonJS.

Innovative procrastinators can procrastinate on other related topics, including any they find on the Master Topic List (ideas procrastinated on for prior Balisage conferences).

Take advantage of this opportunity to procrastinate early and long on your Balisage submissions. You and your audience will be glad you did!

PS: Don’t procrastinate on saying thank you to Tommie Usdin and company for another year of Balisage. Balisage improves XML theory and practice every year it is held.

June 8, 2017

XSL Transformations (XSLT) Version 3.0 (That’s a Wrap!)

Filed under: XML,XSLT — Patrick Durusau @ 10:02 am

XSL Transformations (XSLT) Version 3.0 W3C Recommendation 8 June 2017

Abstract:

This specification defines the syntax and semantics of XSLT 3.0, a language designed primarily for transforming XML documents into other XML documents.

XSLT 3.0 is a revised version of the XSLT 2.0 Recommendation [XSLT 2.0] published on 23 January 2007.

The primary purpose of the changes in this version of the language is to enable transformations to be performed in streaming mode, where neither the source document nor the result document is ever held in memory in its entirety. Another important aim is to improve the modularity of large stylesheets, allowing stylesheets to be developed from independently-developed components with a high level of software engineering robustness.

XSLT 3.0 is designed to be used in conjunction with XPath 3.0, which is defined in [XPath 3.0]. XSLT shares the same data model as XPath 3.0, which is defined in [XDM 3.0], and it uses the library of functions and operators defined in [Functions and Operators 3.0]. XPath 3.0 and the underlying function library introduce a number of enhancements, for example the availability of higher-order functions.

As an implementer option, XSLT 3.0 can also be used with XPath 3.1. All XSLT 3.0 processors provide maps, an addition to the data model which is specified (identically) in both XSLT 3.0 and XPath 3.1. Other features from XPath 3.1, such as arrays, and new functions such as random-number-generatorFO31 and sortFO31, are available in XSLT 3.0 stylesheets only if the implementer chooses to support XPath 3.1.

Some of the functions that were previously defined in the XSLT 2.0 specification, such as the format-dateFO30 and format-numberFO30 functions, are now defined in the standard function library to make them available to other host languages.

XSLT 3.0 also includes optional facilities to serialize the results of a transformation, by means of an interface to the serialization component described in [XSLT and XQuery Serialization]. Again, the new serialization capabilities of [XSLT and XQuery Serialization 3.1] are available at the implementer’s option.

This document contains hyperlinks to specific sections or definitions within other documents in this family of specifications. These links are indicated visually by a superscript identifying the target specification: for example XP30 for XPath 3.0, DM30 for the XDM data model version 3.0, FO30 for Functions and Operators version 3.0.

A special shout out to Michael Kay for, in his words, “Done and dusted: ten years’ work.”

Thanks from an appreciative audience!

May 17, 2017

Balisage: The Markup Conference 2017 Program Now Available

Filed under: Conferences,XML,XML Schema,XPath,XQuery,XSLT — Patrick Durusau @ 3:42 pm

Balisage: The Markup Conference 2017 Program Now Available

An email from Tommie Usdin, Chair, Chief Organizer and herder of markup cats for Balisage advises:

Balisage: where serious markup practitioners and theoreticians meet every August.

The 2017 program includes papers discussing XML vocabularies, cutting-edge digital humanities, lossless JSON/XML roundtripping, reflections on concrete syntax and abstract syntax, parsing and generation, web app development using the XML stack, managing test cases, pipelining and micropipelinging, electronic health records, rethinking imperative algorithms for XSLT and XQuery, markup and intellectual property, why YOU should use (my favorite XML vocabulary), developing a system to aid in studying manuscripts in the tradition of the Ethiopian and Eritrean Highlands, exploring “shapes” in RDF and their relationship to schema validation, exposing XML data to users of varying technical skill, test-suite management, and use case studies about large conversion applications, DITA, and SaxonJS.

Up-Translation and Up-Transformation: A one-day Symposium on the goals, challenges, solutions, and workflows for significant XML enhancements, including approaches, tools, and techniques that may potentially be used for a variety of other tasks. The symposium will be of value not only to those facing up-translation and transformation but also to general XML practitioners seeking to get the most out of their data.

Are you interested in open information, reusable documents, and vendor and application independence? Then you need descriptive markup, and Balisage is your conference. Balisage brings together document architects, librarians, archivists, computer scientists, XML practitioners, XSLT and XQuery programmers, implementers of XSLT and XQuery engines and other markup-related software, semantic-Web evangelists, standards developers, academics, industrial researchers, government and NGO staff, industrial developers, practitioners, consultants, and the world’s greatest concentration of markup theorists. Some participants are busy designing replacements for XML while other still use SGML (and know why they do).

Discussion is open, candid, and unashamedly technical.

Balisage 2017 Program:
http://www.balisage.net/2017/Program.html

Symposium Program:
https://www.balisage.net/UpTransform

NOTE: Members of the TEI and their employees are eligible for discount Balisage registration.

You need to see the program for yourself but the highlights (for me) include: Ethiopic manuscripts (ok, so I have odd tastes), Earley parsers (of particular interest), English Majors (my wife was an English major), and a number of other high points.

Mark your calendar for July 31 – August 4, 2017 – It’s Balisage!

April 19, 2017

Pure CSS crossword – CSS Grid

Filed under: Crossword Puzzle,Education,XPath,XQuery,XSLT — Patrick Durusau @ 4:22 pm

Pure CSS crossword – CSS Grid by Adrian Roworth.

The UI is slick, although creating the puzzle remains on you.

Certainly suitable for string answers, XQuery/XPath/XSLT expressions, etc.

Enjoy!

April 18, 2017

XSL Transformations (XSLT) Version 3.0 (Proposed Recommendation 18 April 2017)

Filed under: W3C,XML,XSLT — Patrick Durusau @ 6:53 pm

XSL Transformations (XSLT) Version 3.0 (Proposed Recommendation 18 April 2017)

Michael Kay tweeted today:

XSLT 3.0 is a Proposed Recommendation: https://www.w3.org/TR/xslt-30/ It’s taken ten years but we’re nearly there!

Congratulations to Michael and the entire team!

What’s new?

A major focus for enhancements in XSLT 3.0 is the requirement to enable streaming of source documents. This is needed when source documents become too large to hold in main memory, and also for applications where it is important to start delivering results before the entire source document is available.

While implementations of XSLT that use streaming have always been theoretically possible, the nature of the language has made it very difficult to achieve this in practice. The approach adopted in this specification is twofold: it identifies a set of restrictions which, if followed by stylesheet authors, will enable implementations to adopt a streaming mode of operation without placing excessive demands on the optimization capabilities of the processor; and it provides new constructs to indicate that streaming is required, or to express transformations in a way that makes it easier for the processor to adopt a streaming execution plan.

Capabilities provided in this category include:

  • A new xsl:source-document instruction, which reads and processes a source document, optionally in streaming mode;
  • The ability to declare that a mode is a streaming mode, in which case all the template rules using that mode must be streamable;
  • A new xsl:iterate instruction, which iterates over the items in a sequence, allowing parameters for the processing of one item to be set during the processing of the previous item;
  • A new xsl:merge instruction, allowing multiple input streams to be merged into a single output stream;
  • A new xsl:fork instruction, allowing multiple computations to be performed in parallel during a single pass through an input document.
  • Accumulators, which allow a value to be computed progressively during streamed processing of a document, and accessed as a function of a node in the document, without compromise to the functional nature of the XSLT language.

A second focus for enhancements in XSLT 3.0 is the introduction of a new mechanism for stylesheet modularity, called the package. Unlike the stylesheet modules of XSLT 1.0 and 2.0 (which remain available), a package defines an interface that regulates which functions, variables, templates and other components are visible outside the package, and which can be overridden. There are two main goals for this facility: it is designed to deliver software engineering benefits by improving the reusability and maintainability of code, and it is intended to streamline stylesheet deployment by allowing packages to be compiled independently of each other, and compiled instances of packages to be shared between multiple applications.

Other significant features in XSLT 3.0 include:

  • An xsl:evaluate instruction allowing evaluation of XPath expressions that are dynamically constructed as strings, or that are read from a source document;
  • Enhancements to the syntax of patterns, in particular enabling the matching of atomic values as well as nodes;
  • An xsl:try instruction to allow recovery from dynamic errors;
  • The element xsl:global-context-item, used to declare the stylesheet’s expectations of the global context item (notably, its type).
  • A new instruction xsl:assert to assist developers in producing correct and robust code.

XSLT 3.0 also delivers enhancements made to the XPath language and to the standard function library, including the following:

  • Variables can now be bound in XPath using the let expression.
  • Functions are now first class values, and can be passed as arguments to other (higher-order) functions, making XSLT a fully-fledged functional programming language.
  • A number of new functions are available, for example trigonometric functions, and the functions parse-xmlFO30 and serializeFO30 to convert between lexical and tree representations of XML.

XSLT 3.0 also includes support for maps (a data structure consisting of key/value pairs, sometimes referred to in other programming languages as dictionaries, hashes, or associative arrays). This feature extends the data model, provides new syntax in XPath, and adds a number of new functions and operators. Initially developed as XSLT-specific extensions, maps have now been integrated into XPath 3.1 (see [XPath 3.1]). XSLT 3.0 does not require implementations to support XPath 3.1 in its entirety, but it does requires support for these specific features.

This will remain a proposed recommendation until 1 June 2017.

How close can you read? 😉

Enjoy!

March 16, 2017

Balisage Papers Due in 3 Weeks!

Filed under: Conferences,XML,XQuery,XSLT — Patrick Durusau @ 9:04 pm

Apologies for the sudden lack of posting but I have been working on a rather large data set with XQuery and checking forwards and backwards to make sure it can be replicated. (I hate “it works on my computer.”)

Anyway, Tommie Usdin dropped an email bomb today with a reminder that Balisage papers are due on April 7, 2017.

From her email:

Submissions to “Balisage: The Markup Conference” and pre-conference symposium:
“Up-Translation and Up-Transformation: Tasks, Challenges, and Solutions”
are on April 7.

It is time to start writing!

Balisage: The Markup Conference 2017
August 1 — 4, 2017, Rockville, MD (a suburb of Washington, DC)
July 31, 2017 — Symposium Up-Translation and Up-Transformation
https://www.balisage.net/

Balisage: where serious markup practitioners and theoreticians meet every August. We solicit papers on any aspect of markup and its uses; topics include but are not limited to:

• Web application development with XML
• Informal data models and consensus-based vocabularies
• Integration of XML with other technologies (e.g., content management, XSLT, XQuery)
• Performance issues in parsing, XML database retrieval, or XSLT processing
• Development of angle-bracket-free user interfaces for non-technical users
• Semistructured data and full text search
• Deployment of XML systems for enterprise data
• Web application development with XML
• Design and implementation of XML vocabularies
• Case studies of the use of XML for publishing, interchange, or archiving
• Alternatives to XML
• the role(s) of XML in the application lifecycle
• the role(s) of vocabularies in XML environments

Detailed Call for Participation: http://balisage.net/Call4Participation.html
About Balisage: http://balisage.net/Call4Participation.html

pre-conference symposium:
Up-Translation and Up-Transformation: Tasks, Challenges, and Solutions
Chair: Evan Owens, Cenveo
https://www.balisage.net/UpTransform/index.html

Increasing the granularity and/or specificity of markup is an important task in many content and information workflows. Markup transformations might involve tasks such as high-level structuring, detailed component structuring, or enhancing information by matching or linking to external vocabularies or data. Enhancing markup presents secondary challenges including lack of structure of the inputs or inconsistency of input data down to the level of spelling, punctuation, and vocabulary. Source data for up-translation may be XML, word processing documents, plain text, scanned & OCRed text, or databases; transformation goals may be content suitable for page makeup, search, or repurposing, in XML, JSON, or any other markup language.

The range of approaches to up-transformation is as varied as the variety of specifics of the input and required outputs. Solutions may combine automated processing with human review or could be 100% software implementations. With the potential for requirements to evolve over time, tools may have to be actively maintained and enhanced. This is the place to discuss goals, challenges, solutions, and workflows for significant XML enhancements, including approaches, tools, and techniques that may potentially be used for a variety of other tasks.

For more information: info@balisage.net or +1 301 315 9631

I’m planning on posting tomorrow one way or the other!

While you wait for that, get to work on your Balisage paper!

January 29, 2017

Up-Translation and Up-Transformation … [Balisage Rocks!]

Filed under: Conferences,XML,XPath,XQuery,XSLT — Patrick Durusau @ 8:48 pm

Up-Translation and Up-Transformation: Tasks, Challenges, and Solutions (a Balisage pre-conference symposium)

When & Where:

Monday July 31, 2017
CAMBRiA Hotel, Rockville, MD USA

Chair: Evan Owens, Cenveo

You need more details than that?

Ok, from the webpage:

Increasing the granularity and/or specificity of markup is an important task in many different content and information workflows. Markup transformations might involve tasks such as high-level structuring, detailed component structuring, or enhancing information by matching or linking to external vocabularies or data. Enhancing markup presents numerous secondary challenges including lack of structure of the inputs or inconsistency of input data down to the level of spelling, punctuation, and vocabulary. Source data for up-translation may be XML, word processing documents, plain text, scanned & OCRed text, or databases; transformation goals may be content suitable for page makeup, search, or repurposing, in XML, JSON, or any other markup language.

The range of approaches to up-transformation is as varied as the variety of specifics of the input and required outputs. Solutions may combine automated processing with human review or could be 100% software implementations. With the potential for requirements to evolve over time, tools may have to be actively maintained and enhanced.

The presentations in this pre-conference symposium will include goals, challenges, solutions, and workflows for significant XML enhancements, including approaches, tools, and techniques that may potentially be used for a variety of other tasks. The symposium will be of value not only to those facing up-translation and transformation but also to general XML practitioners seeking to get the most out of their data.

If I didn’t know better, up-translation and up-transformation sound suspiciously like conferred properties of topic maps fame.

Well, modulo that conferred properties could be predicated on explicit subject identity and not hidden in the personal knowledge of the author.

There are two categories of up-translation and up-transformation:

  1. Ones that preserve jobs like spaghetti Cobol code, and
  2. Ones that support easy long term maintenance.

While writing your paper for the pre-conference, which category fits yours the best?

January 24, 2017

XQuery/XSLT Proposals – Comments by 28 February 2017

Filed under: XML,XPath,XQuery,XSLT — Patrick Durusau @ 4:09 pm

Proposed Recommendations Published for XQuery WG and XSLT WG.

From the webpage:

The XML Query Working Group and XSLT Working Group have published a Proposed Recommendation for four documents:

  • XQuery and XPath Data Model 3.1: This document defines the XQuery and XPath Data Model 3.1, which is the data model of XML Path Language (XPath) 3.1, XSL Transformations (XSLT) Version 3.0, and XQuery 3.1: An XML Query Language. The XQuery and XPath Data Model 3.1 (henceforth “data model”) serves two purposes. First, it defines the information contained in the input to an XSLT or XQuery processor. Second, it defines all permissible values of expressions in the XSLT, XQuery, and XPath languages.
  • XPath and XQuery Functions and Operators 3.1: The purpose of this document is to catalog the functions and operators required for XPath 3.1, XQuery 3.1, and XSLT 3.0. It defines constructor functions, operators, and functions on the datatypes defined in XML Schema Part 2: Datatypes Second Edition and the datatypes defined in XQuery and XPath Data Model (XDM) 3.1. It also defines functions and operators on nodes and node sequences as defined in the XQuery and XPath Data Model (XDM) 3.1.
  • XML Path Language (XPath) 3.1: XPath 3.1 is an expression language that allows the processing of values conforming to the data model defined in XQuery and XPath Data Model (XDM) 3.1. The name of the language derives from its most distinctive feature, the path expression, which provides a means of hierarchic addressing of the nodes in an XML tree. As well as modeling the tree structure of XML, the data model also includes atomic values, function items, and sequences.
  • XSLT and XQuery Serialization 3.1: This document defines serialization of an instance of the data model as defined in XQuery and XPath Data Model (XDM) 3.1 into a sequence of octets. Serialization is designed to be a component that can be used by other specifications such as XSL Transformations (XSLT) Version 3.0 or XQuery 3.1: An XML Query Language.

Comments are welcome through 28 February 2017.

Get your red pen out!

Unlike political flame wars on social media, comments on these proposed recommendatons could make a useful difference.

Enjoy!

January 16, 2017

XML.com Relaunch!

Filed under: XML,XML Schema,XPath,XQuery,XSLT — Patrick Durusau @ 4:11 pm

XML.com

Lauren Wood posted this note about the relaunch of XML.com recently:

I’ve relaunched XML.com (for some background, Tim Bray wrote an article here: https://www.xml.com/articles/2017/01/01/xmlcom-redux/). I’m hoping it will become part of the community again, somewhere for people to post their news (submit your news here: https://www.xml.com/news/submit-news-item/) and articles (see the guidelines at https://www.xml.com/about/contribute/). I added a job board to the site as well (if you’re in Berlin, Germany, or able to
move there, look at the job currently posted; thanks LambdaWerk!); if your employer might want to post XML-related jobs please email me.

The old content should mostly be available but some articles were previously available at two (or more) locations and may now only be at one; try the archive list (https://www.xml.com/pub/a/archive/) if you’re looking for something. Please let me know if something major is missing from the archives.

XML is used in a lot of areas, and there is a wealth of knowledge in this community. If you’d like to write an article, send me your ideas. If you have comments on the site, let me know that as well.

Just in time as President Trump is about to stir, vigorously, that big pot of crazy known as federal data.

Mapping, processing, transformation demands will grow at an exponential rate.

Notice the emphasis on demand.

Taking a two weeks to write custom software to sort files (you know the Weiner/Abedin laptop story, yes?) won’t be acceptable quite soon.

How are your on-demand XML chops?

December 11, 2016

4 Days Left – Submission Alert – XML Prague

Filed under: Conferences,XML,XPath,XQuery,XSLT — Patrick Durusau @ 7:53 pm

A tweet by Jirka Kosek reminded me there are only 4 days left for XML Prague submissions!

  • December 15th – End of CFP (full paper or extended abstract)
  • January 8th – Notification of acceptance/rejection of paper to authors
  • January 29th – Final paper

From the call for papers:

XML Prague 2017 now welcomes submissions for presentations on the following topics:

  • Markup and the Extensible Web – HTML5, XHTML, Web Components, JSON and XML sharing the common space
  • Semantic visions and the reality – micro-formats, semantic data in business, linked data
  • Publishing for the 21th century – publishing toolchains, eBooks, EPUB, DITA, DocBook, CSS for print, …
  • XML databases and Big Data – XML storage, indexing, query languages, …
  • State of the XML Union – updates on specs, the XML community news, …

All proposals will be submitted for review by a peer review panel made up of the XML Prague Program Committee. Submissions will be chosen based on interest, applicability, technical merit, and technical correctness.

Accepted papers will be included in published conference proceedings.

I don’t travel but if you need a last-minute co-author or proofer, you know where to find me!

November 16, 2016

XML Prague 2017, February 9-11, 2017 – Registration Opens!

Filed under: Conferences,XML,XQuery,XSLT — Patrick Durusau @ 3:22 pm

XML Prague 2017, February 9-11, 2017

I mentioned XML Prague 2017 last month and now, after the election of Donald Trump as president of the United States, registration for the conference opens!

Coincidence?

Maybe. 😉

Even if you are returning to the U.S. after the conference, XML Prague will be a welcome respite from the tempest of news coverage of what isn’t known about the impending Trump administration.

At 120 Euros for three days, this is a great investment both professionally and emotionally.

Enjoy!

July 28, 2016

Saxon-JS – Beta Release (EE-License)

Filed under: XML,XSLT — Patrick Durusau @ 10:20 am

Saxon-JS

From the webpage:

Saxon-JS is an XSLT 3.0 run-time written in pure JavaScript. It’s designed to execute Stylesheet Export Files compiled by Saxon-EE.

The first beta release is Saxon-JS 0.9 (released 28 July 2016), for use on web browsers. This can be used with Saxon-EE 9.7.0.7 or later.

The beta release has been tested with current versions of Safari, Firefox, and Chrome browsers. It is known not to work under Internet Explorer. Browser support will be extended in future releases. Please let us know of any problems.

Saxon-JS documentation.

Saxon-JS-beta-0.9.zip.

Goodies from the documentation:


Because people want to write rich interactive client-side applications, Saxon-JS does far more than simply converting XML to HTML, in the way that the original client-side XSLT 1.0 engines did. Instead, the stylesheet can contain rules that respond to user input, such as clicking on buttons, filling in form fields, or hovering the mouse. These events trigger template rules in the stylesheet which can be used to read additional data and modify the content of the HTML page.

We’re talking here primarily about running Saxon-JS in the browser. However, it’s also capable of running in server-side JavaScript environments such as Node.js (not yet fully supported in this beta release).

Grab a copy to get ready for discussions at Balisage!

May 23, 2016

Balisage 2016 Program Posted! (Newcomers Welcome!)

Filed under: Conferences,Topic Maps,XML,XML Schema,XPath,XProc,XQuery,XSLT — Patrick Durusau @ 8:03 pm

Tommie Usdin wrote today to say:

Balisage: The Markup Conference
2016 Program Now Available
http://www.balisage.net/2016/Program.html

Balisage: where serious markup practitioners and theoreticians meet every August.

The 2016 program includes papers discussing reducing ambiguity in linked-open-data annotations, the visualization of XSLT execution patterns, automatic recognition of grant- and funding-related information in scientific papers, construction of an interactive interface to assist cybersecurity analysts, rules for graceful extension and customization of standard vocabularies, case studies of agile schema development, a report on XML encoding of subtitles for video, an extension of XPath to file systems, handling soft hyphens in historical texts, an automated validity checker for formatted pages, one no-angle-brackets editing interface for scholars of German family names and another for scholars of Roman legal history, and a survey of non-XML markup such as Markdown.

XML In, Web Out: A one-day Symposium on the sub rosa XML that powers an increasing number of websites will be held on Monday, August 1. http://balisage.net/XML-In-Web-Out/

If you are interested in open information, reusable documents, and vendor and application independence, then you need descriptive markup, and Balisage is the conference you should attend. Balisage brings together document architects, librarians, archivists, computer
scientists, XML practitioners, XSLT and XQuery programmers, implementers of XSLT and XQuery engines and other markup-related software, Topic-Map enthusiasts, semantic-Web evangelists, standards developers, academics, industrial researchers, government and NGO staff, industrial developers, practitioners, consultants, and the world’s greatest concentration of markup theorists. Some participants are busy designing replacements for XML while other still use SGML (and know why they do).

Discussion is open, candid, and unashamedly technical.

Balisage 2016 Program: http://www.balisage.net/2016/Program.html

Symposium Program: http://balisage.net/XML-In-Web-Out/symposiumProgram.html

Even if you don’t eat RELAX grammars at snack time, put Balisage on your conference schedule. Even if a bit scruffy looking, the long time participants like new document/information problems or new ways of looking at old ones. Not to mention they, on occasion, learn something from newcomers as well.

It is a unique opportunity to meet the people who engineered the tools and specs that you use day to day.

Be forewarned that most of them have difficulty agreeing what controversial terms mean, like “document,” but that to one side, they are a good a crew as you are likely to meet.

Enjoy!

February 12, 2016

XSLT 3.0 Workshop – #XMLPrague

Filed under: XSLT — Patrick Durusau @ 10:56 am

XSLT 3.0 Workshop – submit your questions in advance.

Apologies for the short notice but I saw a tweet by Abel Braaksma reminding everyone of the XSLT 3.0 workshop tomorrow at #XMLPrague and the need to submit questions in advance!

What questions do you want to ask?

February 2, 2016

Balisage 2016, 2–5 August 2016 [XML That Makes A Difference!]

Filed under: Conferences,XLink,XML,XML Data Clustering,XML Schema,XPath,XProc,XQuery,XSLT — Patrick Durusau @ 9:47 pm

Call for Participation

Dates:

  • 25 March 2016 — Peer review applications due
  • 22 April 2016 — Paper submissions due
  • 21 May 2016 — Speakers notified
  • 10 June 2016 — Late-breaking News submissions due
  • 16 June 2016 — Late-breaking News speakers notified
  • 8 July 2016 — Final papers due from presenters of peer reviewed papers
  • 8 July 2016 — Short paper or slide summary due from presenters of late-breaking news
  • 1 August 2016 — Pre-conference Symposium
  • 2–5 August 2016 — Balisage: The Markup Conference

From the call:

Balisage is the premier conference on the theory, practice, design, development, and application of markup. We solicit papers on any aspect of markup and its uses; topics include but are not limited to:

  • Web application development with XML
  • Informal data models and consensus-based vocabularies
  • Integration of XML with other technologies (e.g., content management, XSLT, XQuery)
  • Performance issues in parsing, XML database retrieval, or XSLT processing
  • Development of angle-bracket-free user interfaces for non-technical users
  • Semistructured data and full text search
  • Deployment of XML systems for enterprise data
  • Web application development with XML
  • Design and implementation of XML vocabularies
  • Case studies of the use of XML for publishing, interchange, or archiving
  • Alternatives to XML
  • the role(s) of XML in the application lifecycle
  • the role(s) of vocabularies in XML environments

Full papers should be submitted by the deadline given below. All papers are peer-reviewed — we pride ourselves that you will seldom get a more thorough, skeptical, or helpful review than the one provided by Balisage reviewers.

Whether in theory or practice, let’s make Balisage 2016 the one people speak of in hushed tones at future markup and information conferences.

Useful semantics continues to flounder about, cf. Vice-President Biden’s interest in “one cancer research language.” Easy enough to say. How hard could it be?

Documents are commonly thought of and processed as if from BOM to EOF is the definition of a document. Much to our impoverishment.

Silo dissing has gotten popular. What if we could have our silos and eat them too?

Let’s set our sights on a Balisage 2016 where non-technicals come away saying “I want that!”

Have your first drafts done well before the end of February, 2016!

January 5, 2016

Congressional Roll Call Vote – Accessibility Issues

Filed under: XML,XQuery,XSLT — Patrick Durusau @ 2:43 pm

I posted a color coded version of a congressional roll call vote in Jazzing a Roll Call Vote – Part 2 (XQuery, well XSLT anyway), using red for Republicans and blue for Democrats. #XQuery points out accessibility issues which depend upon color perception.

Color coding works better for me than the more traditional roman versus italic font face distinction but let’s improve the color coding to remove the accessibility issue.

The first question is what colors should I use for accessibility?

In searching to answer that question I found this thread at Edward Tufte’s site (of course), Choice of colors in print and graphics for color-blind readers, which has a rich list of suggestions and pointers to other resources.

One in particular, Color Universal Design (CUD), posted by Maarten Boers, has this graphic on colors:

colorblind_palette

Relying on that palette, I changed the colors for the roll call vote to Republicans in orange; Democrats in sky blue and re-generated the roll call document.

roll-call-access

Here is an accessible version, but color-coded version of: FINAL VOTE RESULTS FOR ROLL CALL 705.

An upside of XML is that changing the presentation of all 429 votes took only a few seconds to change the stylesheet and re-generate the results.

Thanks to #XQuery for prodding me on the accessibility issue which resulted in finding the thread at Tufte and the Colorblind barrier-free color pallet.


Other post on congressional roll call votes:

1. Jazzing Up Roll Call Votes For Fun and Profit (XQuery)

2. Jazzing a Roll Call Vote – Part 2 (XQuery, well XSLT anyway)

December 17, 2015

My Bad – You Are Not! 747 Edits Away From Using XML Tools

Filed under: XPath,XQuery,XSLT — Patrick Durusau @ 4:11 pm

The original, unedited post is below but in response to comments, I checked the XQuery, XPath, XSLT and XQuery Serialization 3.1 files in Chrome (CNTR-U) before saving them.

All the empty elements were properly closed.

I then saved the files and re-opened in Emacs, to discover that Chrome had stripped the “/” from the empty elements, which then caused BaseX to complain. It was an accurate complaint but the files I was tossing against BaseX were not the files as published by the W3C.

So now I need to file a bug report on Chrome, Version 47.0.2526.80 (64-bit) on Ubuntu, for mangling closed empty elements.


You could tell in XQuery, XPath, XSLT and XQuery Serialization 3.1, New Candidate Recommendations! that I was really excited to see the new drafts hit the street.

Me and my big mouth.

I grabbed copies of all three and tossed the XQuery draft against an xquery to create a list of all the paths in it. Simple enough.

The result weren’t.

Here is the first error message:

[FODC0002] “file:/home/patrick/working/w3c/XQuery3.1.html” (Line 68): The element type “link” must be terminated by the matching end-tag “</link>”.

Ouch!

I corrected that and running the query a second time I got:

[FODC0002] “file:/home/patrick/working/w3c/XQuery3.1.html” (Line 68): The element type “meta” must be terminated by the matching end-tag “</meta>”.

The <meta> elements appear on lines three and four.

On the third try:

[FODC0002] “file:/home/patrick/working/w3c/XQuery3.1.html” (Line 69): The element type “img” must be terminated by the matching end-tag “</img>”.

There are 3 <img> elements that are not closed.

I’m getting fairly annoyed at this point.

Fourth try:

[FODC0002] “file:/home/patrick/working/w3c/XQuery3.1.html” (Line 78): The element type “br” must be terminated by the matching end-tag “</br>”.

Of course at this point I revert to grep and discover there are 353
elements that are not closed.

Sigh, nothing to do but correct and soldier on.

Fifth attempt.

[FODC0002] “file:/home/patrick/working/w3c/XQuery3.1.html” (Line 17618): The element type “hr” must be terminated by the matching end-tag “</hr>”.

There are 2 <hr> elements that are not closed.

A total of 361 edits in order to use XML based tools with the most recent XQuery 3.1 Candidate draft.

The most recent XPath 3.1 has 238 empty elements that aren’t closed (same elements as XQuery 3.1).

The XSLT and XQuery Serialization 3.1 draft has 149 empty elements that aren’t closed, same as the other but with the addition of four <col> elements that weren’t closed.

Grand total: 747 edits in order to use XML tools.

Not an editorial but a production problem. A rather severe one it seems to me.

Anyone who wants to use XML tools on these drafts will have to perform the same edits.

XQuery, XPath, XSLT and XQuery Serialization 3.1, New Candidate Recommendations!

Filed under: W3C,XPath,XQuery,XSLT — Patrick Durusau @ 11:10 am

As I forecast 😉 earlier this week, new Candidate Recommendations for:

XQuery 3.1: An XML Query Language

XML Path Language (XPath) 3.1

XSLT and XQuery Serialization 3.1

have hit the streets for your review and comments!

Comments due by 2016-01-31.

That’s forty-five days, minus the ones spent with drugs/sex/rock-n-roll over the holidays and recovering from same.

Say something shy of forty-four actual working days (my endurance isn’t what it once was) for the review process.

What tools, techniques are you going to use to review this latest set of candidates?

BTW, some people review software and check only fixes, for standards I start at the beginning, go to the end, then stop. (Or the reverse for backward proofing.)

My estimates on days spent with drugs/sex/rock-n-rock are approximate only and your experience may vary.

December 14, 2015

35 Lines XQuery versus 604 of XSLT: A List of W3C Recommendations

Filed under: BaseX,Saxon,XML,XQuery,XSLT — Patrick Durusau @ 10:16 pm

Use Case

You should be familiar with the W3C Bibliography Generator. You can insert one or more URLs and the generator produces correctly formatted citations for W3C work products.

It’s quite handy but requires a URL to produce a useful response. I need authors to use correctly formatted W3C citations and asking them to find URLs and to generate correct citations was a bridge too far. Simply didn’t happen.

My current attempt is to produce a list of correctly W3C citations in HTML. Authors can use CTRL-F in their browsers to find citations. (Time will tell if this is a successful approach or not.)

Goal: An HTML page of correctly formatted W3C Recommendations, sorted by title (ignoring case because W3C Recommendations are not consistent in their use of case in titles). “Correctly formatted” meaning that it matches the output from the W3C Bibliography Generator.

Resources

As a starting point, I viewed the source of http://www.w3.org/2002/01/tr-automation/tr-biblio.xsl, the XSLT script that generates the XHTML page with its responses.

The first XSLT script imports two more XSLT scripts, http://www.w3.org/2001/08/date-util.xslt and http://www.w3.org/2001/10/str-util.xsl.

I’m not going to reproduce the XSLT here, but can say that starting with <stylesheet> and ending with </stylesheet>, inclusive, I came up with 604 lines.

You will need to download the file used by the W3C Bibliography Generator, tr.rdf.

XQuery Script

I have used the XQuery script successfully with: BaseX 8.3, eXide 2.1.3 and SaxonHE-6-07J.

Here’s the prolog:

declare default element namespace "http://www.w3.org/2001/02pd/rec54#";
declare namespace rdf = "http://www.w3.org/1999/02/22-rdf-syntax-ns#";
declare namespace dc = "http://purl.org/dc/elements/1.1/"; 
declare namespace doc = "http://www.w3.org/2000/10/swap/pim/doc#";
declare namespace contact = "http://www.w3.org/2000/10/swap/pim/contact#";
declare namespace functx = "http://www.functx.com";
declare function functx:substring-after-last
($string as xs:string?, $delim as xs:string) as xs:string?
{
if (contains ($string, $delim))
then functx:substring-after-last(substring-after($string, $delim), $delim)
else $string
};

Declaring the namespaces and functx:substring-after-last from Patricia Walmsley’s excellent FunctX XQuery Functions site and in particular, functx:substring-after-last.

<html>
<head>XQuery Generated W3C Recommendation List</head>
<body>
<ul class="ul">

Start the HTML page and the unordered list that will contain the list items.

{
for $rec in doc("tr.rdf")//REC
    order by upper-case($rec/dc:title)

If you sort W3C Recommendations by dc:title and don’t specify upper-case, rdf:PlainLiteral: A Datatype for RDF Plain Literals,
rdf:PlainLiteral: A Datatype for RDF Plain Literals (Second Edition), and xml:id Version 1.0, appear at the end of the list sorted by title. Dirty data isn’t limited to databases.

return <li class="li">
  <a href="{string($rec/@rdf:about)}"> {string($rec/dc:title)} </a>, 
   { for $auth in $rec/editor
   return
   if (contains(string($auth/contact:fullName), "."))
   then (concat(string($auth/contact:fullName), ","))
   else (concat(concat(concat(substring(substring-before(string($auth/\
   contact:fullName), ' '), 0, 2), ". "), (substring-after(string\
   ($auth/contact:fullName), ' '))), ","))}

Watch for the line continuation marker “\”.

We begin by grabbing the URL and title for an entry and then confront dirty author data. The standard author listing by the W3C creates an initial plus a period for the author’s first name and then concatenates the rest of the author’s name to that initial plus period.

Problem: There is one entry for authors that already has initials, T.V. Raman, so I had to account for that one entry (as does the XSLT).

{if (count ($rec/editor) >= 2) then " Editors," else " Editor,"}
W3C Recommendation, 
{fn:format-date(xs:date(string($rec/dc:date)), "[MNn] [D], [Y]") }, 
{string($rec/@rdf:about)}. <a href="{string($rec/doc:versionOf/\
@rdf:resource)}">Latest version</a> \
available at {string($rec/doc:versionOf/@rdf:resource)}.
<br/>[Suggested label: <strong>{functx:substring-after-last(uppercase\
(replace(string($rec/doc:versionOf/@rdf:resource), '/$', '')), "/")}\
</strong>]<br/></li>} </ul></body></html>

Nothing remarkable here, except that I snipped the concluding “/” off of the values from doc:versionOf/@rdf:resource so I could use functx:substring-after-last to create the token for a suggested label.

Comments / Omissions

I depart from the XSLT in one case. It calls http://www.w3.org/2002/01/tr-automation/known-tr-editors.rdf here:

<!-- Special casing for when we have the name in Original Script (e.g. in \
Japanese); currently assume that the order is inversed in this case... -->

<:xsl:when test="document('http://www.w3.org/2002/01/tr-automation/\
known-tr-editors.rdf')/rdf:RDF/*[contact:lastNameInOriginalScript=\
substring-before(current(),' ')]">

But that refers to only one case:

<REC rdf:about="http://www.w3.org/TR/2003/REC-SVG11-20030114/">
<dc:date>2003-01-14</dc:date>
<dc:title>Scalable Vector Graphics (SVG) 1.1 Specification</dc:title>

Where Jun Fujisawa appears as an editor.

Recalling my criteria for “correctness” being the output of the W3C Bibliography Generator:

svg-cite-image

Preparing for this post made me discover at least one bug in the XSLT that was supposed to report the name in original script:

&lt:xsl:when test=”document(‘http://www.w3.org/2002/01/tr-automation/\
known-tr-editors.rdf’)/rdf:RDF/*[contact:lastNameInOriginalScript=\
substring-before(current(),’ ‘)]”>

Whereas the entry in http://www.w3.org/2002/01/tr-automation/known-tr-editors.rdf reads:

<rdf:Description>
<rdf:type rdf:resource=”http://www.w3.org/2000/10/swap/pim/contact#Person”/>
<firstName>Jun</firstName>
<firstNameInOriginalScript>藤沢 淳</firstNameInOriginalScript>
<lastName>Fujisawa</lastName>
<sortName>Fujisawa</sortName>
</rdf:Description>

Since the W3C Bibliography Generator doesn’t produce the name in original script, neither do I. When the W3C fixes its output, I will have to amend this script to pick up that entry.

String

While writing this query I found text(), fn:string() and fn:data() by Dave Cassels. Recommended reading. The weakness of text() is that if markup is inserted inside your target element after you write the query, you will get unexpected results. The use of fn:string() avoids that sort of surprise.

Recommendations Only

Unlike the W3C Bibliography Generator, my script as written only generates entries for Recommendations. It would be trivial to modify the script to include drafts, notes, etc., but I chose to not include material that should not be used as normative citations.

I can see the usefulness of the bibliography generator for works in progress but external to the W3C, citing Recommendations is the better course.

Contra Search

The SpecRef project has a searchable interface to all the W3C documents. If you search for XQuery, the interface returns 385 “hits.”

Contrast that with using CNTR-F with the list of recommendations generated from the XQuery script, controlling for case, XQuery produced only 23 “hits.”

There are reasons for using search, but users repeatedly mining results of searches that could be captured (it was called curation once upon a time) is wasteful.

Reading

I can’t recommend Patricia Walmsley’s XQuery 2nd Edition strongly enough.

There is one danger to Walmsley’s book. You will be so ready to start using XQuery after the first ten chapters it’s hard to find the time to read the remaining ones. Great stuff!

You can download the XQuery file, tr.rdf and the resulting html file at: 35LinesOfXQuery.zip.

XQuery, XPath, XSLT and XQuery Serialization 3.1 (Back-to-Front) Drafts (soon!)

Filed under: W3C,XPath,XQuery,XSLT — Patrick Durusau @ 4:04 pm

XQuery, XPath, XSLT and XQuery Serialization 3.1 (Back-to-Front) Drafts will be published quite soon so I wanted to give you a heads up on your holiday reading schedule.

This is deep enough in the review cycle that a back-to-front reading is probably your best approach.

You have read the drafts and corrections often enough by this point that you read the first few words of a paragraph and you “know” what it says so you move on. (At the very least I can report that happens to me.)

By back-to-front reading I mean to start at the end of each draft and read the last sentence and then the next to last sentence and so on.

The back-to-front process does two things:

  1. You are forced to read each sentence on its own.
  2. It prevents skimming and filling in errors with silent corrections (unknown to your conscious mind).

The back-to-front method is quite time consuming so its fortunate these drafts are due to appear just before a series of holidays in a large number of places.

I hesitate to mention it but there is another way to proof these drafts.

If you have XML experienced visitors, you could take turns reading the drafts to each other. It was a technique used by copyists many years ago where one person read and two others took down the text. The two versions were then compared to each other and the original.

Even with a great reading voice, I’m not certain many people would be up to that sort of exercise.

PS: I will post on the new drafts as soon as they are published.

Older Posts »

Powered by WordPress