Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

January 5, 2015

Redefining RFC 2119? Danger! Danger! Will Robinson!

Filed under: Standards,W3C,XML,XPath,XQuery — Patrick Durusau @ 3:43 pm

I’m lagging behind in reading XQuery 3.1: An XML Query Language, XML Path Language (XPath) 3.1, and, XPath and XQuery Functions and Operators 3.1 in order to comment by 13 February 2015.

In order to catch up this past weekend I started trying to tease these candidate recommendations apart to make them easier to proof. One of the things I always do I check for key word conformance language and that means, outside of ISO, RFC 2119.

I was reading XPath and XQuery Functions and Operators 3.1 (herein Functions and Operators) when I saw:

1.1 Conformance

The Functions and Operators specification is intended primarily as a component that can be used by other specifications. Therefore, Functions and Operators relies on specifications that use it (such as [XML Path Language (XPath) 3.1], [XQuery 3.1: An XML Query Language], and potentially future versions of XSLT) to specify conformance criteria for their respective environments.

That works. You have a normative document of definitions, etc., and some other standard cites those definitions and supplies the must,should, may according to RFC 2119. Not common but that works.

But then I started running scripts for usage of key words and I found in Functions and Operators:

1.6.3 Conformance terminology

[Definition] may

Conforming documents and processors are permitted to, but need not, behave as described.

[Definition] must

Conforming documents and processors are required to behave as described; otherwise, they are either non-conformant or else in error.

Thus the title: Redefining RFC 2119? Danger! Danger! Will Robinson!

RFC 2119 reads in part:

1. MUST This word, or the terms “REQUIRED” or “SHALL”, mean that the definition is an absolute requirement of the specification.

5. MAY This word, or the adjective “OPTIONAL”, mean that an item is truly optional. One vendor may choose to include the item because a particular marketplace requires it or because the vendor feels that it enhances the product while another vendor may omit the same item. An implementation which does not include a particular option MUST be prepared to interoperate with another implementation which does include the option, though perhaps with reduced functionality. In the same vein an implementation which does include a particular option MUST be prepared to interoperate with another implementation which does not include the option (except, of course, for the feature the option provides.)

6. Guidance in the use of these Imperatives

Imperatives of the type defined in this memo must be used with care and sparingly. In particular, they MUST only be used where it is actually required for interoperation or to limit behavior which has potential for causing harm (e.g., limiting retransmisssions) For example, they must not be used to try to impose a particular method on implementors where the method is not required for interoperability.

First, the referencing of RFC 2119 is standard practice at the W3C, at least with regard to XML specifications. I wanted to have more than personal experience to cite so I collected the fifty-eight current XML specifications and summarize them in the list at the end of this post.

Of the fifty-nine (59) current XML specifications (there may be others, the W3C has abandoned simply listing its work without extraneous groupings), fifty-two of the standards cite and follow RFC 2119. Three of the remaining seven (7) fail to cite RFC due to errors in editing.

The final four (4) as it were that don’t cite RFC 2119 are a good illustration of how errors get perpetuated from one standard to another.

The first W3C XML specification to not cite RFC 2119 was: Extensible Markup Language (XML) 1.0 (Second Edition) where it reads in part:

1.2 Terminology

may

[Definition: Conforming documents and XML processors are permitted to but need not behave as described.]

must

[Definition: Conforming documents and XML processors are required to behave as described; otherwise they are in error. ]

The definitions of must and may were ABANDONED in Extensible Markup Language (XML) 1.0 (Third Edition), which simply dropped those definitions and instead reads in part:

1.2 Terminology

The terminology used to describe XML documents is defined in the body of this specification. The key words must, must not, required, shall, shall not, should, should not, recommended, may, and optional, when emphasized, are to be interpreted as described in [IETF RFC 2119].

The exclusive use of RFC 2119 continues through Extensible Markup Language (XML) 1.0 (Fourth Edition) to the current Extensible Markup Language (XML) 1.0 (Fifth Edition)

However, as is often said, whatever good editing we do is interred with us and any errors we make live on.

Before the abandonment of attempts to define may and must appeared in XML 3rd edition, XML Schema Part 1: Structures Second Edition and XML Schema Part 2: Datatypes Second Edition cite XML 2nd edition as their rationale for defining may and must. That error has never been corrected.

Which brings us to W3C XML Schema Definition Language (XSD) 1.1 Part 2: Datatypes which is the last W3C XML specification to not cite RFC 2119.

XSD 1.1 Part 2 reads in part, under Appendix I Changes since version 1.0, I.4 Other Changes:

The definitions of must, must not, and ·error· have been changed to specify that processors must detect and report errors in schemas and schema documents (although the quality and level of detail in the error report is not constrained).

The problem being XML Schema Part 2: Datatypes Second Edition

relies upon XML Schema Part 2: Datatypes Second Edition which cites Extensible Markup Language (XML) 1.0 (Second Edition) as the reason for redefining the terms may and must.

The redefining of may and must relies upon language in a superceded version of the XML standard. Language that was deleted ten (10) years ago from the XML standard.

If you have read this far, you have a pretty good guess that I am going to suggest that XPath and XQuery Functions and Operators 3.1 drop the attempt to redefine terms that appear in RFC 2119.

First, redefining widely used terms for conformance is clearly a bad idea. Do you mean RFC2119 must or do you mean and F&O must? Clearly different. If a requirement has an RFC2119 must, my application either conforms or fails. If a requirement has an F&O must, my application may simple be in error. All the time. Is that useful?

Second, by redefining must, we lose the interoperability aspects as define by RFC2119 for all uses of must. Surely interoperability is a goal of Functions and Operators. Yes?

Third, the history of redefining may and must at the W3C shows (to me) the perpetuation of an error long beyond its correction date. It’s time to put an end to redefining may and must.

PS: Before you decide you “know” the difference in upper and lower case key words from RFC 2119, take a look at: RFC Editorial Guidelines and Procedures, Normative References to RFC 2119. Summary, UPPER CASE is normative, lower case is “a necessary logical relationship.”

PPS: Tracking this error down took longer than expected so it will be later this week before I have anything that may help with proofing the specifications.


XML Standards Consulted in preparation of this post. Y = Cites RFC 2119, N = Does not cite RFC 2119.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress