Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

February 21, 2015

Redefining “URL” to Invalidate Twenty-One (21) Years of Usage

Filed under: HTML5,WWW — Patrick Durusau @ 3:11 pm

You may be interested to know that efforts are underway to bury the original meaning of URL and to replace it with another meaning.

Our trail starts with the HTML 5 draft of 17 December 2012, which reads in part:

2.6 URLs

This specification defines the term URL, and defines various algorithms for dealing with URLs, because for historical reasons the rules defined by the URI and IRI specifications are not a complete description of what HTML user agents need to implement to be compatible with Web content.

The term “URL” in this specification is used in a manner distinct from the precise technical meaning it is given in RFC 3986. Readers familiar with that RFC will find it easier to read this specification if they pretend the term “URL” as used herein is really called something else altogether. This is a willful violation of RFC 3986. [RFC3986]

2.6.1 Terminology

A URL is a string used to identify a resource.

A URL is a valid URL if at least one of the following conditions holds:

  • The URL is a valid URI reference [RFC3986].
  • The URL is a valid IRI reference and it has no query component. [RFC3987]
  • The URL is a valid IRI reference and its query component contains no unescaped non-ASCII characters. [RFC3987]
  • The URL is a valid IRI reference and the character encoding of the URL’s Document is UTF-8 or a UTF-16 encoding. [RFC3987]

You may not like the usurpation of URL and its meaning but at least it is honestly reported.

Compare Editor’s Draft 13 November 2014, which reads in part:

2.5 URLs

2.5.1 Terminology

A URL is a valid URL if it conforms to the authoring conformance requirements in the WHATWG URL standard. [URL]

A string is a valid non-empty URL if it is a valid URL but it is not the empty string.

Hmmm, all the references to IRIs and violating RFC3986 has disappeared.

But there is a reference to the WHATWG URL standard.

If you follow that internal link to the bibliography you will find:

[URL]
URL (URL: http://url.spec.whatwg.org/), A. van Kesteren. WHATWG.

Next stop: URL Living Standard — Last Updated 6 February 2015, which reads in part:

The URL standard takes the following approach towards making URLs fully interoperable:

  • Align RFC 3986 and RFC 3987 with contemporary implementations and obsolete them in the process. (E.g. spaces, other “illegal” code points, query encoding, equality, canonicalization, are all concepts not entirely shared, or defined.) URL parsing needs to become as solid as HTML parsing. [RFC3986] [RFC3987]
  • Standardize on the term URL. URI and IRI are just confusing. In practice a single algorithm is used for both so keeping them distinct is not helping anyone. URL also easily wins the search result popularity contest.

A specification being developed by WHATWG.org.

Not nearly as clear and forthcoming as the HTML5 draft as of 17 December 2012. Yes?

RFC3986 and RFC3987 are products of the IETF. If revisions of those RFCs are required, shouldn’t that work be at IETF?

Or at a minimum, why is a foundation for HTML5 not at the W3C, if not at IETF?

The conflating URLs (RFC3986) and IRIs (RFC3987) is taking place well away from the IETF and W3C processes.

A conflation that invalidates twenty-one (21) years of use of URL in books, papers, presentations, documentation, etc.

BTW, URL was originally defined in 1994 in RFC1738.

Is popularity of an acronym worth that cost?

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress