Linked Legal Data: A SKOS Vocabulary for the Code of Federal Regulations by Núria Casellas.
Abstract:
This paper describes the application of Semantic Web and Linked Data techniques and principles to regulatory information for the development of a SKOS vocabulary for the Code of Federal Regulations (in particular of Title 21, Food and Drugs). The Code of Federal Regulations is the codification of the general and permanent enacted rules generated by executive departments and agencies of the Federal Government of the United States, a regulatory corpus of large size, varied subject-matter and structural complexity. The CFR SKOS vocabulary is developed using a bottom-up approach for the extraction of terminology from text based on a combination of syntactic analysis and lexico-syntactic pattern matching. Although the preliminary results are promising, several issues (a method for hierarchy cycle control, expert evaluation and control support, named entity reduction, and adjective and prepositional modifier trimming) require improvement and revision before it can be implemented for search and retrieval enhacement of regulatory materials published by the Legal Information Institute. The vocabulary is part of a larger Linked Legal Data project, that aims at using Semantic Web technologies for the representation and management of legal data.
Considers use of nonregulatory vocabularies, conversion of existing indexing materials and finally settles on NLP processing of the text.
Granting that Title 21, Food and Drugs is no walk in the part, take a peek at the regulations for Title 26, Internal Revenue Code. 😉
A difficulty that I didn’t see mentioned is the changing semantics in statutory law and regulations.
The definition of “person,” for example, varies widely depending upon where it appears. Both chronologically and synchronically.
Moreover, if I have a nonregulatory vocabulary and/or CFR indexes, why shouldn’t that map to the CFR SKOS vocabulary?
I may not have the “correct” index but the one I prefer to use. Shouldn’t that be enabled?
I first saw this at Legal Informatics.