Congress: More XQuery Fodder

Congress Poised for Leap to Open Up Legislative Data by Daniel Schuman.

From the post:

Following bills in Congress requires three major pieces of information: the text of the bill, a summary of what the bill is about, and the status information associated with the bill. For the last few years, Congress has been publishing the text and summaries for all legislation moving in Congress, but has not published bill status information. This key information is necessary to identify the bill author, where the bill is in the legislative process, who introduced the legislation, and so on.

While it has been in the works for a while, this week Congress confirmed it will make “Bill Statuses in XML format available through the GPO’s Federal Digital System (FDsys) Bulk Data repository starting with the 113th Congress,” (i.e. January 2013). In “early 2016,” bill status information will be published online in bulk– here. This should mean that people who wish to use the legislative information published on Congress.gov and THOMAS will no longer need to scrape those websites for current legislative information, but instead should be able to access it automatically.

Congress isn’t just going to pull the plug without notice, however. Through the good offices of the Bulk Data Task Force, Congress will hold a public meeting with power users of legislative information to review how this will work. Eight sample bill status XML files and draft XML User Guides were published on GPO’s GitHub page this past Monday. Based on past positive experiences with the Task Force, the meeting is a tremendous opportunity for public feedback to make sure the XML files serve their intended purposes. It will take place next Tuesday, Dec. 15, from 1-2:30. RSVP details below.

If all goes as planned, this milestone has great significance.

  • It marks the publication of essential legislative information in a format that supports unlimited public reuse, analysis, and republication. It will be possible to see much of a bill’s life cycle.
  • It illustrates the positive relationship that has grown between Congress and the public on access to legislative information, where there is growing open dialog and conversation about how to best meet our collective needs.
  • It is an example of how different components within the legislative branch are engaging with one another on a range of data-related issues, sometimes for the first time ever, under the aegis of the Bulk Data Task Force.
  • It means the Library of Congress and GPO will no longer be tied to the antiquated THOMAS website and can focus on more rapid technological advancement.
  • It shows how a diverse community of outside organizations and interests came together and built a community to work with Congress for the common good.

To be sure, this is not the end of the story. There is much that Congress needs to do to address its antiquated technological infrastructure. But considering where things were a decade ago, the bulk publication of information about legislation is a real achievement, the culmination of a process that overcame high political barriers and significant inertia to support better public engagement with democracy and smarter congressional processes.

Much credit is due in particular to leadership in both parties in the House who have partnered together to push for public access to legislative information, as well as the staff who worked tireless to make it happen.

If you look at the sample XML files, pay close attention to the <bioguideID> element and its contents. Is is the same value as you will find for roll-call votes, but there the value appears in the name-id attribute of the <legislator> element. See: http://clerk.house.gov/evs/2015/roll643.xml and do view source.

Oddly, the <bioguideID> element does not appear in the documentation on GitHub, you just have to know the correspondence to the name-id attribute of the <legislator> element

As I said in the title, this is going to be XQuery fodder.

Comments are closed.