Finding Parties Named in U.S. Law using Python and NLTK by Gary Sieling.
From the post:
U.S. Law periodically names specific institutions; historically it is possible for Congress to write a law naming an individual, although I think that has become less common. I expect the most common entities named in Federal Law to be groups like Congress. It turns out this is true, but the other most common entities are the law itself and bureaucratic functions like archivists.
To get at this information, we need to read the Code XML, and use a natural language processing library to get at the named groups.
NLTK is such an NLP library. It provides interesting features like sentence parsing, part of speech tagging, and named entity recognition. (If interested in the subject see my review of “Natural Language Processing with Python“, a book which covers this library in detail)
I would rather know who paid for particular laws but that requires information external to the Code XML data set. 😉
A very good exercise to become familiar with both NLTK and the Code XML data set.