Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

September 24, 2011

Introducing Fech

Filed under: Dataset,Marketing — Patrick Durusau @ 6:58 pm

Introducing Fech by Michael Strickland.

From the post:

Ten years ago, the Federal Election Commission introduced electronic filing for political committees that raise and spend money to influence elections to the House and the White House. The filings contain aggregate information about a committee’s work (what it has spent, what it owes) and more detailed listings of its interactions with the public (who has donated to it, who it has paid for services).

Journalists who work with these filings need to extract their data from complex text files that can reach hundreds of megabytes. Turning a new set into usable data involves using the F.E.C.’s data dictionaries to match all the fields to their positions in the data. But the available fields have changed over time, and subsequent versions don’t always match up. For example, finding a committee’s total operating expenses in version 7 means knowing to look in column 52 of the “F3P” line. It used to be found at column 50 in version 6, and at column 44 in version 5. To make this process faster, my co-intern Evan Carmi and I created a library to do that matching automatically.

Fech (think “F.E.C.h,” say “fetch”), is a Ruby gem that abstracts away any need to map data points to their meanings by hand. When you give Fech a filing, it checks to see which version of the F.E.C.’s software generated it. Then, when you ask for a field like “total operating expenses,” Fech knows how to retrieve the proper value, no matter where in the filing that particular software version stores it.

At present Fech only parses presidential filings but can be extended to other filings.

OK, so now it is easier to get campaign finance information. Now what?

So members of congress live in the pockets of their largest supporters. Is that news?

How would you use topic map to make that news? Serious question.

Or how to use topic maps to make that extraction a value-add when used with other New York Times content?


Update: Fech 1.1 Released.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress