Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

February 23, 2013

U.S. Statutes at Large 1951-2009

Filed under: Government,Government Data,Law,Law - Sources — Patrick Durusau @ 4:28 pm

GPO is Closing Gap on Public Access to Law at JCP’s Direction, But Much Work Remains by Daniel Schuman.

From the post:

The GPO’s recent electronic publication of all legislation enacted by Congress from 1951-2009 is noteworthy for several reasons. It makes available nearly 40 years of lawmaking that wasn’t previously available online from any official source, narrowing part of a much larger information gap. It meets one of three long-standing directives from Congress’s Joint Committee on Printing regarding public access to important legislative information. And it has published the information in a way that provides a platform for third-party providers to cleverly make use of the information. While more work is still needed to make important legislative information available to the public, this online release is a useful step in the right direction.

Narrowing the Gap

In mid-January 2013, GPO published approximately 32,000 individual documents, along with descriptive metadata, including all bills enacted into law, joint concurrent resolutions that passed both chambers of Congress, and presidential proclamations from 1951-2009. The documents have traditionally been published in print in volumes known as the “Statutes at Large,” which commonly contain all the materials issued during a calendar year.

The Statutes at Large are literally an official source for federal laws and concurrent resolutions passed by Congress. The Statutes at Large are compilations of “slip laws,” bills enacted by both chambers of Congress and signed by the President. By contrast, while many people look to the US Code to find the law, many sections of the Code in actuality are not the “official” law. A special office within the House of Representatives reorganizes the contents of the slip laws thematically into the 50 titles that make up the US Code, but unless that reorganized document (the US Code) is itself passed by Congress and signed into law by the President, it remains an incredibly helpful but ultimately unofficial source for US law. (Only half of the titles of the US Code have been enacted by Congress, and thus have become law themselves.) Moreover, if you want to see the intact text of the legislation as originally passed by Congress — before it’s broken up and scattered throughout the US Code — the place to look is the Statutes at Large.

Policy wonks and trivia experts will have a field day but the value of the Statutes at Large isn’t apparent to me.

I assume there are cases where errors can be found between the U.S.C. (United States Code) and the Statutes at Large. The significance of those errors is unknown.

Like my comments on the SEC Midas program, knowing a law was passed isn’t the same as knowing who benefits from it.

Or who paid for its passage.

Knowing which laws were passed is useful.

Knowing who benefited or who paid, priceless.

Failure By Design

Filed under: BigData,Design,Government,Government Data — Patrick Durusau @ 3:40 pm

Did you know the Security and Exchange Commission (SEC) is now collecting 400 gigabytes of market data daily?

Midas [Market Information Data Analytics System], which is costing the SEC $2.5 million a year, captures data such as time, price, trade type and order number on every order posted on national stock exchanges, every cancellation and modification, and every trade execution, including some off-exchange trades. Combined it adds up to billions of daily records.

So, what’s my complaint?

Midas won’t be able to fill in all of the current holes in SEC’s vision. For example, the SEC won’t be able to see the identities of entities involved in trades and Midas doesn’t look at, for example, futures trades and trades executed outside the system in what are known as “dark pools.” (emphasis added)

What?

The one piece of information that could reveal patterns of insider trading, churning, and a whole host of other securities crimes, is simply not collected.

I wonder who would benefit from the SEC not being able to track insider trading, churning, etc.?

People engaged in insider trading, churning, etc. would be my guess.

You?

Maybe someone should ask SEC chairman Elisse Walter or Gregg Berman (who oversees MIDAS) if tracking entities would help with SEC enforcement?

If they agree, then ask why not now?

For that matter, why not open up the data + entities so others can help the SEC with analysis of the data?

Obvious questions J. Nicholas Hoover should have asked for SEC Makes Big Data Push To Analyze Markets.

February 19, 2013

G-8 International Conference on Open Data for Agriculture

Filed under: Government,Government Data,Open Data — Patrick Durusau @ 6:38 am

G-8 International Conference on Open Data for Agriculture

April 29-30, 2013 Washington, D.C.

Deadline for proposals: Midnight, February 28, 2013.

From the call for ideas:

Are you interested in addressing global challenges, such as food security, by providing open access to information? Would you like the opportunity to present to leaders from around the world?

We are seeking innovative products and ideas that demonstrate the potential of using open data to increase food security. This April 29-30th in Washington, D.C., the G-8 International Conference on Open Data for Agriculture will host policy makers, thought leaders, food security stakeholders, and data experts to build a strategy to share agriculture data and make innovation more accessible. As part of the conference, we are giving innovators a chance to showcase innovative uses of open data for food security in a lightning presentation or in the exhibit hall. This call for ideas is a chance to demonstrate the potential that open data can have in ensuring food security, and can inform an unprecedented global collaboration. Visit data.gov to see what agricultural data is already available and connect to other G-8 open data sites!

We are seeking top innovators to show the world what can be done with open data through:

  • Lightning Presentations: brief (3-5 minute), image rich presentations intended to convey an idea
  • Exhibit Hall: an opportunity to convey an idea through an image-rich exhibit.

Presentations should inspire others to share their data or imagine how open data could be used to increase food security. Presentations may include existing, new, or proposed applications of open data and should meet one or more of the following criteria:

  • Demonstrate the impact of open data on food security.
  • Demonstrate the impact of access to agriculturally-relevant data on developed and/or developing countries.
  • Demonstrate the impact of bringing multiple sources of agriculturally-relevant public and/or private open data together (think about the creation of an agriculture equivalent of weather.com)

For those with a new idea, we invite you to submit your proposal to present it to leading experts in food security, technology and data innovation. Proposals should identify which data is needed that is publicly available, for free, on the internet. Proposals must also include a design of the application including relevance to the target audience and plans for beta testing. A successful prototype will be mobile, interactive, and scalable. Proposals to showcase existing products or pitch new ideas will be reviewed by a global panel of technical experts from the G-8 countries.

Short notice but from the submission form on the website, you only get 75-100 words to summarize your proposal.

Hell, I have trouble identifying myself in 75-100 words. 😉

Still, if you are in D.C. and interested, it could be a good way to meet people in this area.

The nine flags for the G-8 are confusing at first. Not an example of government committee counting. The EU has a representative at G-8 meetings.

I first saw this at: Open Call to Innovators: Apply to present at G-8 International Conference on Open Data for Agriculture.

February 14, 2013

“Improving Critical Infrastructure Cybersecurity” Executive Order

Filed under: Government,Government Data,Security — Patrick Durusau @ 2:53 pm

Unless you have been asleep for the last couple of days, you have heard about President Obama’s “Improving Critical Infrastructure Cybersecurity” Executive Order.

Wanted to point you to one of the lesser discussed provisions of the order:

Section 4 (e) reads:

In order to maximize the utility of cyber threat information sharing with the private sector, the Secretary shall expand the use of programs that bring private sector subject-matter experts into Federal service on a temporary basis. These subject matter experts should provide advice regarding the content, structure, and types of information most useful to critical infrastructure owners and operators in reducing and mitigating cyber risks.

I didn’t know which “…programs that bring private sector subject-matter experts into Federal Service…” he meant.

So, I wrote to the GSA (General Services Administration) and they said to look at schedules 70 and 874 at www.gsaelibrary.gsa.gov.

I won’t try to advise you on the steps to register for government contract work.

But this is an opportunity for building bridges across the semantic divides in any inter-agency effort.

Do remember where you heard the news!

February 13, 2013

datacatalogs.org [San Francisco, for example]

Filed under: Data,Dataset,Government,Government Data — Patrick Durusau @ 2:25 pm

datacatalogs.org

From the homepage:

a comprehensive list of open data catalogs curated by experts from around the world.

Cited in Simon Roger’s post: Competition: visualise open government data and win $2,000.

As of today, 288 registered data catalogs.

The reservation I have about “open” government data is that when it is “open,” it’s not terribly useful.

I am sure there is useful “open” government data but let me give you an example of non-useful “open” government data.

Consider San Francisco, CA and cases of police misconduct against it citizens.

A really interesting data visualization would be to plot those incidents against the neighborhoods of San Francisco. Where the neighborhoods are colored by economic status.

The maps of San Francisco are available at DataSF, specifically, Planning Neighborhoods.

What about the police data?

I found summaries like: OCC Caseload/Disposition Summary – 1993-2009

Which listed:

  • Opened
  • Closed
  • Pending
  • Sustained

Not exactly what is needed for neighborhood by neighborhood mapping.

Note: No police misconduct since 2009 according to these data sets. (I find that rather hard to credit.)

How would you vote on this data set from San Francisco?

Open, Opaque, Semi-Transparent?

February 6, 2013

Need to Pad Your Resume? Innovation Fellows Round 2

Filed under: Government,Government Data — Patrick Durusau @ 11:39 am

White House Seeks Tech Innovation Fellows by Elena Malykhina.

The White House death march farce I covered in A Competent CTO Can Say No continues.

Next group of six to twelve month projects are:

  • — Disaster Response and Recovery: The project will “pre-position” tech tools for disaster readiness in order to diminish economic damage and save lives.
  • — Cyber-Physical Systems: A new generation of cyber-physical “smart systems” will be developed to help the economy and job creation. These systems will combine distributed sensing, control and data analytics.
  • — 21st Century Financial Systems: The 21st Century Financial Systems initiative will transition agency-specific federal financial accounting systems to a more modular, scalable and cost-effective model.
  • — Innovation Toolkit: A suite of tools will be created for federal workers, allowing them to become more responsive and efficient in their jobs.
  • — Development Innovation Ventures: The Development Innovation Ventures project will address tough global problems by allowing the U.S. government to identify, test and scale new technologies.

Sound like six to twelve month projects? Yes?

I know, I know, I should be lining up to participate in this fraud on the public and be paid for doing it. Looks nice on the resume.

Successful solutions will not be developed on fixed timelines before problems are defined or understood.

Some will say, “So what? So long as you are paid for time, travel, etc., why would you care if the solution is successful?”

That must be why there are no links in the Round 2 announcement to “successes” of the first round of innovation.

Take the first one on the list from round one:

Open Data Initiatives have unleashed data from the vaults of the government as fuel for entrepreneurs and innovators to create new apps, products, and services that benefit the American people in myriad ways and contribute to job growth.

Can you name one? Just one.

Sequestration data (except for my releases) continues to be dead PDF files. And the data in those files is too incomplete for useful analysis.

Is that “…unleash[ing] data from the vaults of the government…?”

Or did the sequestration debate escaped their attention?

The number of people willing to defraud the public even in these hard economic times was encouragingly low.

Only 700 people applied for round one. Out of hundreds of thousands of highly qualified IT people who could have applied.

Is defrauding the public becoming unfashionable?

Perhaps there is hope.


Lest there be some misunderstanding, government at all levels is filled with public servants.

But you have to get away from elected/appointed positions to find them.

They mostly don’t appear on Sunday talk shows but tirelessly do the public’s business out of the limelight.

Public servants I would gladly help, public parasites, not so much.

February 5, 2013

Green Book – Semantic and Governmental Failure

Filed under: Government,Government Data — Patrick Durusau @ 5:30 pm

Full Text Reports carried a report today about the House Ways and Means Committee — 2012 Green Book (released November 2012).

I am always looking for data that might be of interesting for topic maps and the quoted blurb:

Since 1981, the Committee on Ways and Means has published the Green Book, which presents background material and statistical data on the major entitlement programs and other activities within the Committee’s jurisdiction. Over the decades, the Green Book has become a valuable resource and standard reference on American social policy. It is widely used by Members of Congress and their staffs, analysts in congressional and administrative agencies, members of the media, scholars, and citizens interested in the Nation’s social policy.

Seemed to fill the bill.

Until I got to: Committee on Ways and Means, U.S. House of Representatives, Green Book: Background Material and Data on the Programs within the Jurisdiction of the Committee on Ways and Means.

I sh*t you not. That is really the title.

No wonder they call it the “Green Book.”

When I got to the book itself, stop laughing!, you are ahead of me, all the tables are in PDF files.

No, I’m not going to convert them this time.

Why they don’t share machine readable files?, is a question you should ask your representative.

Thinking there may be a machine readable copy elsewhere, I searched for the “Green Book.”

Did you know the Department of Defense has a “Green Book?”

Or that Financial Management Services (Treasury) has a Greek Book?

Or that the Treasury has another Greek Book?

Or the U.S. Army Green Books? (apparently there are later ones than cited here)

Or that Obama has a Green Book.

Counting the one from Congress, that’s six and I suspect there are many more that any search will turn up.

Don’t suppose it ever occurred to anyone in government that distinguishing any of these for search purposes would be useful?

February 2, 2013

Alpha.data.gov: From Open Data Provider to Open Data Hub

Filed under: Government,Government Data,Open Data,Topic Maps — Patrick Durusau @ 3:08 pm

Alpha.data.gov: From Open Data Provider to Open Data Hub by Andrea Di Maio.

From the post:

Those who happen to read my blog know that I am rather cynical about many enthusiastic pronouncements around open data. One of the points I keep banging on is that the most common perspective is that open data is just something that governments ought to publish for businesses and citizens to use it. This perspective misses both the importance of open data created elsewhere – such as by businesses or by people in social networks – and the impact of its use inside government. Also, there is a basic confusion between open and public data: not all open data is public and not all public data may be open (although they should, in the long run).

In this respect the new experimental site alpha.data.gov is a breath of fresh air. Announced in a recent post on the White House blog, it does not contain data, but explains which categories of open data can be used for which sort of purposes.

A step in the right direction.

Simply gathering the relevant data sets for any given project is a project in and of itself.

Followed by documenting the semantics of the relevant data sets.

Data hubs are a precursor to collections of semantic documentation for data found at data hubs.

You know what should follow from collections of semantic documentation. 😉 (Can you say topic maps?)

February 1, 2013

Sunlight Congress API [Shifting the Work for Transparency?]

Filed under: Government,Government Data,Transparency — Patrick Durusau @ 8:10 pm

Sunlight Congress API

From the webpage:

A live JSON API for the people and work of Congress, provided by the Sunlight Foundation.

Features

Lots of features and data for members of Congress:

  • Look up legislators by location or by zip code.
  • Official Twitter, YouTube, and Facebook accounts.
  • Committees and subcommittees in Congress, including memberships and rankings.

We also provide Congress' daily work:

  • All introduced bills in the House and Senate, and what occurs to them (updated daily).
  • Full text search over bills, with powerful Lucene-based query syntax.
  • Real time notice of votes, floor activity, and committee hearings, and when bills are scheduled for debate.

All data is served in JSON, and requires a Sunlight API key. An API key is free to register and has no usage limits.

We have an API mailing list, and can be found on Twitter at @sunlightlabs. Bugs and feature requests can be made on Github Issues.

Important not to confuse this effort with transparency.

As the late Aaron Swartz remarked in the O’Reilly “Open Government” text:

…When you create a regulatory agency, you put together a group of people whose job is to solve some problem. They’re given the power to investigate who’s breaking the law and the authority to punish them. Transparency, on the other hand, simply shifts the work from the government to the average citizen, who has neither the time nor the ability to investigate these questions in any detail, let alone do anything about it. It’s a farce: a way for Congress to look like it has done something on some pressing issue without actually endangering its corporate sponsors.

Here is an interface that:

…shifts the work from the [Sunlight Foundation] to the average citizen, who has neither the time nor the ability to investigate these questions in any detail, let alone do anything about it. It’s a farce: a way for [Sunlight Foundation] to look like it has done something on some pressing issue without actually endangering its corporate sponsors. (O’Reilly’s Open Government book [“…more equal than others” pigs]

Suggestions for ending the farce?

I first saw this at the Legal Informatics Blog, Mill: Sunlight Foundation releases Congress API.

Docket Wrench: Exposing Trends in Regulatory Comments [Apparent Transparency]

Filed under: Government,Government Data,Transparency — Patrick Durusau @ 8:10 pm

Docket Wrench: Exposing Trends in Regulatory Comments by Nicko Margolies.

From the post:

Today the Sunlight Foundation unveils Docket Wrench, an online research tool to dig into regulatory comments and uncover patterns among millions of documents. Docket Wrench offers a window into the rulemaking process where special interests and individuals can wield their influence without the level of scrutiny traditional lobbying activities receive.

Before an agency finalizes a proposed rule that Congress and the president have mandated that they enforce, there is a period of public commenting where the agency solicits feedback from those affected by the rule. The commenters can vary from company or industry representatives to citizens concerned about laws that impact their environment, schools, finances and much more. These comments and related documents are grouped into “dockets” where you can follow the actions related to each rule. Every rulemaking docket has its own page on Docket Wrench where you can get a graphical overview of the docket, drill down into the rules and notices it contains and read the comments on those rules. We’ve pulled all this information together into one spot so you can more easily research trends and extract interesting stories from the data. Sunlight’s Reporting Group has done just that, looking into regulatory comment trends and specific comments by the Chamber of Commerce and the NRA.

An “apparent” transparency offering from the Sunlight Foundation.

Imagine that you follow their advice and do discover “form letters,” horror, that have been submitted in a rule making process.

What are you going to do? Whistle up the agency’s former assistant director who is on your staff to call his buds at the agency to complain?

Get yourself a cardboard sign and march around your town square? Start a letter writing campaign of your own?

Rules are drafted, debated and approved in the dark recesses of agencies, former agency staff, lobbyists and law firms.

Want transparency? Real transparency?

That would require experts in law and policy who have equal access to the agency as its insiders and an obligation to report to the public who wins and who loses from particular rules.

An office like the public editor of the New York Times.

Might offend donors if you did that.

Best just to expose the public to a tiny part of the quagmire so you can claim people had an opportunity to participate.

Not a meaningful one, but an opportunity none the less.

I first saw this at the Legal Informatics Blog, Sunlight Foundation Releases Docket Wrench: Tool for Analyzing Comments to Proposed Regulations

January 21, 2013

No Joy in Vindication

Filed under: Government,Government Data,Transparency — Patrick Durusau @ 7:31 pm

You may have seen the news about the latest GAO report on auditing the U.S. government: U.S. Government’s Fiscal Years 2012 and 2011 Consolidated Financial Statements, GAO-13-271R, Jan 17, 2013, http://www.gao.gov/products/GAO-13-271R.

The reasons why the GAO can’t audit the U.S. government:

(1) serious financial management problems at DOD that have prevented its financial statements from being auditable,

(2) the federal government’s inability to adequately account for and reconcile intragovernmental activity and balances between federal agencies, and

(3) the federal government’s ineffective process for preparing the consolidated financial statements.

Number 2 reminds me of: The 560+ $Billion Shell Game, where I provided data files based on the OMB Sequestration report, detailing that over 560 $billion in agency transfers could not be tracked.

That problem has now been confirmed by the GAO.

I am sure my analysis was not original and has been known to insiders at the GAO and others for years.

But did you know that I mailed that analysis to both of my U.S. Senators and got no response?

I did get a “bug letter” from my representative, Austin Scott:

Washington continues to spend at unsustainable levels. That is why I voted against H.R. 8, the American Taxpayer Relief Act when it passed Congress on January 1, 2013. This plan does not address the real driver of our debt – spending. President Obama’s unwillingness to address this continues to cripple our efforts to find a long-term solution. We cannot tax our way out of this fiscal situation.

The President himself has said on multiple occasions that spending cuts must be part of the solution. In fact, on April 13, 2011 he remarked, “So any serious plan to tackle our deficit will require us to put everything on the table, and take on excess spending wherever it exists in the budget.” However, his words have seldom matched his actions.

We owe it to our children and grandchildren to make the tough choices and devise a long-term solution that gets our economy back on track and reduces our deficits. I remain hopeful that the President will join us in this effort. Thank you for contacting me. It’s an honor to represent the Eighth Congressional District of Georgia.

Non-responsive would be a polite word for it.

My original point has been vindicated by the GAO but that brings no joy.

My request to the officials I have contacted was simple:

All released government financial data must be available in standard spreadsheet formats (Excel, CSV, ODF).

There are a whole host of other issues that will arise from such data but the first step is to get it in a crunchable format.

O’Reilly’s Open Government book [“…more equal than others” pigs]

Filed under: Government,Government Data,Open Data,Open Government,Transparency — Patrick Durusau @ 7:30 pm

We’re releasing the files for O’Reilly’s Open Government book by Laurel Ruma.

From the post:

I’ve read many eloquent eulogies from people who knew Aaron Swartz better than I did, but he was also a Foo and contributor to Open Government. So, we’re doing our part at O’Reilly Media to honor Aaron by posting the Open Government book files for free for anyone to download, read and share.

The files are posted on the O’Reilly Media GitHub account as PDF, Mobi, and EPUB files for now. There is a movement on the Internet (#PDFtribute) to memorialize Aaron by posting research and other material for the world to access, and we’re glad to be able to do this.

You can find the book here: github.com/oreillymedia/open_government

Daniel Lathrop, my co-editor on Open Government, says “I think this is an important way to remember Aaron and everything he has done for the world.” We at O’Reilly echo Daniel’s sentiment.

Be sure to read Chapter 25, “When Is Transparency Useful?”, by the late Aaron Swartz.

It includes this passage:

…When you create a regulatory agency, you put together a group of people whose job is to solve some problem. They’re given the power to investigate who’s breaking the law and the authority to punish them. Transparency, on the other hand, simply shifts the work from the government to the average citizen, who has neither the time nor the ability to investigate these questions in any detail, let alone do anything about it. It’s a farce: a way for Congress to look like it has done something on some pressing issue without actually endangering its corporate sponsors.

As a tribute to Aaron, are you going to dump data on the WWW or enable the calling of “more equal than others” pigs to account?

January 20, 2013

Operation Asymptote – [PlainSite / Aaron Swartz]

Filed under: Government,Government Data,Law,Law - Sources,Legal Informatics,Uncategorized — Patrick Durusau @ 8:06 pm

Operation Asymptote

Operation Asymptote’s goal is to make U.S. federal court data freely available to everyone.

The data is available now, but free only up to $15 worth every quarter.

Serious legal research hits that limit pretty quickly.

The project does not cost you any money, only some of your time.

The result will be another source of data to hold the system accountable.

So, how real is your commitment to doing something effective in memory of Aaron Swartz?

January 18, 2013

Freeing the Plum Book

Filed under: Government,Government Data,Transparency — Patrick Durusau @ 7:15 pm

Freeing the Plum Book by Derek Willis.

From the post:

The federal government produces reams of publications, ranging from the useful to the esoteric. Pick a topic, and in most cases you’ll find a relevant government publication: for example, recent Times articles about presidential appointments draw on the Plum Book. Published annually by either the House or the Senate (the task alternates between committees), the Plum Book is a snapshot of appointments throughout the federal government.

The Plum Book is clearly a useful resource for reporters. But like many products of the Government Printing Office, its two main publication formats are print and PDF. That means the digital version isn’t particularly searchable, unless you count Ctrl-F as a legitimate search mechanism. And that’s a shame, because the Plum Book is basically a long list of names, positions and salary information. It’s data.

Derek describes freeing the Plum Book from less than useful formats.

It is now available in JSON and YAML formats at Github and in Excel.

Curious, what other public datasets would you want to match up to the Plum Book?

January 13, 2013

U.S. GPO releases House bills in bulk XML

Filed under: Government Data,Law,Law - Sources — Patrick Durusau @ 8:15 pm

U.S. GPO releases House bills in bulk XML

Bills from the current Congress but for bulk download in XML.

Users guide.

GPO press release.

Bulk House Bills Download.

Another bulk data source from the U.S. Congress.

Integration of the legislative sources will be none trivial but it has been done before, manually.

What will be more interesting will be tracking the more complex interpersonal relationships that underlie the surface of legislative sources.

January 10, 2013

Lost: House Floor Record for 1 January 2013. If found please call…

Filed under: Government,Government Data — Patrick Durusau @ 1:49 pm

U.S. House of Representatives floor proceedings for the 109th Congress, 1st Session (2005) to 113th Congress, 1st Session-to-Date (2013) are now available for download in XML. (House Floor Activities Download)

One obvious test of the data, the House vote on the “fiscal cliff” legislation.

In fact, the Clerk of the House for January 01, 2013, has posted a web version of that day.

Question: If you download 112th Congress, 2nd Session (2012), will you find the vote on the “fiscal cliff” legislation?

Answer: No!

The entire legislative day in the House of Representatives is missing from the 112th Congress, 2nd Session (2012) file.

See for yourself: I have uploaded the 112th Congress, 2nd Session (2012) and the Clerk of the House of Representatives file for January 1, 2013, in the file: Missing1January2013.

Search for: “On motion that the House agree to the Senate amendments Agreed to by recorded vote: 257 – 167” in HDoc-112-2-FloorProceedings.xml. (112th Congress, 2nd Session).

Typos and errors happen all the time. To everyone. But missing an entire day is more than just a typo. It indicates a lack of concern for quality control.

January 9, 2013

Center for Effective Government Announces Launch [Name Change]

Filed under: Government,Government Data,Transparency — Patrick Durusau @ 12:00 pm

Center for Effective Government Announces Launch

The former OMB Watch is now the Center for Effective Government (www.foreffectivegov.org).

A change to reflect a broader expertise on government effectiveness in general.

From the post:

The Center for Effective Government will continue to offer expert analysis, in-depth reports, and news updates on the issues it has been known for in the past. Specifically, the organization will:

  • Analyze federal tax and spending choices and advocate for progressive revenue options and transparency in federal spending;
  • Defend and improve national standards and safeguards and the regulatory systems that produce and enforce them;
  • Expose undue special interest influence in federal policymaking and advocate for open government reforms that ensure public officials put the public interest first; and
  • Encourage more active citizen engagement in our democracy by ensuring people have access to easy-to-understand, contextualized, meaningful public information and understand how they can participate in public policy decision making processes.

If you have been running a topic map in this area, reflect the name change to the OMB Watch topic.

Beyond simple semantic impedance, which is always present, government is replete with examples of intentional impedance if not outright deception.

A fertile field for topic map practitioners!

January 1, 2013

The 560+ $Billion Shell Game

Filed under: Government,Government Data — Patrick Durusau @ 8:49 pm

I have completed another round of analysis on the OMB Report Pursuant to the Sequestration Transparency Act of 2012 (P. L. 112–155), which I started with Fiscal Cliff + OMB or Fool Me Once/Twice (Appendix A) and Over The Fiscal Cliff – Blindfolded (Appendix B).

560,157 (in $millions) or 560.157 $billion are hidden in the O (Opaque) MB report. How?

“Hidden” in the sense that money is taken from an unknown government account and transferred to another government account (the one that says “Exempt, 255(g)(1)(A) — intragovernmental”).

In other words, we know where the money went, but not where it came from.

Let’s walk through the first account in Appendix A to illustrate how “Exempt, 255(g)(1)(A) — intragovernmental” would be calculated:

Senate, 001-05-0110 Salaries, Officers and Employees, Sequestrable BA 176 Sequester Percentage 8.2 Sequester Amount 14

If 50 $million is paid out of the Senate account into another government account, it is “Exempt, 255(g)(1)(A) — intragovernmental” and listed at the “other” account as exempt (in Appendix A, Appendix B identifies it as exempt under the Balanced Budget and Emergency Deficit Control Act of 1985, as amended (BBEDCA).)

In the O (Opaque) MB, approximately 560.157 $billion, was transferred from one government account to another, but it isn’t possible to trace the transfers.

Open the data file, Appendix-A-Exempt-Intragovernmental-With-Appendix-B-Pages. It only contains accounts with “intragovernmental” exemptions.

Filter column J for amounts > 0 and sum the results. (I have included page numbers for Appendix A and Appendix B to assist in your verification of the data.)

Another 1.066 $billion is contained in negative exemptions under “Exempt, 255(g)(1)(A) — intragovernmental.”

Filter column J for amounts < 0 and sum the results. What a negative exemption means? You guess is as good as mine. Twenty One accounts where Appendix A says exempt (but no reason) and Appendix B says not exempt.

  1. 001-15-0100
  2. 001-15-4296
  3. 001-25-0101
  4. 001-25-4325
  5. 001-25-4346
  6. 005-65-1955
  7. 006-48-5583
  8. 006-48-5584
  9. 006-60-0551
  10. 006-60-0552
  11. 006-60-5396
  12. 019-20-0233
  13. 009-38-0118
  14. 024-70-0701
  15. 025-09-0206
  16. 010-22-1700
  17. 010-95-1127
  18. 015-25-4159
  19. 015-45-0947
  20. 026-00-0112
  21. 028-00-4156

Filter on column R = x. I used “x” to denote that Appendix A says exempt but Appendix B disagrees.

Finally, there are four (4) accounts in Appendix A that don’t appear in Appendix B.

  1. 202-00-3112
  2. 026-00-0109
  3. 027-00-0100
  4. 028-00-0100

Filter on column S = x. I used “x” to denote Appendix A has an account number not found in Appendix B (apparent typos in B.)

Totals are for “Exempt, 255(g)(1)(A) — intragovernmental” accounts only. The actual count on missing accounts, etc., is higher on the full data set.


Additional resources:

Text of

255(g)(1)(A) of the Balanced Budget and Emergency Deficit Control Act of 1985, as amended (BBEDCA):

Intragovernmental funds, including those from which the outlays are derived primarily from resources paid in from other government accounts, except to the extent such funds are augmented by direct appropriations for the fiscal year during which an order is in effect.

Budget “Sequestration” and Selected Program Exemptions and Special Rules (Congressional Research Service) by Karen Spar is heavy reading but very helpful.


Update:

Refried Numbers from the OMB

In its current attempt at sequester obfuscation, the OMB combined the approaches used in Appendices A and B of its earlier report and reduced the percentage of sequestration. See: OMB REPORT TO THE CONGRESS ON THE JOINT COMMITTEE SEQUESTRATION FOR FISCAL YEAR 2013.

December 29, 2012

Over The Fiscal Cliff – Blindfolded

Filed under: Government,Government Data — Patrick Durusau @ 9:12 pm

The United States government is about to go over the “fiscal cliff.”

The really sad part is that the people of the United States are going with it, but they are blindfolded.

Intentionally blindfolded by their own government.

The OMB (“o” stands for opaque) report: OMB Report Pursuant to the Sequestration Transparency Act of 2012 (P. L. 112–155), Appendix B. Preliminary Sequestrable / Exempt Classification, classifies accounts as sequestrable, exempt, etc.

One reason to be “exempt” is funds were already sequestered elsewhere in the budget. Makes sense on the face of it.

But for 487 entries out of 2126 in Appendix B, or 22.9%, are being sequestered from some unstated part of the government.

Totally opaque.

Unlike the OMB, I am willing to share an electronic version of the files: OMB-Sequestration-Data-Appendix-B.zip. Satisfy yourself if I am right or wrong.

You can make it the last time the US government puts a blindfold on the American people.

Contact the White House, your Senator or Representative.

December 24, 2012

Political Data Yearbook interactive

Filed under: Government,Government Data — Patrick Durusau @ 3:39 pm

Political Data Yearbook interactive

From the webpage:

Political Data Yearbook captures election results, national referenda, changes in government, and institutional reforms for a range of countries, within and beyond the EU.

Particularly useful if your world consists of the EU + Australia, Canada, Iceland, Israel, Norway, Switzerland and the USA. 😉

To put that into perspective, only the third ranking country in terms of population, the USA, gets listed.

Omitted are (in population order): China, India, Indonesia, Brazil, Pakistan, Bangladesh, Nigeria, Russia and Japan. Or about 60% of the world’s population.

Africa, South America, the Middle East (except for Israel), Mexico and Latin America are omitted as well.

Suggestions of resources to suggest on rapidly expanding markets?

December 13, 2012

Crowdsourcing campaign spending: …

Filed under: Crowd Sourcing,Government Data,Journalism — Patrick Durusau @ 3:43 pm

Crowdsourcing campaign spending: What ProPublica learned from Free the Files by Amanda Zamora.

From the post:

This fall, ProPublica set out to Free the Files, enlisting our readers to help us review political ad files logged with Federal Communications Commission. Our goal was to take thousands of hard-to-parse documents and make them useful, helping to reveal hidden spending in the election.

Nearly 1,000 people pored over the files, logging detailed ad spending data to create a public database that otherwise wouldn’t exist. We logged as much as $1 billion in political ad buys, and a month after the election, people are still reviewing documents. So what made Free the Files work?

A quick backstory: Free the Files actually began last spring as an effort to enlist volunteers to visit local TV stations and request access to the “public inspection file.” Stations had long been required to keep detailed records of political ad buys, but they were only available on paper and required actually traveling to the station.

In August, the FCC ordered stations in the top 50 markets to begin posting the documents online. Finally, we would be able to access a stream of political ad data based on the files. Right?

Wrong. It turns out the FCC didn’t require stations to submit the data in anything that approaches an organized, standardized format. The result was that stations sent in a jumble of difficult to search PDF files. So we decided if the FCC or stations wouldn’t organize the information, we would.

Enter Free the Files 2.0. Our intention was to build an app to help translate the mishmash of files into structured data about the ad buys, ultimately letting voters sort the files by market, contract amount and candidate or political group (which isn’t possible on the FCC’s web site), and to do it with the help of volunteers.

In the end, Free the Files succeeded in large part because it leveraged data and community tools toward a single goal. We’ve compiled a bit of what we’ve learned about crowdsourcing and a few ideas on how news organizations can adapt a Free the Files model for their own projects.

The team who worked on Free the Files included Amanda Zamora, engagement editor; Justin Elliott, reporter; Scott Klein, news applications editor; Al Shaw, news applications developer, and Jeremy Merrill, also a news applications developer. And thanks to Daniel Victor and Blair Hickman for helping create the building blocks of the Free the Files community.

The entire story is golden but a couple of parts shine brighter for me than the others.

Design consideration:

The success of Free the Files hinged in large part on the design of our app. The easier we made it for people to review and annotate documents, the higher the participation rate, the more data we could make available to everyone. Our maxim was to make the process of reviewing documents like eating a potato chip: “Once you start, you can’t stop.”

Let me re-say that: The easier it is for users to author topic maps, the more topic maps they will author.

Yes?

Semantic Diversity:

But despite all of this, we still can’t get an accurate count of the money spent. The FCC’s data is just too dirty. For example, TV stations can file multiple versions of a single contract with contradictory spending amounts — and multiple ad buys with the same contract number means radically different things to different stations. But the problem goes deeper. Different stations use wildly different contract page designs, structure deals in idiosyncratic ways, and even refer to candidates and groups differently.

All true but knowing the semantics vary ahead of time, station to station, why not map the semantics in the markets ahead of time?

Granting I second their request to the FCC to request standardized data but having standardized blocks doesn’t mean the information has the same semantics.

The OMB can’t keep the same semantics for a handful of terms in one document.

What chance is there with dozens and dozens of players in multiple documents?

December 7, 2012

Fiscal Cliff + OMB or Fool Me Once/Twice

Filed under: Government,Government Data,Marketing,Topic Maps — Patrick Durusau @ 12:00 pm

Call it a fiscal “cliff,” “slope,” “curb,” “bump,” or whatever, it is all the rage in U.S. news programming.

Two things are clear:

First, tax and fiscal policy are important for government services, the economy and citizens.

Second, the American people are being kept in near total darkness about what may, could or should be done in tax and fiscal policy.

House Speaker Boehner’s “proposal” to close some tax loopholes, some day by some amount is too vacuous to merit further comment.

President Obama has been clear on wanting an increase in taxes for income over $250,000, but there clarity from the Obama administration stops.

The Office of Management and Budget issued OMB Report Pursuant to the Sequestration Transparency Act of 2012 (P. L. 112–155) as a PDF file. Meaning no one could easily evaluate its contents.

Especially:

Appendix A. Preliminary Estimates of Sequestrable and Exempt Budgetary Resources and Reduction in Sequestrable Budgetary Resources by OMB Account – FY 2013

and,

Appendix B. Preliminary Sequestrable / Exempt Classification by OMB Account and Type of Budgetary Resource

I converted Appendix A in to a comma separated data file, with a short commentary to alert the reader to issues in the data file. (OMB-Sequestration-Data-Appendix-A.zip)

For example:

  • Meaning and application of “offsets” varies throughout Appendix A of the OMB report.
  • The OMB report manages to multiple 0 by 7.6 percent for a result of $91 million.
  • Appendix B has a different ordering of the accounts than Appendix A and uses different identifiers.

Whatever the intent of the report’s authors, it fails to provide meaningful information on the sequestration issue.

Contact the White House, your Senator or Representative.

Demand all proposals be accompanied by machine readable spreadsheets with details.

Demand your favorite news outlet carry no reports without data from any side in this debate. (Being ignored is the most powerful weapon against the White House, Congress and various federal agencies.)

Lobbyists, OMB, member of congress, all have those files. The public is the only side without the details.

Topic maps can map points of clarity as well as obscurity, assuming you have the files for mapping.

December 4, 2012

INSA Highlights Increasing Importance of Open Source

Filed under: Government,Government Data,Intelligence — Patrick Durusau @ 12:52 pm

INSA Highlights Increasing Importance of Open Source

From Recorded Future*:

The Intelligence and National Security Alliance (INSA) Rebalance Task Force recently released its new white paper “Expectations of Intelligence in the Information Age“.

We’re obviously big fans of open source analysis, so some of the lead observations reported by the task force really hit home. Here they are, as written by INSA:

  • The heightened expectations of decision makers for timely strategic warning and current intelligence can be addressed in significant ways by the IC through “open sourcing” of information.
  • “Open sourcing” will not replace traditional intelligence; decision makers will continue to expect the IC to extract those secrets others are determined to keep from the United States.
  • However, because decision makers will access open sources as readily as the IC, they will expect the IC to rapidly validate open source information and quickly meld it with that derived from espionage and traditional sources of collection to provide them with the knowledge desired to confidently address national security issues and events.

You can check out an interactive version of the full report here, and take a moment to visit Recorded Future to see how we’re embracing this synthesis of open source and confidential intelligence.

I have confidence that the IC will find ways to make their collection, recording, analysis and synthesis of information with traditional intelligence sources incompatible with each other.

After all, we are less than five (5) years away from some unknown level of sharing of traditional intelligence data: Read’em and Weep.

Let’s say there is some sort of intelligence sharing by 2017 (2012 + 5). That’s sixteen (16) years after 9/11.

Being mindful that sharing doesn’t mean integrated into the information flow of the respective agencies.

How does that saying go?

Once is happenstance.

Twice is coincidence.

Three times is enemy action?

Where does the continuing failure to share intelligence fall on that list?

(Topic maps can’t provide the incentives to make sharing happen, but they do make sharing possible for people with incentives to share.)


* I listed the entry as originating from Recorded Future. Why some blog authors find it difficult to identify themselves I cannot say.

November 30, 2012

Campaign Finance Data in Splunk [Cui bono?]

Filed under: Government,Government Data,Splunk — Patrick Durusau @ 5:29 pm

Two post you may find interesting:

SPLUNK’D: Federal Election Commission Campaign Finance Data

and,

Spluk4Good Announces public data project highlighting FEC Campaign Finance Data

Project link.

The project reveals answers to our burning questions:

  • What state gives the most?
  • Which state gives the most per capita? (Bet you won’t guess this one!)
  • What does aggregate giving look like visualized over the election cycle?
  • Is your city more Red or more Blue?
  • What does a map viz with drilldown reveal about giving by zip codes or cities?
  • What occupation gives the most?
  • Are geologists more Red or more Blue (Hint: think about where geologist live and who they work for!)

Impressive performance but some of my burning questions would be:

  • Closing which tax loopholes would impact particular taxpayers who contributed to X political campaign?
  • Which legislative provisions benefits particular taxpayers or their investments?
  • Which regulations by federal agencies benefit particular taxpayers or their businesses?

The FEC data isn’t all you would need to answer those questions. But the answers are known.

Someone asked for the benefits in all three cases. Someone wrote the laws, regulations or loop holes with the intent to grant those benefits.

Not all of those are dishonest. Consider the charitable contributions that sustain fine art, music, libraries and research that benefits all of us.

There are other benefits that are less benign.

To identify the givers, recipients, legislation/regulation and the benefit, would require collocation of data from disparate domains and vocabularies.

Interested?

November 29, 2012

International Aid Transparency Initiative (IATI) Standard

Filed under: Code Lists,Government,Government Data — Patrick Durusau @ 6:22 pm

International Aid Transparency Initiative (IATI) Standard

From the webpage:

The International Aid Transparency Initiative (IATI) is a global transparency standard that makes information about aid spending easier to access, use and understand.

More precisely it is a standard for normalizing financial data in order to provide transparency.

Transparency is desired by donors of international aid so they can judge and control the use of the aid they donate. Non-transparency is desired by the recipients of international aid because they resent the paternalism of and interference in local affairs by donors.

I sense a lack of the common interest that would be required to make this standard truly effective.

Its code lists, on the other hand, could be quite valuable in creating mapping solutions between disparate information systems.

I first saw this standard mentioned in Using Graphs to Analyse Public Spending on International Development by James Hughes.

November 16, 2012

Fech 1.1 Released

Filed under: Government,Government Data — Patrick Durusau @ 4:34 am

Fech 1.1 Released by Derek Willis

From the post:

We’ve released an updated version of Fech, our Ruby gem for parsing Federal Election Commission electronic filings. This update provides some performance improvements when parsing large filings (such as those from the presidential campaign) and several small fixes for parsing miscellaneous reports and other issues. The code is on Github; feel free to fork it and make changes or file an issue.

That is good news!

Now, what other data would you want to map with it?

Suggestions?

October 27, 2012

zip-code-data-hacking

Filed under: Geographic Data,Geographic Information Retrieval,Government Data — Patrick Durusau @ 7:09 pm

zip-code-data-hacking by Neil Kodner.

From the readme file:

sourcing publicly available files, generate useful zip code-county data.

My goal is to be able to map zip codes to county FIPS codes, without paying. So far, I’m able to produce county fips codes for 41456 counties out of a list of 42523 zip codes.

I was able to find a zip code database from unitedstateszipcodes.org, each zip code had a county name but not a county FIPS code. I was able to find County FIPS codes on the census.gov site through some google hacking.

The data files are in the data directory – I’ll eventuall add code to make sure the latest data files are retrieved at runtime. I didn’t do this yet because I didn’t want to hammer the sites while I was quickly iterating – a local copy did just fine.

In case you are wondering why this mapping between zip codes to county FIPS codes is important:

Federal information processing standards codes (FIPS codes) are a standardized set of numeric or alphabetic codes issued by the National Institute of Standards and Technology (NIST) to ensure uniform identification of geographic entities through all federal government agencies. The entities covered include: states and statistically equivalent entities, counties and statistically equivalent entities, named populated and related location entities (such as, places and county subdivisions), and American Indian and Alaska Native areas. (From: Federal Information Processing Standard (FIPS)

To use zip code based data against federal agency data (FIPS), requires this mapping.

I suspect Neil would appreciate your assistance.

I first saw this at Pete Warden’s Five Short Links.

October 22, 2012

Accountability = “unintended consequences”? [Benghazai Cables]

Filed under: Government,Government Data,Topic Maps,Transparency — Patrick Durusau @ 1:43 pm

House Oversight Committee Chairman Darrell Issa (R- Calif.), is reported by the Huffington Post to have released “sensitive but unclassified” State Department cables that contained the names of Libyans working within the United States. (Benghazi Consulate Attack: Darrell Issa Releases Raw Libya Cables, Obama Administration Cries Foul)

Acrobat Reader says there are 121 pages in:

State Department Cables – Benghzai, Libya (created last Friday morning)

Not sure what that means.

What the State Department means by “unintended consequences?”

Do they mean…

  • Liyan or U.S. nationals may be held accountable for crimes in the U.S. or other countries?
  • consequences for Libyans who are working against the interest of their fellow Libyans?
  • consequences for Libyans who are favoring their friends and families in Libya, at the expense of other Libyans?
  • consequences for Libyans currying favor with the U.S. State Department?

If there are “unintended consequences,” it may be they are being held accountable for their actions.

Being held accountable is probably the reason the State Department shuns transparency.

Both for themselves and others.

Would mapping the Benghazai cables bring the House Oversight Committee closer to holding someone accountable for that attack?

October 20, 2012

US presidential election fundraising: help us explore the FEC data

Filed under: FEC,Government,Government Data — Patrick Durusau @ 4:22 pm

US presidential election fundraising: help us explore the FEC data by Simon Rogers.

From the post:

Interactive: Which candidate has raised the most cash? Where do the donors live? Find your way around the latest data from the Federal Election Commission with this interactive graphic by Craig Bloodworth at the Information Lab and Andy Cotgreave of Tableau.

  • What can you find in the data? Let us know in the comments below

Being able to track donations to who gets face time the president would be more helpful.

Would enable potential donors to gauge how much to donate for X amount of face time.

Until then, practice with this data.

October 18, 2012

Waste Book 2012 [ > 1,000 Footnote Islands ]

Filed under: Government,Government Data,Marketing,Topic Maps — Patrick Durusau @ 10:43 am

Waste Book 2012 by Sen. Tom Coburn, M.D. (PDF file)

Senator Coburn, is a government pork/waste gadfly in the United States Senate.

Often humorous descriptions call attention to many programs or policies that appear to be pure waste.

I say “appear to be pure waste” because Senator Coburn’s reports are islands of commentary, in a sea crowded with such islands.

There is no opportunity to “connect the dots” with additional information, such as rebuttals, changes in agency policy or practices, or even the personnel responsible for the alleged waste.

Imagine a football (U.S. or European) stadium where every fan has a bull horn and is shouting their description of each play. That is the current status of reporting about issues in the U.S. federal government.

Senator Coburn’s latest report may be described in several thousand news publications, but other than its being issued, that group of shouts should be reduced to 1. The rest are just duplicative noise.

The Waste Book tries to do better than conservative talk radio or its imagined “liberal” press foe. The Waste Book cites sources for the claims that it makes. Over 1,000 footnote islands.

“Islands” because like the Waste Book, it isn’t easy to connect them with other information. Or to debate those connections.

Every increase in connection difficulty increases the likelihood of non-verficiation/validation. That is, you will just take their word for it.

The people who possess information realize that.

Why do you think government reports appear as nearly useless PDF files? Or why media stories, even online, are leaden lumps of information, that quickly sink in the sea of media shouting.

Identifiable someones, want you to “take their word” for any number of things.

They are counting your job, family and life in general leaving too little time for any other answer.

How would you like to disappoint them?

(More to follow on capturing information traffic between footnote “islands” and how to leverage it for yourself and others.)

« Newer PostsOlder Posts »

Powered by WordPress