## Archive for the ‘Government Data’ Category

### US rendition map: what it means, and how to use it

Wednesday, May 22nd, 2013

US rendition map: what it means, and how to use it by James Ball.

From the post:

The Rendition Project, a collaboration between UK academics and the NGO Reprieve, has produced one of the most detailed and illuminating research projects shedding light on the CIA’s extraordinary rendition project to date. Here’s how to use it.

Truly remarkable project to date, but could be even more successful with your assistance.

Not likely that any of the principals will wind up in the dock at the Hague.

On the other hand, exposing their crimes may deter others from similar adventures.

### U.S. Senate Panel Discovers Nowhere Man [Apple As Tax Dodger]

Monday, May 20th, 2013

Forty-seven years after Nowhere Man by the Beatles, a U.S. Senate panel discovers several nowhere men.

A Wall Street Journal Technology Alert:

Apple has set up corporate structures that have allowed it to pay little or no corporate tax–in any country–on much of its overseas income, according to the findings of a U.S. Senate examination.

The unusual result is possible because the iPhone maker’s key foreign subsidiaries argue they are residents of nowhere, according to the investigators’ report, which will be discussed at a hearing Tuesday where Apple CEO Tim Cook will testify. The finding comes from a lengthy investigation into the technology giant’s tax practices by the Senate Permanent Subcommittee on Investigations, led by Sens. Carl Levin (D., Mich.) and John McCain (R., Ariz.).

In additional coverage, Apple says:

Apple’s testimony also includes a call to overhaul: “Apple welcomes an objective examination of the US corporate tax system, which has not kept pace with the advent of the digital age and the rapidly changing global economy.”

Tax reform will be useful only if “transparent” tax reform.

Transparent tax reform mean every provision with more than a $100,000 impact on any taxpayer, names all the taxpayers impacted. Whether more or less taxes. We have the data, we need the will to apply the analysis. A tax-impact topic map anyone? ### UNESCO Publications and Data (Open Access) Sunday, May 19th, 2013 UNESCO to make its publications available free of charge as part of a new Open Access policy From the post: The United Nations Education Scientific and Cultural Organisation (UNESCO) has announced that it is making available to the public free of charge its digital publications and data. This comes after UNESCO has adopted an Open Access Policy, becoming the first agency within the United Nations to do so. The new policy implies that anyone can freely download, translate , adapt, and distribute UNESCO’s publications and data. The policy also states that from July 2013, hundreds of downloadable digital UNESCO publications will be available to users through a new Open Access Repository with a multilingual interface. The policy seeks also to apply retroactively to works that have been published. There’s a treasure trove of information for mapping, say against the New York Times historical archives. If presidential libraries weren’t concerned with helping former administration officials avoid accountability, digitizing presidential libraries for complete access, would be another great treasure trove. ### Open Data and Wishful Thinking Saturday, May 18th, 2013 BLM Fracking Rule Violates New Executive Order on Open Data by Sofia Plagakis. From the post: Today, the U.S. Department of the Interior’s Bureau of Land Management (BLM) released its revised proposed rule for natural gas drilling (commonly referred to as fracking) on federal and tribal lands. The much-anticipated rule violates President Obama’s recently issued executive order that requires new government information to be made available to the public in open, machine-readable formats. Last week, President Obama signed an executive order requiring that all newly generated public data be pushed out in open, machine-readable formats. Concurrently, the Office of Management and Budget (OMB) and the Office of Science and Technology Policy (OSTP) released an Open Data Policy designed to make previously unavailable government data accessible to entrepreneurs, researchers, and the public. The executive order and accompanying policy must have been in development for months, and agencies, including BLM, should have been fully aware of the new policy. But instead of establishing a modern example of government information collection and sharing, BLM’s proposed rule would allow drilling companies to report the chemicals used in fracking to a third-party, industry-funded website, called FracFocus.org, which does not provide data in machine-readable formats. FracFocus.org only allows users to download PDF files of reports on fracked wells. Because PDF files are not machine-readable, the site makes it very difficult for the public to use and analyze data on wells and chemicals that the government requires companies to collect and make available. I wonder if Sofia simply overlooked: When implementing the Open Data Policy, agencies shall incorporate a full analysis of privacy, confidentiality, and security risks into each stage of the information lifecycle to identify information that should not be released. These review processes should be overseen by the senior agency official for privacy. It is vital that agencies not release information if doing so would violate any law or policy, or jeopardize privacy, confidentiality, or national security. [From “We won’t get fooled again…”] Or if her “…requires new government information to be made available to the public in open, machine-readable formats” is wishful thinking? The Obama just released the Benghazi emails in PDF format. So we have an example of the Whitehouse violating its own “open data” policy. We don’t need more “open data.” What we need are more leakers. A lot more leakers. Just be sure you leak or pass on leaks in “open, machine-readable formats.” The foreign adventures, environmental pollution, failures in drug or food safety, etc., avoided by leaks may save your life, the lives of your children or grandchildren. Leak today! ### Open Government and Benghazi Emails Thursday, May 16th, 2013 The controversy over the “Benghazi emails” is a good measure of what the Obama Administration means by “open government.” News of the release of the Benghazi emails broke yesterday, NPR, USA Today, among others. I saw the news at Benghazi Emails Released, Wall Street Journal. PDF of the emails If you go to WhiteHouse.gov and search for “Benghazi emails,” can you find the White House release of the emails? I thought not. The emails show congressional concern over the “talking points” on Benghazi to be a tempest in a teapot, as many of us already suspected. Early release of the emails would have avoided some of the endless discussion rooted in congressional ignorance and bigotry. But, the Obama administration has so little faith in “open government” that it conceals information that would be to its advantage if revealed. Now imagine how the Obama administration must view information that puts it at a disadvantage. Does that help to clarify the commitment of the Obama administration to open government? It does for me. ### Search Nonprofit Tax Forms Friday, May 10th, 2013 ProPublica Launches Online Tool to Search Nonprofit Tax Forms by Doug Donovan. From the post: The investigative-journalism organization ProPublica started a free online service today for searching the federal tax returns of more than 615,000 nonprofits. ProPublica began building its Nonprofit Explorer tool on its Web site shortly after the Internal Revenue Service announced in April that it was making nonprofit tax returns available in a digital, searchable format. ProPublica’s database provides nonprofit Form 990 information free back to 2001, including executive compensation, total revenue, and other critical financial data Scott Klein, editor of news applications at ProPublica, said Nonprofit Explorer is not meant to replace GuideStar, the most familiar online service for searching nonprofit tax forms. Many search results on Nonprofit Explorer also offer links to GuideStar data. “They have a much richer tool set,” Mr. Klein said. For now, Nonprofit Explorer does not include the tax forms filed by private foundations but is expected to do so in a future update. I guess copy limitations prevented reporting the URL for the ProPublica’s Nonprofit Explorer. Another place to look for smoke even if you are unlikely to find fire. ### “We won’t get fooled again…” Friday, May 10th, 2013 Landmark Steps to Liberate Open Data There is no shortage of discussion of President Obama’s executive order that is alleged to result in greater access to government data. Except then you read: Agencies shall implement the requirements of the Open Data Policy and shall adhere to the deadlines for specific actions specified therein. When implementing the Open Data Policy, agencies shall incorporate a full analysis of privacy, confidentiality, and security risks into each stage of the information lifecycle to identify information that should not be released. These review processes should be overseen by the senior agency official for privacy. It is vital that agencies not release information if doing so would violate any law or policy, or jeopardize privacy, confidentiality, or national security. Gee, I wonder who is going to decide what information gets released? How would we know when “open data” efforts succeed? Here’s my test: When ordinary citizens can mine open data and their complaints result in the arrest and conviction of public officials or government staff. Unless and until that sort of information is public data, you are being distracted from important data by platitudes and flattery. ### Free Government Data… [Handicapping Congress?] Wednesday, May 8th, 2013 From the post: The Sunlight Foundation is expanding its free data services with a new website – http://sunlightfoundation.com/api/ – to access our open government APIs. We offer APIs (a.k.a. application programming interfaces) for a number of our projects and tools and support a community of developers who create their own projects using this data. Nonprofit organizations, political campaigns and media outlets use our collection of APIs, which cover topics such as the Congressional Record, lobbying records and state legislation. More than 7,000 people have registered for an API key, resulting in over 735 million API calls to date. Greenpeace uses congressional information available through Sunlight APIs on its activist tools, and the Wikimedia Foundation used Sunlight APIs to help people connect with their lawmakers in Congress during the SOPA debate last year. Those using Sunlight APIs run across the political spectrum, from the Obama-Biden campaign to the Tea Party Patriots. From the API page: Capitol Words API The Capitol Words API is an API allowing access to the word frequency count data powering the Capitol Words project. Congress API v3 API A live JSON API for the people and work of Congress. Information on legislators, districts, committees, bills, votes, as well as real-time notice of hearings, floor activity and upcoming bills. Influence Explorer API The Influence Explorer API gives programmers and journalists the ability to easily create subsets of large data for their own research and development purposes. The API currently offers campaign contributions and lobbying records with more data sets coming soon. Open States API Information on the legislators and activities of all 50 state legislatures, Washington, D.C. and Puerto Rico. Political Party Time API Provides access to the underlying, raw data that the Sunlight Foundation creates based on fundraising invitations collected in Party Time. As we enter information on new invitations, the database updates automatically. Commercial opportunity: The Sunlight Foundation data is a start towards public handicapping of members of Congress for votes on legislation. ### Povcalnet – World Bank Poverty Stats Sunday, May 5th, 2013 I’m surprised some Republican in the U.S. House or Senate isn’t citing Povcalnet as evidence there is no poverty in the United States. The trick of course is in how you define “poverty.” The World Bank uses$1, $1.25 and$2.00 a day as poverty lines.

While there is widespread global hunger and disease, is income sufficient to participate in the global economy really the best measure for poverty?

If the documentaries are to be believed, there are tribes of Indians who live in the rain forests of Brazil, quite healthily, without any form of money at all.

They are not buying iPods with foreign music to replace their own but that isn’t being impoverished. Is it?

There is the related issue that someone else is classifying people as impoverished.

I wonder how they would classify themselves?

Statistics could be made more transparent through the use of topic maps.

### Spring Cleaning Data: 1 of 6… [Federal Reserve]

Tuesday, April 9th, 2013

Spring Cleaning Data: 1 of 6 – Downloading the Data & Opening Excel Files

From the post:

With spring in the air, I thought it would be fun to do a series on (spring) cleaning data. The posts will follow my efforts to to download the data, import into R, cleaned it up, merge the different files, add columns of information created, and then a master file exported. During the process I will be offering at times different ways to do things, this is an attempt to show how there is no one way of doing something, but there are several. When appropriate I will demonstrate as many as I can think of, given the data.

This series of posts will be focusing on the Discount Window of the Federal Reserve. I know I seem to be picking on the Feds, but I am genuinely interested in what they have. The fact that there is data on the discount window is, to be blunt, took legislation from congress to get. The first step in this project was to find the data. The data and additional information can be downloaded here.

I don’t have much faith in government data but if you are going to debate on the “data,” such as it is, you will need to clean it up and combine it with other data.

This is a good start in that direction for data from the Federal Reserve.

If you are interested in data from other government agencies, publishing the steps needed to clean/combine their data would move everyone forward.

A topic map of cleaning directions for government data could be a useful tool.

Not that clean data = government transparency but it might make it easier to spot the shadows.

### Splitting a Large CSV File into…

Monday, April 8th, 2013

From the post:

One of the problems with working with data files containing tens of thousands (or more) rows is that they can become unwieldy, if not impossible, to use with “everyday” desktop tools. When I was Revisiting MPs’ Expenses, the expenses data I downloaded from IPSA (the Independent Parliamentary Standards Authority) came in one large CSV file per year containing expense items for all the sitting MPs.

In many cases, however, we might want to look at the expenses for a specific MP. So how can we easily split the large data file containing expense items for all the MPs into separate files containing expense items for each individual MP? Here’s one way using a handy little R script in RStudio

Just because data is “open,” doesn’t mean it will be easy to use. (Leaving the useful question to one side.)

We have been kicking around idea for a “killer” topic map application.

What about a plug-in for a browser that recognizes file types and suggests tools for processing them?

I am unlikely to remember this post a year from now when I have a CSV file from some site.

But if a browser plugin recognized the extension, .csv, and suggested a list of tools for exploring it….

Particularly if the plug-in called upon some maintained site of tools, so the list of tools is maintained.

Or for that matter, that it points to other data explorers who have examined the same file (voluntary disclosure).

Not the full monty of topic maps but a start towards collectively enhancing our experience with data files.

### USPTO – New Big Data App [Value-Add Opportunity]

Monday, April 1st, 2013

U.S. Patent and Trademark Office Launches New Big Data Application on MarkLogic®

From the post:

Real-Time, Granular, Online Access to Complex Manuals Improves Efficiency and Transparency While Reducing Costs

MarkLogic Corporation, the provider of the MarkLogic® Enterprise NoSQL database, today announced that the U.S. Patent and Trademark Office (USPTO) has launched the Reference Document Management Service (RDMS), which uses MarkLogic for real-time searching of detailed, specific, up-to-date content within patent and trademark manuals. RDMS enables real-time search of the Manual of Patent Examining Procedure (MPEP) and the Trademark Manual of Examination Procedures (TMEP). These manuals provide a vital window into the complexities of U.S. patent and trademark laws for inventors, examiners, businesses, and patent and government attorneys.

The thousands of examiners working for USPTO need to be able to quickly locate relevant instructions and procedures to assist in their examinations. The RDMS is enabling faster, easier searches for these internal users.

Having the most current materials online also means that the government can reduce reliance on printed manuals that quickly go out of date. USPTO can also now create and publish revisions to its manuals more quickly, allowing them to be far more responsive to changes in legislation.

Additionally, for the first time ever, the tool has also been made available to the public increasing the MPEP and TMEP accessibility globally, furthering the federal government’s efforts to promote transparency and accountability to U.S. citizens. Patent creators and their trusted advisors can now search and reference the same content as the USPTO examiners, in real time — instead of having to thumb through a printed reference guide.

The date on this report was March 26, 2013.

I don’t know if the USPTO is just playing games but searching their site for “Reference Document Management Service” produces zero “hits.”

Searching for “RDMS” produces four (4) “hits,” none of which were pointers to an interface.

Maybe it was too transparent?

The value-add proposition I was going to suggest was mapping the results of searching into some coherent presentation, like TaxMap.

And/or linking the results of searches into current literature in rapidly developing fields of technology.

Guess both of those opportunities will have to wait for basic searching to be available.

If you have a status update on this announced but missing project please ping me.

### Open Data for Africa Launched by AfDB

Thursday, March 28th, 2013

Open Data for Africa Launched by AfDB

From the post:

The African Development Bank Group has recently launched the ‘Open Data for Africa‘ as part of the bank’s goal to improve data management and dissemination in Africa. The Open Data for Africa is a user friendly tool for extracting data, creating and sharing own customized reports, and visualising data across themes, sectors and countries in tables, charts and maps. The platform currently holds data from 20 African countries : Algeria, Cameroon, Cape Verde, Democratic Republic of Congo, Ethiopia, Malawi, Morocco, Mozambique, Namibia, Nigeria, Ghana, Rwanda, Republic of Congo, Senegal, South Africa, South Sudan, Tanzania, Tunisia, Zambia and Zimbabwe.

Not a lot of resources but a beginning.

One trip to one country isn’t enough to form an accurate opinion of a continent but I must report my impression of South Africa from several years ago.

I was at a conference with mid-level government and academic types for a week.

In a country where “child head of household” is a real demographic category, I came away deeply impressed with the optimism of everyone I met.

You can just imagine the local news in the United States and/or Europe if a quarter of the population was dying.

Vows of to “…never let this happen again…,” blah, blah, would choke the channels.

Not in South Africa. They readily admit to having a variety of serious issues but are equally serious about developing ways to meet those challenges.

If you want to see optimism in the face of stunning odds, I would strongly recommend a visit.

### Lobbyists 2012: Out of the Game or Under the Radar?

Sunday, March 24th, 2013

Lobbyists 2012: Out of the Game or Under the Radar?

Executive Summary:

Over the past several years, both spending on lobbying and the number of active lobbyists has declined. A number of factors may be responsible, including the lackluster economy, a gridlocked Congress and changes in lobbying rules.

CRP finds that the biggest players in the influence game — lobbying clients across nearly all sectors — increased spending over the last five years. The top 100 lobbying firms income declined only 6 percent between 2007 and 2012 but the number of registered lobbyists dropped by 25 percent.

The more precipitous drop in the number of lobbyists is likely due to changes in the rules. More than 46 percent of lobbyists who were active in 2011 but not in 2012 continue to work for the same employers, suggesting that many have simply avoided the reporting limits while still contributing to lobbying efforts.

Whatever the cause, it is important to understand whether the same activity continues apace with less disclosure and to strengthen the disclosure regimen to ensure that it is clear, enforceable — and enforced. If there is a general sense that the rules don’t matter, there could be erosion to disclosure and a sense that this is an “honor system” that isn’t being honored any longer. This is important because, if people who are in fact lobbying do not register, citizens will be unable to understand the forces at work in shaping federal policy, and therefore can’t effectively participate in policy debates and counter proposals that are not in their interest. At a minimum, the Center for Responsive Politics will continue to aggregate, publish and scrutinize the data that is being reported, in order to explain trends in disclosure — or its omission.

A caution on relying on public records/disclosure for topic maps of political influence.

You can see the full report here.

My surprise was the discovery that:

[the] “honor system” that isn’t being honored any longer.

Lobbying for private advantage at public expense is contrary to any notion of “honor.”

Why the surprise that lobbyists are dishonorable? (However faithful they may be to their employers. Once bought, they stay bought.)

I first saw this at Full Text Reports.

### Open Data: The World Bank Data Blog

Wednesday, March 20th, 2013

Open Data: The World Bank Data Blog

In case you are following open data/government issues, you will want to add this blog to your RSS feed.

Not a high traffic blog but with twenty-seven contributing authors, you get a diversity of viewpoints.

Not to mention that the World Bank is a great source for general data.

I persist in thinking that transparency means identifying individuals responsible for decisions, expenditures and the beneficiaries of those decisions and expenditures.

That isn’t a popular position among those who make decisions and approve expenditures for unidentified beneficiaries.

You will either have to speculate on your own or ask someone else why that is an unpopular position.

### The Biggest Failure of Open Data in Government

Monday, March 18th, 2013

From the post:

In the past few years we’ve seen a huge shift in the way governments publish information. More and more governments are proactively releasing information as raw open data rather than simply putting out reports or responding to requests for information. This has enabled all sorts of great tools like the ones that help us find transportation or the ones that let us track the spending and performance of our government. Unfortunately, somewhere in this new wave of open data we forgot some of the most fundamental information about our government, the basic “who”, “what”, “when”, and “where”.

Do you know all the different government bodies and districts that you’re a part of? Do you know who all your elected officials are? Do you know where and when to vote or when the next public meeting is? Now perhaps you’re thinking that this information is easy enough to find, so what does this have to do with open data? It’s true, it might not be too hard to learn about the highest office or who runs your city, but it usually doesn’t take long before you get lost down the rabbit hole. Government is complex, particularly in America where there can be a vast multitude of government districts and offices at the local level.

How can we have a functioning democracy when we don’t even know the local government we belong to or who our democratically elected representatives are? It’s not that Americans are simply too ignorant or apathetic to know this information, it’s that the system of government really is complex. With what often seems like chaos on the national stage it can be easy to think of local government as simple, yet that’s rarely the case. There are about 35,000 municipal governments in the US, but when you count all the other local districts there are nearly 90,000 government bodies (US Census 2012) with a total of more than 500,000 elected officials (US Census 1992). The average American might struggle to name their representatives in Washington D.C., but that’s just the tip of the iceberg. They can easily belong to 15 government districts with more than 50 elected officials representing them.

We overlook the fact that it’s genuinely difficult to find information about all our levels of government. We unconsciously assume that this information is published on some government website well enough that we don’t need to include it as part of any kind of open data program

Yes, the number of subdivisions of government and the number of elected officials are drawn from two different census reports, the first from the 2012 census and the second from the 1992 census, a gap of twenty (20) years.

The Census bureau has the 1992 list, saying:

1992 (latest available) 1992 Census of Governments vol. I no. 2 [PDF, 2.45MB] * Report has been discontinued

Makes me curious why such a report would be discontinued?

A report that did not address the various agencies, offices, etc. that are also part of various levels of government.

Makes me think you need an “insider” and/or a specialist just to navigate the halls of government.

Philip’s post illustrates that “open data” dumps from government are distractions from more effective questions of open government.

Questions such as:

• Which officials have authority over what questions?
• How to effectively contact those officials?
• What actions are under consideration now?
• Rules and deadlines for comments on actions?
• Hearing and decision calendars?
• Comments and submissions by others?
• etc.

It never really is “…the local board of education (substitute your favorite board) decided….” but “…member A, B, D, and F decided that….”

Transparency means not allowing people and their agendas to hide behind the veil of government.

### “Mixed Messages” on Cybersecurity [China ranks #12 among cyber-attackers]

Thursday, March 14th, 2013

Do you remember the “mixed messages” Dibert cartoon?

Mixed Messages

Where an “honest” answer meant “mixed messages?”

I had that feeling this morning when I read: Mark Rockwell’s post: German telecom company provides real-time map of Cyber attacks.

From the post:

In hopes of blunting mounting electronic assaults, a German telecommunications carrier unveiled a free online capability that shows where Cyber attacks are happening around the world in real time.

Deutsche Telekom, parent company of T-Mobile, put up what it calls its “Security dashboard” portal on March 6. The map, said the company, is based on attacks on its purpose-built network of decoy “honeypot” systems at 90 locations worldwide

Deutsche Telekom said it launched the online portal at the CeBIT telecommunications trade show in Hanover, Germany, to increase the visibility of advancing electronic threats.

“New cyber attacks on companies and institutions are found every day. Deutsche Telekom alone records up to 450,000 attacks per day on its honeypot systems and the number is rising. We need greater transparency about the threat situation. With its security radar, Deutsche Telekom is helping to achieve this,” said Thomas Kremer, board member responsible for Data Privacy, Legal Affairs and Compliance.

Which has a handy chart of the sources of attacks over the last month:

Top 15 of Source Countries (Last month)

Source of Attack Number of Attacks
Russian Federation 2,402,722
Taiwan, Province of China 907,102
Germany 780,425
Ukraine 566,531
Hungary 367,966
United States 355,341
Romania 350,948
Brazil 337,977
Italy 288,607
Australia 255,777
Argentina 185,720
China 168,146
Poland 162,235
Israel 143,943
Japan 133,908

By measured “attacks,” the geographic location of China (not the Chinese government) is #12 as an origin of cyber-attacks.

After Russia, Taiwan (Province of China), Germany, Ukraine, Hungary, United States, and others.

Just in case you missed several recent news cycles, the Chinese government was being singled out as a cyber-attacker for policy or marketing reasons that are not clear.

This service makes the specious nature of those accusations apparent, although the motivations behind the reports remains unclear.

Before you incorporate any government data or report into a topic map, you should verify the information with at least two or more independent sources.

### Man Bites Dog Story (EU Interest Groups and Legislation)

Friday, March 8th, 2013

Interest groups and the making of legislation

From the post:

How are the activities of interest groups related to the making of legislation? Does mobilization of interest groups lead to more legislation in the future? Alternatively, does the adoption of new policies motivate interest groups to get active? Together with Dave Lowery, Brendan Carroll and Joost Berkhout, we tackle these questions in the case of the European Union. What we find is that there is no discernible signal in the data indicating that the mobilization of interest groups and the volume of legislative production over time are significantly related. Of course, absence of evidence is the same as the evidence of absence, so a link might still exist, as suggested by theory, common wisdom and existing studies of the US (e.g. here). But using quite a comprehensive set of model specifications we can’t find any link in our time-series sample. The abstract of the paper is below and as always you can find at my website the data, the analysis scripts, and the pre-print full text. One a side-note – I am very pleased that we managed to publish what is essentially a negative finding. As everyone seems to agree, discovering which phenomena are not related might be as important as discovering which phenomena are. Still, there are few journals that would apply this principle in their editorial policy. So cudos for the journal of Interest Groups and Advocacy.

Abstract
Different perspectives on the role of organized interests in democratic politics imply different temporal sequences in the relationship between legislative activity and the influence activities of organized interests. Unfortunately, lack of data has greatly limited any kind of detailed examination of this temporal relationship. We address this problem by taking advantage of the chronologically very precise data on lobbying activity provided by the door pass system of the European Parliament and data on EU legislative activity collected from EURLEX. After reviewing the several different theoretical perspectives on the timing of lobbying and legislative activity, we present a time-series analysis of the co-evolution of legislative output and interest groups for the period 2005-2011. Our findings show that, contrary to what pluralist and neo-corporatist theories propose, interest groups neither lead nor lag bursts in legislative activity in the EU.

You can read an earlier version of the paper at: Timing is Everything? Organized Interests and the Timing of Legislative Activity. (I say earlier version because the title is the same but the abstract is slightly different.)

Just a post or so ago, Untangling algorithmic illusions from reality in big data, the point was made that biases in data collection can make a significant difference in results.

The “negative” finding in this paper is an example of that hazard.

From the paper:

The European Parliament maintains a door pass system for lobbyists. Everyone entering the Parliament’s premises as a lobbyist is expected to register on this list ….

Now there’s a serious barrier to any special interest group that wants to influence the EU Parliament!

Certainly no special interest group would be so devious and under-handed as to meet with members of the EU Parliament away from the Parliament’s premises.

Say, in exotic vacation spots/spas? Or at meetings of financial institutions? Or just in the normal course of their day to day affairs?

The U.S. registers lobbyists, but like the EU “hall pass” system, it is the public side of influence.

People with actual influence don’t have to rely on anything as crude as lobbyists to insure their goals are met.

The data you collect may exclude the most important data.

Unless it is your goal for it to be excluded, then carry on.

### State Sequester Numbers [Is This Transparency?]

Wednesday, March 6th, 2013

A great visualization of the impact of sequestration state by state.

And, a post on the process followed to produce the visualization.

The only caveat being that one person read the numbers from PDF files supplied by the White House and another person typed them into a spreadsheet.

Doable with a small data set such as this one, but why was it necessary at all?

Once you have the data in a machine readable form, putting faces in the local community to the abstract categories should be the next step.

Topic maps anyone?

### Transparency and the Digital Oil Drop

Tuesday, March 5th, 2013

I left off yesterday pointing out three critical failures in the Digital  Accountability and Transparency  Act  (DATA  Act)

Those failures were:

• Undefined goals with unrealistic deadlines.
• Lack of incentives for performance.
• Lack of funding for assigned duties.

Digital  Accountability and Transparency  Act  (DATA  Act) [DOA]

Make no mistake, I think transparency, particularly in government spending is very important.

Important enough that proposals for transparency should take it seriously.

In broad strokes, here is my alternative to the Digital Accountability and Transparency Act (DATA Act) proposal:

• Ask the GAO, the federal agency with the most experience auditing other federal agencies, to prepare an estimate for:
• Cost/Time for preparing a program internal to the GAO to produce mappings of agency financial records to a common report form.
• Cost/Time to train GAO personnel on the mapping protocol.
• Cost/Time for additional GAO staff for the creation of the mapping protocol and permanent GAO staff as liaisons with particular agencies.
• Recommendations for incentives to promote assistance from agencies.
• Upon approval and funding of the GAO proposal, which should include at least two federal agencies as test cases, that:
• Test case agencies are granted additional funding for training and staff to cooperate with the GAO mapping team.
• Test case agencies are granted additional funding for training and staff to produce reports as specified by the GAO.
• Staff in test case agencies are granted incentives to assist in the initial mapping effort and maintenance of the same. (Positive incentives.)
• The program of mapping of accounts expand no more often than every two to three years and only if prior agencies have achieved and remain in conformance.

Some critical differences between my sketch of a proposal and the Digital  Accountability and Transparency  Act  (DATA  Act):

1. Additional responsibilities and requirements will be funded for agencies, including additional training and personnel.
2. Agency staff will have incentives to learn the new skills and procedures necessary for exporting their data as required by the GAO.
3. Instead of trying to swallow the Federal whale, the project proceeds incrementally and with demonstrable results.

Topic maps can play an important role in such a project but we should be mindful that projects rarely succeed or fail because of technology.

Project fail because, like the DATA Act, they ignore basic human needs, experience in similar situations (9/11), and substitute abuse for legitimate incentives.

### Digital  Accountability and Transparency  Act  (DATA  Act) [DOA]

Monday, March 4th, 2013

I started this series of posts in: Digital  Accountability  and  Transparency  Act  (DATA  Act) [The Details], where I concluded the Data Act had the following characteristics:

• Secretary of the Treasury has one (1) year to design a common data format for unknown financial data in Federal agencies.
• Federal agencies have one (1) year to comply with the common data format from the Secretary of the Treasure.
• No penalties or bonuses for the Secretary of the Treasury.
• No penalties or bonuses for Federal agencies failing to comply.
• No funding for the Secretary of the Treasury to carry out the assigned duties.
• No funding for Federal agencies to carry out the assigned duties.

As written, the Digital  Accountability  and  Transparency  Act  (DATA  Act) will be DOA (Dead On Arrival) in the current or any future session of Congress.

There are three (3) main reasons why that is the case.

A Common Data Format

Let me ask a dumb question: Do you remember 9/11?

Of course you do. And the United States has been in a state of war on terrorism every since.

I point that out because intelligence sharing (read common data format) was identified as a reason why the 9/11 attacks weren’t stopped and has been a high priority to solve since then.

Think about that: Reason why the attacks weren’t stopped and a high priority to correct.

This next September 11th will be the twelfth anniversary of those attacks.

Progress on intelligence sharing: Progress Made and Challenges Remaining in Sharing Terrorism-Related Information which I gloss in Read’em and Weep, along with numerous other GAO reports on intelligence sharing.

The good news is that we are less than five (5) years away from some unknown level of intelligence sharing.

The bad news is that puts us sixteen (16) years after 9/11 with some unknown level of intelligence sharing.

And that is for a subset of the entire Federal government.

A smaller set than will be addressed by the Secretary of the Treasury.

Common data format in a year? Really?

To say nothing of the likelihood of agencies changing the multitude of systems they have in place in a year.

No penalties or bonuses

You can think of this as the proverbial carrot and stick if you like.

What incentive does either the Secretary of the Treasury and/or Federal agencies have to engage in this fool’s errand pursuing a common data format?

In case you have forgotten, both the Secretary of the Treasury and Federal agencies have obligations under their existing missions.

Missions which they are designed by legislation and habit to discharge before they turn to additional reporting duties.

And what happens if they discharge their primary mission but don’t do the reporting?

Oh, they get reported to Congress. And ranked in public.

As Ben Stein would say, “Wow.”

No Funding

To add insult to injury, there is no additional funding for either the Secretary of the Treasury or Federal agencies to engage in any of the activities specified by the Digital  Accountability  and  Transparency  Act  (DATA  Act).

As I noted above, the Secretary of the Treasury and Federal agencies already have full plates with their current missions.

Now they are to be asked to undertake unfamiliar tasks, creation of a chimerical “common data format” and submitting reports based upon it.

Without any addition staff, training, or other resources.

Directives without resources to fulfill them are directives that are going to fail. (full stop)

Tentative Conclusion

If you are asking yourself, “Why would anyone advocate the Digital  Accountability  and  Transparency  Act  (DATA  Act)?,” five points for your house!

I don’t know of anyone who understands:

1. the complexity of Federal data,
2. the need for incentives,
3. the need for resources to perform required tasks,

who thinks the Digital  Accountability  and  Transparency  Act  (DATA  Act) is viable.

Why advocate non-viable legislation?

Its non-viability make it an attractive fund raising mechanism.

Advocates can email, fund raise, telethon, rant, etc., to their heart’s content.

Advocating non-viable transparency lines an organization’s pocket at no risk of losing its rationale for existence.

The third post in this series, suggesting a viable way forward, will appear tomorrow under: Transparency and the Digital Oil Drop.

### Digital  Accountability  and  Transparency  Act  (DATA  Act) [The Details]

Monday, March 4th, 2013

The Data Transparency Coalition, the Sunlight Foundation and others are calling for reintroduction of the Digital  Accountability  and  Transparency  Act  (DATA  Act) in order to make U.S. government spending more transparent.

Transparency in government spending is essential for an informed electorate. An electorate that can call attention to spending that is inconsistent with policies voted for by the electorate. Accountability as it were.

But saying “transparency” is easy. Achieving transparency, not so easy.

Let’s look at some of the details in the DATA Act.

(2) DATA STANDARDS-

‘(A) IN GENERAL- The Secretary of the Treasury, in consultation with the Director of the Office of Management and Budget, the General Services Administration, and the heads of Federal agencies, shall establish Government-wide financial data standards for Federal funds, which may–

‘(i) include common data elements, such as codes, unique award identifiers, and fields, for financial and payment information required to be reported by Federal agencies;

‘(ii) to the extent reasonable and practicable, ensure interoperability and incorporate–

‘(I) common data elements developed and maintained by an international voluntary consensus standards body, as defined by the Office of Management and Budget, such as the International Organization for Standardization;

‘(II) common data elements developed and maintained by Federal agencies with authority over contracting and financial assistance, such as the Federal Acquisition Regulatory Council; and

‘(III) common data elements developed and maintained by accounting standards organizations; and

‘(iii) include data reporting standards that, to the extent reasonable and practicable–

‘(I) incorporate a widely accepted, nonproprietary, searchable, platform-independent computer-readable format;

‘(II) be consistent with and implement applicable accounting principles;

‘(III) be capable of being continually upgraded as necessary; and

‘(IV) incorporate nonproprietary standards in effect on the date of enactment of the Digital Accountability and Transparency Act of 2012.

‘(i) GUIDANCE- The Secretary of the Treasury, in consultation with the Director of the Office of Management and Budget, shall issue guidance on the data standards established under subparagraph (A) to Federal agencies not later than 1 year after the date of enactment of the Digital Accountability and Transparency Act of 2012.

‘(ii) AGENCIES- Not later than 1 year after the date on which the guidance under clause (i) is issued, each Federal agency shall collect, report, and maintain data in accordance with the data standards established under subparagraph (A).

OK, I have a confession to make: I was a lawyer for ten years and reading this sort of thing is second nature to me. Haven’t practiced law in decades but I still read legal stuff for entertainment.

First, read section A and write down the types of data you would have to collect for each of those items.

Don’t list the agencies/organizations you would have to contact, you probably don’t have enough paper in your office for that task.

Second, read section B and notice that the Secretary of the Treasury has one (1) years to issue guidance for all the data you listed under Section A.

That means gathering, analyzing, testing and designing a standard for all that data, most of which is unknown. Even to the GAO.

And, if they meet that one (1) year deadline, the various agencies have only one (1) year to comply with the guidance from the Secretary of the Treasury.

Do I need to comment on the likelihood of success?

As far as the Secretary of the Treasury, what happens if they don’t meet the one year deadline? Do you see any penalties?

Assuming some guidance emerges, what happens to any Federal agency that does not comply? Any penalties for failure? Any incentives to comply?

• Secretary of the Treasury has one (1) year to design a common data format for unknown financial data in Federal agencies.
• Federal agencies have one (1) year to comply with the common data format from the Secretary of the Treasure.
• No penalties or bonuses for the Secretary of the Treasury.
• No penalties or bonuses for Federal agencies failing to comply.
• No funding for the Secretary of the Treasury to carry out the assigned duties.
• No funding for Federal agencies to carry out the assigned duties.

Do you disagree with that reading of the Digital  Accountability  and  Transparency  Act  (DATA  Act)?

My analysis of that starting point appears in Digital  Accountability  and  Transparency  Act  (DATA  Act) [DOA]

### $1.55 Trillion in Federal Spending Misreported in 2011 Monday, March 4th, 2013 With$1.55  Trillion  in  Federal  Spending  Misreported  in  2011,  Data Transparency  Coalition  Renews  Call  for  Congressional  Action

Updating Senator Dirksen for inflation: “A trillion here, a trillon there, and pretty soon you’re talking real money.” (Attributed to Senator Dirksen but not documented.)

From the press release:

The Data  Transparency  Coalition,  the  only  group  unifying  the  technology  industry  in  support  of  federal  data  reform,  applauded  the  release  today  of  the  Sunlight  Foundation’s Clearspending  report  and  called  for  the  U.S.  Congress  to  reintroduce  and  pass  the  Digital  Accountability  and  Transparency  Act  (DATA  Act)  in  order  to  rectify  the  misreporting  of  trillions  of  dollars  in  federal  spending  each  year.

No Joy in Vindication Seventh post, Confirmation by the GAO that the problem I describe in the 560+ $Billion Shell Game exists in the DoD. (January 21, 2013) Update: Refried Numbers from the OMB In its current attempt at sequester obfuscation, the OMB combined the approaches used in Appendices A and B of its earlier report and reduced the percentage of sequestration. See: OMB REPORT TO THE CONGRESS ON THE JOINT COMMITTEE SEQUESTRATION FOR FISCAL YEAR 2013. ### From President Obama, The Opaque Thursday, February 28th, 2013 Leaked BLM Draft May Hinder Public Access to Chemical Information From the post: On Feb. 8, EnergyWire released a leaked draft proposal from the U.S. Department of the Interior’s Bureau of Land Management on natural gas drilling and extraction on federal public lands. If finalized, the proposal could greatly reduce the public’s ability to protect our resources and communities. The new draft indicates a disappointing capitulation to industry recommendations. The draft rule affects oil and natural gas drilling operations on the 700 million acres of public land administered by BLM, plus 56 million acres of Indian lands. This includes national forests, which are the sources of drinking water for tens of millions of Americans, national wildlife refuges, and national parks, which are widely used for recreation. The Department of the Interior estimates that 90 percent of the 3,400 wells drilled each year on public and Indian lands use natural gas fracking, a process that pumps large amounts of water, sand, and toxic chemicals into gas wells at very high pressure to cause fissures in shale rock that contains methane gas. Fracking fluid is known to contain benzene (which causes cancer), toluene, and other harmful chemicals. Studies link fracking-related activities to contaminated groundwater, air pollution, and health problems in animals and humans. If the leaked draft is finalized, the changes in chemical disclosure requirements would represent a major concession to the oil and gas industry. The rule would allow drilling companies to report the chemicals used in fracking to an industry-funded website, called FracFocus.org. Though the move by the federal government to require online disclosure is encouraging, the choice of FracFocus as the vehicle is problematic for many reasons. First, the site is not subject to federal laws or oversight. The site is managed by the Ground Water Protection Council (GWPC) and the Interstate Oil and Gas Compact Commission (IOGCC), nonprofit intergovernmental organizations comprised of state agencies that promote oil and gas development. However, the site is paid for by the American Petroleum Institute and America’s Natural Gas Alliance, industry associations that represent the interests of member companies. BLM would have little to no authority to ensure the quality and accuracy of the data reported directly to such a third-party website. Additionally, the data will not be accessible through the Freedom of Information Act since BLM is not collecting the information. The IOGCC has already declared that it is not subject to federal or state open records laws, despite its role in collecting government-mandated data. Second, FracFocus.org makes it difficult for the public to use the data on wells and chemicals. The leaked BLM proposal fails to include any provisions to ensure minimum functionality on searching, sorting, downloading, or other mechanisms to make complex data more usable. Currently, the site only allows users to download PDF files of reports on fracked wells, which makes it very difficult to analyze data in a region or track chemical use. Despite some plans to improve searching on FracFocus.org, the oil and gas industry opposes making chemical data easier to download or evaluate for fear that the public “might misinterpret it or use it for political purposes.” Don’t you feel safer? Knowing the oil and gas industry is working so hard to protect you from misinterpreting data? Why the government is helping the oil and gas industry protect us from data I cannot say. I mention this an example of testing for “transparency.” Anything the government freely makes available with spreadsheet capabilities, isn’t transparency. It’s distraction. Any data that the government tries to hide, that data has potential value. The Center for Effective Government points out these are draft rules and when published, you need to comment. Not a bad plan but not very reassuring given the current record of President Obama, the Opaque. Alternatives? Suggestions for how data mining could expose those who own floors of the BLM, who drill the wells, etc? ### EU Commission – Open Data Portal Open Tuesday, February 26th, 2013 EU Commission – Open Data Portal Open From the post: The European Union Commission has unveiled a new Open Data Portal, with over 5,580 data sets – the majority of which comes from the Eurostat (the statistical office of the European Union). The portal is the result of the Commission’s ‘Open Data Strategy for Europe’, and will publish data from the European Commission and other bodies of the European Union; it already holds data from the European Environment Agency. The portal has a SPARQL endpoint to provide linked data, and will also feature applications that use this data. The published data can be downloaded by everyone interested to facilitate reuse, linking and the creation of innovative services. This shows the commitment of the Commission to the principles of openness and transparency. For more information https://ec.europa.eu/digital-agenda/en/blog/eu-open-data-portal-here. If the Commission is committed to “principles of openness and transparency, when can we expect to see: 1. Rosters of the institutions and individual participants in EU funded research from 1980 to present? 2. Economic analysis of the results of EU funded projects, on a project by project basis, from 1980 to present? Noting from 1984 – 2013, the total research funding exceeds EUR 118 billion. To be fair, CORDIS: Community Research and Development Information Service has report summaries and project reports for FP5, FP6 and FP7. And CORDIS Search Service provides coverage back to the early 1980′s. About Projects on Cordis has a wealth of information to guide searching into EU funded research. While a valuable resource, CORDIS requires the extraction of detailed information on a project by project basis, making large scale analysis difficult if not prohibitively expensive. PS: Of the 5855 datasets, some 5680 datasets, were previously published by EuroStat. European Environmental Agency, 106 datasets. Perhaps a net increase of 59 datasets over those previously available. ### U.S. Statutes at Large 1951-2009 Saturday, February 23rd, 2013 From the post: The GPO’s recent electronic publication of all legislation enacted by Congress from 1951-2009 is noteworthy for several reasons. It makes available nearly 40 years of lawmaking that wasn’t previously available online from any official source, narrowing part of a much larger information gap. It meets one of three long-standing directives from Congress’s Joint Committee on Printing regarding public access to important legislative information. And it has published the information in a way that provides a platform for third-party providers to cleverly make use of the information. While more work is still needed to make important legislative information available to the public, this online release is a useful step in the right direction. Narrowing the Gap In mid-January 2013, GPO published approximately 32,000 individual documents, along with descriptive metadata, including all bills enacted into law, joint concurrent resolutions that passed both chambers of Congress, and presidential proclamations from 1951-2009. The documents have traditionally been published in print in volumes known as the “Statutes at Large,” which commonly contain all the materials issued during a calendar year. The Statutes at Large are literally an official source for federal laws and concurrent resolutions passed by Congress. The Statutes at Large are compilations of “slip laws,” bills enacted by both chambers of Congress and signed by the President. By contrast, while many people look to the US Code to find the law, many sections of the Code in actuality are not the “official” law. A special office within the House of Representatives reorganizes the contents of the slip laws thematically into the 50 titles that make up the US Code, but unless that reorganized document (the US Code) is itself passed by Congress and signed into law by the President, it remains an incredibly helpful but ultimately unofficial source for US law. (Only half of the titles of the US Code have been enacted by Congress, and thus have become law themselves.) Moreover, if you want to see the intact text of the legislation as originally passed by Congress — before it’s broken up and scattered throughout the US Code — the place to look is the Statutes at Large. Policy wonks and trivia experts will have a field day but the value of the Statutes at Large isn’t apparent to me. I assume there are cases where errors can be found between the U.S.C. (United States Code) and the Statutes at Large. The significance of those errors is unknown. Like my comments on the SEC Midas program, knowing a law was passed isn’t the same as knowing who benefits from it. Or who paid for its passage. Knowing which laws were passed is useful. Knowing who benefited or who paid, priceless. ### Failure By Design Saturday, February 23rd, 2013 Did you know the Security and Exchange Commission (SEC) is now collecting 400 gigabytes of market data daily? Midas [Market Information Data Analytics System], which is costing the SEC$2.5 million a year, captures data such as time, price, trade type and order number on every order posted on national stock exchanges, every cancellation and modification, and every trade execution, including some off-exchange trades. Combined it adds up to billions of daily records.

So, what’s my complaint?

Midas won’t be able to fill in all of the current holes in SEC’s vision. For example, the SEC won’t be able to see the identities of entities involved in trades and Midas doesn’t look at, for example, futures trades and trades executed outside the system in what are known as “dark pools.” (emphasis added)

What?

The one piece of information that could reveal patterns of insider trading, churning, and a whole host of other securities crimes, is simply not collected.

I wonder who would benefit from the SEC not being able to track insider trading, churning, etc.?

People engaged in insider trading, churning, etc. would be my guess.

You?

Maybe someone should ask SEC chairman Elisse Walter or Gregg Berman (who oversees MIDAS) if tracking entities would help with SEC enforcement?

If they agree, then ask why not now?

For that matter, why not open up the data + entities so others can help the SEC with analysis of the data?

Obvious questions J. Nicholas Hoover should have asked for SEC Makes Big Data Push To Analyze Markets.

### G-8 International Conference on Open Data for Agriculture

Tuesday, February 19th, 2013

G-8 International Conference on Open Data for Agriculture

April 29-30, 2013 Washington, D.C.

Deadline for proposals: Midnight, February 28, 2013.

From the call for ideas:

Are you interested in addressing global challenges, such as food security, by providing open access to information? Would you like the opportunity to present to leaders from around the world?

We are seeking innovative products and ideas that demonstrate the potential of using open data to increase food security. This April 29-30th in Washington, D.C., the G-8 International Conference on Open Data for Agriculture will host policy makers, thought leaders, food security stakeholders, and data experts to build a strategy to share agriculture data and make innovation more accessible. As part of the conference, we are giving innovators a chance to showcase innovative uses of open data for food security in a lightning presentation or in the exhibit hall. This call for ideas is a chance to demonstrate the potential that open data can have in ensuring food security, and can inform an unprecedented global collaboration. Visit data.gov to see what agricultural data is already available and connect to other G-8 open data sites!

We are seeking top innovators to show the world what can be done with open data through:

• Lightning Presentations: brief (3-5 minute), image rich presentations intended to convey an idea
• Exhibit Hall: an opportunity to convey an idea through an image-rich exhibit.

Presentations should inspire others to share their data or imagine how open data could be used to increase food security. Presentations may include existing, new, or proposed applications of open data and should meet one or more of the following criteria:

• Demonstrate the impact of open data on food security.
• Demonstrate the impact of access to agriculturally-relevant data on developed and/or developing countries.
• Demonstrate the impact of bringing multiple sources of agriculturally-relevant public and/or private open data together (think about the creation of an agriculture equivalent of weather.com)

For those with a new idea, we invite you to submit your proposal to present it to leading experts in food security, technology and data innovation. Proposals should identify which data is needed that is publicly available, for free, on the internet. Proposals must also include a design of the application including relevance to the target audience and plans for beta testing. A successful prototype will be mobile, interactive, and scalable. Proposals to showcase existing products or pitch new ideas will be reviewed by a global panel of technical experts from the G-8 countries.

Short notice but from the submission form on the website, you only get 75-100 words to summarize your proposal.

Hell, I have trouble identifying myself in 75-100 words.

Still, if you are in D.C. and interested, it could be a good way to meet people in this area.

The nine flags for the G-8 are confusing at first. Not an example of government committee counting. The EU has a representative at G-8 meetings.