Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

November 10, 2015

How Computers Broke Science… [Soon To Break Businesses …]

Filed under: Business Intelligence,Replication,Scientific Computing,Transparency — Patrick Durusau @ 3:04 pm

How Computers Broke Science — and What We can do to Fix It by Ben Marwick.

From the post:

Reproducibility is one of the cornerstones of science. Made popular by British scientist Robert Boyle in the 1660s, the idea is that a discovery should be reproducible before being accepted as scientific knowledge.

In essence, you should be able to produce the same results I did if you follow the method I describe when announcing my discovery in a scholarly publication. For example, if researchers can reproduce the effectiveness of a new drug at treating a disease, that’s a good sign it could work for all sufferers of the disease. If not, we’re left wondering what accident or mistake produced the original favorable result, and would doubt the drug’s usefulness.

For most of the history of science, researchers have reported their methods in a way that enabled independent reproduction of their results. But, since the introduction of the personal computer — and the point-and-click software programs that have evolved to make it more user-friendly — reproducibility of much research has become questionable, if not impossible. Too much of the research process is now shrouded by the opaque use of computers that many researchers have come to depend on. This makes it almost impossible for an outsider to recreate their results.

Recently, several groups have proposed similar solutions to this problem. Together they would break scientific data out of the black box of unrecorded computer manipulations so independent readers can again critically assess and reproduce results. Researchers, the public, and science itself would benefit.

Whether you are looking for specific proposals to make computed results capable of replication or quotes to support that idea, this is a good first stop.

FYI for business analysts, how are you going to replicate results of computer runs to establish your “due diligence” before critical business decisions?

What looked like a science or academic issue has liability implications!

Changing a few variables in a spreadsheet or more complex machine learning algorithms can make you look criminally negligent if not criminal.

The computer illiteracy/incompetence of prosecutors and litigants is only going to last so long. Prepare defensive audit trails to enable the replication of your actual* computer-based business analysis.

*I offer advice on techniques for such audit trails. The audit trails you choose to build are up to you.

August 21, 2015

Disclosing Government Contracts

Filed under: Government,Government Data,Public Data,Transparency — Patrick Durusau @ 4:37 pm

The More the Merrier? How much information on government contracts should be published and who will use it by Gavin Hayman.

From the post:

A huge bunch of flowers to Rick Messick for his excellent post asking two key questions about open contracting. And some luxury cars, expensive seafood and a vat or two of cognac.

Our lavish offerings all come from Slovakia, where in 2013 the Government Public Procurement Office launched a new portal publishing all its government contracts. All these items were part of the excessive government contracting uncovered by journalists, civil society and activists. In the case of the flowers, teachers investigating spending at the Department of Education uncovered florists’ bills for thousands of euros. Spending on all of these has subsequently declined: a small victory for fiscal probity.

The flowers, cars, and cognac help to answer the first of two important questions that Rick posed: Will anyone look at contracting information? In the case of Slovakia, it is clear that lowering the barriers to access information did stimulate some form of response and oversight.

The second question was equally important: “How much contracting information should be disclosed?”, especially in commercially sensitive circumstances.

These are two of key questions that we have been grappling with in our strategy at the Open Contracting Partnership. We thought that we would share our latest thinking below, in a post that is a bit longer than usual. So grab a cup of tea and have a read. We’ll be definitely looking forward to your continued thoughts on these issues.

Not a short read so do grab some coffee (outside of Europe) and settle in for a good read.

Disclosure: I’m financially interested in government disclosure in general and contracts in particular. With openness there comes more effort to conceal semantics and increase the need for topic maps to pierce the darkness.

I don’t think openness reduces the amount of fraud and misconduct in government, it only gives an alignment between citizens and the career interests of a prosecutor a sporting chance to catch someone out.

Disclosure should be as open as possible and what isn’t disclosed voluntarily, well, one hopes for brave souls who will leak the remainder.

Support disclosure of government contracts and leakers of the same.

If you need help “connecting the dots,” consider topic maps.

June 4, 2015

Reputation instead of obligation:…

Filed under: Open Access,Open Data,Transparency — Patrick Durusau @ 10:16 am

Reputation instead of obligation: forging new policies to motivate academic data sharing by Sascha Friesike, Benedikt Fecher, Marcel Hebing, and Stephanie Linek.

From the post:

Despite strong support from funding agencies and policy makers academic data sharing sees hardly any adoption among researchers. Current policies that try to foster academic data sharing fail, as they try to either motivate researchers to share for the common good or force researchers to publish their data. Instead, Sascha Friesike, Benedikt Fecher, Marcel Hebing, and Stephanie Linek argue that in order to tap into the vast potential that is attributed to academic data sharing we need to forge new policies that follow the guiding principle reputation instead of obligation.

In 1996, leaders of the scientific community met in Bermuda and agreed on a set of rules and standards for the publication of human genome data. What became known as the Bermuda Principles can be considered a milestone for the decoding of our DNA. These principles have been widely acknowledged for their contribution towards an understanding of disease causation and the interplay between the sequence of the human genome. The principles shaped the practice of an entire research field as it established a culture of data sharing. Ever since, the Bermuda Principles are used to showcase how the publication of data can enable scientific progress.

Considering this vast potential, it comes as no surprise that open research data finds prominent support from policy makers, funding agencies, and researchers themselves. However, recent studies show that it is hardly ever practised. We argue that the academic system is a reputation economy in which researchers are best motivated to perform activities if those pay in the form of reputation. Therefore, the hesitant adoption of data sharing practices can mainly be explained by the absence of formal recognition. And we should change this.

(emphasis in the original)

Understanding what motivates researchers to share data is an important step towards encouraging data sharing.

But at the same time, would we say that every researcher is as good as every other researcher at preparing data for sharing? At documenting data for sharing? At doing any number of tasks that aren’t really research, but just as important in order to share data?

Rather than focusing exclusively on researchers, funders should fund projects to include data sharing specialists who have the skills and interests necessary to effectively share data as part of a project’s output. Their reputations will be more closely tied to the successful sharing of data and researchers would gain in reputation for the high quality data that is shared. A much better fit for the recommendation of the authors.

Or to put it differently, lecturing researchers on how they should spend their limited time and resources to satisfy your goals, isn’t going to motivate anyone. “Pay the man!” (Richard Prior from Silver Streak)

TPP – Just One of Many Government Secrets

Filed under: Business Intelligence,Government,Transparency — Patrick Durusau @ 8:22 am

The Trans-Pacific Partnership is just one of many government secrets.

Reading Army ELA: Weapon Of Mass Confusion? by Kevin McLaughlin, I discovered yet another.

From the post:


As DISA and VMware work on a new JELA proposal, sources familiar with the matter said the relationship between the two is coming under scrutiny from other enterprise vendors. What’s more, certain details of the JELA remain shrouded in secrecy.

DISA’s JELA document contains several large chunks of redacted text, including one entire section titled “Determination Of Fair And Reasonable Cost.”

In other parts, DISA has redacted specific figures, such as VMware’s percentage of the DOD’s virtualized environments and the total amount the DOD has invested in VMware software licenses. The redacted portions have fueled industry speculation about why these and other aspects of the contract were deemed unfit for the eyes of the public.

DISA’s rationale for awarding the Army ELA and DOD JELA to VMware without opening it up to competition is also suspect, one industry executive who’s been tracking both deals told CRN. “Typically, the justification for sole-sourcing contracts to a vendor is that they only cover maintenance, yet these contracts obviously weren’t maintenance-only,” said the source.

The situation is complex but essentially the Army signed a contract with VMware that resulted in the Army downloading suites of software when it only wanted one particular part of the suite, but the Army was billed for maintenance cost on the entire suite.

That appears to be what was specified in the VMware ELA, which should be a motivation to using topic maps in connection with government contracts.

Did that go by a little fast? The jump from the VMware ELA to topic maps?

Think about it. The “Army” didn’t really sign a contract with “VMware” anymore than “VMware” signed a contract with the “Army.”

No, someone in particular, a nameable individual or group of nameable individuals, had meetings, reviews, and ultimately decided to agree to the contract between the “Army” and “VMware.” All of those individuals has roles in the proceedings that resulted in the ELA in question.

Yet, when it comes time to discuss the VMware ELA, the best we can do is talk about it as though these large organizations acted on their own. The only named individual who might be in some way responsible for the mess is the Army’s current CIO, Lt. Gen. Robert S. Ferrell, and he got there after the original agreement but before its later extension.

Topic maps, since we don’t have to plot the domain before we start uncovering relationships and roles, could easily construct a history of contacts (email, phone, physical meetings), aligned with documents (drafts, amendments), of all the individuals on all sides of this sorry episode.

Topic maps can’t guarantee that the government, the DOD in this case, won’t make ill-advised contracts in the future. No software can do that. What topic maps can do is trace responsibility for such contracts to named individuals. Having an accountable government means having accountable government employees.

PS: How does your government, agency, enterprise track responsibility?

PPS: Topic maps could also trace, given the appropriate information, who authorized the redactions to the DISA JELA. The first person who should be on a transport as a permanent advisor in Syria. Seriously. Remember what I said about accountable government requiring accountable employees.

June 2, 2015

$100,000 reward for leaking the Trans-Pacific Partnership ‘TPP’

Filed under: Government,Politics,Transparency — Patrick Durusau @ 12:26 pm

Wikileaks is raising a $100,000 bounty for the missing twenty-six (26) chapters of the TPP. (Hope they get the revised versions after the meeting in Lima this summer.)

Donate Today!

As of 13:23 on 2 June 2015, $23927.17 from 68 people or about 24% of the goal. No guarantees but sounds like a good plan.

May 30, 2015

Congress Can — and Should — Declassify the TPP

Filed under: Government,Politics,Transparency — Patrick Durusau @ 7:09 pm

Congress Can — and Should — Declassify the TPP by Robert Naiman.

From the post:

One of the most controversial aspects of the proposed Trans-Pacific Partnership (TPP) is the fact that the Obama administration has tried to impose a public blockade on the text of the draft agreement.

When Congress votes on whether to grant the president “fast-track authority” to negotiate the TPP — which would bar Congress from making any changes to the secret pact after it’s negotiated — it will effectively be a vote to pre-approve the TPP itself.

Although the other negotiating countries and “cleared” corporate advisers to the US Trade Representative have access to the draft TPP agreement, the American people haven’t been allowed to see it before Congress votes on fast track. Members of Congress can read the draft agreement under heavy restrictions, but they can’t publicly discuss or consult on what they have read.

Correction: The Obama administration hasn’t “tried to impose a public blockade on the text of the draft agreement,” it has succeeded in imposing a public blockage on the text of TPP.

The question is: What is Congress going to do to break the current blockade on the text of the TPP?

Robert has a great writeup of all the reasons why the American public should be allowed to see the text of the TPP. Such as the other parties to the agreement already know what it says, so why not the American people? Interim texts of agreements get published all the time, so why not this one?

The United States Senate or the House of Representatives can declassify the TPP text. I would say to write to your Senators and Representatives, but not this time. Starting Monday, June 1, 2015, I am going to call both of my Senators and my Representative until I have spoken with each one of them personally to express my concern that the TPP text should be available to the American public before it is submitted to Congress for approval. Including any additional or revised versions.

I will be polite and courteous but will keep calling until contact is made. I suggest you do the same. Leave your request that the TPP be declassified (including later versions) by (appropriate body) for the American public with every message.

BTW, keep count of how many calls it takes to speak to your Senator or Representative. It may give you a better understanding of how effective democracy is in the United States.

I first saw this in a tweet by BillMoyers.com.

Government and “legitimate secrets?”

Filed under: Government,Politics,Transparency — Patrick Durusau @ 9:47 am

Benjamin Wittes in An Approach to Ameliorating Press-IC Tensions Over Classified Information gives a good set of pointers to the recent dispute between the intelligence community and the New York Times:

I’ve been thinking about the exchange over the past couple of weeks—much of which took place on Lawfare—between the New York Times and the intelligence community over the naming of CIA undercover officers in a Times story. (A brief recap in links: here are Bob Litt’s original comments, the 20 former intelligence officers’ letter, Jack’s interview with Dean Baquet, my comments in response, Mark Mazzetti’s comments, and Jack’s comments.)

I want to float an idea for a mechanism that might ameliorate tensions over this sort of issue in the future. It won’t eliminate those tensions, which are inherent in the relationship between a government that has legitimate secrets to keep and a press that rightly wants to report on the activities of government, but it might give the public a lens through which to view individual disputes, and it might create a means through the which the two sides can better and more fully communicate in high-stakes situations.

The basic problem when the press has an undoubtedly newsworthy item that involves legitimately sensitive information is two-fold: the government will tend to err on the side of overstating the sensitivity of the information, because it has to protect against the biggest risk, and the government often cannot disclose to the newspaper the full reasons for its concerns.

I can’t say that I care for Wittes’ proposal because it begins with the assumption that a democratically elected government can have “legitimate secrets.” Once that point is conceded, the only argument is about the degree of ignorance of the electorate. That the electorate will never know, hopefully in eyes of some, the truth about government activities is taken as a given.

For example, did you know that the United States government supported the Pol Pot regime in Cambodia? A regime that reduced the population of Cambodia by 25%, deaths coming from executions, forced labor, starvation, etc.

Question for readers who vote in the United States:

Would United States support for a genocidal regime affect your vote in an election?

Assuming that one candidate promised continued support for such a regime and their opponent promised to discontinue support.

That seems pretty obvious, but that is exactly the sort of secrets that the government keeps from the voters.

How do I know the United States government supported Pol Pot? Good question! Sources? Would you believe diplomatic cables from the relevant time period? Recently published by Wikileaks?

The Pol Pot dilemma by Charles Parkinson, Alice Cuddy and Daniel Pye, reads in part:

A trove of more than 500,000 US diplomatic cables from 1978 released by WikiLeaks on Wednesday includes hundreds that paint a vivid picture of a US administration torn between revulsion at the brutality of Pol Pot’s government and fear of Vietnamese influence should it collapse.

“We believe a national Cambodia must exist even though we believe the Pol Pot regime is the world’s worst violator of human rights,” reads a cable sent by the State Department to six US embassies in Asia on October 11, 1978. “We cannot support [the] Pol Pot government, but an independent Kampuchea must exist.”

They are the second batch of cables to be released by the whistle-blowing website from Jimmy Carter’s presidency, which was marked by a vocal emphasis on human rights. That focus shines through in much of the correspondence, even to the point of wishing success on the Khmer Rouge in repelling Vietnamese incursions during the ongoing war between the two countries, in the hope it would, paradoxically, prevent more of the worst excesses of the government in Phnom Penh.

“While the Pol Pot government has few, if any, redeeming features, the cause of human rights is not likely to be served by the continuation of fighting between the Vietnamese and the government,” reads a cable sent by the US Embassy in Thailand to the State Department on October 17. “A negotiated settlement of [Vietnamese-Cambodian] differences might reduce the purges.”

Read also: SRV-KHMER CONFLICT PRESENTS BENEFITS AND POTENTIAL PROBLEMS FOR MOSCOW to get a feel for the proxy war status of the conflict between Cambodia and Vietnam during this period.

Although the government keeps a copy of your financial information, social security number, etc., that is your information that it holds in trust. Secrecy of that information should not be in doubt.

However, when we are talking about information that is generated in the course of government relations to other governments or in carrying out government policy, we aren’t talking about information that belongs to individuals. At least in a democracy, we are talking about information that belongs to the general public.

In your next debate about government secrecy, challenge the presumption of a need for government secrecy. History is on your side.

April 21, 2015

Why nobody knows what’s really going into your food

Filed under: Government,Transparency — Patrick Durusau @ 4:14 pm

Why nobody knows what’s really going into your food by Phillip Allen, et al.

From the webpage:

Why doesn’t the government know what’s in your food? Because industry can declare on their own that added ingredients are safe. It’s all thanks to a loophole in a 57-year-old law that allows food manufacturers to circumvent the approval process by regulators. This means companies can add substances to their food without ever consulting the Food and Drug Administration about potential health risks.

The animation is quite good and worth your time to watch.

If you think the animation is disheartening, you could spend some time at the Generally Recognized as Safe (GRAS) page over at the FDA.

From the webpage:

“GRAS” is an acronym for the phrase Generally Recognized As Safe. Under sections 201(s) and 409 of the Federal Food, Drug, and Cosmetic Act (the Act), any substance that is intentionally added to food is a food additive, that is subject to premarket review and approval by FDA, unless the substance is generally recognized, among qualified experts, as having been adequately shown to be safe under the conditions of its intended use, or unless the use of the substance is otherwise excluded from the definition of a food additive.

Links to legislation, regulations, applications, and other sources of information.

Leaving the question of regulation to one side, every product should be required to list all of its ingredients. In addition to the package, it should be required to post a full chemical analysis online.

Disclosure would not reach everyone but at least careful consumers would have a sporting chance to discover what they are eating.

April 10, 2015

UNESCO Transparency Portal

Filed under: Government,Transparency — Patrick Durusau @ 7:09 pm

UNESCO Transparency Portal

From the about page:

Public access to information is a key component of UNESCO’s commitment to transparency and its accountability vis-à-vis stakeholders. UNESCO recognizes that there is a positive correlation between a high level of transparency through information sharing and public participation in UNESCO-supported activities.

The UNESCO transparency portal has been designed to enable public access to information about the Organization’s activities across sectors, countries, and regions, accompanied by some detail on budgetary and donor information. We see this as a work in progress. Our objective is to enable access to as much quality data about our activities as possible. The portal will be regularly updated and improved.

The data is a bit stale, 2014 and by the site’s admission, data on “10 Category I Institutes operating as separate economic entities” and the “UNESCO Brasilia Office,” will be included “in a later phase….”

The map navigation on the default homepage works quite well and I tested it from focusing on Zimbabwe, lead by everyone’s favorite, Robert Mugabe. If you zoom in and select Zimbabwe on the map, the world map displays with a single icon over Zimbabwe. Hovering over that icon displays the number of projects, budget, so I selected projects and the screen scrolls down to show the five projects.

I then selected: UBRAF: Supporting Comprehensive Education Sector Responses to HIV, Sexual and Reproductive Health in Botswana, Malawi, Zambia and Zimbabwe and you are presented with the same summary information that was already presented.

Not a great showing of transparency. The United States Congress can do that well, most of the time. Transparency isn’t well served by totals and bulk amounts. Those are more suited to concealment than transparency.

At a minimum, transparency requires disclosure of who the funds were disbursed to (one assume some entity in Zimbabwe) and to who that entity transferred funds, and so on, until we reach consumables or direct services. Along with the identities of every actor along that trail.

I first saw this in The Research Desk: UNESCO, SIPRI, and Searching iTunes by Gary Price.

March 28, 2015

Tracking NSA/CIA/FBI Agents Just Got Easier

Filed under: Security,Transparency — Patrick Durusau @ 10:33 am

I pointed out in The DEA is Stalking You! the widespread use of automobile license reading applications by the DEA. I also suggested that citizens start using their cellphones to take photos of people coming and going from DEA, CIA, FBI offices and posting them online.

The good news is that Big Data has come to the aid of citizens to track NSA/CIA/FBI, etc. agents.

Lisa Vaas writes in Entire Oakland Police license plate reader data set handed to journalist:

Howard Matis, a physicist who works at the Lawrence Berkeley National Laboratory in California, didn’t know that his local police department had license plate readers (LPRs).

But even if they did have LPRs (they do: they have 33 automated units), he wasn’t particularly worried about police capturing his movements.

Until, that is, he gave permission for Ars Technica’s Cyrus Farivar to get data about his own car and its movements around town.

The data is, after all, accessible via public records law.

Ars obtained the entire LPR dataset of the Oakland Police Department (OPD), including more than 4.6 million reads of over 1.1 million unique plates captured in just over 3 years.

Then, to make sense out of data originally provided in 18 Excel spreadsheets, each containing hundreds of thousands of lines, Ars hired a data visualization specialist who created a simple tool that allowed the publication to search any given plate and plot its locations on a map.

How cool is that!?

Of course, your mileage may vary as the original Ars article reports:

In August 2014, the American Civil Liberties Union and the Electronic Frontier Foundation lost a lawsuit to compel the Los Angeles Police Department and the Los Angeles Sheriff’s Department to hand over a mere week’s worth of all LPR data. That case is now on appeal.

The trick being that the state doesn’t mind invading your privacy but is very concerned with you not invading its privacy or knowing enough about its activities to be an informed member of the voting electorate.

If you believe that the government wants to keep information like license reading secret to protect the privacy of other citizens, you need to move to North Korea. I understand they have a very egalitarian society.

Of course these are license reading records collected by the state. Since automobiles are in public view, anyone could start collecting license plate numbers with locations. Now there’s a startup idea. Blanket the more important parts of D.C. inside the Beltway with private license readers. That would be a big data set with commercial value.

To give you an idea of the possibilities, visit Police License Plate Readers at PoliceOne.com. You will find links to a wide variety of license plate reading solutions, including:

Vigilant-Solutions

A fixed installation device from Vigilant Solutions.

You could wire something together but if you are serious about keeping track of the government keeping track on all of us, you should go with professional grade equipment. As well as adopt an activist response to government surveillance. Being concerned, frightened, “speaking truth to power,” etc. are as effective as doing nothing at all.

Think about a citizen group based license plate data collection. Possible discoveries could include government vehicles at local motels and massage parlors, explaining the donut gaze in some police officer eyes, meetings between regulators and the regulated, a whole range of governmental wrong doing is waiting to be discovered. Think about investing in a mobile license plate reader for your car today!

If you don’t like government surveillance, invite them into the fish bowl.

They have a lot more to hide that you do.

March 18, 2015

“We live in constant fear of upsetting the WH (White House).”

Filed under: Government,Politics,Transparency — Patrick Durusau @ 5:44 pm

Administration sets record for withholding government files by Ted Bridis.

From the post:

The Obama administration set a record again for censoring government files or outright denying access to them last year under the U.S. Freedom of Information Act, according to a new analysis of federal data by The Associated Press.

The government took longer to turn over files when it provided any, said more regularly that it couldn’t find documents and refused a record number of times to turn over files quickly that might be especially newsworthy.

It also acknowledged in nearly 1 in 3 cases that its initial decisions to withhold or censor records were improper under the law — but only when it was challenged.

Its backlog of unanswered requests at year’s end grew remarkably by 55 percent to more than 200,000. It also cut by 375, or about 9 percent, the number of full-time employees across government paid to look for records. That was the fewest number of employees working on the issue in five years.

The government’s new figures, published Tuesday, covered all requests to 100 federal agencies during fiscal 2014 under the Freedom of Information law, which is heralded globally as a model for transparent government. They showed that despite disappointments and failed promises by the White House to make meaningful improvements in the way it releases records, the law was more popular than ever. Citizens, journalists, businesses and others made a record 714,231 requests for information. The U.S. spent a record $434 million trying to keep up. It also spent about $28 million on lawyers’ fees to keep records secret.

Ted does a great job detailing the secretive and paranoid Obama White House up to and including a censored document that forgot to cover up:

“We live in constant fear of upsetting the WH (White House).”

Although I must confess that I don’t know if it is worse that President Obama and company are so non-transparent or that they lie about how transparent they are with such easy smiles. No shame, no embarrassment, they lie when the truth would do just as well.

Not that I think any other member of government does any better, but that is hardly an excuse.

The only legitimate solution that I see going forward are massive leaks from all parts of government. If you aren’t leaking, you are part of the problem.

March 11, 2015

What’s all the fuss about Dark Data?…

Filed under: Dark Data,Transparency — Patrick Durusau @ 6:29 pm

What’s all the fuss about Dark Data? Big Data’s New Best Friend by Martyn Jones.

From the post:

Dark data, what is it and why all the fuss?

First, I’ll give you the short answer. The right dark data, just like its brother right Big Data, can be monetised – honest, guv! There’s loadsa money to be made from dark data by ‘them that want to’, and as value propositions go, seriously, what could be more attractive?

Let’s take a look at the market.

Gartner defines dark data as "the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes" (IT Glossary – Gartner)

Techopedia describes dark data as being data that is "found in log files and data archives stored within large enterprise class data storage locations. It includes all data objects and types that have yet to be analyzed for any business or competitive intelligence or aid in business decision making." (Techopedia – Cory Jannsen)

Cory also wrote that "IDC, a research firm, stated that up to 90 percent of big data is dark data."

In an interesting whitepaper from C2C Systems it was noted that "PST files and ZIP files account for nearly 90% of dark data by IDC Estimates." and that dark data is "Very simply, all those bits and pieces of data floating around in your environment that aren’t fully accounted for:" (Dark Data, Dark Email – C2C Systems)

Elsewhere, Charles Fiori defined dark data as "data whose existence is either unknown to a firm, known but inaccessible, too costly to access or inaccessible because of compliance concerns." (Shedding Light on Dark Data – Michael Shashoua)

Not quite the last insight, but in a piece published by Datameer, John Nicholson wrote that "Research firm IDC estimates that 90 percent of digital data is dark." And went on to state that "This dark data may come in the form of machine or sensor logs" (Shine Light on Dark Data – Joe Nicholson via Datameer)

Finally, Lug Bergman of NGDATA wrote this in a sponsored piece in Wired: "It" – dark data – "is different for each organization, but it is essentially data that is not being used to get a 360 degree view of a customer.

Well, I would say that 90% of 2.7 Zetabytes (as of last October) of data being dark is a reason to be concerned.

But like the Wizard of Oz, Martyn knows what you are lacking, a data inventory:


You don’t need a Chief Data Officer in order to be able to catalogue all your data assets. However, it is still good idea to have a reliable inventory of all your business data, including the euphemistically termed Big Data and dark data.

If you have such an inventory, you will know:

What you have, where it is, where it came from, what it is used in, what qualitative or quantitative value it may have, and how it relates to other data (including metadata) and the business.

Really? A data inventory? Relief to know the MDM (master data management) folks have been struggling for the past two decades for no reason. All they needed was a data inventory!

You might want to recall AnHai Doan’s observation for a single enterprise mapping project:

…the manual creation of semantic mappings has long been known to be extremely laborious and error-prone. For example, a recent project at the GTE telecommunications company sought to integrate 40 databases that have a total of 27,000 elements (i.e., attributes of relational tables) [LC00]. The project planners estimated that, without the database creators, just finding and documenting the semantic mappings among the elements would take more than 12 person years.

That’s right. One enterprise, 40 databases, 12 person years.

How that works out: PersonYears x 2.7 Zetabytes = ???, no one knows.

Oh, why did I lose the 90% as “dark data?” Simple enough, the data AnHai was mapping wasn’t entirely “dark.” At least it had headers that were meaningful to someone. Unstructured data has no headers at all.

What Martyn is missing?

What is known about data is the measure of its darkness, not usage.

But supplying opaque terms (all terms are opaque to someone) for data, only puts you into the AnHai situation. Either you enlist people who know the meanings of the terms and/or you create new meanings for them from scratch. Hopefully in the latter case you approximate the original meanings assigned to the terms.

If you want to improve on opaque terms, you need to provide alternative opaque terms that may be recognized by some future user instead of the primary opaque term you would use otherwise.

Make no mistake, it isn’t possible to escape opacity but you can increase your odds that your data can be useful at some future point in time. How many alternatives = some degree of future usefulness isn’t known.

So far as I know, the question hasn’t been researched. Every new set of opaque terms (read ontology, classification, controlled vocabulary) presents itself as possessing semantics for the ages. Given the number of such efforts, I find their confidence misplaced.

February 28, 2015

MI5 accused of covering up sexual abuse at boys’ home

Filed under: Government,Politics,Transparency — Patrick Durusau @ 10:18 am

MI5 accused of covering up sexual abuse at boys’ home by Vikram Dodd and Richard Norton-Taylor.

From the post:

MI5 is facing allegations it was complicit in the sexual abuse of children, the high court in Northern Ireland will hear on Tuesday.

Victims of the abuse are taking legal action to force a full independent inquiry with the power to compel witnesses to testify and the security service to hand over documents.

The case, in Belfast, is the first in court over the alleged cover-up of British state involvement at the Kincora children’s home in Northern Ireland in the 1970s. It is also the first of the recent sex abuse cases allegedly tying in the British state directly. Victims allege that the cover-up over Kincora has lasted decades.

The victims want the claims of state collusion investigated by an inquiry with full powers, such as the one set up into other sex abuse scandals chaired by the New Zealand judge Lowell Goddard.

Amnesty International branded Kincora “one of the biggest scandals of our age” and backed the victims’ calls for an inquiry with full powers: “There are longstanding claims that MI5 blocked one or more police investigations into Kincora in the 1970s in order to protect its own intelligence-gathering operation, a terrible indictment which raises the spectre of countless vulnerable boys having faced further years of brutal abuse.

It’s too early to claim victory but, Belfast boys’ home abuse victims win legal bid by Henry McDonald:

Residents of a notorious Northern Ireland boys’ home are to be allowed to challenge a decision to exclude it from the UK-wide inquiry into establishment paedophile rings.

A high court judge in Belfast on Tuesday granted a number of former inmates from the Kincora home a judicial review into the decision to keep this scandal out of the investigation, headed by judge Lowell Goddard from New Zealand.

The Kincora boys’ home has been linked to a paedophile ring, some of whose members were allegedly being blackmailed by MI5 and other branches of the security forces during the Troubles.

Until now, the home secretary, Theresa May, has resisted demands from men who were abused at the home – and Amnesty International – that the inquiry be widened to include Kincora.

The campaigners want to establish whether the security services turned a blind eye to the abuse and instead used it to compromise a number of extreme Ulster loyalists guilty of abusing boys at the home.

If you read carefully you will see the abuse victims have won the right to challenge the exclusion of the boys home from a UK wide investigation. A long way from forcing MI5 and other collaborators in sexual abuse of children to provide an accounting in the clear light of day.

Leaked documents, caches of spy cables, spy documents, always show agents of the government protecting war criminals, paedophiles, engaging in torture, including rape and other dishonorable conduct.

My question is why does the mainstream media honors the fiction that government secrets are meant to protect the public? Government secrets are used to protect guilty, the dishonorable and the despicable. What’s unclear about that?

February 24, 2015

Black Site in USA – Location and Details

Filed under: Government,Transparency — Patrick Durusau @ 4:37 pm

The disappeared: Chicago police detain Americans at abuse-laden ‘black site’ by Spencer Ackerman.

From the post:

The Chicago police department operates an off-the-books interrogation compound, rendering Americans unable to be found by family or attorneys while locked inside what lawyers say is the domestic equivalent of a CIA black site.

The facility, a nondescript warehouse on Chicago’s west side known as Homan Square, has long been the scene of secretive work by special police units. Interviews with local attorneys and one protester who spent the better part of a day shackled in Homan Square describe operations that deny access to basic constitutional rights.

Alleged police practices at Homan Square, according to those familiar with the facility who spoke out to the Guardian after its investigation into Chicago police abuse, include:

  • Keeping arrestees out of official booking databases.
  • Beating by police, resulting in head wounds.
  • Shackling for prolonged periods.
  • Denying attorneys access to the “secure” facility.
  • Holding people without legal counsel for between 12 and 24 hours, including people as young as 15.

At least one man was found unresponsive in a Homan Square “interview room” and later pronounced dead.

And it gets worse, far worse.

It is a detailed post but merits a slow read, particularly the statement by Jim Trainum, a former DC homicide detective:


“I’ve never known any kind of organized, secret place where they go and just hold somebody before booking for hours and hours and hours. That scares the hell out of me that that even exists or might exist,” said Trainum, who now studies national policing issues, to include interrogations, for the Innocence Project and the Constitution Project.

If a detective who lived with death and violence on a day to day basis is frightened of police black sites, what should our reaction be?

Imperiling Investigations With Secrecy

Filed under: Government,Politics,Transparency — Patrick Durusau @ 1:46 pm

Spy Cables expose ‘desperate’ US approach to Hamas by Will Jordan and Rahul Radhakrishnan.

From the post:

A CIA agent “desperate” to make contact with Hamas in Gaza pleaded for help from a South African spy in the summer of 2012, according to intelligence files leaked to Al Jazeera’s Investigative Unit. The US lists Hamas as a terrorist organisation and, officially at least, has no contact with the group.

That was just one of the revelations of extensive back-channel politicking involving the US, Israel and the Palestinian Authority as they navigate the Israeli-Palestinian conflict amid a stalled peace process.

Classified South African documents obtained by Al Jazeera also reveal an approach by Israel’s then-secret service chief, Meir Dagan, seeking Pretoria’s help in its efforts to scupper a landmark UN-authorised probe into alleged war crimes in Gaza, which was headed by South African judge Richard Goldstone.

Dagan explained that his effort to squelch the Goldstone Report had strong support from Palestinian Authority (PA) president Mahmoud Abbas.

The Mossad director told the South Africans that Abbas privately backed the Israeli position, saying he wanted the report rejected because he feared it would “play into the hands” of Hamas, his key domestic political rival.

The Spy Cables also reveal that US President Barack Obama made a direct threat to Abbas in hope of dissuading him from pursuing United Nations recognition for a Palestinian state.

In case you don’t know the “back story,” the Goldstone report was in its own words:

On 3 April 2009, the President of the Human Rights Council established the United Nations Fact Finding Mission on the Gaza Conflict with the mandate “to investigate all violations of international human rights law and international humanitarian law that might have been committed at any time in the context of the military operations that were conducted in Gaza during the period from 27 December 2008 and 18 January 2009, whether before, during or after.”

When produced, the report found there was evidence that both Hamas and Israel had committed war crimes. The chief judge, Richard Goldman, has subsequently stated the report would have been substantially different had information in the possession of Israel had been shared with the investigation. Specifically, Judge Goldman is satisfied that Israel did not target civilians as a matter of policy.

Israel has only itself to blame for the initial report reaching erroneous conclusions due to its failure to cooperate at all with the investigation. Secrecy and non-cooperation being their own reward in that case.

Even worse, however, is the revelation that the United States and others had no interest in whether Hamas or Israel had in fact committed war crimes but in how the politics of the report would impact their allies.

I can only imagine what the election results in the United States had Obama’s acceptance speech read in part:

I will build new partnerships to defeat the threats of the 21st century: terrorism and nuclear proliferation; poverty and genocide; climate change and disease.” I will stop any reports critical of Israel and scuttle any investigation into Israel’s conduct in Gaza or against the Palestinians. Helping our allies, Israel and at times the Palestinian Liberation Authority, will require that we turn a blind eye to potential war crimes and those who have committed them. I will engage in secret negotiations to protect anyone, no matter now foul, it is furthers the interest of the United States or one of its allies. In all those ways, “I will restore our moral standing, so that America is once again that last, best hope for all who are called to the cause of freedom, who long for lives of peace, and who yearn for a better future.” (The non-bolded text was added.)

Interesting to see how additional information shapes your reading of the speech isn’t it?

Transparent government isn’t a technical issue but a political one. Although I hasten to add that topic maps can assist with transparency, for governments so minded.

PS: Ones hopes that for any future investigations that Israel cooperates and the facts can be developed more quickly and openly.

February 15, 2015

Federal Spending Data Elements

Filed under: Government Data,Transparency — Patrick Durusau @ 10:43 am

Federal Spending Data Elements

From the webpage:

The data elements in the below list represent the existing Federal Funding Accountability and Transparency Act (FFATA) data elements currently displayed on USAspending.gov and the additional data elements that will be posted pursuant to the DATA Act. These elements are currently being deliberated on and discussed by the Federal community as a part of DATA Act implementation. At this point, this list is exhaustive. However, additional data elements may be standardized for transparency reporting in the future based on agency or community needs.

Join the Conversation

At this time, we are asking for comments in response to the following questions:

  1. Which data elements are most crucial to your current reporting and/or analysis?
  2. In setting standards, what are industry standards the Treasury and OMB should be considering?
  3. What are some of the considerations that Treasury and OMB should take into account when establishing data standards?

Just reading the responses to the questions on GitHub will give you a sense of what other community members are thinking about.

What responses are you going to contribute?

I first saw this in a tweet by Hudson Hollister.

February 14, 2015

OpenGov Voices: Bringing transparency to earmarks buried in the budget

Filed under: Government,Government Data,Politics,Transparency — Patrick Durusau @ 7:29 pm

OpenGov Voices: Bringing transparency to earmarks buried in the budget by Matthew Heston, Madian Khabsa, Vrushank Vora, Ellery Wulczyn and Joe Walsh.

From the post:

Last week, President Obama kicked off the fiscal year 2016 budget cycle by unveiling his $3.99 trillion budget proposal. Congress has the next eight months to write the final version, leaving plenty of time for individual senators and representatives, state and local governments, corporate lobbyists, bureaucrats, citizens groups, think tanks and other political groups to prod and cajole for changes. The final bill will differ from Obama’s draft in major and minor ways, and it won’t always be clear how those changes came about. Congress will reveal many of its budget decisions after voting on the budget, if at all.

We spent this past summer with the Data Science for Social Good program trying to bring transparency to this process. We focused on earmarks – budget allocations to specific people, places or projects – because they are “the best known, most notorious, and most misunderstood aspect of the congressional budgetary process” — yet remain tedious and time-consuming to find. Our goal: to train computers to extract all the earmarks from the hundreds of pages of mind-numbing legalese and numbers found in each budget.

Watchdog groups such as Citizens Against Government Waste and Taxpayers for Common Sense have used armies of human readers to sift through budget documents, looking for earmarks. The White House Office of Management and Budget enlisted help from every federal department and agency, and the process still took three months. In comparison, our software is free and transparent and generates similar results in only 15 minutes. We used the software to construct the first publicly available database of earmarks that covers every year back to 1995.

Despite our success, we barely scratched the surface of the budget. Not only do earmarks comprise a small portion of federal spending but senators and representatives who want to hide the money they budget for friends and allies have several ways to do it:

I was checking the Sunlight Foundation Blog for any updated information on the soon to be released indexes of federal data holdings when I encountered this jewel on earmarks.

Important to read/support because:

  1. By dramatically reducing the human time investment to find earmarks, it frees up that time to be spent gathering deeper information about each earmark
  2. It represents a major step forward in the ability to discover relationships between players in the data (what the NSA wants to do but with a rationally chosen data set).
  3. It will educate you on earmarks and their hiding places.
  4. It is an inspirational example of how darkness can be replaced with transparency, some of it anyway.

Will transparency reduce earmarks? I rather doubt it because a sense of shame doesn’t seem to motivate elected and appointed officials.

What transparency can do is create a more level playing field for those who want to buy government access and benefits.

For example, if I knew what it cost to have the following exemption in the FOIA:

Exemption 9: Geological information on wells.

it might be possible to raise enough funds to purchase the deletion of:

Exemption 5: Information that concerns communications within or between agencies which are protected by legal privileges, that include but are not limited to:

4 Deliberative Process Privilege

Which is where some staffers hide their negotiations with former staffers as they prepare to exit the government.

I don’t know that matching what Big Oil paid for the geological information on wells exemption would be enough but it would set a baseline for what it takes to start the conversation.

I say “Big Oil paid…” assuming that most of us don’t equate matters of national security with geological information. Do you have another explanation for such an offbeat provision?

If government is (and I think it is) for sale, then let’s open up the bidding process.

A big win for open government: Sunlight gets U.S. to…

Filed under: Government,Government Data,Transparency — Patrick Durusau @ 6:58 pm

A big win for open government: Sunlight gets U.S. to release indexes of federal data by Matthew Rumsey and Sean Vitka and John Wonderlich.

From the post:

For the first time, the United States government has agreed to release what we believe to be the largest index of government data in the world.

On Friday, the Sunlight Foundation received a letter from the Office of Management and Budget (OMB) outlining how they plan to comply with our FOIA request from December 2013 for agency Enterprise Data Inventories. EDIs are comprehensive lists of a federal agency’s information holdings, providing an unprecedented view into data held internally across the government. Our FOIA request was submitted 14 months ago.

These lists of the government’s data were not public, however, until now. More than a year after Sunlight’s FOIA request and with a lawsuit initiated by Sunlight about to be filed, we’re finally going to see what data the government holds.

Since 2013, federal agencies have been required to construct a list of all of their major data sets, subject only to a few exceptions detailed in President Obama’s executive order as well as some information exempted from disclosure under the FOIA.

Many kudos to the Sunlight Foundation!

As to using the word “win,” do we need to wait and see what Enterprise Data Inventories are in fact produced?

I say that because the executive order of President Obama that is cited in the post, provides these exemptions from disclosure:

4 (d) (d) Nothing in this order shall compel or authorize the disclosure of privileged information, law enforcement information, national security information, personal information, or information the disclosure of which is prohibited by law.

Will that be taken as an excuse to not list the data collections at all?

Or, will the NSA say:

one (1) collection of telephone metadata, timeSpan: 4 (d) exempt, size: 4 (d) exempt, metadataStructure: 4 (d) exempt source: 4 (d) exempt

Do they mean internal NSA phone logs? Do they mean some other source?

Or will they simply not list telephone metadata at all?

What’s exempt under FOAI? (From FOIA.gov):

Not all records can be released under the FOIA.  Congress established certain categories of information that are not required to be released in response to a FOIA request because release would be harmful to governmental or private interests.   These categories are called "exemptions" from disclosures.  Still, even if an exemption applies, agencies may use their discretion to release information when there is no foreseeable harm in doing so and disclosure is not otherwise prohibited by law.  There are nine categories of exempt information and each is described below.  

Exemption 1: Information that is classified to protect national security.  The material must be properly classified under an Executive Order.

Exemption 2: Information related solely to the internal personnel rules and practices of an agency.

Exemption 3: Information that is prohibited from disclosure by another federal law. Additional resources on the use of Exemption 3 can be found on the Department of Justice FOIA Resources page.

Exemption 4: Information that concerns business trade secrets or other confidential commercial or financial information.

Exemption 5: Information that concerns communications within or between agencies which are protected by legal privileges, that include but are not limited to:

  1. Attorney-Work Product Privilege
  2. Attorney-Client Privilege
  3. Deliberative Process Privilege
  4. Presidential Communications Privilege

Exemption 6: Information that, if disclosed, would invade another individual’s personal privacy.

Exemption 7: Information compiled for law enforcement purposes if one of the following harms would occur.  Law enforcement information is exempt if it: 

  • 7(A). Could reasonably be expected to interfere with enforcement proceedings
  • 7(B). Would deprive a person of a right to a fair trial or an impartial adjudication
  • 7(C). Could reasonably be expected to constitute an unwarranted invasion of personal privacy
  • 7(D). Could reasonably be expected to disclose the identity of a confidential source
  • 7(E). Would disclose techniques and procedures for law enforcement investigations or prosecutions
  • 7(F). Could reasonably be expected to endanger the life or physical safety of any individual

Exemption 8: Information that concerns the supervision of financial institutions.

Exemption 9: Geological information on wells.

And the exclusions:

Congress has provided special protection in the FOIA for three narrow categories of law enforcement and national security records. The provisions protecting those records are known as “exclusions.” The first exclusion protects the existence of an ongoing criminal law enforcement investigation when the subject of the investigation is unaware that it is pending and disclosure could reasonably be expected to interfere with enforcement proceedings. The second exclusion is limited to criminal law enforcement agencies and protects the existence of informant records when the informant’s status has not been officially confirmed. The third exclusion is limited to the Federal Bureau of Investigation and protects the existence of foreign intelligence or counterintelligence, or international terrorism records when the existence of such records is classified. Records falling within an exclusion are not subject to the requirements of the FOIA. So, when an office or agency responds to your request, it will limit its response to those records that are subject to the FOIA.

You can spot the truck sized holes as well as I can that may prevent disclosure.

One analytic challenge upon the release of the Enterprise Data Inventories will be to determine what is present and what is missing but should be present. Another will be to assist the Sunlight Foundation in its pursuit of additional FOIAs to obtain data listed but not available. Perhaps I should call this an important victory although of a battle and not the long term war for government transparency.

Thoughts?

February 5, 2015

Intelligence agencies tout transparency [Clapper? Eh?]

Filed under: Government,NSA,Transparency — Patrick Durusau @ 6:54 pm

Intelligence agencies tout transparency by Josh Gerstein.

From:

A year and a half after Edward Snowden’s surveillance revelations changed intelligence work forever, the U.S. intelligence community is formally embracing the value of transparency. Whether America’s spies and snoopers are ready to take that idea to heart remains an open question.

On Tuesday, Director of National Intelligence James Clapper released a set of principles that amounts to a formal acknowledgement that intelligence agencies had tilted so far in the direction of secrecy that it actually undermined their work by harming public trust.

“The thought here was we needed to strategically get on the same page in terms of what we were trying to do with transparency,” DNI Civil Liberties Protection Officer Alex Joel told POLITICO Monday. “The intelligence community is by design focused on keeping secrets rather than disclosing them. We have to figure out how we can work with our very dedicated work force to be transparent while they’re keeping secrets.”

The principles (posted here) are highly general and include a call to “provide appropriate transparency to enhance public understanding about the IC’s mission and what the IC does to accomplish it (including its structure and effectiveness).” The new statement is vague on whether specific programs or capabilities should be made public. In addition, the principle on handling of classified information appears largely to restate the terms of an executive order President Barack Obama issued on the subject in 2009.

If I understand the gist of this story correctly, the Director of National Intelligence (DNI) James Clapper, the same James Clapper that lied to Congress about the NSA, wants regain the public’s trust. Really?

Hmmm, how about James Clapper and every appointed official in the security services resigning as a start. The second step would be congressional appointment of oversight personnel who can go anywhere, see any information, question anyone, throughout the security apparatus and report back to Congress. Those reports back to Congress can elide details where necessary but by rotating the oversight personnel, they won’t become captives of the agencies where they work.

BTW, the intelligence community is considering how it can release more information to avoid “program shock” from Snowden like disclosures. Not that they have released any such information but they are thinking about it. OK, I’m thinking about winning $1 million in the next lottery drawing. Doesn’t mean that it is going to happen.

Let’s get off the falsehood merry-go-round that Clapper and others want to keep spinning. Unless and until all the known liars are out of government and kept out of government, including jobs with security contractors, there is no more reason to trust our intelligence community any more than we would trust the North Korean intelligence community.

Perhaps more of a reason to trust the North Korean intelligence community because at least we know whose side they are on. As far as the DNI and the rest of the U.S. security community, hard to say whose side they are on. Booz Allen’s? NSA’s? CIA’s? Some other contractors? Certainly not on the side of Congress and not on the side of the American people, despite their delusional pretensions to the contrary.

No doubt there is a role for a well-functioning and accountable intelligence community for the United States. That in no way could be applied to our current intelligence community, which is is a collection of parochial silos more concerned with guarding their turf and benefiting their contractors than any semblance of service to the American people.

Congress needs to end the intelligence community as we know it and soon. In the not distant future, the DNI and not the President will be the decision maker in Washington.

Forty and Seven Inspector Generals Hit a Stone Wall

Filed under: Government,Government Data,Transparency — Patrick Durusau @ 3:16 pm

Inspectors general testify against agency ‘stonewalling’ before Congress by Sarah Westwood.

From the post:

Frustration with federal agencies that block probes from their inspectors general bubbled over Tuesday in a congressional hearing that dug into allegations of obstruction from a number of government watchdogs.

The Peace Corps, Environmental Protection Agency and Justice Department inspectors general each argued to members of the House Oversight and Government Reform Committee that some of their investigations had been thwarted or stalled by officials who refused to release necessary information to their offices.

Committee members from both parties doubled down on criticisms of the Justice Department’s lack of transparency and called for solutions to the government-wide problem during their first official hearing of the 114th Congress.

“If you can’t do your job, then we can’t do our job in Congress,” Chairman Jason Chaffetz, R-Utah, told the three witnesses and the scores of agency watchdogs who also attended, including the Department of Homeland Security and General Service Administration inspectors general.

Michael Horowitz, the Justice Department’s inspector general, testified that the FBI began reviewing requested documents in 2010 in what he said was a clear violation of federal law that is supposed to grant watchdogs unfettered access to agency records.

The FBI’s process, which involves clearing the release of documents with the attorney general or deputy attorney general, “seriously impairs inspector general independence, creates excessive delays, and may lead to incomplete, inaccurate or significantly delayed findings or recommendations,” Horowitz said.

Perhaps no surprise that the FBI shows up in the non-transparency column. But given the number of inspector generals with similar problems (47), it seems to be part of a larger herd.

If you are interested in going further into this issue, there was a hearing last August 2014), Obstructing Oversight: Concerns from Inspectors General, which is here in ASCII and here with video and witness statements in PDF.

Both sources omit the following documents:

Sept. 9, 2014, letter to Chairman Issa from OMB submitted by Chairman Issa.. 58
Aug. 5, 2014, letter to Reps. Issa, Cummings, Carper, and Coburn from 47 IGs, submitted by Rep. Chaffetz.. 61
Aug. 8, 2014, letter to OMB from Reps. Carper, Coburn, Issa and Cummings, submitted by Rep. Walberg.. 69
Statement for the record from The Institute of Internal Auditors. 71

Isn’t that rather lame? To leave these items in the table of contents but to omit them from the ASCII version and to not even include them with the witness statements.

I’m curious who the other forty-four (44) inspector generals might be. Aren’t you?

If you know where to find these appendix materials, please send me a pointer.

I think it will be more effective to list all of the Inspector Generals who have encountered this stone wall treatment than treat them as all and sundry.

Chairman Jason Chaffetz suggests that by controlling funding that Congress can force transparency. I would use a finer knife. Cut all funding for health care and retirement benefits in the agencies/departments in question. See how the rank and file in the agencies like them apples.

Assuming transparency results, I would not restore those benefits retroactively. Staff chose to support, explicitly or implicitly, illegal behavior. Making bad choices has negative consequences. It would be a teaching opportunity for all future federal staff members.

January 13, 2015

While We Were Distracted….

Filed under: Finance Services,Transparency — Patrick Durusau @ 8:19 pm

I have long suspected that mainstream news, with its terrorist attacks, high profile political disputes, etc., is a dangerous distraction. Here is one more brick to shore up that opinion.

Congress attempts giant leap backward on data transparency by Pam Baker.

From the post:

The new Republican Congress was incredibly busy on its first full day at work. 241 bills were introduced on that day and more than a few were highly controversial. While polarizing bills on abortion, Obamacare and immigration got all the media headlines, one very important Congressional action dipped beneath the radar: an attempt to eliminate data transparency in financial reporting.

The provision to the “Promoting Job Creation and Reducing Small Business Burdens Act” would exempt nearly 60 percent of public companies from filing data-based reports with the Securities and Exchange Commission (SEC), according to the Data Transparency Coalition.

“This action will set the U.S. on a path backwards and put our financial regulators, public companies and investors at a significant disadvantage to global competitors. It is tremendously disappointing to see that one of the first actions of the new Congress is to put forward legislation that would harm American competitiveness and deal a major setback to data transparency in financial regulation,” said Hudson Hollister, the executive director of the Data Transparency Coalition, a trade association pursuing the publication of government information as standardized, machine-readable data.

See Pam’s post for some positive steps you can take with regard to this bill and how to remain informed about similar attempts in the future.

To be honest apparently the SEC is having all sorts of data management difficulties but given the success rate of government data projects, that’s not all that hard to believe. But the solution to such a problem isn’t to simply stop collecting information.

No doubt the SEC is locked into various custom/proprietary systems, but what if they opened up all the information about those systems for an open source project, say under the Apache Foundation, to integrate some specified data set into their systems?

It surely could not fare any worse than projects for which the government hires contractors.

December 2, 2014

GiveDirectly (Transparency)

Filed under: Open Access,Open Data,Transparency — Patrick Durusau @ 3:53 pm

GiveDirectly

From the post:

Today we’re launching a new website for GiveDirectly—the first major update since www.givedirectly.org went live in 2011.

Our main goal in reimagining the site was to create radical transparency into what we do and how well we do it. We’ve invested a lot to integrate cutting-edge technology into our field model so that we have real-time data to guide internal management. Why not open up that same data to the public? All we needed were APIs to connect the website and our internal field database (which is powered by our technology partner, Segovia).

Transparency is of course a non-profit buzzword, but I usually see it used in reference to publishing quarterly or annual reports, packaged for marketing purposes—not the kind of unfiltered data and facts I want as a donor. We wanted to use our technology to take transparency to an entirely new level.

Two features of the new site that I’m most excited about:

First, you can track how we’re doing on our most important performance metrics, at the same time we do. For example, the performance chart on the home page mirrors the dashboard we use internally to track performance in the field. If recipients aren’t understanding our program, you’ll learn about it when we do. If the follow-up team falls behind or outperforms, metrics will update accordingly. We want to be honest about our successes and failures alike.

Second, you can verify our claims about performance. We don’t think you should have to trust that we’re giving you accurate information. Each “Verify this” tag downloads a csv file with the underlying raw data (anonymized). Every piece of data is generated by a GiveDirectly staff member’s work in the field and is stored using proprietary software; it’s our end-to-end model in action. Explore the data for yourself and absolutely question us on what you find.

Tis the season for soliciting donations, by every known form of media.

Suggestion: Copy and print out this response:

___________________________, I would love to donate to your worthy cause but before I do, please send a weblink to the equivalent of: http://www.givedirectly.org. Wishing you every happiness this holiday season.

___________________________

Where no response or no equivalent website = no donation.

I first saw this in a tweet by Stefano Bertolo.

November 21, 2014

Land Matrix

Filed under: Data,Government,Transparency — Patrick Durusau @ 6:34 pm

Land Matrix: The Online Public Database on Land Deals

From the webpage:

The Land Matrix is a global and independent land monitoring initiative that promotes transparency and accountability in decisions over land and investment.

This website is our Global Observatory – an open tool for collecting and visualising information about large-scale land acquisitions.

The data represented here is constantly evolving; to make this resource more accurate and comprehensive, we encourage your participation.

The deals collected as data must meet the following criteria:

  • Entail a transfer of rights to use, control or ownership of land through sale, lease or concession;
  • Have been initiated since the year 2000;
  • Cover an area of 200 hectares or more;
  • Imply the potential conversion of land from smallholder production, local community use or important ecosystem service provision to commercial use.

FYI, 200 hectares = 2 square kilometers.

Land ownership and its transfer are matters of law and law means government.

The project describes its data this way:

The dataset is inherently unreliable, but over time it is expected to become more accurate. Land deals are notoriously un-transparent. In many countries, established procedures for decision-making on land deals do not exist, and negotiations and decisions do not take place in the public realm. Furthermore, a range of government agencies and levels of government are usually responsible for approving different kinds of land deals. Even official data sources in the same country can therefore vary, and none may actually reflect reality on the ground. Decisions are often changed, and this may or may not be communicated publically.

I would start earlier than the year 2000 but the same techniques could be applied along the route of the Keystone XL pipeline. I am assuming that you are aware that pipelines, roads and other public works are not located purely for physical or aesthetic reasons. Yes?

Please take the time to view and support the Land Matrix project and consider similar projects in your community.

If the owners can be run to ground, you may find the parties to the transactions are linked by other “associations.”

November 20, 2014

Conflict of Interest – Reversing the Definition

Filed under: Government,Transparency — Patrick Durusau @ 4:58 pm

Just a quick heads up that the semantics of “conflict of interest” has changed, at least in the context of the US House of Representatives.

Traditionally, the meaning of “conflict of interest” is captured by Wikipedia’s one-liner:

A conflict of interest (COI) is a situation occurring when an individual or organization is involved in multiple interests, one of which could possibly corrupt the motivation.

That seems fairly straight forward.

However, in H.R.1422 — 113th Congress (2013-2014), passed on 11/18/2014, the House authorized paid representatives of industry interest to be appointed to the EPA Advisory Board, saying:

SEC. 2. SCIENCE ADVISORY BOARD.(b)(2)(C) – persons with substantial and relevant expertise are not excluded from the Board due to affiliation with or representation of entities that may have a potential interest in the Board’s advisory activities, so long as that interest is fully disclosed to the Administrator and the public and appointment to the Board complies with section 208 of title 18, United States Code;

So, the House of Representatives has just reversed the standard definition of “conflict of interest” to say that hired guns of industry players have no “conflict of interest” sitting on the EPA Science Board, so long as they say they are hired guns.

I thought I was fairly hardened to hearing bizarre things out of government but reversing the definition of “conflict of interest” is a new one on me.

The science board is supposed to be composed of scientists, unsurprisingly. Scientists, by the very nature of their profession, do science. Experiments, reports, projects, etc. And no surprise the scientists on the EPA science panel work on … that’s right, environment science.

Care to guess who H.R.1422 prohibits from certain advisory activities?

SEC. 2. SCIENCE ADVISORY BOARD.(b)(2)(D)

Board members may not participate in advisory activities that directly or indirectly involve review or evaluation of their own work;

Scientists are excluded from advisory activities where they have expertise.

Being an expert in a field is a “conflict of interest” and being a hired gun is not (so long as being a hired gun is disclosed).

So the revision of “conflict of interest” is even worse than I thought.

I don’t have the heart to amend the Wikipedia article on conflict of interest. Would someone do that for me?

PS: I first saw this at House Republicans just passed a bill forbidding scientists from advising the EPA on their own research by Lindsay Abrams. Lindsay does a great job summarizing the issues and links to the legislation. I followed her link to the bill and reported just the legislative language. I think that is chilling enough.

PPS: Did your member of the House of Representative vote for this bill?

I first saw this in a tweet by Wilson da Silva.

October 3, 2014

Beyond Light Table

Filed under: Computer Science,Interface Research/Design,Programming,Transparency — Patrick Durusau @ 10:38 am

Beyond Light Table by Chris Granger.

From the post:

I have three big announcements to make today. The first is the official announcement of our next project. We’ve been quietly talking about it over the past few months, but today we want to tell you a bit more about it and finally reveal its name:

eve

Eve is our way of bringing the power of computation to everyone, not by making everyone a programmer but by finding a better way for us to interact with computers. On the surface, Eve is an environment a little like Excel that allows you to “program” simply by moving columns and rows around in tables. Under the covers it’s a powerful database, a temporal logic language, and a flexible IDE that allows you to build anything from a simple website to complex algorithms. Instead of poring over text files full of abstract symbols, you interact with domain editors that are parameterized by grids of data. To build a UI you don’t open a text editor, you just draw it on the screen and drag data to it. It’s much closer to the ideal we’ve always had of just describing what we want and letting the machine do the rest. Eve makes the computer a real tool again – one that doesn’t require decades of training to use.

Imagine a world where everyone has access to computation without having to become a professional programmer – where a scientist doesn’t have to rely on the one person in the lab who knows python, where a child could come up with an idea for a game and build it in a couple of weekends, where your computer can help you organize and plan your wedding/vacation/business. A world where programmers could focus on solving the hard problems without being weighed down by the plumbing. That is the world we want to live in. That is the world we want to help create with Eve.

We’ve found our way to that future by studying the past and revisiting some of the foundational ideas of computing. In those ideas we discovered a simpler way to think about computation and have used modern research to start making it into reality. That reality will be an open source platform upon which anyone can explore and contribute their own ideas.

Chris goes onto announce that they have raised more money and they are looking to make one or more new hires.

Exciting news and I applaud viewing computers as tools, not as oracles that perform operations on data beyond our ken and deliver answers.

Except easy access to computation doesn’t guarantee useful results. Consider the case of automobiles. Easy access to complex machines results in 37,000 deaths and 2.35 million injuries each year.

Easy access to computers for word processing, email, blogging, webpages, Facebook, etc., hasn’t resulted in a single Shakespearean sonnet, much less the complete works of Shakespeare.

Just as practically, how do I distinguish between success on the iris dataset and a data set with missing values, which can make a significant difference in results when I am dragging and dropping?

I am not a supporter of using artificial barriers to exclude people from making use of computation but on the other hand, what weight should be given to their “results?”

As “computation” spreads will “verification of results” become a new discipline in CS?

July 10, 2014

Peer Review Ring

Filed under: Peer Review,Transparency — Patrick Durusau @ 10:25 am

Scholarly journal retracts 60 articles, smashes ‘peer review ring’ by Fred Barbash.

From the post:

Every now and then a scholarly journal retracts an article because of errors or outright fraud. In academic circles, and sometimes beyond, each retraction is a big deal.

Now comes word of a journal retracting 60 articles at once.

The reason for the mass retraction is mind-blowing: A “peer review and citation ring” was apparently rigging the review process to get articles published.

You’ve heard of prostitution rings, gambling rings and extortion rings. Now there’s a “peer review ring.”

Favorable reviews were entered using fake identities as part of an open peer review process. The favorable reviews resulted in publication of those articles.

This was a peer review ring that depended upon false identities.

If peer review were more transparent, publications could explore relationships between peer reviewers and who reviewed their papers, grants, proposals, or their prior reviews of authors, projects, for interesting patterns.

I first saw this in a tweet by Steven Strogatz.

June 26, 2014

Charities, Transparency and Trade Secrets

Filed under: Transparency — Patrick Durusau @ 7:00 pm

Red Cross: How We Spent Sandy Money Is a ‘Trade Secret’ by Justin Elliott.

From the post:

Just how badly does the American Red Cross want to keep secret how it raised and spent over $300 million after Hurricane Sandy?

The charity has hired a fancy law firm to fight a public request we filed with New York state, arguing that information about its Sandy activities is a “trade secret.”

The Red Cross’ “trade secret” argument has persuaded the state to redact some material, though it’s not clear yet how much since the documents haven’t yet been released.

The documents include “internal and proprietary methodology and procedures for fundraising, confidential information about its internal operations, and confidential financial information,” wrote Gabrielle Levin of Gibson Dunn in a letter to the attorney general’s office.

If those details were disclosed, “the American Red Cross would suffer competitive harm because its competitors would be able to mimic the American Red Cross’s business model for an increased competitive advantage,” Levin wrote.

The letter doesn’t specify who the Red Cross’ “competitors” are.

I see bizarre stories on a regular basis but this is a real “man bites dog” sort of story.

See Justin’s post for the details, such as are known now. I am sure there will be follow up stories on these records.

It may just be my background but when anyone, government, charity, industry, assures me that information I can’t see is ok, that sets off multiple alarm bells.

You?

PS: Not that I think transparency automatically leads to better government or decision making. I do know that a lack of transparency, cf. the NSA, leads to very poor decision making.

March 4, 2014

Beyond Transparency

Filed under: Open Data,Open Government,Transparency — Patrick Durusau @ 1:53 pm

Beyond Transparency, edited by Brett Goldstein and Lauren Dyson.

From the webpage:

The rise of open data in the public sector has sparked innovation, driven efficiency, and fueled economic development. And in the vein of high-profile federal initiatives like Data.gov and the White House’s Open Government Initiative, more and more local governments are making their foray into the field with Chief Data Officers, open data policies, and open data catalogs.

While still emerging, we are seeing evidence of the transformative potential of open data in shaping the future of our civic life. It’s at the local level that government most directly impacts the lives of residents—providing clean parks, fighting crime, or issuing permits to open a new business. This is where there is the biggest opportunity to use open data to reimagine the relationship between citizens and government.

Beyond Transparency is a cross-disciplinary survey of the open data landscape, in which practitioners share their own stories of what they’ve accomplished with open civic data. It seeks to move beyond the rhetoric of transparency for transparency’s sake and towards action and problem solving. Through these stories, we examine what is needed to build an ecosystem in which open data can become the raw materials to drive more effective decision-making and efficient service delivery, spur economic activity, and empower citizens to take an active role in improving their own communities.

Let me list the titles for two (2) parts out of five (5):

  • PART 1 Opening Government Data
    • Open Data and Open Discourse at Boston Public Schools Joel Mahoney
    • Open Data in Chicago: Game On Brett Goldstein
    • Building a Smarter Chicago Dan X O’Neil
    • Lessons from the London Datastore Emer Coleman
    • Asheville’s Open Data Journey: Pragmatics, Policy, and Participation Jonathan Feldman
  • PART 2 Building on Open Data
    • From Entrepreneurs to Civic Entrepreneurs, Ryan Alfred, Mike Alfred
    • Hacking FOIA: Using FOIA Requests to Drive Government Innovation, Jeffrey D. Rubenstein
    • A Journalist’s Take on Open Data. Elliott Ramos
    • Oakland and the Search for the Open City, Steve Spiker
    • Pioneering Open Data Standards: The GTFS Story, Bibiana McHugh

Steve Spiker captures my concerns about efficacy of “open data” in his opening sentence:

At the center of the Bay Area lies an urban city struggling with the woes of many old, great cities in the USA, particularly those in the rust belt: disinvestment, white flight, struggling schools, high crime, massive foreclosures, political and government corruption, and scandals. (Oakland and the Search for the Open City)

It may well be that I agree with “open data,” in part because I have no real data to share. So any sharing of data is going to benefit me and whatever agenda I want to pursue.

People who are pursuing their own agendas without open data, have nothing to gain by an open playing field and more than a little to lose. Particularly if they are on the corrupt side of public affairs.

All the more reason to pursue open data in my view but with the understanding that every line of data access benefits some and penalizes others.

Take the long standing tradition of not publishing who meets with the President of the United States. Justified on the basis that the President needs open and frank advice from people who feel free to speak openly.

That’s one explanation. Another explanation is being clubby with media moguls would look inconvenient with the U.S. trade delegation be pushing a pro-media position, to the detriment of us all.

When open data is used to take down members of Congress, the White House, heads and staffs of agencies, it will truly have arrived.

Until then, open data is just whistling as it walks past a graveyard in the dark.

I first saw this in a tweet by ladyson.

February 23, 2014

Making the meaning of contracts visible…

Filed under: Law,Law - Sources,Legal Informatics,Transparency,Visualization — Patrick Durusau @ 4:27 pm

Making the meaning of contracts visible – Automating contract visualization by Stefania Passera, Helena Haapio, Michael Curtotti.

Abstract:

The paper, co-authored by Passera, Haapio and Curtotti, presents three demos of tools to automatically generate visualizations of selected contract clauses. Our early prototypes include common types of term and termination, payment and liquidated damages clauses. These examples provide proof-of-concept demonstration tools that help contract writers present content in a way readers pay attention to and understand. These results point to the possibility of document assembly engines compiling an entirely new genre of contracts, more user-friendly and transparent for readers and not too challenging to produce for lawyers.

Demo.

Slides.

From slides 2 and 3:

Need for information to be accessible, transparent, clear and easy to understand
   Contracts are no exception.

Benefits of visualization

  • Information encoded explicitly is easier to grasp & share
  • Integrating pictures & text prevents cognitive overload by distributing effort on 2 different processing systems
  • Visual structures and cues act as paralanguage, reducing the possibility of misinterpretation

Sounds like the output from a topic map doesn’t it?

A contract is “explicit and transparent” to a lawyer, but that doesn’t mean everyone reading it sees the contract as “explicit and transparent.”

Making what the lawyer “sees” explicit, in other words, is another identification of the same subject, just a different way to describe it.

What’s refreshing is the recognition that not everyone understands the same description, hence the need for alternative descriptions.

Some additional leads to explore on these authors:

Stefania Passera Homepage with pointers to her work.

Helena Haapio Profile at Lexpert, pointers to her work.

Michael Curtotti – Computational Tools for Reading and Writing Law.

There is a growing interest in making the law transparent to non-lawyers, which is going to require a lot more than “this is the equivalent of that, because I say so.” Particularly for re-use of prior mappings.

Looks like a rapid growth area for topic maps to me.

You?

I first saw this at: Passera, Haapio and Curtotti: Making the meaning of contracts visible – Automating contract visualization.

January 26, 2014

Pricing “The Internet of Everything”

Filed under: Transparency,WWW — Patrick Durusau @ 8:11 pm

I was reading Embracing the Internet of Everything To Capture Your Share of $14.4 Trillion by Joseph Bradley, Joel Barbier, and Doug Handler, when I realized their projected Value at Stake of $14.4 trillion left out an important number. The price for an Internet of Everything.

Prices are usually calculated by the product price multiplied by the quantity of the product. Let’s start there to evaluate Cisco’s pricing.

In How Many Things Are Currently Connected To The “Internet of Things” (IoT)?, appearing in Forbes, Rob Soderberry, Cisco Executive, said that:

the number of connected devices reached 8.7 billion in 2012.

The Internet of Everything (IoE) paper projects 50 billion “things” being connected by 2020.

Roughly that’s 41.3 billion more connections than exist at present.

Let’s take some liberties with Cisco’s numbers. Assume the networking in each device, leaving aside the cost of a new device with networking capability, is $10. So $10 times 41.3 billion connections = $410.3 billion. The projected ROI just dropped from $14.4 trillion to $14 trillion.

Let’s further assume that Internet connectivity has radically dropped in price and so it only $10 per month. For our additional 41.3 billion devices, $10 times 41.3 billion things times 12 or $4.130 trillion per year. The projected ROI just dropped to $10 trillion.

I say the ROI “dropped,” but that’s not really true. Someone is getting paid for Internet access, the infrastructure to support it, etc. Can you spell “C-i-s-c-o?”

In terms of complexity, consider Mark Zuckerberg’s (Facebook founder) Internet.org, which is working with Ericsson, MediaTek, Nokia, Opera, Qualcomm, and Samsung:

to help bring web access to the five billion people who are not yet connected. (From: Mark Zuckerberg launches Internet.org to help bring web access to the whole world by Mark Wilson.)

A coalition of major players working on connecting 5 billion people versus Cisco’s hand waving about connecting 50 billion “things.”

That’s not a cost estimate but it does illustrate the enormity of the problem of creating the IoE.

But the cost of the proposed IoE isn’t just connecting to the Internet.

For commercial ground vehicles the Cisco report says:

As vehicles become more connected with their environment (road, signals, toll booths, other vehicles, air quality reports, inventory systems), efficiencies and safety greatly increase. For example, the driver of a vending-machine truck will be able to look at a panel on the dashboard to see exactly which locations need to be replenished. This scenario saves time and reduces costs.

Just taking roads and signals, do you know how much is spent on highway and street construction in the United States every month?

Would you believe it averages between $77 billion and 83+ billion a month? US Highway and Street Construction Spending:
82.09B USD for Nov 2013

And the current state of road infrastructures in the United States?

Forty-two percent of America’s major urban highways remain congested, costing the economy an estimated $101 billion in wasted time and fuel annually. While the conditions have improved in the near term, and Federal, state, and local capital investments increased to $91 billion annually, that level of investment is insufficient and still projected to result in a decline in conditions and performance in the long term. Currently, the Federal Highway Administration estimates that $170 billion in capital investment would be needed on an annual basis to significantly improve conditions and performance. (2013 Report Card: Roads D+. For more infrastructure reports see: 2013 Report Card )

I read that to say an estimated $170 billion is needed annually just to improve current roads. Yes?

That doesn’t include the costs of Internet infrastructure, the delivery vehicle, other vehicles, inventory systems, etc.

I am certain that however and whenever the Internet of Things comes into being, Cisco, as part of the core infrastructure now, will prosper. I can see Cisco’s ROI from the IoE.

What I don’t see is the ROI for the public or private sector, even assuming the Cisco numbers are spot on.

Why? Because there is no price tag for the infrastructure to make the IoE a reality. Someone, maybe a lot of someones, will be paying that cost.

If you encounter costs estimates sufficient for players in the public or private sectors to make their own ROI calculations, please point them out. Thanks!

PS: A future Internet more to my taste would have tagged Cisco’s article with “speculation,” “no cost data,” etc. as aids for unwary readers.

PPS: Apologies for only U.S. cost figures. Other countries will have similar issues but I am not as familiar with where to find their infrastructure data.

« Newer PostsOlder Posts »

Powered by WordPress