Archive for the ‘Government Data’ Category

“This culture of leaking must stop.” Taking up Sessions’ Gage

Friday, August 4th, 2017

Jeff Sessions, the current (4 August 2017) Attorney General of the United States, wants to improve on Barack Obama‘s legacy as the most secretive presidency of the modern era.

Sessions has announced a tripling Justice Department probes into leaks and a review of guidelines for subpoenas for members of the news media. Attorney General says Justice Dept. has tripled the number of leak probes. (Media subpoenas are an effort to discover media sources and hence to plug the “leaks.”)

Sessions has thrown down his gage, declaring war on occasional transparency from government leakers. Indirectly, that war will include members of the media as casualties.

Shakespeare penned the best response for taking up Sessions’ gage:

Cry ‘Havoc,’ and let slip the dogs of war;

In case you don’t know the original sense of “Havoc:”

The military order Havoc! was a signal given to the English military forces in the Middle Ages to direct the soldiery (in Shakespeare’s parlance ‘the dogs of war’) to pillage and chaos. Cry havoc and let slip the dogs of war

It’s on all of us to create enough chaos to protect leakers and members of the media who publish their leaks.

Observations – Not Instructions

Data access: Phishing emails succeed 33% of the time. Do they punish would-be leakers who fall for phishing emails?

Exflitration: Tracing select documents to a leaker is commonplace. How do you trace an entire server disk? The larger and more systematic the data haul, the greater the difficulty in pinning the leak on particular documents. (Back to school specials often include multi-terabyte drives.)

Protect the Media: Full drive leaks posted a Torrent or Dark Web server means media can answer subpoenas with: go to: https://some-location. 😉

BTW, full drive leaks provide transparency for the relationship between the leaked data and media reports. Accountability is as important for the media as the government.

One or more of my observations may constitute crimes depending upon your jurisdiction.

Which I guess is why Nathan Hale is recorded as saying:

Gee, that sounds like a crime. You know, I could get arrested, even executed. None for me please!

Not!

Nathan Hale volunteered to be a spy, was caught and executed, having said:

I only regret, that I have but one life to lose for my country.

Question for you:

Are you a ‘dog of war’ making the government bleed data?

PS: As a security measure, don’t write that answer down or tell anyone. When you read about leaks, you can inwardly smile and know you played your part.

“But it feels better when I sneak”

Wednesday, August 2nd, 2017

Email prankster tricks White House officials by Graham Cluley is ample evidence for why you should abandon FOIA requests in favor of phishing/hacking during the reign of Donald Trump.

People can and do obtain mountains of information using FOIA requests, but in the words of Parker Ray, “The Other Woman,”:

“Now I hate to have to cheat
But it feels better when I sneak”

In addition to feeling better, not using FOIA requests during the Trump regime results in:

  1. Access to competitor’s data deposited with the government
  2. Avoids the paperwork and delay of the FOIA process
  3. Bidding and contract data
  4. Develop long-term stealth access than spans presidencies
  5. Incompetence of staff gives broad and deep access across agencies
  6. Mine papers of extremely secretive prior presidents, like Obama
  7. Transparency when least expected and most inconvenient

If that sounds wishful, remember Cluley reports the “technique” used by the prankster was: 1) create an email account in the name of a White House staffer, 2) send an email from that account. This has to be a new low bar for “fake” emails.

Can you afford to be a goody two shoes?

Manning Leaks — No Real Harm (Database of Government Liars Anyone?)

Tuesday, June 20th, 2017

Secret Government Report: Chelsea Manning Leaks Caused No Real Harm by Jason Leopold.

From the post:

In the seven years since WikiLeaks published the largest leak of classified documents in history, the federal government has said they caused enormous damage to national security.

But a secret, 107-page report, prepared by a Department of Defense task force and newly obtained by BuzzFeed News, tells a starkly different story: It says the disclosures were largely insignificant and did not cause any real harm to US interests.

Regarding the hundreds of thousands of Iraq-related military documents and State Department cables provided by the Army private Chelsea Manning, the report assessed “with high confidence that disclosure of the Iraq data set will have no direct personal impact on current and former U.S. leadership in Iraq.”

The 107 page report, redacted, runs 35 pages. Thanks to BuzzFeed News for prying that much of a semblance of the truth out of the government.

It is further proof that US prosecutors and other federal government representatives lie to the courts, the press and the public, whenever its suits their purposes.

Anyone with transcripts from the original Manning hearings, should identify statements by prosecutors at variance with this report, noting the prosecutor’s name, rank and recording the page/line reference in the transcript.

That individual prosecutors and federal law enforcement witnesses lie is a commonly known fact. What I haven’t seen, is a central repository of all such liars and the lies they have told.

I mention a central repository because to say one or two prosecutors have lied or been called down by a judge grabs a headline, but showing a pattern over decades of lying by the state, that could move to an entirely different level.

Judges, even conservative ones (especially conservative ones?), don’t appreciate being lied to by anyone, including the state.

The state has chosen lying as its default mode of operation.

Let’s help them wear that banner.

Interested?

DoD Audit Ready By End of September (Which September? Define “ready.”)

Monday, June 19th, 2017

For your Monday amusement: Pentagon Official: DoD will be audit ready by end of September by Eric White.

From the post:

In today’s Federal Newscast, the Defense Department’s Comptroller David Norquist said the department has been properly preparing for its deadline for audit readiness.

The Pentagon’s top financial official said DoD will meet its deadline to be “audit ready” by the end of September. DoD has been working toward the deadline for the better part of seven years, and as the department pointed out in its most recent audit readiness update, most federal agencies haven’t earned clean opinions until they’ve been under full-scale audits for several years. But newly-confirmed comptroller David Norquist said now’s the time to start. He said the department has already contracted with several outside accounting firms to perform the audits, both for the Defense Department’s various components and an overarching audit of the entire department.

I’m reminded of the alleged letter by the Duke of Wellington to Whitehall:

Gentlemen,

Whilst marching from Portugal to a position which commands the approach to Madrid and the French forces, my officers have been diligently complying with your requests which have been sent by H.M. ship from London to Lisbon and thence by dispatch to our headquarters.

We have enumerated our saddles, bridles, tents and tent poles, and all manner of sundry items for which His Majesty’s Government holds me accountable. I have dispatched reports on the character, wit, and spleen of every officer. Each item and every farthing has been accounted for, with two regrettable exceptions for which I beg your indulgence.

Unfortunately the sum of one shilling and ninepence remains unaccounted for in one infantry battalion’s petty cash and there has been a hideous confusion as the the number of jars of raspberry jam issued to one cavalry regiment during a sandstorm in western Spain. This reprehensible carelessness may be related to the pressure of circumstance, since we are war with France, a fact which may come as a bit of a surprise to you gentlemen in Whitehall.

This brings me to my present purpose, which is to request elucidation of my instructions from His Majesty’s Government so that I may better understand why I am dragging an army over these barren plains. I construe that perforce it must be one of two alternative duties, as given below. I shall pursue either one with the best of my ability, but I cannot do both:

1. To train an army of uniformed British clerks in Spain for the benefit of the accountants and copy-boys in London or perchance.

2. To see to it that the forces of Napoleon are driven out of Spain.

Your most obedient servant,

Wellington

The primary function of any military organization is suppression of the currently designated “enemy.”

Congress should direct the Department of Homeland Security (DHS) to auditing the DoD.

Instead of chasing fictional terrorists, DHS staff would be chasing known to exist dollars and alleged expenses.

OpSec Reminder

Saturday, June 17th, 2017

Catalin Cimpanu covers a hack of the DoD’s Enhanced Mobile Satellite Services (EMSS) satellite phone network in 2014 in British Hacker Used Home Internet Connection to Hack the DoD in 2014.

The details are amusing but the most important part of Cimpanu’s post is a reminder about OpSec:


In a statement released yesterday, the NCA said it had a solid case against Caffrey because they traced back the attack to his house, and found the stolen data on his computer. Furthermore, officers found an online messaging account linked to the hack on Caffrey’s computer.

Caffrey’s OpSec stumbles:

  1. Connection traced to his computer (No use of Tor or VPN)
  2. Data found on his hard drive (No use of encryption and/or storage elsewhere)
  3. Online account used in hack operated from his computer (Again, no use of Tor or VPN)

I’m sure the hack was a clever one but Caffrey’s OpSec was less so. Decidedly less so.

PS: The National Criminal Agency (NCA) report on Caffrey.

FOIA Success Prediction

Friday, June 16th, 2017

Will your FOIA request succeed? This new machine will tell you by Benjamin Mullin.

From the post:

Many journalists know the feeling: There could be a cache of documents that might confirm an important story. Your big scoop hinges on one question: Will the government official responsible for the records respond to your FOIA request?

Now, thanks to a new project from a data storage and analysis company, some of the guesswork has been taken out of that question.

Want to know the chances your public records request will get rejected? Plug it into FOIA Predictor, a probability analysis web application from Data.World, and it will provide an estimation of your success based on factors including word count, average sentence length and specificity.

Accuracy?

Best way to gauge that is experience with your FOIA requests.

Try starting at MuckRock.com.

Enjoy!

Real Talk on Reality (Knowledge Gap on Leaking)

Friday, June 9th, 2017

Real Talk on Reality : Leaking is high risk by the grugq.

From the post:

On June 5th The Intercept released an article based on an anonymously leaked Top Secret NSA document. The article was about one aspect of the Russian cyber campaign against the 2016 US election — the targeting of election device manufacturers. The relevance of this aspect of the Russian operation is not exactly clear, but we’ll address that in a separate post because… just hours after The Intercept’s article went live the US Department of Justice released an affidavit (and search warrant) covering the arrest of Reality Winner — the alleged leaker. Let’s look at that!

You could teach a short course on leaking from this one post but there is one “meta” issue that merits your attention.

The failures of Reality Winner and the Intercept signal users need educating in the art of information leaking.

With wide spread tracking of web browsers, training on information leaking needs to be pushed to users. It would stand out if one member of the military requested and was sent an email lesson on leaking. An email that went to everyone in a particular command, not so much.

Public Service Announcements (PSAs) in web zines, as ads, etc. with only the barest of tips, is another mechanism to consider.

If you are very creative, perhaps “Mr. Bill” claymation episodes with one principle of leaking each? Need to be funny enough that viewing/sharing isn’t suspicious.

Other suggestions?

Raw FBI Uniform Crime Report (UCR) Files for 2015 (NICAR Database Library)

Friday, June 9th, 2017

IRE & NICAR to freely publish unprocessed data by Charles Minshew.

From the post:

Inspired by our members, IRE is pleased to announce the first release of raw, unprocessed data from the NICAR Database Library.

The contents of the FBI’s Uniform Crime Report (UCR) master file for 2015 are now available for free download on our website. The package contains the original fixed-width files, data dictionaries for the tables as well as the FBI’s UCR user guide. We are planning subsequent releases of other raw data that is not readily available online.

The yearly data from the FBI details arrest and offense numbers for police agencies across the United States. If you download this unprocessed data, expect to do some work to get it in a useable format. The data is fixed-width, across multiple tables, contains many records on a single row that need to be unpacked and in some cases decoded, before being cleaned and imported for use in programs like Excel or your favorite database manager. Not up to the task? We do all of this work in the version of the data that we will soon have for sale in the Database Library.

I have peeked at the data and documentation files and “raw” is the correct term.

Think of it as great exercise for when an already cleaned and formatted data set isn’t available.

More to follow on processing this data set.

(Legal) Office of Personnel Management Data!

Friday, June 9th, 2017

We’re Sharing A Vast Trove Of Federal Payroll Records by Jeremy Singer-Vine.

From the post:

Today, BuzzFeed News is sharing an enormous dataset — one that sheds light on four decades of the United States’ federal payroll.

The dataset contains hundreds of millions of rows and stretches all the way back to 1973. It provides salary, title, and demographic details about millions of U.S. government employees, as well as their migrations into, out of, and through the federal bureaucracy. In many cases, the data also contains employees’ names.

We obtained the information — nearly 30 gigabytes of it — from the U.S. Office of Personnel Management, via the Freedom of Information Act (FOIA). Now, we’re sharing it with the public. You can download it for free on the Internet Archive.

This is the first time, it seems, that such extensive federal payroll data is freely available online, in bulk. (The Asbury Park Press and FedsDataCenter.com both publish searchable databases. They’re great for browsing, but don’t let you download the data.)

We hope that policy wonks, sociologists, statisticians, fellow journalists — or anyone else, for that matter — find the data useful.

We obtained the information through two Freedom of Information Act requests to OPM. The first chunk of data, provided in response to a request filed in September 2014, covers late 1973 through mid-2014. The second, provided in response to a request filed in December 2015, covers late 2014 through late 2016. We have submitted a third request, pending with the agency, to update the data further.

Between our first and second requests, OPM announced it had suffered a massive computer hack. As a result, the agency told us, it would no longer release certain information, including the employee “pseudo identifier” that had previously disambiguated employees with common names.

What a great data release! Kudos and thanks to BuzzFeed News!

If you need the “pseudo identifiers” for the second or following releases and/or data for the employees withheld (generally the more interesting ones), consult data from the massive computer hack.

Or obtain the excluded data directly from the Office of Personnel Management without permission.

Enjoy!

Fiscal Year 2018 Budget

Tuesday, May 23rd, 2017

Fiscal Year 2018 Budget.

In the best pay-to-play tradition, the Government Printing Office (GPO) has these volumes for sale:

America First: A Budget Blueprint To Make America Great Again By: Executive Office of the President, Office of Management and Budget. GPO Stock # 041-001-00719-9 ISBN: 9780160937620. Price: $10.00.

Budget of the United States Government, FY 2018 (Paperback Book) By: Executive Office of the President, Office of Management and Budget. GPO Stock # 041-001-00723-7 ISBN: 9780160939228. Price: $38.00.

Appendix, Budget of the United States Government, FY 2018 By: Executive Office of the President, Office of Management and Budget GPO Stock # 041-001-00720-2 ISBN: 9780160939334. Price: $79.00.

Budget of the United States Government, FY 2018 (CD-ROM) By: Executive Office of the President, Office of Management and Budget GPO Stock # 041-001-00722-9 ISBN: 9780160939358. Price: $29.00.

Analytical Perspectives, Budget of the United States Government, FY 2018 By: Executive Office of the President, Office of Management and Budget. GPO Stock # 041-001-00721-1 ISBN: 9780160939341. Price: $56.00.

Major Savings and Reforms: Budget of the United States Government, Fiscal Year 2018 By: Executive Office of the President, Office of Management and Budget. GPO Stock # 041-001-00724-5 ISBN: 9780160939457. Price: $35.00.

If someone doesn’t beat me to it (very likely), I will be either uploading the CD-ROM and/or pointing you to a location with the contents of the CD-ROM.

As citizens, whether you voted or not, you should have the opportunity to verify news accounts, charges and counter-charges with regard to the budget.

What’s Up With Data Padding? (Regulations.gov)

Wednesday, March 29th, 2017

I forgot to mention in Copyright Troll Hunting – 92,398 Possibles -> 146 Possibles that while using LibreOffice, I deleted a large number of either N/A only or columns not relevant for troll-mining.zip.

Except as otherwise noted, after removal of “no last name,” these fields had N/A for all records except as noted:

  1. L – Implementation Date
  2. M – Effective Date
  3. N – Related RINs
  4. O – Document SubType (Comment(s))
  5. P – Subject
  6. Q – Abstract
  7. R – Status – (Posted, except for 2)
  8. S – Source Citation
  9. T – OMB Approval Number
  10. U – FR Citation
  11. V – Federal Register Number (8 exceptions)
  12. W – Start End Page (8 exceptions)
  13. X – Special Instructions
  14. Y – Legacy ID
  15. Z – Post Mark Date
  16. AA – File Type (1 docx)
  17. AB – Number of Pages
  18. AC – Paper Width
  19. AD – Paper Length
  20. AE – Exhibit Type
  21. AF – Exhibit Location
  22. AG – Document Field_1
  23. AH – Document Field_2

Regulations.gov, not the Copyright Office, is responsible for the collection and management of comments, including the bulked up export of comments.

From the state of the records, one suspects the “bulking up” is NOT an artifact of the export but represents the storage of each record.

One way to test that theory would be a query on the noise fields via the API for Regulations.gov.

The documentation for the API is out-dated, the Field References documentation lacks the Document Detail (field AI), which contains the URL to access the comment.

The closest thing I could find was:

fileFormats Formats of the document, included as URLs to download from the API

How easy/hard it will be to download attachments isn’t clear.

BTW, the comment pages themselves are seriously puffed up. Take https://www.regulations.gov/document?D=COLC-2015-0013-52236.

Saved to disk: 148.6 KB.

Content of the comment: 2.5 KB.

The content of the comment is 1.6 % of the delivered webpage.

It must have taken serious effort to achieve a 98.4% noise to 1.6% signal ratio.

How transparent is data when you have to mine for the 1.6% that is actual content?

Executive Orders (Bulk Data From Federal Register)

Tuesday, January 31st, 2017

Executive Orders

From the webpage:

The President of the United States manages the operations of the Executive branch of Government through Executive orders. After the President signs an Executive order, the White House sends it to the Office of the Federal Register (OFR). The OFR numbers each order consecutively as part of a series, and publishes it in the daily Federal Register shortly after receipt.

Executive orders issued since 1994 are available as a single bulk download and as a bulk download by President, or you can browse by President and year from the list below. More details about our APIs and other developer tools can be found on our developer pages.

Don’t ignore the developer pages.

Whether friend or foe of the current regime in Washington, the FederalRegister.gov API enables access to all the regulatory material published in the Federal Register. Use it.

It should be especially useful in light of Presidential Executive Order on Reducing Regulation and Controlling Regulatory Costs, which provides in part:


Sec. 2. Regulatory Cap for Fiscal Year 2017. (a) Unless prohibited by law, whenever an executive department or agency (agency) publicly proposes for notice and comment or otherwise promulgates a new regulation, it shall identify at least two existing regulations to be repealed.

Disclaimer: Any resemblance to an executive order is purely coincidental:

The CIA’s Secret History Is Now Online [Indexing, Mapping, NLP Anyone?]

Wednesday, January 18th, 2017

The CIA’s Secret History Is Now Online by Jason Leopold.

From the post:

Decades ago, the CIA declassified a 26-page secret document cryptically titled “clarifying statement to Fidel Castro concerning assassination.”

It was a step toward greater transparency for one of the most secretive of all federal agencies. But to find out what the document actually said, you had to trek to the National Archives in College Park, Maryland, between the hours of 9 a.m. and 4:30 p.m. and hope that one of only four computers designated by the CIA to access its archives would be available.

But today the CIA posted the Castro record on its website along with more than 12 million pages of the agency’s other declassified documents that have eluded the public, journalists, and historians for nearly two decades. You can view the documents here.

The title of the Castro document, as it turns out, was far more interesting than the contents. It includes a partial transcript of a 1977 transcript between Barbara Walters and Fidel Castro in which she asked the late Cuban dictator whether he had “proof” of the CIA’s last attempt to assassinate him. The transcript was sent to Adm. Stansfield Turner, the CIA director at the time, by a public affairs official at the agency with a note highlighting all references to CIA.

But that’s just one of the millions documents, which date from the 1940s to 1990s, are wide-ranging, covering everything from Nazi war crimes to mind-control experiments to the role the CIA played in overthrowing governments in Chile and Iran. There are also secret documents about a telepathy and precognition program known as Star Gate, files the CIA kept on certain media publications, such as Mother Jones, photographs, more than 100,000 pages of internal intelligence bulletins, policy papers, and memos written by former CIA directors.

Michael Best, @NatSecGeek has pointed out the “CIA de-OCRed at least some of the CREST files before they uploaded them.”

Spy agency class petty. Grant public access but force the restoration of text search.

The restoration of text search work is underway so next steps will be indexing, NLP, mapping, etc.

A great set of documents to get ready for future official and unofficial leaks of CIA documents.

Enjoy!

PS: Curious if any of the search engine vendors will use CREST as demonstration data? Non-trivial size, interesting search issues, etc.

Ask at the next search conference.

CIA Cartography [Comparison to other maps?]

Monday, November 28th, 2016

CIA Cartography

From the webpage:

Tracing its roots to October 1941, CIA’s Cartography Center has a long, proud history of service to the Intelligence Community (IC) and continues to respond to a variety of finished intelligence map requirements. The mission of the Cartography Center is to provide a full range of maps, geographic analysis, and research in support of the Agency, the White House, senior policymakers, and the IC at large. Its chief objectives are to analyze geospatial information, extract intelligence-related geodata, and present the information visually in creative and effective ways for maximum understanding by intelligence consumers.

Since 1941, the Cartography Center maps have told the stories of post-WWII reconstruction, the Suez crisis, the Cuban Missile crisis, the Falklands War, and many other important events in history.

There you will find:

Cartography Tools 211 photos

Cartography Maps 1940s 22 photos

Cartography Maps 1950s 14 photos

Cartography Maps 1960s 16 photos

Cartography Maps 1970s 19 photos

Cartography Maps 1980s 12 photos

Cartography Maps 1990s 16 photos

Cartography Maps 2000s 16 photos

Cartography Maps 2010s 15 photos

The albums have this motto at the top:

CIA Cartography Center has been making vital contributions to our Nation’s security, providing policymakers with crucial insights that simply cannot be conveyed through words alone.

President-elect Trump is said to be gaining foreign intelligence from sources other than his national security briefings. Trump is ignoring daily intelligence briefings, relying on ‘a number of sources’ instead. That report is based on a Washington Post account, which puts its credibility somewhere between a conversation overhead in a laundry mat and a stump speech by a member of Congress.

Assuming Trump is gaining intelligence from other sources, just how good are other sources of intelligence?

This release of maps by the CIA, some 160 maps spread from the 1940’s to the 2010’s, provides one axis for evaluating CIA intelligence versus what was commonly known at the time.

As a starting point, may I suggest: Image matching for historical maps comparison by C. Balletti and F. Guerrae, Perimetron, Vol. 4, No. 3, 2009 [180-186] www.e-perimetron.org | ISSN 1790-3769?

Abstract:

In cartographic heritage we suddenly find maps of the same mapmaker and of the same area, published in different years, or new editions due to integration of cartographic, such us in national cartographic series. These maps have the same projective system and the same cut, but they present very small differences. The manual comparison can be very difficult and with uncertain results, because it’s easy to leave some particulars out. It is necessary to find an automatic procedure to compare these maps and a solution can be given by digital maps comparison.

In the last years our experience in cartographic data processing was opted for find new tools for digital comparison and today solution is given by a new software, ACM (Automatic Correlation Map), which finds areas that are candidate to contain differences between two maps. ACM is based on image matching, a key component in almost any image analysis process.

Interesting paper but it presupposes a closeness of the maps that is likely to be missing when comparing CIA maps to other maps of the same places and time period.

I am in the process of locating other tools for map comparison.

Any favorites you would like to suggest?

OPM Farce Continues – 2016 Inspector General Report

Monday, November 21st, 2016

U.S. Office of Personnel Management – Office of the Inspector General – Office of Audits

The Office of Personnel Management hack was back in the old days when China was being blamed for every hack. There’s no credible evidence of that but the Chinese were blamed in any event.

The OMP hack illustrated the danger inherent in appointing campaign staff to run mission critical federal agencies. Just a sampling of the impressive depth of Archuleta’s incompetence, read Flash Audit on OPM Infrastructure Update Plan.

The executive summary of the current report offers little room for hope:

This audit report again communicates a material weakness related to OPM’s Security Assessment and Authorization (Authorization) program. In April 2015, the then Chief Information Officer issued a memorandum that granted an extension of the previous Authorizations for all systems whose Authorization had already expired, and for those scheduled to expire through September 2016. Although the moratorium on Authorizations has since been lifted, the effects of the April 2015 memorandum continue to have a significant negative impact on OPM. At the end of fiscal year (FY) 2016, the agency still had at least 18 major systems without a valid Authorization in place.

However, OPM did initiate an “Authorization Sprint” during FY 2016 in an effort to get all of the agency’s systems compliant with the Authorization requirements. We acknowledge that OPM is once again taking system Authorization seriously. We intend to perform a comprehensive audit of OPM’s Authorization process in early FY 2017.

This audit report also re-issues a significant deficiency related to OPM’s information security management structure. Although OPM has developed a security management structure that we believe can be effective, there has been an extremely high turnover rate of critical positions. The negative impact of these staffing issues is apparent in the results of our current FISMA audit work. There has been a significant regression in OPM’s compliance with FISMA requirements, as the agency failed to meet requirements that it had successfully met in prior years. We acknowledge that OPM has placed significant effort toward filling these positions, but simply having the staff does not guarantee that the team can effectively manage information security and keep OPM compliant with FISMA requirements. We will continue to closely monitor activity in this area throughout FY 2017.

It’s illegal but hacking the OPM remains easier than the NSA.

Hacking the NSA requires a job at Booz Allen and a USB drive.

“connecting the dots” requires dots (Support Michael Best)

Friday, November 11th, 2016

Michael Best is creating a massive archive of government documents.

From the post:

Since 2015, I’ve published millions of government documents (about 10% of the text items on the Internet Archive, with some items containing thousands of documents) and terabytes of data; but in order to keep going, I need your help. Since I’ve gotten started, no outlet has matched the number of government documents that I’ve published and made freely available. The only non-governmental publisher that rivals the size and scope of the government files I’ve uploaded is WikiLeaks. While I analyze and write about these documents, I consider publishing them to be more important because it enables and empowers an entire generation of journalists, researchers and students of history.

I’ve also pressured government agencies into making their documents more widely available. This includes the more than 13,000,000 pages of CIA documents that are being put online soon, partially in response to my Kickstarter and publishing efforts. These documents are coming from CREST, which is a special CIA database of declassified records. Currently, it can only be accessed from four computers in the world, all of them just outside of Washington D.C. These records, which represent more than 3/4 of a million CIA files, will soon be more accessible than ever – but even once that’s done, there’s a lot more work left to do.

Question: Do you want a transparent and accountable Trump presidency?

Potential Answers include:

1) Yes, but I’m going to spend time and resources hyper-ventilating with others and roaming the streets.

2) Yes, and I’m going to support Michael Best and FOIA efforts.

Governments, even Trump’s presidency, don’t spring from ocean foam.

1024px-sandro_botticelli_-_la_nascita_di_venere_-_google_art_project_-_edited-460

The people chosen fill cabinet and other posts have history, in many cases government history.

For example, I heard a rumor today that Ed Meese, a former government crime lord, is on the Trump transition team. Hell, I thought he was dead.

Michael’s efforts produce the dots that connect past events, places, people, and even present administrations.

The dots Michael produces may support your expose, winning story and/or indictment.

Are you in or out?

Attn: Secrecy Bed-Wetters! All Five Volumes of Bay of Pigs History Released!

Thursday, November 3rd, 2016

Hand-wringers and bed-wetters who use government secrecy to hide incompetence and errors will sleep less easy tonight.

All Five Volumes of Bay of Pigs History Released and Together at Last: FRINFORMSUM 11/3/2016 by Lauren Harper.

From the post:

After more than twenty years, it appears that fear of exposing the Agency’s dirty linen, rather than any significant security information, is what prompts continued denial of requests for release of these records. Although this volume may do nothing to modify that position, hopefully it does put one of the nastiest internal power struggles into proper perspective for the Agency’s own record.” This is according to Agency historian Jack Pfeiffer, author of the CIA’s long-contested Volume V of its official history of the Bay of Pigs invasion that was released after years of work by the National Security Archive to win the volume’s release. Chief CIA Historian David Robarge states in the cover letter announcing the document’s release that the agency is “releasing this draft volume today because recent 2016 changes in the Freedom of Information Act (FOIA) requires us to release some drafts that are responsive to FOIA requests if they are more than 25 years old.” This improvement – codified by the FOIA Improvement Act of 2016 – came directly from the National Security Archive’s years of litigation.

The CIA argued in court for years – backed by Department of Justice lawyers – that the release of this volume would “confuse the public.” National Security Archive Director Tom Blanton says, “Now the public gets to decide for itself how confusing the CIA can be. How many thousands of taxpayer dollars were wasted trying to hide a CIA historian’s opinion that the Bay of Pigs aftermath degenerated into a nasty internal power struggle?”

To read all five volumes of the CIA’s Official History of the Bay of Pigs Operation – together at last – visit the National Security Archive’s website.

Even the CIA’s own retelling of the story, The Bay of Pigs Invasion, ends with a chilling reminder for all “rebels” being presently supported by the United States.


Brigade 2506’s pleas for air and naval support were refused at the highest US Government levels, although several CIA contract pilots dropped munitions and supplies, resulting in the deaths of four of them: Pete Ray, Leo Baker, Riley Shamburger, and Wade Gray.

Kennedy refused to authorize any extension beyond the hour granted. To this day, there has been no resolution as to what caused this discrepancy in timing.

Without direct air support—no artillery and no weapons—and completely outnumbered by Castro’s forces, members of the Brigade either surrendered or returned to the turquoise water from which they had come.

Two American destroyers attempted to move into the Bay of Pigs to evacuate these members, but gunfire from Cuban forces made that impossible.

In the following days, US entities continued to monitor the waters surrounding the bay in search of survivors, with only a handful being rescued. A few members of the Brigade managed to escape and went into hiding, but soon surrendered due to a lack of food and water. When all was said and done, more than seventy-five percent of Brigade 2506 ended up in Cuban prisons.

100% captured or killed. There’s an example of US support.

In a less media savvy time, the US did pay $53 million (in 1962 dollars, about $424 million today) for the release of 1113 members of Brigade 2506.

Another important fact is that fifty-seven (57) years of delay enabled the participants to escape censure and/or a trip to the gallows for their misdeeds and crimes.

Let’s not let that happen with the full CIA Torture Report. Even the sanitized 6,700 page version would be useful. More so the documents upon which it was based.

All of that exists somewhere. We lack a person with access and moral courage to inform their fellow citizens of the full truth about the CIA torture program. So far.


Update: Michael Best, NatSecGeek advises CIA Histories has the most complete CIA history collection. Thanks Michael!

Hackers May Fake Documents, Congress Publishes False Ones

Monday, September 19th, 2016

I pointed out in Lions, Tigers, and Lies! Oh My! that Bruce Schneier‘s concerns over the potential for hackers faking documents to be leaked pales beside the mis-information distributed by government.

Executive Summary of Review of the Unauthorized Disclosures of Former National Security Agency Contractor Edward Snowden (their title, not mine), is a case in point.

Barton Gellman in The House Intelligence Committee’s Terrible, Horrible, Very Bad Snowden Report leaves no doubt the House Permanent Select Committee on Intelligence (HPSCI) report is a sack of lies.

Not mistakes, not exaggerations, not simply misleading, but actual, factual lies.

For example:


Since I’m on record claiming the report is dishonest, let’s skip straight to the fourth section. That’s the one that describes Snowden as “a serial exaggerator and fabricator,” with “a pattern of intentional lying.” Here is the evidence adduced for that finding, in its entirety.

“He claimed to have obtained a high school degree equivalent when in fact he never did.”

I do not know how the committee could get this one wrong in good faith. According to the official Maryland State Department of Education test report, which I have reviewed, Snowden sat for the high school equivalency test on May 4, 2004. He needed a score of 2250 to pass. He scored 3550. His Diploma No. 269403 was dated June 2, 2004, the same month he would have graduated had he returned to Arundel High School after losing his sophomore year to mononucleosis. In the interim, he took courses at Anne Arundel Community College.

See Gellman’s post for more examples.

All twenty-two members of the HPSCI signed the report. To save you time in the future, here’s a listing of the members of Congress who agreed to report these lies:

Republicans

Democrats

I sorted each group in to alphabetical order. The original listings were in an order that no doubt makes sense to fellow rodents but not to the casual reader.

That’s twenty-two members of Congress who are willing to distribute known falsehoods.

Does anyone have an equivalent list of hackers?

Congress.gov Corrects Clinton-Impeachment Search Results

Monday, September 19th, 2016

After posting Congress.gov Search Alert: “…previous total of 261 to the new total of 0.” [Solved] yesterday, pointing out that a change from http:// to https:// altered a search result for Clinton w/in 5 words impeachment, I got an email this morning:

congress-gov-correction-460

I appreciate the update and correction for saved searches, but my point about remote data changing without notice to you remains valid.

I’m still waiting for word on bulk downloads from both Wikileaks and DC Leaks.

Why leak information vital to public discussion and then limit access to search?

Congress.gov Search Alert: “…previous total of 261 to the new total of 0.” [Solved]

Sunday, September 18th, 2016

Odd message from the Congress.org search alert this AM:

congress-alert-460

Here’s the search I created back in June, 2016:

congress-alert-search-460

My probably inaccurate recall at the moment was I was searching for some quote from the impeachment of Bill Clinton and was too lazy to specify a term of congress, hence:

all congresses – searching for Clinton within five words, impeachment

Fairly trivial search that produced 261 “hits.”

I set the search alert more to explore the search options than any expectation of different future results.

Imagine my surprise to find that all congresses – searching for Clinton within five words, impeachment performed today, results in 0 “hits.”

Suspecting some internal changes to the search interface, I re-entered the search today and got 0 “hits.”

Other saved searches with radically different search results as of today?

This is not, repeat not, the result of some elaborate conspiracy to assist Secretary Clinton in her bid for the presidency.

I do think something fundamental has gone wrong with searching at Congress.gov and it needs to be fixed.

This is an illustration of why Wikileaks, DC Leaks and other data sites should provide easy to access downloads in bulk of their materials.

Providing search interfaces to document collections is a public service, but document collections or access to them can change in ways not transparent to search users. Such as demonstrated by the CIA removing documents previously delivered to the Senate.

Petition Wikileaks, DC Leaks and other data sites for easy bulk downloads.

That will ensure the “evidence” will not shift under your feet and the availability of more sophisticated means of analysis than brute-force search.


Update: The changing from http:// to https:// by the congress.gov site, trashed my save query and using http:// to re-perform the same search.

Using https:// returns the same 261 search results.

What your experience with other saved searches at congress.gov?

Inside the fight to reveal the CIA’s torture secrets [Support The Guardian]

Monday, September 12th, 2016

Inside the fight to reveal the CIA’s torture secrets by Spencer Ackerman.

Part one: Crossing the bridge

Part two: A constitutional crisis

Part three: The aftermath

Ackerman captures the drama of a failed attempt by the United States Senate to exercise oversight on the Central Intelligence Agency (CIA) in this series.

I say “failed attempt” because even if the full 6,200+ page report is ever released, the lead Senate investigator, Daniel Jones, obscured the identities of all the responsible CIA personnel and sources of information in the report.

Even if the full report is serialized in your local newspaper, the CIA contractors and staff guilty of multiple felonies, will be not one step closer to being brought to justice.

To that extent, the “full” report is itself a disservice to the American people, who elect their congressional leaders and expect them to oversee agencies such as the CIA.

From Ackerman’s account you will learn that the CIA can dictate to its overseers, the location and conditions under which it can view documents, decide which documents it is allowed to see and in cases of conflict, the CIA can spy on the Select Senate Committee on Intelligence.

Does that sound like effective oversight to you?

BTW, you will also learn that members of the “most transparent administration in history” aided and abetted the CIA in preventing an effective investigation into the CIA and its torture program. I use “aided and abetted” deliberately and in their legal sense.

I mention in my header that you should support The Guardian.

This story by Spencer Ackerman is one reason.

Another reason is that given the plethora of names and transfers recited in Ackerman’s story, we need The Guardian to cover future breaks in this story.

Despite the tales of superhuman security, nobody is that good.

I leave you with the thought that if more than one person knows a secret, then it it can be discovered.

Check Ackerman’s story for a starting list of those who know secrets about the CIA torture program.

Good hunting!

New Plea: Charges Don’t Reflect Who I Am Today

Wednesday, September 7th, 2016

Traditionally, pleas have been guilty, not guilty, not guilty by reason of insanity and nolo contendere (no contest).

Beth Cobert, acting director at the OPM, has added a fifth plea:

Charges Don’t Reflect Who I Am Today

Greg Masters captures the new plea in Congressional report faults OPM over breach preparedness and response:


While welcoming the committee’s acknowledgement of the OPM’s progress, Beth Cobert, acting director at the OPM, disagreed with the committee’s findings in a blog post published on the OPM site on Wednesday, responding that the report does “not fully reflect where this agency stands today.”
… (emphasis added)

Any claims about “…where this agency stands today…” are a distraction from the question of responsibility for a system wide failure of security.

If you know any criminal defense lawyers, suggest they quote Beth Cobert as setting a precedent for responding to allegations of prior misconduct with:

Charges Don’t Reflect Who I Am Today

Please forward links to news reports of successful use of that plea to my attention.

Congressional Research Service Fiscal 2015 – Full Report List

Saturday, August 6th, 2016

Congressional Research Service Fiscal 2015

The Director’s Message:

From international conflicts and humanitarian crises, to immigration, transportation, and secondary education, the Congressional Research Service (CRS) helped every congressional office and committee navigate the wide range of complex and controversial issues that confronted Congress in FY2015.

We kicked off the year strongly, preparing for the newly elected Members of the 114th Congress with the tenth biannual CRS Seminar for New Members, and wrapped up 2015 supporting the transition to a new Speaker and the crafting of the omnibus appropriations bill. In between, CRS experts answered over 62,000 individual requests; hosted over 7,400 Congressional participants at seminars, briefings and trainings; provided over 3,600 new or refreshed products; and summarized over 8,000 pieces of legislation.

While the CRS mission remains the same, Congress and the environment in which it works are continually evolving. To ensure that the Service is well positioned to anticipate and meet the information and research needs of a 21st-century Congress, we launched a comprehensive strategic planning effort that has identified the most critical priorities, goals, and objectives that will enable us to most efficiently and effectively serve Congress as CRS moves into its second century.

Responding to the increasingly rapid pace of congressional business, and taking advantage of new technologies, we continued to explore new and innovative ways to deliver authoritative information and timely analysis to Congress. For example, we introduced shorter report formats and added infographics to our website CRS.gov to better serve congressional needs.

It is an honor and privilege to work for the U.S. Congress. With great dedication, our staff creatively supports Members, staff and committees as they help shape and direct the legislative process and our nation’s future. Our accomplishments in fiscal 2015 reflect that dedication.

All true but also true that the funders of all those wonderful efforts, taxpayers, have spotty and/or erratic access to those research goodies.

Perhaps that will change in the not too distant future.

But until then, perhaps a list of all the new CRS products in 2015, which runs from page 47 to page 124 may be of interest.

Not all entries are unique as they may appear under different categories.

Sadly the only navigation you are offered is by chunky categories like “Health” and “Law and Justice.”

Hmmm, perhaps that can be fixed, at least to some degree.

Watch for more CRS news this coming week.

How-To Track Projects Like A Defense Contractor

Sunday, July 31st, 2016

Transparency Tip: How to Track Government Projects Like a Defense Contractor by Dave Maass.

From the post:

Over the last year, thousands of pages of sensitive documents outlining the government’s intelligence practices have landed on our desktops.

One set of documents describes the Director of National Intelligence’s goal of funding “dramatic improvements in unconstrained face recognition.” A presentation from the Navy uses examples from Star Trek to explain its electronic warfare program. Other records show the FBI was purchasing mobile phone extraction devices, malware and fiber network-tapping systems. A sign-in list shows the names and contact details of hundreds of cybersecurity contractors who turned up a Department of Homeland Security “Industry Day.” Yet another document, a heavily redacted contract, provides details of U.S. assistance with drone surveillance programs in Burundi, Kenya and Uganda.

But these aren’t top-secret records carefully leaked to journalists. They aren’t classified dossiers pasted haphazardly on the Internet by hacktivists. They weren’t even liberated through the Freedom of Information Act. No, these public documents are available to anyone who looks at the U.S. government’s contracting website, FBO.gov. In this case “anyone,” is usually just contractors looking to sell goods, services, or research to the government. But, because the government often makes itself more accessible to businesses than the general public, it’s also a useful tool for watchdogs. Every government program costs money, and whenever money is involved, there’s a paper trail.

Searching FBO.gov is difficult enough that there are firms that offer search services to assist contractors with locating business opportunities.

Collating FBO.gov data with topic maps (read adding non-FBO.gov data) will be a value-add to watchdogs, potential contractors (including yourself), or watchers watching watchers.

Dave’s post will get you started on your way.

U.S. Climate Resilience Toolkit

Thursday, July 28th, 2016

Bringing climate information to your backyard: the U.S. Climate Resilience Toolkit by Tamara Dickinson and Kathryn Sullivan.

From the post:

Climate change is a global challenge that will requires local solutions. Today, a new version of the Climate Resilience Toolkit brings climate information to your backyard.

The Toolkit, called for in the President’s Climate Action Plan and developed by the National Oceanic and Atmospheric Administration (NOAA), in collaboration with a number of Federal agencies, was launched in 2014. After collecting feedback from a diversity of stakeholders, the team has updated the Toolkit to deliver more locally-relevant information and to better serve the needs of its users. Starting today, Toolkit users will find:

  • A redesigned user interface that is responsive to mobile devices;
  • County-scale climate projections through the new version of the Toolkit’s Climate Explorer;
  • A new “Reports” section that includes state and municipal climate-vulnerability assessments, adaptation plans, and scientific reports; and
  • A revised “Steps to Resilience” guide, which communicates steps to identifying and addressing climate-related vulnerabilities.

Thanks to the Toolkit’s Climate Explorer, citizens, communities, businesses, and policy leaders can now visualize both current and future climate risk on a single interface by layering up-to-date, county-level, climate-risk data with maps. The Climate Explorer allows coastal communities, for example, to overlay anticipated sea-level rise with bridges in their jurisdiction in order to identify vulnerabilities. Water managers can visualize which areas of the country are being impacted by flooding and drought. Tribal nations can see which of their lands will see the greatest mean daily temperature increases over the next 100 years.  

A number of decision makers, including the members of the State, Local, and Tribal Leaders Task Force, have called on the Federal Government to develop actionable information at local-to-regional scales.  The place-based, forward-looking information now available through the Climate Explorer helps to meet this demand.

The Climate Resilience Toolkit update builds upon the Administration’s efforts to boost access to data and information through resources such as the National Climate Assessment and the Climate Data Initiative. The updated Toolkit is a great example of the kind of actionable information that the Federal Government can provide to support community and business resilience efforts. We look forward to continuing to work with leaders from across the country to provide the tools, information, and support they need to build healthy and climate-ready communities.

Check out the new capabilities today at toolkit.climate.gov!

I have only started to explore this resource but thought I should pass it along.

Of particular interest to me is the integration of data/analysis from this resource with other data.

Suggestions/comments?

Accessing IRS 990 Filings (Old School)

Monday, July 25th, 2016

Like many others, I was glad to see: IRS 990 Filings on AWS.

From the webpage:

Machine-readable data from certain electronic 990 forms filed with the IRS from 2011 to present are available for anyone to use via Amazon S3.

Form 990 is the form used by the United States Internal Revenue Service to gather financial information about nonprofit organizations. Data for each 990 filing is provided in an XML file that contains structured information that represents the main 990 form, any filed forms and schedules, and other control information describing how the document was filed. Some non-disclosable information is not included in the files.

This data set includes Forms 990, 990-EZ and 990-PF which have been electronically filed with the IRS and is updated regularly in an XML format. The data can be used to perform research and analysis of organizations that have electronically filed Forms 990, 990-EZ and 990-PF. Forms 990-N (e-Postcard) are not available withing this data set. Forms 990-N can be viewed and downloaded from the IRS website.

I could use AWS but I’m more interested in deep analysis of a few returns than analysis of the entire dataset.

Fortunately the webpage continues:


An index listing all of the available filings is available at s3://irs-form-990/index.json. This file includes basic information about each filing including the name of the filer, the Employer Identificiation Number (EIN) of the filer, the date of the filing, and the path to download the filing.

All of the data is publicly accessible via the S3 bucket’s HTTPS endpoint at https://s3.amazonaws.com/irs-form-990. No authentication is required to download data over HTTPS. For example, the index file can be accessed at https://s3.amazonaws.com/irs-form-990/index.json and the example filing mentioned above can be accessed at https://s3.amazonaws.com/irs-form-990/201541349349307794_public.xml (emphasis in original).

I open a terminal window and type:

wget https://s3.amazonaws.com/irs-form-990/index.json

which as of today, results in:

-rw-rw-r-- 1 patrick patrick 1036711819 Jun 16 10:23 index.json

A trial grep:

grep "NATIONAL RIFLE" index.json > nra.txt

Which produces:

{“EIN”: “530116130”, “SubmittedOn”: “2014-11-25”, “TaxPeriod”: “201312”, “DLN”: “93493309004174”, “LastUpdated”: “2016-03-21T17:23:53”, “URL”: “https://s3.amazonaws.com/irs-form-990/201423099349300417_public.xml”, “FormType”: “990”, “ObjectId”: “201423099349300417”, “OrganizationName”: “NATIONAL RIFLE ASSOCIATION OF AMERICA”, “IsElectronic”: true, “IsAvailable”: true},
{“EIN”: “530116130”, “SubmittedOn”: “2013-12-20”, “TaxPeriod”: “201212”, “DLN”: “93493260005203”, “LastUpdated”: “2016-03-21T17:23:53”, “URL”: “https://s3.amazonaws.com/irs-form-990/201302609349300520_public.xml”, “FormType”: “990”, “ObjectId”: “201302609349300520”, “OrganizationName”: “NATIONAL RIFLE ASSOCIATION OF AMERICA”, “IsElectronic”: true, “IsAvailable”: true},
{“EIN”: “530116130”, “SubmittedOn”: “2012-12-06”, “TaxPeriod”: “201112”, “DLN”: “93493311011202”, “LastUpdated”: “2016-03-21T17:23:53”, “URL”: “https://s3.amazonaws.com/irs-form-990/201203119349301120_public.xml”, “FormType”: “990”, “ObjectId”: “201203119349301120”, “OrganizationName”: “NATIONAL RIFLE ASSOCIATION OF AMERICA”, “IsElectronic”: true, “IsAvailable”: true},
{“EIN”: “396056607”, “SubmittedOn”: “2011-05-12”, “TaxPeriod”: “201012”, “FormType”: “990EZ”, “LastUpdated”: “2016-06-14T01:22:09.915971Z”, “OrganizationName”: “EAU CLAIRE NATIONAL RIFLE CLUB”, “IsElectronic”: false, “IsAvailable”: false},
{“EIN”: “530116130”, “SubmittedOn”: “2011-11-09”, “TaxPeriod”: “201012”, “DLN”: “93493270005081”, “LastUpdated”: “2016-03-21T17:23:53”, “URL”: “https://s3.amazonaws.com/irs-form-990/201132709349300508_public.xml”, “FormType”: “990”, “ObjectId”: “201132709349300508”, “OrganizationName”: “NATIONAL RIFLE ASSOCIATION OF AMERICA”, “IsElectronic”: true, “IsAvailable”: true},
{“EIN”: “530116130”, “SubmittedOn”: “2016-01-11”, “TaxPeriod”: “201412”, “DLN”: “93493259005035”, “LastUpdated”: “2016-04-29T13:40:20”, “URL”: “https://s3.amazonaws.com/irs-form-990/201532599349300503_public.xml”, “FormType”: “990”, “ObjectId”: “201532599349300503”, “OrganizationName”: “NATIONAL RIFLE ASSOCIATION OF AMERICA”, “IsElectronic”: true, “IsAvailable”: true},

We have one errant result, the “EAU CLAIRE NATIONAL RIFLE CLUB,” so let’s delete that, re-order by year and the NATIONAL RIFLE ASSOCIATION OF AMERICA result reads (most recent to oldest):

{“EIN”: “530116130”, “SubmittedOn”: “2016-01-11”, “TaxPeriod”: “201412”, “DLN”: “93493259005035”, “LastUpdated”: “2016-04-29T13:40:20”, “URL”: “https://s3.amazonaws.com/irs-form-990/201532599349300503_public.xml”, “FormType”: “990”, “ObjectId”: “201532599349300503”, “OrganizationName”: “NATIONAL RIFLE ASSOCIATION OF AMERICA”, “IsElectronic”: true, “IsAvailable”: true},
{“EIN”: “530116130”, “SubmittedOn”: “2014-11-25”, “TaxPeriod”: “201312”, “DLN”: “93493309004174”, “LastUpdated”: “2016-03-21T17:23:53”, “URL”: “https://s3.amazonaws.com/irs-form-990/201423099349300417_public.xml”, “FormType”: “990”, “ObjectId”: “201423099349300417”, “OrganizationName”: “NATIONAL RIFLE ASSOCIATION OF AMERICA”, “IsElectronic”: true, “IsAvailable”: true},
{“EIN”: “530116130”, “SubmittedOn”: “2013-12-20”, “TaxPeriod”: “201212”, “DLN”: “93493260005203”, “LastUpdated”: “2016-03-21T17:23:53”, “URL”: “https://s3.amazonaws.com/irs-form-990/201302609349300520_public.xml”, “FormType”: “990”, “ObjectId”: “201302609349300520”, “OrganizationName”: “NATIONAL RIFLE ASSOCIATION OF AMERICA”, “IsElectronic”: true, “IsAvailable”: true},
{“EIN”: “530116130”, “SubmittedOn”: “2012-12-06”, “TaxPeriod”: “201112”, “DLN”: “93493311011202”, “LastUpdated”: “2016-03-21T17:23:53”, “URL”: “https://s3.amazonaws.com/irs-form-990/201203119349301120_public.xml”, “FormType”: “990”, “ObjectId”: “201203119349301120”, “OrganizationName”: “NATIONAL RIFLE ASSOCIATION OF AMERICA”, “IsElectronic”: true, “IsAvailable”: true},
{“EIN”: “530116130”, “SubmittedOn”: “2011-11-09”, “TaxPeriod”: “201012”, “DLN”: “93493270005081”, “LastUpdated”: “2016-03-21T17:23:53”, “URL”: “https://s3.amazonaws.com/irs-form-990/201132709349300508_public.xml”, “FormType”: “990”, “ObjectId”: “201132709349300508”, “OrganizationName”: “NATIONAL RIFLE ASSOCIATION OF AMERICA”, “IsElectronic”: true, “IsAvailable”: true},

Of course, now you want the XML 990 returns, so extract the URLs for the 990s to a file, here nra-urls.txt (I would use awk if it is more than a handful):

https://s3.amazonaws.com/irs-form-990/201532599349300503_public.xml
https://s3.amazonaws.com/irs-form-990/201423099349300417_public.xml
https://s3.amazonaws.com/irs-form-990/201302609349300520_public.xml
https://s3.amazonaws.com/irs-form-990/201203119349301120_public.xml
https://s3.amazonaws.com/irs-form-990/201132709349300508_public.xml

Back to wget:

wget -i nra-urls.txt

Results:

-rw-rw-r– 1 patrick patrick 111798 Mar 21 16:12 201132709349300508_public.xml
-rw-rw-r– 1 patrick patrick 123490 Mar 21 19:47 201203119349301120_public.xml
-rw-rw-r– 1 patrick patrick 116786 Mar 21 22:12 201302609349300520_public.xml
-rw-rw-r– 1 patrick patrick 122071 Mar 21 15:20 201423099349300417_public.xml
-rw-rw-r– 1 patrick patrick 132081 Apr 29 10:10 201532599349300503_public.xml

Ooooh, it’s in XML! 😉

For the XML you are going to need: Current Valid XML Schemas and Business Rules for Exempt Organizations Modernized e-File, not to mention a means of querying the data (may I suggest XQuery?).

Once you have the index.json file, with grep, a little awk and wget, you can quickly explore IRS 990 filings for further analysis or to prepare queries for running on AWS (such as discovery of common directors, etc.).

Enjoy!

What’s the “CFR” and Why Is It So Important to Me?

Wednesday, July 20th, 2016

What’s the “CFR” and Why Is It So Important to Me? Government Printing Office (GPO) blog, GovernmentBookTalk.

From the post:

If you’re a GPO Online Bookstore regular or public official you probably know we’re speaking about the “Code of Federal Regulations.” CFRs are produced routinely by all federal departments and agencies to inform the public and government officials of regulatory changes and updates for literally every subject that the federal government has jurisdiction to manage.

For the general public these constantly updated federal regulations can spell fantastic opportunity. Farmer, lawyer, construction owner, environmentalist, it makes no difference. Within the 50 codes are a wide variety of regulations that impact citizens from all walks of life. Federal Rules, Regulations, Processes, or Procedures on the surface can appear daunting, confusing, and even may seem to impede progress. In fact, the opposite is true. By codifying critical steps to anyone who operates within the framework of any of these sectors, the CFR focused on a particular issue can clarify what’s legal, how to move forward, and how to ultimately successfully translate one’s projects or ideas into reality.

Without CFR documentation the path could be strewn with uncertainty, unknown liabilities, and lost opportunities, especially regarding federal development programs, simply because an interested party wouldn’t know where or how to find what’s available within their area of interest.

The authors of CFRs are immersed in the technical and substantive issues associated within their areas of expertise. For a private sector employer or entrepreneur who becomes familiar with the content of CFRs relative to their field of work, it’s like having an expert staff on board.

I like the CFRs but I stumbled on:

For a private sector employer or entrepreneur who becomes familiar with the content of CFRs relative to their field of work, it’s like having an expert staff on board.

I don’t doubt the expertise of the CFR authors, but their writing often requires an expert for accurate interpretation. If you doubt that statement, test your reading skills on any section of CFR Title 26, Internal Revenue.

Try your favorite NLP parser out on any of the CFRs.

The post lists a number of ways to acquire the CFRs but personally I would use the free Electronic Code of Federal Regulations unless you need to impress clients with the paper version.

Enjoy!

IRS E-File Bucket – Internet Archive

Saturday, June 18th, 2016

IRS E-File Bucket courtesy of Carl Malamud and Public.Resource.Org.

From the webpage:

This bucket contains a mirror of the IRS e-file release as of June 16, 2016. You may access the source files at https://aws.amazon.com/public-data-sets/irs-990/. The present bucket may or may not be updated in the future.

To access this bucket, use the download links.

Note that tarballs is image scans from 2002-2015 are also available in this IRS 990 Forms collection.

Many thanks to the Internal Revenue Service for making this information available. Here is their announcement on June 16, 2016. Here is a statement from Public.Resource.Org congratulating the IRS on a job well done.

As I noted in IRS 990 Filing Data (2001 to date):

990* disclosures aren’t detailed enough to pinch but when combined with other data, say leaked data, the results can be remarkable.

It’s up to you to see that public disclosures pinch.

IRS 990 Filing Data (2001 to date)

Thursday, June 16th, 2016

IRS 990 Filing Data Now Available as an AWS Public Data Set

From the post:

We are excited to announce that over one million electronic IRS 990 filings are available via Amazon Simple Storage Service (Amazon S3). Filings from 2011 to the present are currently available and the IRS will add new 990 filing data each month.

(image omitted)

Form 990 is the form used by the United States Internal Revenue Service (IRS) to gather financial information about nonprofit organizations. By making electronic 990 filing data available, the IRS has made it possible for anyone to programmatically access and analyze information about individual nonprofits or the entire nonprofit sector in the United States. This also makes it possible to analyze it in the cloud without having to download the data or store it themselves, which lowers the cost of product development and accelerates analysis.

Each electronic 990 filing is available as a unique XML file in the “irs-form-990” S3 bucket in the AWS US East (N. Virginia) region. Information on how the data is organized and what it contains is available on the IRS 990 Filings on AWS Public Data Set landing page.

Some of the forms and instructions that will help you make sense of the data reported:

990 – Form 990 Return of Organization Exempt from Income Tax, Annual Form 990 Requirements for Tax-Exempt Organizations

990-EZ – 2015 Form 990-EZ, Instructions for IRS 990 EZ – Internal Revenue Service

990-PF – 2015 Form 990-PF, 2015 Instructions for Form 990-PF

As always, use caution with law related data as words may have unusual nuances and/or unexpected meanings.

These forms and instructions are only a tiny part of a vast iceberg of laws, regulations, rulings, court decisions and the like.

990* disclosures aren’t detailed enough to pinch but when combined with other data, say leaked data, the results can be remarkable.

Breaking Californication (An Act Performed On The Public)

Monday, June 6th, 2016

Law Enforcement Lobby Succeeds In Killing California Transparency Bill by Kit O’Connell.

From the post:

A California Senate committee killed a bill to increase transparency in police misconduct investigations, hampering victims’ efforts to obtain justice.

Chauncee Smith, legislative advocate at the ACLU of California, told MintPress News that the state Legislature “caved to the tremendous influence and power of the law enforcement lobby” and “failed to listen to the demands and concerns of everyday Californian people.”

California has some of the most secretive rules in the country when it comes to investigations into police misconduct and excessive use of force. Records are kept sealed, regardless of the outcome, as the ACLU of Northern California explains on its website:

“In places like Texas, Kentucky, and Utah, peace officer records are made public when an officer is found guilty of misconduct. Other states make records public regardless of whether misconduct is found. This is not the case in California.”

“Right now, there is a tremendous cloud of secrecy that is unparalleled compared to many other states,” Smith added. “California is in the minority in which the public do not know basic information when someone is killed or potentially harmed by those are sworn to serve and protect them.”

In February, Sen. Mark Leno, a Democrat from San Francisco, introduced SB 1286, the “Enhance Community Oversight on Police Misconduct and Serious Uses of Force” bill. It would have allowed “public access to investigations, findings and discipline information on serious uses of force by police” and would have increased transparency in other cases of police misconduct, according to an ACLU fact sheet. Polling data cited by the ACLU suggests about 80 percent of Californians would support the measure.

But the bill’s progress through the legislature ended on May 27, when it failed to pass out of the Senate Appropriations committee.

“Today is a sad day for transparency, accountability, and justice in California,” said Peter Bibring, police practices director for the ACLU of California, in a May 27 press release.

Mistrust between police officers and citizens makes the job of police officers more difficult and dangerous, while denying citizens the full advantages of a trained police force, paid for by their tax dollars.

The state legislature, finding sowing and fueling mistrust between police officers and citizens has election upsides for them, fans those flames with secrecy over police misconduct investigations.

Open, not secret (read grand jury) proceedings where witnesses can be fairly examined (unlike the deliberately thrown Michael Brown investigation), can go a long way to re-establishing trust between the police and the public.

Members of the community know when someone was a danger to police officers and others. Whether their family members will admit it or not. Likewise, police officers know which officers are far to quick to escalate to deadly force. Want better community policing? What better citizen cooperation? That’s not going to happen with completely secret police misconduct investigations.

So the State of California is going to collect the evidence, statements, etc., in police misconduct investigations, but won’t share that information with the public. At least not willingly.

Official attempts to break illegitimate government secrecy failed. Even if it had succeeded you’d be paying least $0.25 per page plus a service fee.

Two observations about government networks:

  • Secret (and otherwise) government documents are usually printed on networked printers.
  • Passively capturing Ethernet traffic (network tap) captures printer traffic too.

Whistle blowers don’t have to hack heavily monitored systems, steal logins/passwords, leaking illegally withheld documents is within the reach of anyone who can plug in an Ethernet cable.

There’s a bit more to it than that, but remember all those network cables running through the ceiling, walls, closets, the next time your security consultant, assures you of your network’s security.

As a practical matter, if you start leaking party menus and football pools, someone will start looking for a network tap.

Leak when it makes a significant difference to public discussion and/or legal proceedings. Even then, look for ways to attribute the leak to factions within the government.

Remember the DoD’s amused reaction to State’s huffing and puffing over the Afghan diplomatic cables? That sort of rivalry exists at every level of government. You should use it to your advantage.

The State of California would have you believe that government information sharing is at its sufferance.

I beg to differ.

So should you.