CIA Documents or Reports of CIA Documents? Vault7

As I tool up to analyze the 1134 non-duplicate/artifact HTML files in Vault 7: CIA Hacking Tools Revealed, it occurred to me those aren’t “CIA documents.”

Take Weeping Angel (Extending) Engineering Notes as an example.

Caveat: My range of experience with “CIA documents” is limited to those obtained by Michael Best and others using Freedom of Information Act requests. But that should be sufficient to identify “CIA documents.”

Some things I notice about Weeping Angel (Extending) Engineering Notes:

  1. A Wikileaks header with donation button.
  2. “Vault 7: CIA Hacking Tools Revealed”
  3. Wikileaks navigation
  4. reported text
  5. More Wikileaks navigation
  6. Ads for Wikileaks, Tor, Tails, Courage, bitcoin

I’m going to say that the 1134 non-duplicate/artifact HTML files in Vault7, Part1, are reports of portions (which portions is unknown) of some unknown number of CIA documents.

A distinction that influences searching, indexing, concordances, word frequency, just to name a few.

What I need is the reported text, minus:

  1. A Wikileaks header with donation button.
  2. “Vault 7: CIA Hacking Tools Revealed”
  3. Wikileaks navigation
  4. More Wikileaks navigation
  5. Ads for Wikileaks, Tor, Tails, Courage, bitcoin

Check in tomorrow when I boil 1134 reports of CIA documents to get something better suited for text analysis.

One Response to “CIA Documents or Reports of CIA Documents? Vault7”

  1. […] You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own […]