Archive for the ‘Wikileaks’ Category

CIA Documents or Reports of CIA Documents? Vault7

Wednesday, March 22nd, 2017

As I tool up to analyze the 1134 non-duplicate/artifact HTML files in Vault 7: CIA Hacking Tools Revealed, it occurred to me those aren’t “CIA documents.”

Take Weeping Angel (Extending) Engineering Notes as an example.

Caveat: My range of experience with “CIA documents” is limited to those obtained by Michael Best and others using Freedom of Information Act requests. But that should be sufficient to identify “CIA documents.”

Some things I notice about Weeping Angel (Extending) Engineering Notes:

  1. A Wikileaks header with donation button.
  2. “Vault 7: CIA Hacking Tools Revealed”
  3. Wikileaks navigation
  4. reported text
  5. More Wikileaks navigation
  6. Ads for Wikileaks, Tor, Tails, Courage, bitcoin

I’m going to say that the 1134 non-duplicate/artifact HTML files in Vault7, Part1, are reports of portions (which portions is unknown) of some unknown number of CIA documents.

A distinction that influences searching, indexing, concordances, word frequency, just to name a few.

What I need is the reported text, minus:

  1. A Wikileaks header with donation button.
  2. “Vault 7: CIA Hacking Tools Revealed”
  3. Wikileaks navigation
  4. More Wikileaks navigation
  5. Ads for Wikileaks, Tor, Tails, Courage, bitcoin

Check in tomorrow when I boil 1134 reports of CIA documents to get something better suited for text analysis.

Fact Checking Wikileaks’ Vault 7: CIA Hacking Tools Revealed (Part 2 – The PDF Files)

Tuesday, March 21st, 2017

You may want to read Fact Checking Wikileaks’ Vault 7: CIA Hacking Tools Revealed (Part 1) before reading this post. In Part 1, I walk you through obtaining a copy of Wikileaks’ Vault 7: CIA Hacking Tools Revealed so you can follow and check my analysis and conclusions.

Fact checking applies to every source, including this blog.

I proofed my listing of the 357 PDF files in the first Vault 7 release and report an increase in arguably CIA files and a slight decline in public documents. An increase from 114 to 125 for the CIA and a decrease from 109 to 98 for public documents.

  1. Arguably CIA – 125
  2. Public – 98
  3. Wikileaks placeholders – 134

The listings to date:

  1. CIA (maybe)
  2. Public documents
  3. Wikileaks placeholders

For public documents, I created hyperlinks whenever possible. (Saying a fact and having evidence are two different things.) Vendor documentation that was not marked with a security classification I counted as public.

All I can say for the Wikileaks placeholders, some 134 of them, is to ignore them unless you like mining low grade ore.

I created notes in the CIA listing to help narrow your focus down to highly relevant materials.

I have more commentary in the works but I wanted to release these listings in case they help others make efficient use of their time.

Enjoy!

PS: A question I want to start addressing this week is how the dilution of a leak impacts the use of same?

Fact Checking Wikileaks’ Vault 7: CIA Hacking Tools Revealed (Part 1)

Monday, March 20th, 2017

Executive Summary:

If you reported Vault 7: CIA Hacking Tools Revealed as containing:

8,761 documents and files from an isolated, high-security network situated inside the CIA’s Center for Cyber Intelligence in Langley, Virgina…. (Vault 7: CIA Hacking Tools Revealed)

you failed to check your facts.

I detail my process below but in terms of numbers:

  1. Of 7809 HTML files, 6675 are duplicates or Wikileaks artifacts
  2. Of 357 PDF files, 134 are Wikileaks artifacts (for materials not released). Of the remaining 223 PDF files, 109 of them are public information, the GNU Make Manual for instance. Out of the 357 pdf files, Wikileaks has delivered 114 arguably from the CIA and some of those are doubtful. (Part 2, forthcoming)

Wikileaks haters will find little solace here. My criticisms of Wikileaks are for padding the leak and not enabling effective use of the leak. Padding the leak is obvious from the inclusion of numerous duplicate and irrelevant documents. Effective use of the leak is impaired by the padding but also by teases of what could have been released but wasn’t.

Getting Started

To start on common ground, fire up a torrent client, obtain and decompress: Wikileaks-Year-Zero-2017-v1.7z.torrent.

Decompression requires this password: SplinterItIntoAThousandPiecesAndScatterItIntoTheWinds

The root directory is year0.

When I run a recursive ls from above that directory:

ls -R year0 | wc -l

My system reports: 8820

Change to the year0 directory and ls reveals:

bootstrap/ css/ highlighter/ IMG/ localhost:6081@ static/ vault7/

Checking the files in vault7:

ls -R vault7/ | wc -l

returns: 8755

Change to the vault7 directory and ls shows:

cms/ files/ index.html logo.png

The directory files/ has only one file, org-chart.png. An organization chart of the CIA but with sub-departments are listed with acronyms and “???.” Did the author of the chart not know the names of those departments? I point that out as the first of many file anomalies.

Some 7809 HTML files are found under cms/.

The cms/ directory has a sub-directory files, plus main.css and 7809 HTML files (including the index.html file).

Duplicated HTML Files

I discovered duplication of the HTML files quite by accident. I had prepared the files with Tidy for parsing with Saxon and compressed a set of those files for uploading.

The 7808 files I compressed started at 296.7 MB.

The compressed size, using 7z, was approximately 3.5 MB.

That’s almost 2 order of magnitude of compression. 7z is good, but it’s not quite that good. 😉

Checking my file compression numbers

You don’t have to take my word for the file compression experience. If you select all the page_*, space_* and user_* HTML files in a file browser, it should report a total size of 296.7 MB.

Create a sub-directory to year0/vault7/cms/, say mkdir htmlfiles and then:

cp *.html htmlfiles

Then: cd htmlfiles

and,

7z a compressedhtml.7z *.html

Run: ls -l compressedhtml.7z

Result: 3488727 Mar 16 16:31 compressedhtml.7z

Tom Harris, in How File Compression Works, explains that:


Most types of computer files are fairly redundant — they have the same information listed over and over again. File-compression programs simply get rid of the redundancy. Instead of listing a piece of information over and over again, a file-compression program lists that information once and then refers back to it whenever it appears in the original program.

If you don’t agree the HTML file are highly repetitive, check the next section where one source of duplication is demonstrated.

Demonstrating Duplication of HTML files

Let’s start with the same file as we look for a source of duplication. Load Cinnamon Cisco881 Testing at Wikileaks into your browser.

Scroll to near the bottom of the file where you see:

Yes! There are 136 prior versions of this alleged CIA file in the directory.

Cinnamon Cisco881 Testinghas the most prior versions but all of them have prior versions.

Are we now in agreement that duplicated versions of the HTML pages exist in the year0/vault7/cms/ directory?

Good!

Now we need to count how many duplicated files there are in year0/vault7/cms/.

Counting Prior Versions of the HTML Files

You may or may not have noticed but every reference to a prior version takes the form:

<a href=”filename.html”>integer</a*gt;

That going to be an important fact but let’s clean up the HTML so we can process it with XQuery/Saxon.

Preparing for XQuery

Before we start crunching the HTML files, let’s clean them up with Tidy.

Here’s my Tidy config file:

output-xml: yes
quote-nbsp: no
show-warnings: no
show-info: no
quiet: yes
write-back: yes

In htmlfiles I run:

tidy -config tidy.config *.html

Tidy reports two errors:


line 887 column 1 - Error: is not recognized!
line 887 column 15 - Error: is not recognized!

Grepping for “declarations>”:

grep "declarations" *.html

Returns:

page_26345506.html:<declarations><string name="½ö"></string></declarations><p>›<br>

The string element is present as well so we open up the file and repair it with XML comments:

<!-- <declarations><string name="½ö"></string></declarations><p>›<br> -->
<!-- prior line commented out to avoid Tidy error, pld 14 March 2017-->

Rerun Tidy:

tidy -config tidy.config *.html

Now Tidy returns no errors.

XQuery Finds Prior Versions

Our files are ready to be queried but 7809 is a lot of files.

There are a number of solutions but a simple one is to create an XML collection of the documents and run our XQuery statements across the files as a set.

Here’s how I created a collection file for these files:

I did an ls in the directory and piped that to collection.xml. Opening the file I deleted index.html, started each entry with <doc href=" and ended each one with "/>, inserted <collection> before the first entry and </collection> after the last entry and then saved the file.

Your version should look something like:

<collection>
  <doc href="page_10158081.html"/>
  <doc href="page_10158088.html"/>
  <doc href="page_10452995.html"/>
...
  <doc href="user_7995631.html"/>
  <doc href="user_8650754.html"/>
  <doc href="user_9535837.html"/>
</collection>

The prior versions in Cinnamon Cisco881 Testing from Wikileaks, have this appearance in HTML source:

<h3>Previous versions:</h3>
<p>| <a href=”page_17760540.html”>1</a> <span class=”pg-tag”><i>empty</i></span>
| <a href=”page_17760578.html”>2</a> <span class=”pg-tag”></span>

…..

| <a href=”page_23134323.html”>135</a> <span class=”pg-tag”>[Xetron]</span>
| <a href=”page_23134377.html”>136</a> <span class=”pg-tag”>[Xetron]</span>
|</p>
</div>

You will need to spend some time with the files (I have obviously) to satisfy yourself that <a> elements that contain only numbers are exclusively used for prior references. If you come across any counter-examples, I would be very interested to hear about them.

To get a file count on all the prior references, I used:

let $count := count(collection('collection.xml')//a[matches(.,'^\d+$')])
return $count

Run that script to find: 6514 previous editions of the base files

Unpacking the XQuery

Rest assured that’s not how I wrote the first XQuery on this data set! 😉

Without exploring all the by-ways and alleys I traversed, I will unpack that query.

First, the goal of the query is to identify every <a> element that only contains digits. Recalling that previous versions link have digits only in their <a> elements.

A shout out to Jonathan Robie, Editor of XQuery, for reminding me that string expressions match substrings unless they are have beginning and ending line anchors. Here:

'^\d+$'

The \d matches only digits, the + enables matching 1 or more digits, and the beginning ^ and ending $ eliminate any <a> elements that might start with one or more digits, but also contains text. Like links to files, etc.

Expanding out a bit more, [matches(.,'^\d+$')], the [ ] enclose a predicate that consist of the matches function, which takes two arguments. The . here represents the content of an <a> element, followed by a comma as a separator and then the regex that provides the pattern to match against.

Although talked about as a “code smell,” the //a in //a[matches(.,'^\d+$')] enables us to pick up the <a> elements wherever they are located. We did have to repair these HTML files and I don’t want to spend time debugging ephemeral HTML.

Almost there! The collection file, along with the collection function, collection('collection.xml') enables us to apply the XQuery to all the files listed in the collection file.

Finally, we surround all of the foregoing with the count function: count(collection('collection.xml')//a[matches(.,'^\d+$')]) and declare a variable to capture the result of the count function: let $count :=

So far so good? I know, tedious for XQuery jocks but not all news reporters are XQuery jocks, at least not yet!

Then we produce the results: return $count.

But 6514 files aren’t 6675 files, you said 6675 files

Yes, your right! Thanks for paying attention!

I said at the top, 6675 are duplicates or Wikileaks artifacts.

Where are the others?

If you look at User #71477, which has the file name, user_40828931.html, you will find it’s not a CIA document but part of Wikileaks administration for these documents. There are 90 such pages.

If you look at Marble Framework, which has the file name, space_15204359.html, you find it’s a CIA document but a form of indexing created by Wikileaks. There are 70 such pages.

Don’t forget the index.html page.

When added together, 6514 (duplicates), 90 (user pages), 70 (space pages), index.html, I get 6675 duplicates or Wikileaks artifacts.

What’s your total?


Tomorrow:

In Fact Checking Wikileaks’ Vault 7: CIA Hacking Tools Revealed (Part 2), I look under year0/vault7/cms/files to discover:

  1. Arguably CIA files (maybe) – 114
  2. Public documents – 109
  3. Wikileaks artifacts – 134

I say “Arguably CIA” because there are file artifacts and anomalies that warrant your attention in evaluating those files.

How Bad Is Wikileaks Vault7 (CIA) HTML?

Thursday, March 9th, 2017

How bad?

Unless you want to hand correct 7809 html files to use with XQuery, grab the latest copy of Tidy

It’s not the worst HTML I have ever seen, but put that in the context of having seen a lot of really poor HTML.

I’ve “tidied” up a test collection and will grab a fresh copy of the files before producing and releasing a clean set of the HTML files.

Producing a document collection for XQuery processing. Working towards something suitable for application of NLP and other tools.

That CIA exploit list in full: … [highlights]

Wednesday, March 8th, 2017

That CIA exploit list in full: The good, the bad, and the very ugly by Iain Thomson.

From the post:

We’re still going through the 8,761 CIA documents published on Tuesday by WikiLeaks for political mischief, although here are some of the highlights.

First, though, a few general points: one, there’s very little here that should shock you. The CIA is a spying organization, after all, and, yes, it spies on people.

Two, unlike the NSA, the CIA isn’t mad keen on blanket surveillance: it targets particular people, and the hacking tools revealed by WikiLeaks are designed to monitor specific persons of interest. For example, you may have seen headlines about the CIA hacking Samsung TVs. As we previously mentioned, that involves breaking into someone’s house and physically reprogramming the telly with a USB stick. If the CIA wants to bug you, it will bug you one way or another, smart telly or no smart telly. You’ll probably be tricked into opening a dodgy attachment or download.

That’s actually a silver lining to all this: end-to-end encrypted apps, such as Signal and WhatsApp, are so strong, the CIA has to compromise your handset, TV or computer to read your messages and snoop on your webcam and microphones, if you’re unlucky enough to be a target. Hacking devices this way is fraught with risk and cost, so only highly valuable targets will be attacked. The vast, vast majority of us are not walking around with CIA malware lurking in our pockets, laptop bags, and living rooms.

Thirdly, if you’ve been following US politics and WikiLeaks’ mischievous role in the rise of Donald Trump, you may have clocked that Tuesday’s dump was engineered to help the President pin the hacking of his political opponents’ email server on the CIA. The leaked documents suggest the agency can disguise its operations as the work of a foreign government. Thus, it wasn’t the Russians who broke into the Democrats’ computers and, by leaking the emails, helped swing Donald the election – it was the CIA all along, Trump can now claim. That’ll shut the intelligence community up. The President’s pet news outlet Breitbart is already running that line.

Iain does a good job of picking out some of the more interesting bits from the CIA (alleged) file dump. No, you will have to read Iain’s post for those.

I mention Iain’s post primarily as a way to entice you into reading the all the files in hopes of discovering more juicy tidbits.

Read the files. Your security depends on the indifference of the CIA and similar agencies. Is that your model for privacy?

Vault 7: CIA Hacking Tools In Bulk Download

Tuesday, March 7th, 2017

If you want to avoid mirroring Vault 7: CIA Hacking Tools Revealed for yourself, check out: https://archive.org/details/wikileaks.vault7part1.tar.

Why Wikileaks doesn’t offer bulk access to its data sets, you would have to ask Wikileaks.

Enjoy!

Wikileaks Armed – You’re Not

Tuesday, March 7th, 2017

Vault 7: CIA Hacking Tools Revealed (Wikileaks).

Very excited to read:

Today, Tuesday 7 March 2017, WikiLeaks begins its new series of leaks on the U.S. Central Intelligence Agency. Code-named “Vault 7” by WikiLeaks, it is the largest ever publication of confidential documents on the agency.

The first full part of the series, “Year Zero”, comprises 8,761 documents and files from an isolated, high-security network situated inside the CIA’s Center for Cyber Intelligence in Langley, Virgina. It follows an introductory disclosure last month of CIA targeting French political parties and candidates in the lead up to the 2012 presidential election.

Recently, the CIA lost control of the majority of its hacking arsenal including malware, viruses, trojans, weaponized “zero day” exploits, malware remote control systems and associated documentation. This extraordinary collection, which amounts to more than several hundred million lines of code, gives its possessor the entire hacking capacity of the CIA. The archive appears to have been circulated among former U.S. government hackers and contractors in an unauthorized manner, one of whom has provided WikiLeaks with portions of the archive.

Very disappointed to read:


Wikileaks has carefully reviewed the “Year Zero” disclosure and published substantive CIA documentation while avoiding the distribution of ‘armed’ cyberweapons until a consensus emerges on the technical and political nature of the CIA’s program and how such ‘weapons’ should analyzed, disarmed and published.

Wikileaks has also decided to redact and anonymise some identifying information in “Year Zero” for in depth analysis. These redactions include ten of thousands of CIA targets and attack machines throughout Latin America, Europe and the United States. While we are aware of the imperfect results of any approach chosen, we remain committed to our publishing model and note that the quantity of published pages in “Vault 7” part one (“Year Zero”) already eclipses the total number of pages published over the first three years of the Edward Snowden NSA leaks.

For all of the fretting over the “…extreme proliferation risk in the development of cyber ‘weapons’…”, bottom line is Wikileaks and its agents are armed with CIA cyber weapons and you are not.

Assange/Wikileaks have cast their vote in favor of arming themselves and protecting the CIA and others.

Responsible leaking of cyber weapons means arming everyone equally.

Online Database of “Verified” Twitter Accounts (Right On!)

Friday, January 6th, 2017

The WikiLeaks Task Force tweeted on 6 Jan. 2017:

We are thinking of making an online database with all “verified” twitter accounts & their family/job/financial/housing relationships.

There are a number of comments to this tweet, the ones containing “dox,” “doxx,” “doxing,” “creepy,” “evil,” etc. that should be ignored.

Ignored because intelligence agencies, news organizations, merchants, banks, etc. are all collecting and organizing that data and more.

Ignored because the public should not preemptively disarm itself.

If anything, the Wikileaks Task Force should start with “verified” Twitter accounts and expand outwards, rapidly.

The public should be able to rapidly find relationships of individuals nominated for office, who contribute money to candidates, who profit from contracts, who launder public money. The public should have the same advantages intelligence agencies enjoy today.

To the nay-sayers to the WikiLeaks Task Force proposal:

Why do you seek to prevent putting the public on a better footing vis-a-vis government?

Question to my readers: What do the nay-sayers gain from a disarmed public?

Drip, Drip, Drip, Leaking At Wikileaks

Monday, November 7th, 2016

wikileaks-dnc-460

Two days before the U.S. Presidential election, Wikileaks released 8,200 emails from the Democratic National Committee (DNC). Which were in addition to its daily drip, drip, drip leaking of emails from John Podesta, Hillary Clinton’s campaign chair.

The New York Times, a sometimes collaborator with Wikileaks (The War Logs (NYT)), has sponsored a series of disorderly and nearly incoherent attacks on Wikileaks for these leaks.

The dominant theme in those attacks is that readers should not worry their shallow and insecure minds about social media but rely upon media outlets to clearly state any truth readers need to know.

I am not exaggerating. The exact language that appears in one such attack was:

…people rarely act like rational, civic-minded automatons. Instead, we are roiled by preconceptions and biases, and we usually do what feels easiest — we gorge on information that confirms our ideas, and we shun what does not.

Is that how you think of yourself? It is how the New York Times thinks about you.

There are legitimate criticisms concerning Wikileaks and its drip, drip, drip leaking but the Times manages to miss all of them.

For example, the daily drops of Podesta emails, selected on some “unknown to the public” criteria, prevented the creation of a coherent narrative by reporters and the public. The next day’s leak might contain some critical link, or not.

Reporters, curators and the public were teased with drips and drabs of information, which served to drive traffic to the Wikileaks site, traffic that serves no public interest.

If that sounds ungenerous, consider that as the game draws to a close, that Wikileaks has finally posted a link to the Podesta emails in bulk: https://file.wikileaks.org/file/podesta-emails/.

To be sure, some use has been made of the Podesta emails, my work and that of others on DKIM signatures (unacknowledged by Wikileaks when it “featured” such verification on email webpages), graphs, etc. but early bulk release of the emails would have enabled much more.

For example:

  • Concordances of the emails and merging those with other sources
  • Connecting the dots to public or known only to others data
  • Entity recognition and linking in extant resources and news stories
  • Fitting the events, people, places into coherent narratives
  • Sentiment analysis
  • etc.

All of that lost because of the “Wikileaks look at me” strategy for releasing the Podesta emails.

I applaud Wikileaks obtaining and leaking data, including the Podesta emails, but, a look at me strategy impairs the full exploration and use of leaked data.

Is that really the goal of Wikileaks?

PS: If you are interested in leaking without games or redaction, ping me. I’m interested in helping with such leaks.

Freedom of Speech/Press – Great For “Us” – Not So Much For You (Wikileaks)

Saturday, November 5th, 2016

The New York Times, sensing a possible defeat of its neo-liberal agenda on November 8, 2016, has loosed the dogs of war on social media in general and Wikileaks in particular.

Consider the sleight of hand in Farhad Manjoo’s How the Internet Is Loosening Our Grip on the Truth, which argues on one hand,


You’re Not Rational

The root of the problem with online news is something that initially sounds great: We have a lot more media to choose from.

In the last 20 years, the internet has overrun your morning paper and evening newscast with a smorgasbord of information sources, from well-funded online magazines to muckraking fact-checkers to the three guys in your country club whose Facebook group claims proof that Hillary Clinton and Donald J. Trump are really the same person.

A wider variety of news sources was supposed to be the bulwark of a rational age — “the marketplace of ideas,” the boosters called it.

But that’s not how any of this works. Psychologists and other social scientists have repeatedly shown that when confronted with diverse information choices, people rarely act like rational, civic-minded automatons. Instead, we are roiled by preconceptions and biases, and we usually do what feels easiest — we gorge on information that confirms our ideas, and we shun what does not.

This dynamic becomes especially problematic in a news landscape of near-infinite choice. Whether navigating Facebook, Google or The New York Times’s smartphone app, you are given ultimate control — if you see something you don’t like, you can easily tap away to something more pleasing. Then we all share what we found with our like-minded social networks, creating closed-off, shoulder-patting circles online.

This gets to the deeper problem: We all tend to filter documentary evidence through our own biases. Researchers have shown that two people with differing points of view can look at the same picture, video or document and come away with strikingly different ideas about what it shows.

You caught the invocation of authority by Manjoo, “researchers have shown,” etc.

But did you notice he never shows his other hand?

If the public is so bat-shit crazy that it takes all social media content as equally trustworthy, what are we to do?

Well, that is the question isn’t it?

Manjoo invokes “dozens of news outlets” who are tirelessly but hopelessly fact checking on our behalf in his conclusion.

The strong implication is that without the help of “media outlets,” you are a bundle of preconceptions and biases doing what feels easiest.

“News outlets,” on the other hand, are free from those limitations.

You bet.

If you thought Majoo was bad, enjoy seething through Zeynep Tufekci’s claims that Wikileaks is an opponent of privacy, sponsor of censorship and opponent of democracy, all in a little over 1,000 words (1069 exact count). Wikileaks Isn’t Whistleblowing.

It’s a breath taking piece of half-truths.

For example, playing for your sympathy, Tufekci invokes the need of dissidents for privacy. Even to the point of invoking the ghost of the former Soviet Union.

Tufekci overlooks and hopes you do as well, that these emails weren’t from dissidents, but from people who traded in and on the whims and caprices at the pinnacles of American power.

Perhaps realizing that is too transparent a ploy, she recounts other data dumps by Wikileaks to which she objects. As lawyers say, if the facts are against you, pound on the table.

In an echo of Manjoo, did you know you are too dumb to distinguish critical information from trivial?

Tufekci writes:


These hacks also function as a form of censorship. Once, censorship worked by blocking crucial pieces of information. In this era of information overload, censorship works by drowning us in too much undifferentiated information, crippling our ability to focus. These dumps, combined with the news media’s obsession with campaign trivia and gossip, have resulted in whistle-drowning, rather than whistle-blowing: In a sea of so many whistles blowing so loud, we cannot hear a single one.

I don’t think you are that dumb.

Do you?

But who will save us? You can guess Tufekci’s answer, but here it is in full:


Journalism ethics have to transition from the time of information scarcity to the current realities of information glut and privacy invasion. For example, obsessively reporting on internal campaign discussions about strategy from the (long ago) primary, in the last month of a general election against a different opponent, is not responsible journalism. Out-of-context emails from WikiLeaks have fueled viral misinformation on social media. Journalists should focus on the few important revelations, but also help debunk false misinformation that is proliferating on social media.

If you weren’t frightened into agreement by the end of her parade of horrors:


We can’t shrug off these dangers just because these hackers have, so far, largely made relatively powerful people and groups their targets. Their true target is the health of our democracy.

So now Wikileaks is gunning for democracy?

You bet. 😉

Journalists of my youth, think Vietnam, Watergate, were aggressive critics of government and the powerful. The Panama Papers project is evidence that level of journalism still exists.

Instead of whining about releases by Wikileaks and others, journalists* need to step up and provide context they see as lacking.

It would sure beat the hell out of repeating news releases from military commanders, “justice” department mouthpieces, and official but “unofficial” leaks from the American intelligence community.

* Like any generalization this is grossly unfair to the many journalists who work on behalf of the public everyday but lack the megaphone of the government lapdog New York Times. To those journalists and only them, do I apologize in advance for any offense given. The rest of you, take such offense as is appropriate.

Clinton/Podesta Map (through #30)

Saturday, November 5th, 2016

Charlie Grapski created Navigating Wikileaks: A Guide to the Podesta Emails.

podesta-map-grapski-460

The listing take 365 pages to date so this is just a tiny sample image.

I don’t have a legend for the row coloring but have tweeted to Charlie about the same.

Enjoy!

Wikileaks Podesta Docs Proven To Be False

Wednesday, November 2nd, 2016

Glenn Greenwald tweeted this list of all the Podesta Docs from Wikileaks that have proven to be false:

https://t.co/3QAb3LLxn0

Journalists should keep that in mind when judging contested facts between Wikileaks and government sources.

Yes?

Validating Wikileaks Emails [Just The Facts]

Saturday, October 22nd, 2016

A factual basis for reporting on alleged “doctored” or “falsified” emails from Wikileaks has emerged.

Now to see if the organizations and individuals responsible for repeating those allegations, some 260,000 times, will put their doubts to the test.

You know where my money is riding.

If you want to verify the Podesta emails or other email leaks from Wikileaks, consult the following resources.

Yes, we can validate the Wikileaks emails by Robert Graham.

From the post:

Recently, WikiLeaks has released emails from Democrats. Many have repeatedly claimed that some of these emails are fake or have been modified, that there’s no way to validate each and every one of them as being true. Actually, there is, using a mechanism called DKIM.

DKIM is a system designed to stop spam. It works by verifying the sender of the email. Moreover, as a side effect, it verifies that the email has not been altered.

Hillary’s team uses “hillaryclinton.com”, which as DKIM enabled. Thus, we can verify whether some of these emails are true.

Recently, in response to a leaked email suggesting Donna Brazile gave Hillary’s team early access to debate questions, she defended herself by suggesting the email had been “doctored” or “falsified”. That’s not true. We can use DKIM to verify it.

Bob walks you through validating a raw email from Wikileaks with the DKIM verifier plugin for Thunderbird. And demonstrating the same process can detect “doctored” or “falsified” emails.

Bob concludes:

I was just listening to ABC News about this story. It repeated Democrat talking points that the WikiLeaks emails weren’t validated. That’s a lie. This email in particular has been validated. I just did it, and shown you how you can validate it, too.

Btw, if you can forge an email that validates correctly as I’ve shown, I’ll give you 1-bitcoin. It’s the easiest way of solving arguments whether this really validates the email — if somebody tells you this blogpost is invalid, then tell them they can earn about $600 (current value of BTC) proving it. Otherwise, no.

BTW, Bob also points to:

Here’s Cryptographic Proof That Donna Brazile Is Wrong, WikiLeaks Emails Are Real by Luke Rosiak, which includes this Python code to verify the emails:

clinton-python-email-460

and,

Verifying Wikileaks DKIM-Signatures by teknotus, offers this manual approach for testing the signatures:

clinton-sig-check-460

But those are all one-off methods and there are thousands of emails.

But the post by teknotus goes on:

Preliminary results

I only got signature validation on some of the emails I tested initially but this doesn’t necessarily invalidate them as invisible changes to make them display correctly on different machines done automatically by browsers could be enough to break the signatures. Not all messages are signed. Etc. Many of the messages that failed were stuff like advertising where nobody would have incentive to break the signatures, so I think I can safely assume my test isn’t perfect. I decided at this point to try to validate as many messages as I could so that people researching these emails have any reference point to start from. Rather than download messages from wikileaks one at a time I found someone had already done that for the Podesta emails, and uploaded zip files to Archive.org.

Emails 1-4160
Emails 4161-5360
Emails 5361-7241
Emails 7242-9077
Emails 9078-11107

It only took me about 5 minutes to download all of them. Writing a script to test all of them was pretty straightforward. The program dkimverify just calls a python function to test a message. The tricky part is providing context, and making the results easy to search.

Automated testing of thousands of messages

It’s up on Github

It’s main output is a spreadsheet with test results, and some metadata from the message being tested. Results Spreadsheet 1.5 Megs

It has some significant bugs at the moment. For example Unicode isn’t properly converted, and spreadsheet programs think the Unicode bits are formulas. I also had to trap a bunch of exceptions to keep the program from crashing.

Warning: I have difficulty opening the verify.xlsx file. In Calc, Excel and in a CSV converter. Teknotus reports it opens in LibreOffice Calc, which just failed to install on an older Ubuntu distribution. Sing out if you can successfully open the file.

Journalists: Are you going to validate Podesta emails that you cite? Or that others claim are false/modified?

The Podesta Emails [In Bulk]

Wednesday, October 19th, 2016

Wikileaks has been posting:

The Podesta Emails, described as:

WikiLeaks series on deals involving Hillary Clinton campaign Chairman John Podesta. Mr Podesta is a long-term associate of the Clintons and was President Bill Clinton’s Chief of Staff from 1998 until 2001. Mr Podesta also owns the Podesta Group with his brother Tony, a major lobbying firm and is the Chair of the Center for American Progress (CAP), a Washington DC-based think tank.

long enough for them to be decried as “interference” with the U.S. presidential election.

You have two search options, basic:

podesta-basic-search-460

and, advanced:

podesta-adv-search-460

As handy as these search interfaces are, you cannot easily:

  • Analyze relationships between multiple senders and/or recipients of emails
  • Perform entity recognition across the emails as a corpus
  • Process the emails with other software
  • Integrate the emails with other data sources
  • etc., etc.

Michael Best, @NatSecGeek, is posting all the Podesta emails as they are released at: Podesta Emails (zipped).

As of Podesta Emails 13, there is approximately 2 GB of zipped email files available for downloading.

The search interfaces at Wikileaks may work for you, but if you want to get closer to the metal, you have Michael Best to thank for that opportunity!

Enjoy!

Why Journalists Should Not Rely On Wikileaks Indexing – Podesta Emails

Saturday, October 15th, 2016

Clinton on Fracking, or, Another Reason to Avoid Wikileaks Indexing

fracking-podesta-460

The quote in the tweet is false.

Politico supplies the correct quotation in its post:


“Bernie Sanders is getting lots of support from the most radical environmentalists because he’s out there every day bashing the Keystone pipeline. And, you know, I’m not into it for that,” Clinton told the unions, according to the transcript. “My view is, I want to defend natural gas. … I want to defend fracking under the right circumstances.”

I’m guessing that “…under the right circumstances.” must have pushed Wikileaks too close to the 140 character barrier.

Ditto for the Wikileaks mis-quote of: “Get a life.”

Which reported as in the tweet, appears to refer to unbridled fracking.

Not so in the Politico post:


“I’m already at odds with the most organized and wildest” of the environmental movement, Clinton told building trades unions in September 2015, according to a transcript of the remarks apparently circulated by her aides. “They come to my rallies and they yell at me and, you know, all the rest of it. They say, ‘Will you promise never to take any fossil fuels out of the earth ever again?’ No. I won’t promise that. Get a life, you know.”

Doesn’t read quite the same way does it?

I supposed once you start lying it’s really hard to stop. Clinton is a good example of that and Wikileaks should not follow her example.

It’s hard to spot these lies because Wikileaks isn’t indexing the attachments.

You can search all day for “defend fracking,” “get a life” (by Clinton) and you will come up empty (at least as of today).

So that you don’t have to search for: 20150909 Transcript | Building Trades Union (Keystone XL) at Wikileaks – Podesta Emails, I have produced a PDF version of that attachment, Building-Trades-Union-Clinton-Sept-09-2015.pdf (my naming), for your viewing pleasure.

How to Navigate Wikileak Torrents (wlstorage.net)?

Wednesday, August 24th, 2016

How to Navigate Wikileak Torrents (wlstorage.net)?. A query I posted earlier today at Open Data on Stack Exchange.

From the query:

I can download Wikileak files from either wlstorage.net or file.wikileaks.org but I’m having difficulty identifying the files of interest.

For example, at http://wikileaks.org, you see “DNC Email Archive,” and “AKP Email Archive,” but I have been unable to match those with any entry for the Wikileaks archives. Dates don’t help because the archives all list as 01-Jan-1984.

Am I missing a well known mapping file to the archives? Thanks!

A mapping from common names for collections to the archives would be a very useful thing.

Pointers? Suggestions?

Double Standards At NPR

Wednesday, August 17th, 2016

NPR Host Demands That Assange Do Something Its Own Reporters Are Told Never to Do by Naomi LaChance.

From the post:

In a ten-minute interview aired Wednesday morning, NPR’s David Greene asked Wikileaks founder Julian Assange five times to reveal the sources of the leaked information he has published on the internet.

A major tenet of American journalism is that reporters protect their sources. Wikileaks is certainly not a traditional news organization, but Greene’s persistent attempts to get Assange to violate confidentiality was alarming, especially considering that there has been no challenge to the authenticity of the material in question.

NPR (National Public Radio) shows its true colors, not as a free and independent press but as a lackey of the Democratic Party in this interview with Assange.

David Greene (Morning Edition) was fixated on repeating the unconfirmed reports that the Russians (which Russians no one every says), were behind the leak of DNC emails.

You can read the transcript of Assange/Greene interview for yourself.

Greene never asks one substantive question about the 20,000 emails. Not one. The first leak of its kind and all Greene does is whine about rumors of Russian involvement.

Well, that’s not entirely fair, Greene does have this exchange with Assange:


GREENE: Well, let me – apart from the different investigations, could you see people in the U.S. government thinking that you might be a threat to national security?

ASSANGE: Well, I mean, there’s great people in the U.S. government – many of them are our sources – and there’s terrible people in the U.S. government. Unfortunately, the U.S. government is a – you know, a reflection, to some degree, of the rest of society. So it’s filled with its share of paranoid and sociopathic power climbers…

GREENE: But is it paranoid to look at these uncensored documents?

ASSANGE: …People who make errors of judgment, etc.

GREENE: Is it paranoid to look at these uncensored documents, these emails, that are released by you? And if they believe that that could change a U.S. presidential election, could be a threat to national security, why isn’t it logical…

ASSANGE: I just – I mean…

GREENE: …For them to see you as a possible threat?

Hmmm, telling the truth about DNC emails can be a threat to national security?

What a bizarre concept in a democracy! Disclosure of evidence of manipulation of the democratic process is a “…threat to national security?”

NPR can and should do better than David Greene shilling for the Democratic Party.

WikiLeaks AKP dump contains 80 types of malware (!OutLook)

Tuesday, August 16th, 2016

WikiLeaks AKP dump contains 80 types of malware by Nicky Cappella.

From the post:

The latest WikiLeaks AKP email contains more than 80 types of malware, an independent researcher has confirmed. The malware includes ransomware and remote-access trojans.

WikiLeaks released emails from the Turkish political party AKP in two parts: one in July, and one on August 5. Anti-virus and malware expert Vesselin Bontchev reviewed the content of those emails and published his findings on his GitHub page. Bontchev listed more than 200 individual emails that contain a link to a confirmed malicious attachment.

His report shows a link to infected emails on the WikiLeaks site, the URL for the malware attachment within the email, and a link to a VirusTotal page, showing the way that different anti-virus scanners are reporting the malware. The URL to the malicious attachment has been made unclickable by substituting ‘hxxxxx’ for ‘https’, as the URL listed is a direct link to the malware and a click would result in an immediate download.

A word to the wise I suppose.

You weren’t going to look at a stolen email archive using OutLook were you?

Joel Simon (@Joelcpj): Woodward and Bernstein Not “Ethical and Committed” Journalists

Thursday, August 4th, 2016

Joel Simon‘s opinion piece How journalists can cover leaks without helping spies, leaves you with the conclusion that Woodward and Bernstein (Watergate) were not “ethical and committed” journalists.

Skipping the nationalistic ranting and “compelling evidence,” which turns out to be the New York Times parroting surmises and guesses by known liars (U.S. intelligence community), Simon writes of the Wikileaks dump of DNC emails:


As for WikiLeaks, by publishing a data dump without verifying the source or providing its readers with the context to make informed decisions about the motivations of the leakers, it is allowing itself to be a vehicle for governments like Russia that are weaponizing information and using it to achieve policy objectives. Ethical and committed journalists should do all within their power to ensure they are never put in such a position. (emphasis added)

For more than thirty years, 1972 – 2005, the Watergate source known as “Deep Throat (W. Mark Felt),” and his motives, remained a mystery to the American public.

Yet, his revelations were instrumental in bringing down an American president (Richard Nixon).

Mark Felt was a friend of Bob Woodward and their meeting in a parking garage on October 9th, 1972, lead to the October 10, 1972 Washington Post story titled: FBI Finds Nixon Aides Sabotaged Democrats.

In case you don’t remember, 1972 was a presidential election year, with the election being held on November 7, 1972.

Consider those three dates, the discussion between Bernstein and Felt (October 9, 1972), the Washington Post story (October 10, 1972) and the presidential election (November 7, 1972). Or perhaps better:


October 9, 1972 – 29 days until voting begins in presidential election

October 10, 1972 – 28 days until voting begins in presidential election

November 7, 1972 (election day)

The timing of the leak and its publication by the Washington Post less than thirty (30) days prior to a presidential election certainly make the motives of the leaker a relevant question.

Yet, Deep Throat remained unknown and “…readers with[out] the context to make informed decisions about the motivations of the [Deep Throat/Mark Felt]…” for more than thirty years.

Contrary to Joel Simon’s criteria, Woodward and Bernstein verified and corroborated the information given to them by Deep Throat/Mark Felt to be truthful and did not explore for their readers, any possible motivations on his part.

The authenticity of the DNC emails has not been challenged and resignations of Wasserman Schultz (DNC Chair), Amy Dacey (DNC CEO), Brad Marshall (DNC CFO), Luis Miranda (DNC Communications Director) and an public apology to Bernie Sanders by the Democratic National Committee, are all supporting evidence that the DNC email leak is both accurate and authentic.

Unlike Joel Simon, I think Woodward and Bernstein were “ethical and committed” journalists during Watergate, providing their readers with accurate information in a timely manner.

Without exploring the motives of why someone would leak truthful information.

The CJR, Joel Simon and the media generally should abandon its attempt to twist journalistic ethics to exclude publication of truthful information of legitimate interest to a voting public.

Judging from the tone of Simon’s post, his concerns are driven more by rabid nationalism and jingoism than any legitimate concern for journalistic ethics.

Wikileaks Mentions In DNC Email – .000718%. Hillary To/From Emails – .000000% (RDON)

Saturday, July 23rd, 2016

Cryptome tweeted today:

wikileaks-dnc-460

Would you believe that Hillary Clinton is more irrelevant than Wikileaks?

Consider the evidence:

Search for hillaryclinton.com at Search the DNC email database

Scrape the 533 results, as of Saturday, 23 July 2016, into a file.

Grep for hillaryclinton.com and pipe that to another file.

Clean out the remaining markup, insert line returns for commas in cc: field, lowercase and sort, then uniq.

Results:

  1. aelrod@hillaryclinton.com – Adrienne K. Elrod
  2. creynolds@hillaryclinton.com – never a sender
  3. dcheng@hillaryclinton.com – Dennis Cheng
  4. djtspeaks@hillaryclinton.com – never a sender
  5. jklein@hillaryclinton.com – Justin Klein
  6. jschwerin@hillaryclinton.com – Josh Schwerin
  7. kgasperine@hillaryclinton.com – Kathleen Gasperine
  8. lroitman@hillaryclinton.com – Lindsay Roitman
  9. mhalle@hillaryclinton.com – never a sender
  10. mjennings@hillaryclinton.com – Mary Rutherford Jennings
  11. press@hillaryclinton.com – no author
  12. tvclips@hillaryclinton.com – 1 post, no sig
  13. zpetkanas@hillaryclinton.com – Zac Petkanas

That’s right! From January of 2015 until May of 2016, Hillary Clinton apparently had no emails to or from the DNC.

I find that to be unlikely to say the least.

What’s your explanation for the absence of Hillary Clinton emails to and from the DNC?

My explanation that Wikileaks is manipulating both the data and all of us.

Here’s a motto for data leaks: Raw Data Or Nothing (RDON)

Say it, repeat it, demand it – RDON!

Yes Luis, There Is A Fuck You Emoji

Friday, July 22nd, 2016

Luis Miranda, Communications Director of the DNC asks:

fuck-you-emoji-460

Yes, there is a Fuck You emoji!

For example, here is the Google version:

google-fuck-you

I don’t know if Luis is still looking for an answer to that question but if so, consider it answered!

Searching the DNC email database can be amusing, even educational as the question from Luis demonstrates, I would prefer the ability to browse and to download the dataset for deeper analysis.

What have you found in the DNC email database?

Hillary Clinton Email Archive

Tuesday, July 5th, 2016

Hillary Clinton Email Archive by Wikileaks.

From the webpage:

On March 16, 2016 WikiLeaks launched a searchable archive for 30,322 emails & email attachments sent to and from Hillary Clinton’s private email server while she was Secretary of State. The 50,547 pages of documents span from 30 June 2010 to 12 August 2014. 7,570 of the documents were sent by Hillary Clinton. The emails were made available in the form of thousands of PDFs by the US State Department as a result of a Freedom of Information Act request. The final PDFs were made available on February 29, 2016.

“Truthers” may be interested in this searchable archive of Clinton’s emails while Secretary of State.

“Truthers” because the FBI’s recommendation of no charges effectively ends this particular approach to derail Clinton’s run for the presidency.

Many wish the result were different but when the last strike is called, arguing about it isn’t going to change the score of the game.

New evidence and new facts, on the other hand, are unknown factors and could make a difference whereas old emails will not.

Are you going to be looking for new evidence and facts or crying over calls in a game already lost?

A Taste of the DNC

Wednesday, June 15th, 2016

GUCCIFER 2.0 DNC’S SERVERS HACKED BY A LONE HACKER by Guccifer2.

From the post:

Worldwide known cyber security company CrowdStrike announced that the Democratic National Committee (DNC) servers had been hacked by “sophisticated” hacker groups.

I’m very pleased the company appreciated my skills so highly))) But in fact, it was easy, very easy.

Guccifer may have been the first one who penetrated Hillary Clinton’s and other Democrats’ mail servers. But he certainly wasn’t the last. No wonder any other hacker could easily get access to the DNC’s servers.

Shame on CrowdStrike: Do you think I’ve been in the DNC’s networks for almost a year and saved only 2 documents? Do you really believe it?

Here are just a few docs from many thousands I extracted when hacking into DNC’s network.

A taste of what was liberated from the DNC servers, including:

  • Donald Trump Report.
  • DNC donor lists (compare to FEC records).
  • A secret document from Clinton’s days as Secretary of State.
  • A scattering of other documents.

The main part of the papers were given to Wikileaks.

Sigh.

Hopefully that won’t mean sanitized documents but we will have to wait and see. Remember the Afghan War Diaries? Edited so as to not discomfort the U.S. government too much.

Hacking Team Email Archive

Saturday, July 11th, 2015

Hacking Team Email Archive (Wikileaks)

Wikileaks has created a searchable version of over one (1) million emails from Hacking Team.

Enjoy!

Sony Emails and Dilbert Cartoons

Saturday, May 2nd, 2015

WikiLeaks Adds More Hacked Emails From Sony Pictures Entertainment by Sohini Auddy.

From the post:

WikiLeaks has added thousands more of Sony Pictures Entertainment’s hacked emails in its database, as mentioned in a Twitter post on Thursday.

Sony has yet to develop a sense of humor over the hack attack late last year.

Suggestion: Search the Sony emails at Wikileaks and then the Dilbert archives for a matching Dilbert cartoon.

Tweet the link for the Sony email and your matching Dilbert cartoon, #sonydilbert.

Let’s try that for a week, ending May 9, 2014.

Tweet with the most retweets will be declared the winner by acclamation. (Contest not open to Sony managers.)

Enjoy!

Sony at Wikileaks! (MPAA Privacy versus Your Privacy)

Monday, April 20th, 2015

Sony at Wikileaks!

From the press release:

Today, 16 April 2015, WikiLeaks publishes an analysis and search system for The Sony Archives: 30,287 documents from Sony Pictures Entertainment (SPE) and 173,132 emails, to and from more than 2,200 SPE email addresses. SPE is a US subsidiary of the Japanese multinational technology and media corporation Sony, handling their film and TV production and distribution operations. It is a multi-billion dollar US business running many popular networks, TV shows and film franchises such as Spider-Man, Men in Black and Resident Evil.

In November 2014 the White House alleged that North Korea’s intelligence services had obtained and distributed a version of the archive in revenge for SPE’s pending release of The Interview, a film depicting a future overthrow of the North Korean government and the assassination of its leader, Kim Jong-un. Whilst some stories came out at the time, the original archives, which were not searchable, were removed before the public and journalists were able to do more than scratch the surface.

Now published in a fully searchable format The Sony Archives offer a rare insight into the inner workings of a large, secretive multinational corporation. The work publicly known from Sony is to produce entertainment; however, The Sony Archives show that behind the scenes this is an influential corporation, with ties to the White House (there are almost 100 US government email addresses in the archive), with an ability to impact laws and policies, and with connections to the US military-industrial complex.

WikiLeaks editor-in-chief Julian Assange said: “This archive shows the inner workings of an influential multinational corporation. It is newsworthy and at the centre of a geo-political conflict. It belongs in the public domain. WikiLeaks will ensure it stays there.”

Lee Munson writes in WikiLeaks publishes massive searchable archive of hacked Sony documents,


According to the Guardian, former senator Chris Dodd, chairman of the MPAA, wrote how the republication of this information signifies a further attack on the privacy of those involved:

This information was stolen from Sony Pictures as part of an illegal and unprecedented cyberattack. Wikileaks is not performing a public service by making this information easily searchable. Instead, with this despicable act, Wikileaks is further violating the privacy of every person involved.

Hacked Sony documents soon began appearing online and were available for download from a number of different sites but interested parties had to wade through vast volumes of data to find what they were looking for.

WikiLeaks’ new searchable archive will, sadly, make it far easier to discover the information they require.

I don’t see anything sad about the posting of the Sony documents in searchable form by Wikileaks.

If anything, I regret there aren’t more leaks, breaches, etc., of both corporate and governmental document archives. Leaks and breaches that should be posted “as is” with no deletions by Wikileaks, the Guardian or anyone else.

Chris Dodd’s privacy concerns aren’t your privacy concerns. Not even close.

Your privacy concerns (some of them):

  • personal finances
  • medical records
  • phone calls (sorry, already SOL on that one)
  • personal history and relationships
  • more normal sort of stuff

The MPAA, Sony and such, have much different privacy concerns:

  • concealment of meetings with and donations to members of government
  • concealment of hiring practices and work conditions
  • concealment of agreements with other businesses
  • concealment of offenses against the public
  • concealment of the exercise of privilege

Not really the same are they?

Your privacy centers on you, the MPAA/Sony privacy centers on what they have done to others.

New terms? You have a privacy interest, MPAA/Sony has an interest in concealing information.

That sets a better tone for the discussion.

Wikileaks: Kissinger Cables

Saturday, April 13th, 2013

Wikileaks: Kissinger Cables

The code behind the Public Library of US Diplomacy.

Another rich source of information for anyone creating a mapping of relationships and events in the early 1970’s.

My only puzzle over Wikileaks is their apparent focus on US diplomatic cables.

Where are the diplomatic cables of the former government in Egypt? Or the USSR? Or of any of the many existing regimes around the globe?

Surely those aren’t more difficult to obtain than those of the US?

Perhaps that would make an interesting topic map.

Those who could be exposed by Wikileaks but aren’t.

I first saw this as: Wikileaks ProjectK Code (Github) on Nat Torkington’s Four short links: 12 April 2013.

WikiLeaks as Wakeup Call?

Thursday, May 31st, 2012

Must be a slow news week. Federal Computer Week is recycling Wikileaks as a “wake up” call.

In case you have forgotten (or is that why the story is coming back up?), Robert Gates (Sec. of Defense) found that Wikileaks did not disclose sensitive intelligence sources or methods.

Hardly “…a security breach of epic proportions…” as claimed by the State Department.

If you want to claim Wikileaks was a “wakeup call,” make it a wake up call about “data dumpster” techniques for sharing intelligence data.

“Here are all our reports. Good luck finding something, anything.”

Security breach written all over it. Useless other than as material for a security breach. Easy to copy in bulk, etc.

What about this says “potential security breach” to you?

Best methods for sharing intelligence vary depending on the data, security requirements and a host of other factors. Take Wikileaks as motivation (if lacking before) to strive for useful intelligence sharing.

Not sharing for the sake of saying you are sharing.

Cablemap

Thursday, February 17th, 2011

Cablemap

Just in case you have been in a coma for the last 6 months or in solitary confinement, Wikileaks is publishing a set of diplomatic cables it describes as follows:

Wikileaks began on Sunday November 28th publishing 251,287 leaked United States embassy cables, the largest set of confidential documents ever to be released into the public domain. The documents will give people around the world an unprecedented insight into US Government foreign activities.

The cables, which date from 1966 up until the end of February this year, contain confidential communications between 274 embassies in countries throughout the world and the State Department in Washington DC. 15,652 of the cables are classified Secret.

….

The cables show the extent of US spying on its allies and the UN; turning a blind eye to corruption and human rights abuse in “client states”; backroom deals with supposedly neutral countries; lobbying for US corporations; and the measures US diplomats take to advance those who have access to them.

This document release reveals the contradictions between the US’s public persona and what it says behind closed doors – and shows that if citizens in a democracy want their governments to reflect their wishes, they should ask to see what’s going on behind the scenes.

The online treatments I have seen by the Guardian and the New York Times are more annoying than the parade of horrors suggested by US government sources.

True, the cables show diplomats to be venal and dishonest creatures in the service of even more venal and dishonest creatures but everyone outside of an asylum and over 12 years of age knew that already.

Just as everyone knew that US foreign policy benefits friends and benefactors of elected US officials, not the general U.S. population.

Here is the test: Look over all the diplomatic cables since 1966 and find one where the result benefited you personally. Now pick one at random and identify the person or group who benefited from the activity or policy discussed in the cable.

A topic map that matched up individuals or groups who benefited from the activities or policies discussed in the cables would be a step towards being more than annoying.

Topic mapping in Google map locations for those individuals or representatives of those groups, would be more than annoying still.

Add the ability to seamlessly integrate leaked information into another intelligence system, you are edging towards the potential of topic maps.

Cablemap is a step towards the production of a Cablegate resource that is more than simply annoying.