Podesta/Clinton Emails: Filtering by Email Address (Pimping Bill Clinton)

The Bill Clinton, Inc. story reminds me of:

Although I steadfastly resist imaging either Bill or Hillary in that video. Just won’t go there!

Where a graph display can make a difference is that instead of just the one email/memo from Bill’s pimp, we can rapidly survey all of the emails in which he appears, in any role.

import-spigot-email-filter-460

I ran that on Gephi 8.2 against podesta-release-1-18 but the results were:

Nodes 0, Edges 0.

Hmmm, there is something missing, possibly the CSV file?

I checked and podesta-release-1-18 has 393 emails where doug@presidentclinton.com appears.

Could try to find the “right” way to filter on email addresses but for now, let’s take a dirty short-cut.

I created a directory to hold all emails with doug@presidentclinton.com and ran into all manner of difficulties because the file names are plagued with spaces!

So much so that I unintentionally (read “by mistake”) saved all the doug@presidentclinton.com posts from podesta-release-1-18 to a different folder than the ones from podesta-release-19.

🙁

Well, but there is a happy outcome and an illustration of yet another Gephi capability.

I build the first graph from the doug@presidentclinton.com posts from podesta-release-1-18 and then with that graph open, imported the doug@presidentclinton.com from podesta-release-19 and appended those results to the open graph.

How cool is that!

Imagine doing that across data sets, assuming you paid close attention to identifiers, etc.

Sorry, back to the graphs, here is the random layout once the graphs were combined:

doug-default-460

Applying the Yifan Hu network layout:

doug-network-yifan-hu-460

I ran network statistics on network diameter and applied colors based on betweenness:

doug-network-centrality-460

And finally, adjusted the font and turned on the labels:

doub-network-labels-460

I have spent a fair amount of time just moving stuff about but imagine if you could interactively explore the emails, creating and trashing networks based on to:, from:, cc:, dates, content, etc.

The limits of Gephi imports were a major source of pain today.

I’m dodging those tomorrow in favor of creating node and adjacency tables with scripts.

PS: Don’t John Podesta and Doug Band look like two pimps in a pod? 😉

PPS: If you haven’t read the pimping Bill Clinton memo. (I think it has some other official title.)

Comments are closed.