Archive for April, 2016

MATISSE – Solar System Exploration

Saturday, April 30th, 2016

MATISSE: A novel tool to access, visualize and analyse data from planetary exploration missions by Angelo Zinzi, Maria Teresa Capria, Ernesto Palomba, Paolo Giommi, Lucio Angelo Antonelli.


The increasing number and complexity of planetary exploration space missions require new tools to access, visualize and analyse data to improve their scientific return.

ASI Science Data Center (ASDC) addresses this request with the web-tool MATISSE (Multi-purpose Advanced Tool for the Instruments of the Solar System Exploration), allowing the visualization of single observation or real-time computed high-order products, directly projected on the three-dimensional model of the selected target body.

Using MATISSE it will be no longer needed to download huge quantity of data or to write down a specific code for every instrument analysed, greatly encouraging studies based on joint analysis of different datasets.

In addition the extremely high-resolution output, to be used offline with a Python-based free software, together with the files to be read with specific GIS software, makes it a valuable tool to further process the data at the best spatial accuracy available.

MATISSE modular structure permits addition of new missions or tasks and, thanks to dedicated future developments, it would be possible to make it compliant to the Planetary Virtual Observatory standards currently under definition. In this context the recent development of an interface to the NASA ODE REST API by which it is possible to access to public repositories is set.

Continuing a long tradition of making big data and tools for processing big data freely available online (hint, hint, Panama Papers hoarders), this paper describes MATISSE (Multi-purpose Advanced Tool for the Instruments for the Solar System Exploration), which you can find online at:

Data currently available:

MATISSE currently ingests both public and proprietary data from 4 missions (ESA Rosetta, NASA Dawn, Chinese Chang’e-1 and Chang’e-2), 4 targets (4 Vesta, 21 Lutetia, 67P ChuryumovGerasimenko, the Moon) and 6 instruments (GIADA, OSIRIS, VIRTIS-M, all onboard Rosetta, VIR onboard Dawn, elemental abundance maps from Gamma Ray Spectrometer, Digital Elevation Models by Laser Altimeter and Digital Ortophoto by CCD Camera from Chang’e-1 and Chang’e-2).

If those names don’t sound familiar (links to mission pages):

4 Vesta – asteriod (NASA)

21 Lutetia – asteroid (ESA)

67P ChuryumovGerasimenko – comet (ESA)

the Moon – As in “our” moon.

You can do professional level research on extra-worldly data, but with worldly data (Panama Papers), not so much. Don’t be deceived by the forthcoming May 9th dribble of corporate data from the Panama Papers. Without the details contained in the documents, it’s little more than a suspect’s list.

DIY OpenShift Cluster

Saturday, April 30th, 2016


No videos are planned for the Neo4j cluster I mentioned in Visual Searching with Google – One Example – Neo4j – Raspberry Pi but that’s all right, Marek Jelen has started a series on building an OpenShift cluster.

Deploying embedded OpenShift cluster (part 1) introduces the project:

In this series we are going to discuss deploying OpenShift cluster on development boards, specifically MinnowBoards. You might be asking, why the hell would I do that? Well, there are some benefits. First, they do have much lower power consumption. In my case, I am using Minnowboards, with light demand, one board takes approximately 3W-4W. Running a cluster of 4 boards including a switch takes 17W, deploying and starting 10 containers adds 1W. But yeah, that does not include fast disks. But that will come as well. Next benefit is the form factor. My cluster of four boards has dimensions of 7.5cm x 10.0cm x 8cm, about the size of a pack of credit cards. Quite a powerful cluster that can fit pretty much anywhere. The small size bring another benefit – mobility. Do you need computer power on the go? Well, this kind of boards can help solve your problem. Anyway, let’s get on with it.

Lower power consumption and form factor aren’t high on my list as reasons to pursue this project.

Security while learning about OpenShift clusters would be my top reason.

Anything without an air gap between it and outside networks is by definition insecure. Even with air gaps systems can be insecure but air gaps reduce the attack surfaces.

I appreciate Marek’s preference for MinnowBoards but there is a vibrant community around the Raspberry Pi.

Looking forward to the next post in this series!

PS: Physical security is rarely accorded the priority it deserves. Using a MinnowBoard or Raspberry Pi, a very small form factor computer could be installed behind any external firewalls. “PC call home.”

Privacy Protects Murderers

Friday, April 29th, 2016

What a broad shadow “privacy” can cast.

A week or so ago, Keeping Panama Papers Secret? Law Firms, Journalists and Privacy, I was pointing out the specious “we’re protecting privacy claims” of Suddeutsche Zeitung.

Now, the United States cites “privacy concerns” in not revealing the identities of sixteen military personnel who murdered 42 people and wounded 37 others in an attack on a Doctors Without Borders (MSF) hospital in Afghanistan last year. US: Afghan MSF hospital air strike was not a war crime

The acts of aircraft crews may not be war crimes, they can only function based on the information they are given by others, but the casual indifference that resulted in wholly inadequate information systems upon which they relied, certainly could result in command level charges of war crimes.

Moving war crimes charges upon the chain of command could well result in much needed accountability.

But, like the case with Suddeutsche Zeitung, accountability is something that is desired for others. Never for those calling upon privacy.

Elixir RAM and the Template of Doom

Friday, April 29th, 2016

Elixir RAM and the Template of Doom by Evan Miller.

From the post:

I will attempt to convince you, in two lines of code, that Elixir is more interesting than any programming language you’ve ever used.

Are you ready? Don’t worry, the code doesn’t involve quicksort, or metaprogramming, or anything like that.

Here we go.

Sorry, no spoilers, but I can say that use of strace or dtruss figure in explaining some amazing performance characteristics.

It reminds me of the sort of post I expect from Julia Evans. It’s that good.

If you are not interested in possibly slashing CPU and RAM usage, move onto another post. If you are, spend some time with Evan’s post.

I’m thinking the smaller the footprint the better. Yes?


Defender Services Office

Friday, April 29th, 2016

Defender Services Office

I discovered the Defender Services Office while searching for something beyond the usual complaints about government prosecutions. Complaining is ok, but blunting government efforts requires something more.

From the about page:

The Defender Services Office (DSO) of the Administrative Office of the U.S. Courts assists in administering the Defender Services Program under the Criminal Justice Act (CJA), the law governing the provision of federal criminal defense services to those unable to afford representation. The Training Division of DSO provides substantial training and other resource support to Federal Defender Organization (FDO) staff and CJA panel attorneys. The Training Division has seven principal tasks:

  • Providing substantive information on federal criminal law and procedure, publications, training materials and other online resources to CJA panel attorneys and FDO staff through the Training Branch websites, and
  • Designing, implementing and teaching at national and local training programs for CJA panel attorneys and FDO attorneys, paralegals, and investigators.
  • Delivering training programs to FDO attorneys, paralegals and investigators through an interagency agreement with the Federal Judicial Center (FJC) and assisting in the design of those programs.
  • Working with contractors on the planning and implementation of federal death penalty and federal capital habeas corpus training for FDO staff and CJA panel attorneys.
  • Providing guidance and information to members of the CJA panel and FDO staff on CJA cases regarding all aspects of criminal law and procedure through our hotline (800-788-9908).
  • Implementing the Supreme Court Advocacy Program, which arranges moots, performs legal research, provides substantive and strategic advice, or editing and writing drafts of merits briefs, to CJA panel members and FDO attorneys representing CJA-eligible defendants in the United States Supreme Court.
  • Providing advice and consultation on litigation support tools, services and processes to federal courts, federal defender organizations, and CJA panel attorneys.

There are a number of resource materials, mostly of interest to lawyers and paralegals.


Saving Time With Automation (Regex for USTR?)

Friday, April 29th, 2016

Some humor to get started on a beautiful Friday (near Atlanta, GA.). Your local conditions may vary!


Speaking of automation, does anyone have a regex for United States Trade Representative, USTR, or named staff of the USTR?

It could be used in filters that pipe USTR comments, emails, webpages, reports, etc., to /dev/null.

Pointers anyone?

Reporting U.S Industry’s Tantrum-By-Proxy

Thursday, April 28th, 2016

Captured U.S. Trade Agency Resorts to Bullying Again in 2016 Special 301 Report by Jeremy Malcolm.

Jeremy does a great job dissecting the latest tantrum-by-proxy of U.S. industry, delivered by the United States Trade Representative. But, it’s hardy news that the USTR is a captive of the U.S. entertainment and Big Pharma.

No, the news waits until the last paragraph of Jeremy’s post:

… the foreign press often wrongly reports on the Special 301 as if it were more than just a unilateral wish-list from certain U.S. industries. The result is foreign governments coming under unfair pressure to amend their laws and to divert enforcement resources, without any international obligation for them to do so….

That’s news and it is something that can be addressed, by ordinary readers.

First, readers fluent in languages other than English should seek out non-U.S. news reporting on the most recent U.S. industry tantrum-by-proxy report.

You can refer to it by its formal title, 2016 Special 301 Report, but always include (U.S. industry tantrum-by-proxy report) as an alternative title.

Second, use the Jeremy’s post and compare the resources he cites to non-U.S. press reports. Contact reporters to correct stories that don’t point out entertainment or big pharma origins of those claims.

Reporters are always over-worked and under-resourced so be polite, brief and specific about your corrections and how they can verify the correctness of your statements.

It may well be that the same actors who have corrupted the United States Trade Representative (USTR) are the same ones putting pressure on a foreign government. Which makes reporting of undue influence on trade issues even more important.

Third, remember that taking legal advice from the world’s largest arms dealer, the architect of trade agreements that favor corporations over natural persons, the tireless servant of U.S. business interests, is like getting career counseling from a pimp. Your best interest isn’t upper most in their mind.

U.S. Government Surveillance Breeds Meekness, Fear and Self-Censorship [Old News]

Thursday, April 28th, 2016

New Study Shows Mass Surveillance Breeds Meekness, Fear and Self-Censorship by Glenn Greenwald.

From the post:

A newly published study from Oxford’s Jon Penney provides empirical evidence for a key argument long made by privacy advocates: that the mere existence of a surveillance state breeds fear and conformity and stifles free expression. Reporting on the study, the Washington Post this morning described this phenomenon: “If we think that authorities are watching our online actions, we might stop visiting certain websites or not say certain things just to avoid seeming suspicious.”

The new study documents how, in the wake of the 2013 Snowden revelations (of which 87% of Americans were aware), there was “a 20 percent decline in page views on Wikipedia articles related to terrorism, including those that mentioned ‘al-Qaeda,’ “car bomb’ or ‘Taliban.’” People were afraid to read articles about those topics because of fear that doing so would bring them under a cloud of suspicion. The dangers of that dynamic were expressed well by Penney: “If people are spooked or deterred from learning about important policy matters like terrorism and national security, this is a real threat to proper democratic debate.”

As the Post explains, several other studies have also demonstrated how mass surveillance crushes free expression and free thought. A 2015 study examined Google search data and demonstrated that, post-Snowden, “users were less likely to search using search terms that they believed might get them in trouble with the US government” and that these “results suggest that there is a chilling effect on search behavior from government surveillance on the Internet.”

While I applaud Greenwald and others who are trying to expose the systematic dismantling of civil liberties in the United States, at least as enjoyed by the privileged, the breeding of meekness, fear and self-censorship is hardly new.

Meekness, fear and self-censorship are especially not new to the non-privileged.

Civil Rights:

Many young activists of the 1960s saw their efforts as a new departure and themselves as a unique generation, not as actors with much to learn from an earlier, labor-infused civil rights tradition. Persecution, censorship, and self-censorship reinforced that generational divide by sidelining independent black radicals, thus whitening the memory and historiography of the Left and leaving later generations with an understanding of black politics that dichotomizes nationalism and integrationism.

The Long Civil Rights Movement and the Political Uses of the Past by Jacquelyn Dowd Hall, at page 1253.


Those who might object to a policy that is being defended on the grounds that it is protecting threats to the American community may remain silent rather than risk isolation. Arguably, this was the greatest long-term consequence of McCartyism. No politician thereafter could be seen to be soft on Communism, so that America could slide, almost by consensus, into a war against Vietnamese communists without rigorous criticism of successive administrations’ policies ever being mounted. Self-censoring of political and social debate among politicians and others can act to counter the positive effects of the country’s legal rights of expression.

Political Conflict in American by Alan Ware, pages 63-64.

The breeding of meekness, fear and self-censorship has long been a tradition in the United States. A tradition far older than the Internet.

A tradition that was enforced by fear of loss of employment, social isolation, loss of business.

You may recall in Driving Miss Daisy when her son (Boolie) worries about not getting invited to business meetings if he openly support Dr. Martin Luther King. You may mock Boolie now but that was a day to day reality. Still is, most places.

How to respond?

Supporting Wikileaks, Greenwald and other journalists is a start towards resisting surveillance, but don’t take it as a given that journalists will be able to preserve free expression for all of us.

As a matter of fact, journalists have been shown to be as reticent as the non-privileged:

Even the New York Times, the most aggressive news organization throughout the year of investigations, proved receptive to government pleas for secrecy. The Times refused to publicize President Ford’s unintentional disclosure of assassination plots. It joined many other papers in suppressing the Glomar Explorer story and led the editorial attacks on the Pike committee and on Schorr. The real question, as Tom Wicker wrote in 1978, is not “whether the press had lacked aggressiveness in challenging the national-security mystique, but why?” Why, indeed, did most journalists decide to defer to the administration instead of pursuing sensational stories?

Challenging the Secret Government by Kathryn S. Olmsted, at page 183.

You may have noticed the lack of national press organs in the United States challenging the largely fictional “war on terrorism.” There is the odd piece, You’re more likely to be fatally crushed by furniture than killed by a terrorist by Andrew Shaver, but those are easily missed in the maelstrom of unquestioning coverage of any government press release on terrorism.

My suggestion? Don’t be meek, fearful or self-censor. Easier said than done but every instance of meekness, fearfulness or self-censorship, is another step towards the docile population desired by governments and others.

Let’s disappoint them together.

Office of the Historian, U.S. Department of State (+ XQuery)

Thursday, April 28th, 2016

Office of the Historian (website) : Office of the Historian, U.S. Department of State (Github).

All of the XQuery code and data from the website is available at Github.

You will find such goodies as:

Office of the Historian Subject Taxonomy of the History of U.S. Foreign Relations (XML)

Foreign Relations of the United States

The Foreign Relations of the United States (FRUS) series presents the official documentary historical record of major U.S. foreign policy decisions and significant diplomatic activity. The series is published in print and online editions at the U.S. Department of State Office of the Historian website.

Encoded using TEI with additional tools for quality checking.

Impressive but perhaps not as immediately useful as:

A Guide to the United States’ History of Recognition, Diplomatic, and Consular Relations, by Country, since 1776

I checked and there is an entry for Texas that will need to be updated depending on who you listen to in Texas.

There are XML, Schematron, XQuery files galore so there is plenty of production and/or practice material, depending upon your interests.

Quantum Shannon Theory (Review Request)

Thursday, April 28th, 2016

Quantum Shannon Theory by John Preskill.


This is the 10th and final chapter of my book on Quantum Information, based on the course I have been teaching at Caltech since 1997. An early version of this chapter (originally Chapter 5) has been available on the course website since 1998, but this version is substantially revised and expanded. The level of detail is uneven, as I’ve aimed to provide a gentle introduction, but I’ve also tried to avoid statements that are incorrect or obscure. Generally speaking, I chose to include topics that are both useful to know and relatively easy to explain; I had to leave out a lot of good stuff, but on the other hand the chapter is already quite long. This is a working draft of Chapter 10, which I will continue to update. See the URL on the title page for further updates and drafts of other chapters, and please send me an email if you notice errors. Eventually, the complete book will be published by Cambridge University Press.

Prekill tweeted requesting reviews of and comments on this 112 page “chapter” from Quantum Information (forthcoming, appropriately, no projected date).

Be forewarned that Preskill compresses classical information theory into 14 pages or so. 😉

You can find more chapters at: Quantum Computation.

Previous problem sets with solutions are also available.

Quantum computing is coming. Are you going to be the first quantum hacker?


Panama Papers – Shake That Money Maker

Thursday, April 28th, 2016

ICIJ to Release Panama Papers Offshore Companies Data by Marina Walker Guevara.

From the post:

The International Consortium of Investigative Journalists will release on May 9 a searchable database with information on more than 200,000 offshore entities that are part of the Panama Papers investigation.

While the database opens up a world that has never been revealed on such a massive scale, the application will not be a “data dump” of the original documents – it will be a careful release of basic corporate information .

ICIJ won’t release personal data en masse; the database will not include records of bank accounts and financial transactions, emails and other correspondence, passports and telephone numbers. The selected and limited information is being published in the public interest.

Meanwhile ICIJ, the German newspaper Süddeutsche Zeitung which received the leak, and other global media partners, including several new outlets in countries where ICIJ has not been able to report, will continue to investigate and publish stories in the weeks and months to come. (emphasis added)

A teaser from ICIJ.

ICIJ is shaking the Panama Papers as a money maker.

Here a video depiction:

I don’t object to ICIJ and its 400 or so blessed journalists making money from the Panama Papers.

A lot of money has been invested in making the data dump useful and profits here will support more investigations in the future.

Admitting profit is driving the concealment of the Panama Papers enables a rational discussion on releasing the data dump.

For example, when law enforcement authorities request copies of data relevant to their jurisdictions, they should have to pay for the research to segregate and package those files, along with agreements to not post publicly post them for some set time.

In terms of public access, Süddeutsche Zeitung (SZ)/ICIJ has had these documents for more than a year. Two years from the first publication, how much low-lying fruit could be left? Especially given the need to re-process the raw data to explore it.

Reasonable profits are necessary and just, hoarding (think monopoly/anti-trust) and avoiding accountability are not.

Kiddie Porn – Anti-Tor Malware

Thursday, April 28th, 2016

U.S. v. COTTOM (December 22, 2015).

This quote tweeted April 27, 2016 by Anonymous:

Dr. Matt Edman also testified at the hearing. Id. at 84-101. In the Fall of 2012 he was employed by the Mitre Corporation as a senior cyber security engineer assigned to the FBI’s Remote Operations Unit. Id. at 84. He testified he has a bachelor of science degree in computer science from Baylor University and a Master’s Degree and Ph. D. in computer science from Rensselaer Polytechnic Institute. Id. at 85. He essentially corroborated Smith’s testimony. Id. at 85-89. He stated he adapted and configured the application found on to collect the limited set of information from a user’s computer (a unique identifier, the user’s operating system type, version, and architecture) and then send that information to the FBI-controlled server. Id. at 89. He wrote the source code and called it “Cornhusker.” Id. at 87. He stated there was no other functionality installed. Id. He further testified he did not plant porn on anyone’s computer. Id. (emphasis in the Anonymous tweet but not in the original decision)

Without more context, I was puzzled why that portion of the opinion was significant to Anonymous?

Mystery solved this morning when I saw: Former Tor Developer Created Malware for FBI to Unmask Tor Users by Swati Khandelwal.

From Swati’s post:

According to an investigation, Matthew Edman, a cyber security expert and former employee of the Tor Project, helped the FBI with Cornhusker a.k.a Torsploit malware that allowed Feds to hack and unmask Tor users in several high-profile cases, including Operation Torpedo and Silk Road.

I say “mystery solved,” but not really because I still fail to see the complaint about Matthew Edman working on anti-Tor malware?

No one claims Edman did poor work on Tor in hopes of a future exploit.

He was a former Tor employee working for Mitre, who had a client requesting anti-Tor malware.

Who should Mitre have tasked with that job?

Someone who had never used Tor or perhaps someone with greater familiarity with it?

For another take on this issue, see: Gamekeeper turns poacher? The ex-Tor developer who unmasked Tor users for the FBI by Paul Ducklin.

Paul writes:

…Edman is nevertheless being pilloried in the media, as though he were some sort of “gamekeeper turned poacher”, and as though, having once worked on Tor, he ought to have turned his back on law enforcement for ever.

What do you think? Is Edman some sort of turncoat?

Or has he shown that you can be in favour of privacy while also supporting the uncloaking of users when investigating serious crimes?

My answer is: Next question?

Edman was hired and owed his client in each case his best efforts.

What more could anyone ask?

Hacking Book Sale! To Support the Electronic Frontier Foundation

Wednesday, April 27th, 2016

Humble Books Bundle: Hacking

No Starch Press has teamed up with Humble Bundle to raise money for the Electronic Frontier Foundation (EFF)!

$366 worth of No Starch hacking books on a pay what you want basis!

Charitable opportunities don’t get any better than this!

As I type this post, sales of these bundles rolled over 6,200 sales!

To help me participate in this sale, consider a donation.


Topic Map Fooddie Alert!

Wednesday, April 27th, 2016

Our Tagged Ingredients Data is Now on GitHub by Erica Greene and Adam McKaig.

From the post:

Since publishing our post about “Extracting Structured Data From Recipes Using Conditional Random Fields,” we’ve received a tremendous number of requests to release the data and our code. Today, we’re excited to release the roughly 180,000 labeled ingredient phrases that we used to train our machine learning model.

You can find the data and code in the ingredient-phrase-tagger GitHub repo. Instructions are in the README and the raw data is in nyt-ingredients-snapshot-2015.csv.

Reaching a critical mass for any domain is a stumbling block for any topic map. Erica and Adam kick start your foodie topic map adventures with ~ 180,000 labeled ingredient phrases.

You are looking at the end result of six years of data mining and some clever programming so be sure to:

  1. Always acknowledge this project along with Erica and Alex in your work.
  2. Contribute back improved data.
  3. Contribute back improvements on the conditional random fields (CRF).
  4. Have a great time extending this data set!

Possible extensions include automatic translation (with mapping of “equivalent” terms), melding in the USDA food database (it’s formally known as: USDA National Nutrient Database for Standard Reference) with nutrient content information on ~8,800 foods, and, of course, the “correct” way to make a roux as reflected in your mother’s cookbook.

It is, unfortunately, true that you can buy a mix for roux in a cardboard box. That requires a food processor to chop up the cardboard to enjoy with the roux that came in it. I’m originally from Louisiana and the thought of a roux mix is depressing, if not heretical.

Reboot Your $100+ Million F-35 Stealth Jet Every 10 Hours Instead of 4 (TM Fusion)

Wednesday, April 27th, 2016

Pentagon identifies cause of F-35 radar software issue

From the post:

The Pentagon has found the root cause of stability issues with the radar software being tested for the F-35 stealth fighter jet made by Lockheed Martin Corp, U.S. Defense Acquisition Chief Frank Kendall told a congressional hearing on Tuesday.

Last month the Pentagon said the software instability issue meant the sensors had to be restarted once every four hours of flying.

Kendall and Air Force Lieutenant General Christopher Bogdan, the program executive officer for the F-35, told a Senate Armed Service Committee hearing in written testimony that the cause of the problem was the timing of “software messages from the sensors to the main F-35” computer. They added that stability issues had improved to where the sensors only needed to be restarted after more than 10 hours.

“We are cautiously optimistic that these fixes will resolve the current stability problems, but are waiting to see how the software performs in an operational test environment,” the officials said in a written statement.
… (emphasis added)

At $100+ Million plane that requires rebooting every ten hours? I’m not a pilot but that sounds like a real weakness.

The precise nature of the software glitch isn’t described but you can guess one of the problems from Lockheed Martin’s, Software You Wish You Had: Inside the F-35 Supercomputer:

The human brain relies on five senses—sight, smell, taste, touch and hearing—to provide the information it needs to analyze and understand the surrounding environment.

Similarly, the F-35 relies on five types of sensors: Electronic Warfare (EW), Radar, Communication, Navigation and Identification (CNI), Electro-Optical Targeting System (EOTS) and the Distributed Aperture System (DAS). The F-35 “brain”—the process that combines this stellar amount of information into an integrated picture of the environment—is known as sensor fusion.

At any given moment, fusion processes large amounts of data from sensors around the aircraft—plus additional information from datalinks with other in-air F-35s—and combines them into a centralized view of activity in the jet’s environment, displayed to the pilot.

In everyday life, you can imagine how useful this software might be—like going out for a jog in your neighborhood and picking up on real-time information about obstacles that lie ahead, changes in traffic patterns that may affect your route, and whether or not you are likely to pass by a friend near the local park.

F-35 fusion not only combines data, but figures out what additional information is needed and automatically tasks sensors to gather it—without the pilot ever having to ask.
… (emphasis added)

The fusion of data from other in-air F-35s is a classic topic map merging of data problem.

You have one subject, say an anti-aircraft missile site, seen from up to four (in the F-35 specs) F-35s. As is the habit of most physical objects, it has only one geographic location but the fusion computer for the F-35 doesn’t come up with than answer.

Kris Osborn writes in Software Glitch Causes F-35 to Incorrectly Detect Targets in Formation:

“When you have two, three or four F-35s looking at the same threat, they don’t all see it exactly the same because of the angles that they are looking at and what their sensors pick up,” Bogdan told reporters Tuesday. “When there is a slight difference in what those four airplanes might be seeing, the fusion model can’t decide if it’s one threat or more than one threat. If two airplanes are looking at the same thing, they see it slightly differently because of the physics of it.”

For example, if a group of F-35s detect a single ground threat such as anti-aircraft weaponry, the sensors on the planes may have trouble distinguishing whether it was an isolated threat or several objects, Bogdan explained.

As a result, F-35 engineers are working with Navy experts and academics from John’s Hopkins Applied Physics Laboratory to adjust the sensitivity of the fusion algorithms for the JSF’s 2B software package so that groups of planes can correctly identify or discern threats.

“What we want to have happen is no matter which airplane is picking up the threat – whatever the angles or the sensors – they correctly identify a single threat and then pass that information to all four airplanes so that all four airplanes are looking at the same threat at the same place,” Bogdan said.

Unless Bogdan is using “sensitivity” in a very unusual sense, that doesn’t sound like the issue with the fusion computer of the F-35.

Rather the problem is the fusion computer has no explicit doctrine of subject identity to use when it is merging data from different F-35s, whether it be two, three, four or even more F-35s. The display of tactical information should be seamless to the pilot and without human intervention.

I’m sure members of Congress were impressed with General Bogdan using words like “angles” and “physics,” but the underlying subject identity issue isn’t hard to address.

At issue is the location of a potential target on the ground. Within some pre-defined metric, anything located within a given area is the “same target.”

The Air Force has already paid for this type of analysis and the mathematics of what is called Circular Error Probability (CEP) has been published in Use of Circular Error Probability in Target Detection by William Nelson (1988).

You need to use the “current” location of the detecting aircraft, allowances for inaccuracy in estimating the location of the target, etc., but once you call out the subject identity as an issue, its a matter of making choices of how accurate you want the subject identification to be.

Before you forward this to Gen. Bogdan as a way forward on the fusion computer, realize that CEP is only one aspect of target identification. But, calling the subject identity of targets out explicitly, enables reliable presentation of single/multiple targets to pilots.

Your call, confusing displays or a reliable, useful display.

PS: I assume military subject identity systems would not be running XTM software. Same principles apply even if the syntax is different.

Visual Searching with Google – One Example – Neo4j – Raspberry Pi

Tuesday, April 26th, 2016

Just to show I don’t spend too much time thinking of ways to gnaw on the ankles of Süddeutsche Zeitung (SZ), the hoarders of the Panama Papers, here is my experience with visual searching with Google today.

I saw this image on Twitter:


I assumed that cutting the “clutter” from around the cluster might produce a better result. Besides, the plastic separators looked (to me) to be standard and not custom made.

Here is my cropped image for searching:


Google responded this looks like: “water.” 😉

OK, so I tried cropping it more just to show the ports, thinking that might turn up similar port arrangements, here’s that image:


Google says: “machinery.” With a number of amusing “similar” images.

BTW, when I tried the full image, the first one, Google says: “electronics.”

OK, so much for Google image searching. What if I try?

Searching on neo4j cluster and raspberry pi (the most likely suspect), my first “hit” had this image:


Same height as the search image.

My seventh “hit” has this image:


Same height and logo as the search image. That’s Stefan Armbruster next to the cluster. (He does presentations on building the cluster, but I have yet to find a video of one of those presentations.)

My eight “hit


Common wiring color (networking cable), height.

Definitely Raspberry Pi but I wasn’t able to uncover further details.

Very interested in seeing a video of Stefan putting one of these together!

Open Data Institute – Join From £1 (Süddeutsche Zeitung (SZ), “Nein!”)

Tuesday, April 26th, 2016

A new offer for membership in the Open Data Institute:

Data impacts everybody. It’s the infrastructure that underpins transparency, accountability, public services, business innovation and civil society.

Together we can embrace open data to improve how we access healthcare services, discover cures for diseases, understand our governments, travel around more easily and much, much more.

Are you eager to learn more about it, collaborate with it or meet others who are already making a difference with it? From just £1 join our growing, collaborative global network of individuals, students, businesses, startups and organisations, and receive:

  • invitations to events and open evenings organised by the ODI and beyond
  • opportunities to promote your own news and events across the network
  • updates up to twice a month from the world of data and open innovation
  • 30% discount on all our courses
  • 20% reduction on our annual ODI Summit

Become a member from £1

I’d like to sign my organisation up

If you search for Süddeutsche Zeitung (SZ), the hoarders of the Panama Papers, you will come up empty.

SZ is in favor of transparency and accountability, but only for others. Never for SZ.

SZ claims in some venues to be concerned with the privacy of individuals mentioned in the Panama Papers.

How to judge between privacy rights of individuals, parties to looting nations, against the public’s right to judge reporting on the same? How is financial regulation reform possible without the details?

SZ is comfortable with protecting looters of nations and obstructing meaningful financial reform.

You can judge news media by the people they protect.

SVGs beyond mere shapes

Tuesday, April 26th, 2016

SVGs beyond mere shapes by Nadieh Bremer

From the post:

I was exhilarated (and honored) to have my talk accepted for OpenVis 2016. Yesterday April 25th, 2016, I was on the stage of the Simons IMAX Theatre in Boston’s New England Aquarium to inspire the audience with some dataviz eye candy. My talk was titled SVGs beyond mere shapes:

SVG can do much more than create nice shapes and paths. In my talk I discuss several techniques and demonstrate how to implement them in D3: from dynamic gradients based on data, to SVG filters, to creating glow, gooey, and fuzzy effects that brighten up any visual.

My eventual goal was to give people a whole bunch of effective or fun examples but to also show them that, even if I focus on a subject as narrow as SVG gradient and filters, if you try to experiment and use things in an unconventional manner you can create some very interesting results. I hope I’ve managed to inspire the audience to show a dedication to the details, to go beyond the norm, so they have to make as few concessions to the computer as possible to recreate the image that they have in their mind.

I’ve received so many wonderful reactions, it was really an amazing experience and well worth the time invested and the nerves I’ve had building up inside of me since hearing I’d been accepted last November 🙂

Are you ready to take SVG beyond shapes?

The start of a series so check back often and/or follow @NadiehBremer.

One Data Journalism Toolkit

Tuesday, April 26th, 2016

A Data Journalism Expert’s Personal Toolkit by Duc Quang Nguyen.

From the post:

I was interested to review my own toolkit. Spoiler alert — this post is code-centric and will mention R a lot. This is just because I am familiar with it. I do not think everybody should necessarily use my workflow. I will not discuss much Excel, Python, Javascript, … I am well aware, however, that they are more typically used in ddj.

Before I dive into my typical workflow and tools for 2016 so far, I should mention that I work as the sole data journalist in my newsroom. It is more common in news outlets to have data/visual journalism teams, with people specialized in specific sub-areas of data-driven journalism. My workflow is pretty much data journalism on a shoe string.

Also, by ideology and because I am a nerd, I use (nearly) solely open-source free tools. Again, it is just because these are what I am more familiar with. But if there was a proprietary framework with which I can do things faster and better, I would switch in a heartbeat.

Your starting toolkit may not have all the capabilities of Nguyen’s toolkit but it is a good target to grow towards.

Start from your everyday needs and workflow and then select tools that meet those needs and workflow. Many fine tools won’t suit your present needs and there’s no shame in that. You will be far better off mastering tools that do meet your current needs.


Peda(bot)bically Speaking:…

Monday, April 25th, 2016

Peda(bot)bically Speaking: Teaching Computational and Data Journalism with Bots by Nicholas Diakopoulos.

From the post:

Bots can be useful little creatures for journalism. Not only because they help us automate tasks like alerting and filtering, but also because they encapsulate how data and computing can work together, in service of automated news. At the University of Maryland, where I’m a professor of journalism, my students are using the power of news bots to learn concepts and skills in computational journalism—including both editorial thinking and computational thinking.

Hmmm, bot that filters all tweets that don’t contain a URL? (To filter cat pics and the like.) 😉

Or retweets tweets with #’s that trigger creation of topics/associations?

I don’t think there is a requirement that hashtags be meaningful to others. Yes?

Sounds like a great class!

Seriously, Who’s Gonna Find It?

Monday, April 25th, 2016


Graphic whimsy via Bruce Sterling,

Are your information requirements met by finding something or by finding the right thing?

Panama Papers: Süddeutsche Zeitung’s (SZ) Claims “National Security,” Press Plays Dead

Monday, April 25th, 2016

“National security” is the battle cry of governments seeking to conceal materials from the public.

In an odd turn of events, Süddeutsche Zeitung‘s (SZ), the recipient of the Panama Papers leak, claims the equivalent of national security to withhold the leaked data.

No one doubts the obligation of Süddeutsche Zeitung to protect the identity of the leaker, but specious logic leads SZ astray:

“As journalists, we have to protect our source: we can’t guarantee that there is no way for someone to find out who the source is with the data. That’s why we can’t make the data public,” the team said during an “Ask Me Anything” session on Reddit, which included journalist Bastian Obermayer, who was first contacted by the anonymous source.

“You don’t harm the privacy of people, who are not in the public eye. Blacking out private data is a task that would require a lifetime of work – we have eleven million documents,” the unit added.

Change SZ to the United States government, any government, and play back the argument:

Out of 11.5 million documents “…we can’t guarantee that there is no way for someone to find out who the source is with the data. That’s why we can’t make the data public….

A twitter storm of mockery would follow along with Charlie Hebdo cartoons.

Süddeutsche Zeitung makes such an absurd claim and the press responds with:

(dead air)

All Süddeutsche Zeitung can guarantee is their secrecy:

… prevents the general public from checking government abuses of power and participating in democratic deliberation over the optimal [financial regulation] policies. page 817 Secrecy and National Security Investigations by Nathan Alexander Sales.

And the public cannot hold Süddeutsche Zeitung and others responsible for who have not been named in current reports.

Has anyone mentioned Süddeutsche Zeitung remains in possession of a treasure trove that will see present cub reporters past retirement?

One of the righteous 400 journalists with access to the Panama Papers needs to complete the leaking process. Empower the public to reach its own conclusions. Aided by the independent press but not dependent upon it.

Leak the leak!

Beautiful People Love MongoDB (But Not Brits)

Monday, April 25th, 2016 Leaks Very Private Data of 1.1 Million ‘Elite’ Daters — And It’s All For Sale by Thomas Fox-Brewster.

From the post:

Sexual preference. Relationship status. Income. Address. These are just some details applicants for the controversial dating site are asked to supply before their physical appeal is judged by the existing user base, who vote on who is allowed in to the “elite” club based on looks alone. All of this, of course, is supposed to remain confidential. But much of that supposedly-private information is now public, thanks to the leak of a database containing sensitive data of 1.1 million users. The leak, according to one researcher, also included 15 million private messages between users. Another said the data is now being sold by traders lurking in the murky corners of the web.

News of the breach was passed to FORBES initially in December 2015 by researcher Chris Vickery. At the time, said the compromised data came from a test server, which was quickly locked up. It did not appear to be a serious incident.

But the information – which now appears to be real user data despite being hosted on a non-production server – was taken by one or more less-than-scrupulous individuals before the lockdown, making it out into the dirty world of data trading this year.

“We’re looking at in excess of 100 individual data attributes per person,” Hunt told FORBES. “Everything you’d expect from a site of this nature is in there.”

Vickery said the database he’d obtained contained 15 million messages between users. One exchange shown to FORBES involved users asking for prurient pictures of one another. A separate message read: “I didn’t even think to look for a better photo because the brits, on average, are some ugly motherf***ers anyway.” This would appear to chime with’s own “research”.

Don’t be in the act of drinking any hot or cold beverages when you visit “’s own “research”.” You may hurt yourself or ruin a keyboard. Fair warning.

The relative inaccessibility of these hacked data sets prevents leaks from acting as incentives for online services to improve their data security.

Imagine Forbes running data market pricing for “beautiful people,” living in Stockholm, for example. A very large number of people would imagine themselves to be in that set, which would set the price of that sub-set accordingly.

Moreover, it would be harder for to recruit new members, who are aware of the company’s lack security practices.

Thomas says that the leak was from a non-production MongoDB server.

That’s one of those databases that installs with no password for root and no obvious (in the manual) way to set it. I say “not obvious,” take a look at page 396 of the MongoDB Reference Manual, Release 3.2.5, April 25, 2016, where you will find:

The localhost exception allows you to enable access control and then create the first user in the system. With the localhost exception, after you enable access control, connect to the localhost interface and create the first user in the admin database. The first user must have privileges to create other users, such as a user with the userAdmin (page 488) or userAdminAnyDatabase (page 493) role.

Changed in version 3.0: The localhost exception changed so that these connections only have access to create the first user on the admin database. In previous versions, connections that gained access using the localhost exception had unrestricted access to the MongoDB instance.

The localhost exception applies only when there are no users created in the MongoDB instance.

First mention of password in the manual.

Should you encounter a MongoDB instance in the wild, 3.0 or earlier….

Anonymity and Privacy – Lesson 1

Monday, April 25th, 2016

“Welcome to ‘How to Triforce’ advanced”

Transcript of the first OnionIRC class on anonymity and privacy.

From the introduction:

Welcome to the first of (hopefully) many lessons to come here on the OnionIRC, coming to you live from The Onion Routing network! This lesson is an entry-level course on Anonymity and Privacy.

Some of you may be wondering why we are doing this. What motivates us? Some users have shown concern that this network might be “ran by the feds” and other such common threads of discussion in these dark corners of the web. I assure you, our goal is to educate. And I hope you came to learn. No admin here will ask you something that would compromise your identity nor ask you to do anything illegal. We may, however, give you the tools and knowledge necessary to commit what some would consider a crime. (Shout out to all the prisons out there that need a good burning!) What you do with the knowledge you obtain here is entirely your business.

We are personally motivated to participate in this project for various reasons. Over the last five years we have seen the numbers of those aligning with Anonymous soaring, while the average users’ technical knowhow has been on the decline. The average Anonymous “member” believes that DDoS and Twitter spam equates to hacking & activism, respectively. While this course is not covering these specific topics, we think this is a beginning to a better understanding of what “hacktivism” is & how to protect yourself while subverting corrupt governments.

Okay, enough with the back story. I’m sure you are all ready to start learning.

An important but somewhat jumpy discussion of OpSec (Operational Security) occurs between time marks 0:47 and 1:04.

Despite what you read in these notes, you have a substantial advantage over the NSA or any large organization when it comes to Operational Security.

You and you alone are responsible for your OpSec.

All large organizations, including the NSA, are vulnerable through employees (current/former), contractors (current/former), oversight committees, auditors, public recruitment, etc. They all leak, some more than others.

Given the NSA’s footprint, you should have better than NSA-grade OpSec from the outset. If you don’t, you need a safer hobby. Try binging on X-Files reruns.

The chat is informative, sometimes entertaining, and tosses out a number of useful tidbits but you will get more details out of the notes.


Women in Data Science (~632) – Twitter List

Monday, April 25th, 2016

Data Science Renee has a twitter list of approximately 632 women in data science.

I say “approximately” because when I first saw her post about the list it had 630 members. When I looked this AM, it had 632 members. By the time you look, that number will be different again.

If you are making a conscious effort to seek a diversity of speakers for your next data science conference, it should be on your list of sources.


Criticizing Erdoğan

Sunday, April 24th, 2016

Dutch journalist arrested in Turkey for criticising Erdoğan.

From the post:

A Dutch journalist was arrested early on Sunday at her home in Turkey for tweets deemed critical of the Turkish president, Recep Tayyip Erdoğan, according to her Twitter account.

“Police at the door. No joke,” wrote Ebru Umar, a well-known atheist and feminist journalist of Turkish origin.

If the story wasn’t disturbing enough, it concludes:

Trials in Turkey for insulting Erdoğan have multiplied since his election to the presidency in August 2014, with nearly 2,000 such cases currently open.

I started this post to ask for suggested criticisms of or insults for Erdoğan in Turkish.

But my criticisms and/or insults, in English and/or Turkish, aren’t going to burden Erdoğan or those using big data to track all the criticisms/insults. What’s one more?

Saying that I support those who criticize and/or insult Erdoğan is true, but again, that’s no skin off of Erdoğan and his overworked criticism/insult trackers. They must have NSA-sized cloud space just to keep up with the ones in Turkish.

Whatever the Turkish equivalent of “asshole” is, every occurrence in print or speech is likely a direct and/or indirect reference to Erdoğan. That is by definition a big data problem.

With his sensitivity to criticism and insults, how would Erdoğan react to all the banks in Turkey going dark? All of them. Turkish and foreign.

The banking/business community would take that to reflect unfavorably on Erdoğan. Yes?

Therefore, avoiding that sort of problem, would be good planning on the part of Erdoğan. Which would include ending and apologizing for all the past and present insult/criticism charges.

It isn’t the case that “…business as usual…” is an absolute, “…business as usual…” is an indulgence of those who control the switches and hubs of modern communication and networking.

Erdoğan has voluntarily departed from the norms expected for “…business as usual….” Let’s all hope that he quickly and voluntarily returns to expected norms of civilized behavior. For the banks sake if no one else’s.

PS: It would require more planning and expertise than defacing a KKK website (or Denver’s) but where’s the challenge in twitting racists?

The New Normal

Saturday, April 23rd, 2016

The New Normal, a series by Michael Nygard.

I encountered one of the more recent posts in this series and when looking for its beginning: The New Normal: Failure is a Good Thing.

From that starting post:

Everything breaks. It’s just a question of when and how badly.

What we need is a new approach where “continuous partial failure” is the normal state of affairs. Continuous partial failure opens the doors to making big changes happen because you’re already good at executing the small stuff.

In subsequent posts, I’ll talk about moving from the mentality of preventing problems to actually promoting them. I’ll look at the aging models for achieving resiliency and introduce microservices as an extension of the concept of antifragility into the design of IT infrastructure, applications, and organizations.

Along the way, I’ll share some stories about Netflix and their classic Chaos Monkey, how Amazon is becoming an increasingly terrifying competitor, the significance of maneuverability and the art of war, the unforeseen consequences of outsourcing and how Cognitect’s simple and sharp tools play a pivotal role in shaping the new IT blueprint.

Does anyone seriously doubt the the proposition: Everything breaks?

From a security perspective, I would not argue with Everything’s broken.

I’m starting at the beginning and working my way forward in this series. It promises to be seriously rewarding.


Functors, Applicatives, and Monads in Plain English

Saturday, April 23rd, 2016

Functors, Applicatives, and Monads in Plain English by Russ Bishop.

From the post:

Let’s learn what Monads, Applicatives, and Functors are, only instead of relying on obscure functional vocabulary or category theory we’ll just, you know, use plain english instead.

See what you think.

I say Russ was successful.


Create a Heatmap in Excel

Saturday, April 23rd, 2016

Create a Heatmap in Excel by Jonathan Schwabish.

From the post:

Last week, I showed you how to use Excel’s Conditional Formatting menu to add cell formats to highlight specific data values. Here, I’ll show you how to easily use the Color Scales options in that menu to create a Heatmap.

Simply put, a heatmap is a table where the data are visualized using color. They pop up fairly regularly these days, sometimes showing the actual data values and sometimes not, like these two I pulled from FlowingData.

In addition to this post, there are a number of other Excel-centric visualization posts, podcasts and other high quality materials.

Even if you aren’t sold on Excel, you will learn a lot about visualization here.


Loading the Galaxy Network of the “Cosmic Web” into Neo4j

Saturday, April 23rd, 2016

Loading the Galaxy Network of the “Cosmic Web” into Neo4j by Michael Hunger.

Cypher script for loading “Cosmic Web” into Neo4j.

You remember “Cosmic Web:”