Abandon All Hope Prior To IE 11

August 26th, 2015

Stay up-to-date with Internet Explorer

From the post:

As we shared in May, Microsoft is prioritizing helping users stay up-to-date with the latest version of Internet Explorer. Today we would like to share important information on migration resources, upgrade guidance, and details on support timelines to help you plan for moving to the latest Internet Explorer browser for your operating system.

Microsoft offers innovative and transformational services for a mobile-first and cloud-first world, so you can do more and achieve more; Internet Explorer is core to this vision. In today’s digital world, billions of people use Internet-connected devices, powered by cloud service-based applications, spanning both work and life experiences. Running a modern browser is more important than ever for the fastest, most secure experience on the latest Web sites and services, connecting anytime, anywhere, on any device.

Microsoft recommends enabling automatic updates to ensure an up-to-date computing experience—including the latest version of Internet Explorer—and most consumers use automatic updates today. Commercial customers are encouraged to test and accept updates quickly, especially security updates. Regular updates provide significant benefits, such as decreased security risk and increased reliability, and Windows Update can automatically install updates for Internet Explorer and Windows.

For customers not yet running the latest browser available for your operating system, we encourage you to upgrade and stay up-to-date for a faster, more secure browsing experience. Beginning January 12, 2016, the following operating systems and browser version combinations will be supported:

Windows Platform Internet Explorer Version
Windows Vista SP2 Internet Explorer 9
Windows Server 2008 SP2 Internet Explorer 9
Windows 7 SP1 Internet Explorer 11
Windows Server 2008 R2 SP1 Internet Explorer 11
Windows 8.1 Internet Explorer 11
Windows Server 2012 Internet Explorer 10
Windows Server 2012 R2 Internet Explorer 11

After January 12, 2016, only the most recent version of Internet Explorer available for a supported operating system will receive technical support and security updates. For example, customers using Internet Explorer 8, Internet Explorer 9, or Internet Explorer 10 on Windows 7 SP1 should migrate to Internet Explorer 11 to continue receiving security updates and technical support. For more details regarding support timelines on Windows and Windows Embedded, see the Microsoft Support Lifecycle site.

I can’t comment on the security of IE 11 but it will create a smaller footprint for support. Perhaps some hackers will be drawn away for easier pickings on earlier versions.

You are already late planning your migration path to IE 11.

What IE version are you going to be running on January 12, 2016?

Spreadsheets are graphs too!

August 26th, 2015

Spreadsheets are graphs too! by Felienne Hermans.

Presentation with transcript.

Felienne starts with a great spreadsheet story:

When I was in grad school, I worked with an investment bank doing spreadsheet research. On my first day, I went to the head of the Excel team.

I said, ‘Hello, can I have a list of all your spreadsheets?’

There was no such thing.

‘We don’t have a list of all the spreadsheets,’ he said. ‘You could ask Frank in Accounting or maybe Harry over at Finance. He’s always talking about spreadsheets. I don’t really know, but I think we might have 10,000 spreadsheets.’

10,000 spreadsheets was a gold mine of research, so I went to the IT department and conducted my first spreadsheet scan with root access in Windows Explorer.

Within one second, it had already found 10,000 spreadsheets. Within an hour, it was still finding more, with over one million Excel files located. Eventually, we found 2.5 million spreadsheets.

In short, spreadsheets run the world.

She continues to outline spreadsheet horror stories and then demonstrates how complex relationships between cells can be captured by Neo4j.

Which are much easier to query with Cypher than SQL!

While I applaud:

I realized that spreadsheet information is actually very graphy. All the cells are connected to references to each other and they happen to be in a worksheet or on the spreadsheet, but that’s not really what matters. What matters is the connections.

I would be more concerned with the identity of the subjects between which connections have been made.

Think of it as documenting the column headers from a five year old spreadsheet, that you are now using by rote.

Knowing the connections between cells is a big step forward. Knowing what the cells are supposed to represent is an even bigger one.

Trademark Litigation Attorney Needed – Contingency Fee Case

August 26th, 2015

No, not for me but for Grsecurity.

Important Notice Regarding Public Availability of Stable Patches by Brad Spengler & The PaX Team.

From the webpage:

Grsecurity has existed for over 14 years now. During this time it has been the premier solution for hardening Linux against security exploits and served as a role model for many mainstream commercial applications elsewhere. All modern OSes took our lead and implemented to varying degrees a number of security defenses we pioneered; some have even been burned into silicon in newer processors. Over the past decade, these defenses (a small portion of those we’ve created and have yet to release) have single-handedly caused the greatest increase in security for users worldwide.


A multi-billion dollar corporation had made grsecurity a critical component of their embedded platform. This in itself isn’t a problem, nor is it necessarily (albeit extremely unwise) that they’re using an old, unsupported kernel and a several year old, unsupported version of grsecurity that they’ve modified. This seems to be the norm for the embedded Linux industry, seemingly driven by a need to mark a security checkbox at the lowest cost possible. So it’s no surprise that they didn’t bother to hire us to perform the port properly for them or to actively maintain the security of the kernel they’re providing to their paid customers.

They are publishing a “grsecurity” for a kernel version we never released a patch for. We provided evidence to their lawyers of one of their employees registering on our forums and asking for free help with backporting an EFI fix to their modified version of grsecurity based off a very old patch of ours (a test patch that wasn’t even the last one released for that major kernel version). The company’s lawyers repeatedly claimed the company had not modified the grsecurity code in any way and that therefore all the references to “grsecurity” in their product were therefore only nominative use of the trademark to refer to our external work. They would therefore not cease using our trademark and would continue to do so despite our objections. This final assertion occurred three months after our initial cease and desist letter. They also threatened to request “all available sanctions and attorneys’ fees” were we to proceed with a lawsuit against them.

This announcement is our public statement that we’ve had enough. Companies in the embedded industry not playing by the same rules as every other company using our software violates users’ rights, misleads users and developers, and harms our ability to continue our work. Though I’ve only gone into depth in this announcement on the latest trademark violation against us, our experience with two GPL violations over the previous year have caused an incredible amount of frustration. These concerns are echoed by the complaints of many others about the treatment of the GPL by the embedded Linux industry in particular over many years.

With that in mind, today’s announcement is concerned with the future availability of our stable series of patches. We decided that it is unfair to our sponsors that the above mentioned unlawful players can get away with their activity. Therefore, two weeks from now, we will cease the public dissemination of the stable series and will make it available to sponsors only. The test series, unfit in our view for production use, will however continue to be available to the public to avoid impact to the Gentoo Hardened and Arch Linux communities. If this does not resolve the issue, despite strong indications that it will have a large impact, we may need to resort to a policy similar to Red Hat’s, described here or eventually stop the stable series entirely as it will be an unsustainable development model.

If you know a trademark attorney or would like to donate the services of one (large corporations have them by the bag full), consider contacting Grsecurity.

Bottom feeders, as are described in Brad’s post, remind me of the “pigs in their stys with all their backing, what they need is a damned good whacking.”

Please re-post, forward, distribute, etc.

Looking for Big Data? Look Up!

August 25th, 2015

Gaia’s first year of scientific observations

From the post:

After launch on 19 December 2013 and a six-month long in-orbit commissioning period, the satellite started routine scientific operations on 25 July 2014. Located at the Lagrange point L2, 1.5 million km from Earth, Gaia surveys stars and many other astronomical objects as it spins, observing circular swathes of the sky. By repeatedly measuring the positions of the stars with extraordinary accuracy, Gaia can tease out their distances and motions through the Milky Way galaxy.

For the first 28 days, Gaia operated in a special scanning mode that sampled great circles on the sky, but always including the ecliptic poles. This meant that the satellite observed the stars in those regions many times, providing an invaluable database for Gaia’s initial calibration.

At the end of that phase, on 21 August, Gaia commenced its main survey operation, employing a scanning law designed to achieve the best possible coverage of the whole sky.

Since the start of its routine phase, the satellite recorded 272 billion positional or astrometric measurements, 54.4 billion brightness or photometric data points, and 5.4 billion spectra.

The Gaia team have spent a busy year processing and analysing these data, en route towards the development of Gaia’s main scientific products, consisting of enormous public catalogues of the positions, distances, motions and other properties of more than a billion stars. Because of the immense volumes of data and their complex nature, this requires a huge effort from expert scientists and software developers distributed across Europe, combined in Gaia’s Data Processing and Analysis Consortium (DPAC).

In case you missed it:

Since the start of its routine phase, the satellite recorded 272 billion positional or astrometric measurements, 54.4 billion brightness or photometric data points, and 5.4 billion spectra.

It sounds like big data. Yes? ;-)

Public release of the data is pending. Check back at the Gaia homepage for the latest news.

Your Fridge Joined Ashley Madison?

August 24th, 2015

Samsung smart fridge leaves Gmail logins open to attack by John Leyden.

From the post:

Security researchers have discovered a potential way to steal users’ Gmail credentials from a Samsung smart fridge.

Pen Test Partners discovered the MiTM (man-in-the-middle) vulnerability that facilitated the exploit during an IoT hacking challenge run by Samsung at the recent DEF CON hacking conference.

The hack was pulled off against the RF28HMELBSR smart fridge, part of Samsung’s line-up of Smart Home appliances which can be controlled via their Smart Home app. While the fridge implements SSL, it fails to validate SSL certificates, thereby enabling man-in-the-middle attacks against most connections.

The internet-connected device is designed to download Gmail Calendar information to an on-screen display. Security shortcomings mean that hackers who manage to jump on to the same network can potentially steal Google login credentials from their neighbours.

The certainty of online transactions diminishes with the spread of the internet-of-things (IoT).

Think about it. My email, packets with my router address, etc. may appear in massive NSA data vacuum bags. What defense to I have other than “I didn’t send, receive, etc.?” I’m not logging or at least not preserving logs of every bit of every keystroke on my computer.

Are you?

And if you did, how would you authenticate it to the NSA?

Of course, the authenticity, or subject identity in topic map terms, of an email in Ashley Madison data, should depend on a number of related factors to establish identity. From the user profile associated with an email for example. Are sexual profiles as unique as fingerprints?

Authenticity hasn’t been raised in the NSA phone surveillance debate but if you think “phone call tracking is connecting dots,” it isn’t likely to come up.

I used to call the time-of-day service after every client call. It wasn’t a secretive messaging technique, it was to clear the last called buffer on the phone. Sometimes a lack of a pattern is just that, lack of a pattern.

Linux on the Mainframe

August 24th, 2015

Linux Foundation Launches Open Mainframe Project to Advance Linux on the Mainframe

From the post:

The Linux Foundation, the nonprofit organization dedicated to accelerating the growth of Linux and collaborative development, announced the Open Mainframe Project. This initiative brings together industry experts to drive innovation and development of Linux on the mainframe.

Founding Platinum members of the Open Mainframe Project include ADP, CA Technologies, IBM and SUSE. Founding Silver members include BMC, Compuware, LC3, RSM Partners and Vicom Infinity. The first academic institutions participating in the effort include Marist College, University of Bedfordshire and The Center for Information Assurance and Cybersecurity at University of Washington. The announcement comes as the industry marks 15 years of Linux on the mainframe.

In just the last few years, demand for mainframe capabilities have drastically increased due to Big Data, mobile processing, cloud computing and virtualization. Linux excels in all these areas, often being recognized as the operating system of the cloud and for advancing the most complex technologies across data, mobile and virtualized environments. Linux on the mainframe today has reached a critical mass such that vendors, users and academia need a neutral forum to work together to advance Linux tools and technologies and increase enterprise innovation.

“Linux today is the fastest growing operating system in the world. As mobile and cloud computing become globally pervasive, new levels of speed and efficiency are required in the enterprise and Linux on the mainframe is poised to deliver,” said Jim Zemlin executive director at The Linux Foundation. “The Open Mainframe Project will bring the best technology leaders together to work on Linux and advanced technologies from across the IT industry and academia to advance the most complex enterprise operations of our time.”

Linux Foundation Collaborative Projects, visit: http://collabprojects.linuxfoundation.org/

Open Mainframe Project, visit: https://www.openmainframeproject.org/

In terms of ancient topic map history, recall that both topic maps and DocBook arose out of what became the X-Windows series by O’Reilly. If you are familiar with the series, you can imagine the difficulty of adapting it to the nuances of different vendor releases and vocabularies.

Several of the volumes from the X-Windows series are available in the O’Reilly OpenBook Project.

I mention that item of topic map history because documenting mainframe Linux isn’t going to be a trivial task. A useful index across documentation from multiple authors is going to require topic maps or something very close to it.

One last bit of trivia, the X-Windows project can be found at www.x.org. How’s that for cool? A single letter name.

Popcorn Time Information Banned in Denmark

August 24th, 2015

Not content to prosecute actual copyright violators, Denmark is prosecuting people who spread information about software that can violate copyrights.

That right! Information, not links to pirated content, not the software, just information about the software.

Police Arrest Men For Spreading Popcorn Time Information.

From the post:

While arrests of file-sharers and those running sites that closely facilitate infringement are nothing new, this week’s arrests appear to go way beyond anything seen before. The two men are not connected to the development of Popcorn Time and have not been offering copyrighted content for download.

Both sites were information resources, offering recent news on Popcorn Time related developments, guides, FAQ sections and tips on how to use the software.

Both men stand accused of distributing knowledge and guides on how to obtain illegal content online and are reported to have confessed.

I wonder what “confessed” means under these circumstances? Confessed to providing up-to-date and useful information on Popcorn Time? That’s hardly a crime by any stretch of the imagination.

I realize there is a real shortage of crime in Denmark, http://www.nationmaster.com/country-info/profiles/Denmark/Crime:


but that’s no excuse to get overly inventive with regard to intellectual property crimes.

Before I forget:

Those looking for a clearer (and live) idea of what the site looked like before it was taken down should check out getpopcorntime.co.uk, which was previously promoted by PopcornTime.dk as an English language version of their site.

Whenever you encounter banned sites or information, be sure to pass the banned information along.

Censorship has no legitimate role on the Internet. If you don’t want to see particular content, don’t look. What other people choose to look at is their business and none of yours.

Child porn is the oft-cited example for censorship on the Internet. I agree it is evil, etc., but why concentrate on people sharing child porn? Shouldn’t the police be seeking the people making child porn?

Makes you wonder doesn’t it? Are the police ineffectually swatting (sorry) at the distribution of child porn and ignoring the real crimes of making child porn?

With modern day image recognition, you have to wonder why the police aren’t identifying more children in child porn? Or are they so wedded to ineffectual but budget supporting techniques that they haven’t considered the alternatives?

I am far more sympathetic to the use of technology to catch the producers of child porn than to state functionaries attempting to suppress the free interchange of information on the Internet.

Lisp for the Modern Web

August 23rd, 2015

Lisp for the Modern Web by Vito Van.

From the post:

What to Expect

This piece is about how to build a modern web application with Common Lisp in the backend, from scratch.

You may need to have some knowledge about Front End Development, cause we won’t explain the steps for building the client.

Why Lisp? Again

It is awesome.

I don’t think we need another reason for using Lisp, do we? Life is short, let’s be awesome!

It’s been more than half a century since Lisp first appeared, she’s like the The One Ring in the Middle-earth. The one who mastered the spell of Lisp, will rule the world, once again.

Other reasons.

If you need some other reasons beside awesome, here is some articles about Lisp, enjoy them.

I have never been fond of triumphalism so let me give you a more pragmatic reason for using Lisp:

Ericka Chickowski reports in Angler Climbing To Top Of Exploit Heap that Angler makes up 82% of the exploit kits in use.

Angler targets? Adobe Flash and Java.

Any questions?

You can write vulnerable code in Lisp just as you can any other language. But then it will be your mistake and not something broken at the outset.

I first saw this in a tweet by Christophe Lalanne.

1962 United States Tourist Map

August 23rd, 2015

Visit every place on this vintage US map for the most epic road trip ever by Phil Edwards.

Part of the joy of this map comes from being old enough to remember maps similar to this one.

Critics can scan the map for what isn’t represented as tourist draws.

Consider it to be a snapshot of the styles and interests.

Most notable absence? Cape Canaveral.

I suspect its absence reflects the lead time involved in the drafting and publishing of a map at the time.

Explorer 1 (1958) and the first American in space, Alan Shepard (1961), both preceded this map.


Decoding Satellite-Based Text Messages… [Mini-CIA]

August 23rd, 2015

Decoding Satellite-Based Text Messages with RTL-SDR and Hacked GPS Antenna by Rick Osgood.

From the post:

[Carl] just found a yet another use for the RTL-SDR. He’s been decoding Inmarsat STD-C EGC messages with it. Inmarsat is a British satellite telecommunications company. They provide communications all over the world to places that do not have a reliable terrestrial communications network. STD-C is a text message communications channel used mostly by maritime operators. This channel contains Enhanced Group Call (EGC) messages which include information such as search and rescue, coast guard, weather, and more.

Not much equipment is required for this, just the RTL-SDR dongle, an antenna, a computer, and the cables to hook them all up together. Once all of the gear was collected, [Carl] used an Android app called Satellite AR to locate his nearest Inmarsat satellite. Since these satellites are geostationary, he won’t have to move his antenna once it’s pointed in the right direction.

You may have to ally with a neighbor who is good with a soldering iron but considering the amount of RF in the air, you should be able to become the mini-CIA for your area.

Not that the data itself may be all that interesting, but munging cellphone data with video surveillance of street traffic, news and other feeds, plus other RF sources, will hone your data handling skills.

For example, have you ever wondered how many of your neighbors obey watering restrictions during droughts? One way to find out is to create a baseline set of data for water usage (meters now report digitally) and check periodically when drought restrictions are in effect.

Nothing enlivens a town or county meeting like a color-coded chart of water cheats. (That will also exercise your mapping skills as well.)

Using topic maps will facilitate merging your water surveillance data other data, such as high traffic patterns for some locations of different cars. Or the periods of cars arriving and departing from some location.

Cisco 2015 Midyear Security Report

August 23rd, 2015

Cisco 2015 Midyear Security Report

A must read for this graphic is nothing else:


Select (“click”) for a larger version.

The top three?

  1. Buffer Errors – 471
  2. Input Validation – 244
  3. Resource Management Errors – 238

If we assume that #4, Permissions, Privileges and Access Control – 155 and Information Leak/Disclosure – 138, are not within a vendor’s control, the remaining 295 are. Added to the top three vulnerabilities, vendor preventable vulnerabilities total 1245 out of 1541 or 81% of the vulnerabilities in the graphic.

Cisco has an answer for why this pattern repeats year after year:

The problem lies in insufficient attention being paid to the secure development lifecycle. Security safeguards and vulnerability tests should be built in as a product is being developed. Instead, vendors wait until the product reaches the market and then address its vulnerabilities.

You can’t say that Cisco is anti-vendor, being a software vendor itself.

Under current law, it is cheaper for software vendors to fix only the vulnerabilities that are discovered (for free) by others.

Yes, the key phrase is “under current law.”

Strict liability and minimum (say $5K) damages for

  1. Buffer Errors
  2. Input Validation
  3. Resource Management Errors

would be a large step towards eliminating 62% of the vulnerabilities each year.

Vendors would not have to hunt for every possible vulnerability, just those that fall into those three categories. (To blunt the argument that hunting vulnerabilities is sooo difficult. Perhaps but I propose to eliminate only three of them.)

Strict liability would eliminate all the tiresome EULA issues for all plaintiffs.

Mandatory minimum damages would make finding lawyers to bring the suits easy.

Setting specific vulnerability criteria limits the cry that “perfect” software isn’t possible. True, but techniques to avoid buffer overflows existed in the 1960’s.

Users aren’t asking for perfection but that they and their files have a higher status than digital litter.

TinkerPop3 Promo in One Paragraph

August 22nd, 2015

Marko A. Rodriguez tweeted https://news.ycombinator.com/item?id=10104282 as a “single paragraph” explanation of why you should prefer TinkerPop3 over TinkerPop2.

Of course, I didn’t believe the advantages could be contained in a single paragraph but you be the judge:

Check out http://tinkerpop.com. Apache TinkerPop 3.0.0 was released in June 2015 and it is a quantum leap forward. Not only is it now apart of the Apache Software Foundation, but the Gremlin3 query language has advanced significantly since Gremlin2. The language is much cleaner, provides declarative graph pattern matching constructs, and it supports both OLTP graph databases (e.g. Titan, Neo4j, OrientDB) and OLAP graph processors (e.g. Spark, Giraph). With most every graph vendor providing TinkerPop-connectivity, this should make it easier for developers as they don’t have to learn a new query language for each graph system and developers are less prone to experience vendor lock-in as their code (like JDBC/SQL) can just move to another underlying graph system.

Are my choices developer lock-in versus vendor lock-in? That’s a tough call. ;-)

Do check out TinkerPop3!

100 open source Big Data architecture papers for data professionals

August 22nd, 2015

100 open source Big Data architecture papers for data professionals by Anil Madan.

From the post:

Big Data technology has been extremely disruptive with open source playing a dominant role in shaping its evolution. While on one hand it has been disruptive, on the other it has led to a complex ecosystem where new frameworks, libraries and tools are being released pretty much every day, creating confusion as technologists struggle and grapple with the deluge.

If you are a Big Data enthusiast or a technologist ramping up (or scratching your head), it is important to spend some serious time deeply understanding the architecture of key systems to appreciate its evolution. Understanding the architectural components and subtleties would also help you choose and apply the appropriate technology for your use case. In my journey over the last few years, some literature has helped me become a better educated data professional. My goal here is to not only share the literature but consequently also use the opportunity to put some sanity into the labyrinth of open source systems.

One caution, most of the reference literature included is hugely skewed towards deep architecture overview (in most cases original research papers) than simply provide you with basic overview. I firmly believe that deep dive will fundamentally help you understand the nuances, though would not provide you with any shortcuts, if you want to get a quick basic overview.

Jumping right in…

You will have a great background in Big Data if you read all one hundred (100) papers.

What you will be missing is an overview that ties the many concepts and terms together into a coherent narrative.

Perhaps after reading all 100 papers, you will start over to map the terms and concepts one to the other.

That would both useful and controversial within the field of Big Data!


I first saw this in a tweet by Kirk Borne.

Images for Social Media

August 21st, 2015

23 Tools and Resources to Create Images for Social Media

From the post:

Through experimentation and iteration, we’ve found that including images when sharing to social media increases engagement across the board — more clicks, reshares, replies, and favorites.

Using images in social media posts is well worth trying with your profiles.

As a small business owner or a one-man marketing team, is this something you can pull off by yourself?

At Buffer, we create all the images for our blogposts and social media sharing without any outside design help. We rely on a handful of amazing tools and resources to get the job done, and I’ll be happy to share with you the ones we use and the extras that we’ve found helpful or interesting.

If you tend to scroll down numbered lists (like I do), you will be left thinking the creators of the post don’t know how to count:




the end of the numbered list, isn’t 23.

If you look closely, there are several lists of unnumbered resources. So, you’re thinking that they do know how to count, but some of the items are unnumbered.

Should be, but it’s not. There are thirteen (13) unnumbered items, which added to fifteen (15), makes twenty-eight (28).

So, I suspect the title should read: 28 Tools and Resources to Create Images for Social Media.

In any event, its a fair collection of tools that with some effort on your part, can increase your social media presence.


Parens of the Dead

August 21st, 2015

Parens of the Dead: A screencast series of zombie-themed games written with Clojure and ClojureScript.

Three episodes posted thus far:

Episode 1: Lying in the Ground

Starting with an empty folder, we’ll lay the technical groundwork for our game. We’ll get a Clojure web server up and running, compiling and sending ClojureScript to the browser.

Episode 2: Frontal Assualt

In this one, we create most of the front-end code. We take a look at the data structure that describes the game, using that to build up our UI.

Episode 3: What Lies Beneath

The player has only one action available; revealing a tile. We’ll start implementing the central ‘reveal-tile’ function on the backend, writing tests along the way.

Next episode? Follow @parensofthedead

Another innovative instruction technique!


1) Have your volume control available because I found the sound in the screencasts to be very soft.

2) Be prepared to move very quickly as episode one, for example, is only eleven minutes long.

3) Download the code and walk through it at a slower pace.


Pandering for Complaints

August 21st, 2015

Yesterday I mentioned that the UK has joined the ranks of censors of Google and is attempting to fine tune search results for a given name. Censorship of Google Spreads to the UK.

Today, Simon Rice of the Information Commissioner’s Office, posted: Personal data in leaked datasets is still personal data.

Simon starts off by mentioning the Ashley Madison data dumps and then says:

Anyone in the UK who might download, collect or otherwise process the leaked data needs to be aware they could be taking on data protection responsibilities defined in the UK’s Data Protection Act.

Similarly, seeking to identify an individual from a leaked dataset will be an intrusion into their private life and could also lead to a breach of the DPA.

Individuals will have a range of personal reasons for having created an account with particular online services (or even had an account created without their knowledge) and any publication of further personal data without their consent can cause them significant damage or distress.

It’s worth noting too that any individual or organisation seeking to rely on the journalism exemption should be reminded that this is not a blanket exemption to the DPA and be encouraged to read our detailed guide on how the DPA applies to journalism.

Talk about chilling free speech. You shouldn’t even look to see if the data is genuine. Just don’t look!

You could let your “betters” in the professional press tell you what they want you to know, but I suspect you are brighter than that. What are the press motives behind what you see and what you don’t?

To make matters even worse, Simon closes with a solicitation for complaints:

If you find your personal data being published online then you have a right to go to that publisher and request that the information is removed. This applies equally to information being shared on social media. If the publisher is based in the UK and fails to remove your information you can complain to the ICO.

I don’t have a lot of extra webspace but if you get a complaint from the ICO, I’m willing to host whatever data I can. It won’t be much so don’t get too excited about free space.

We all need to step up and offer storage space for content censored by the UK and others.

Disclosing Government Contracts

August 21st, 2015

The More the Merrier? How much information on government contracts should be published and who will use it by Gavin Hayman.

From the post:

A huge bunch of flowers to Rick Messick for his excellent post asking two key questions about open contracting. And some luxury cars, expensive seafood and a vat or two of cognac.

Our lavish offerings all come from Slovakia, where in 2013 the Government Public Procurement Office launched a new portal publishing all its government contracts. All these items were part of the excessive government contracting uncovered by journalists, civil society and activists. In the case of the flowers, teachers investigating spending at the Department of Education uncovered florists’ bills for thousands of euros. Spending on all of these has subsequently declined: a small victory for fiscal probity.

The flowers, cars, and cognac help to answer the first of two important questions that Rick posed: Will anyone look at contracting information? In the case of Slovakia, it is clear that lowering the barriers to access information did stimulate some form of response and oversight.

The second question was equally important: “How much contracting information should be disclosed?”, especially in commercially sensitive circumstances.

These are two of key questions that we have been grappling with in our strategy at the Open Contracting Partnership. We thought that we would share our latest thinking below, in a post that is a bit longer than usual. So grab a cup of tea and have a read. We’ll be definitely looking forward to your continued thoughts on these issues.

Not a short read so do grab some coffee (outside of Europe) and settle in for a good read.

Disclosure: I’m financially interested in government disclosure in general and contracts in particular. With openness there comes more effort to conceal semantics and increase the need for topic maps to pierce the darkness.

I don’t think openness reduces the amount of fraud and misconduct in government, it only gives an alignment between citizens and the career interests of a prosecutor a sporting chance to catch someone out.

Disclosure should be as open as possible and what isn’t disclosed voluntarily, well, one hopes for brave souls who will leak the remainder.

Support disclosure of government contracts and leakers of the same.

If you need help “connecting the dots,” consider topic maps.

TSA Master Luggage Keys

August 21st, 2015

The paragons of security who keep you safe (sic) in the air, the TSA, helpfully offered a photo op for some of their masterkeys to your luggage.


Frantic discussion of images of these keys should be tempered with the knowledge that an ordinary screwdriver:

Image converted using ifftoany

will open more suitcase locks than the entire set of TSA masterkeys.

Oh, but what if they want to keep it a secret? You mean like the people with the keys who put flyers in your bag when they open it? Yes?

Anyone else opening your luggage is looking for the quickest way possible and relocking isn’t a value to them.

Still, an example of the highly trained and security aware public servants who are making air travel safe, while never catching a single terrorist.

Must be like driving all the snakes out of Ireland.

Solving the Stable Marriage problem…

August 21st, 2015

Solving the Stable Marriage problem with Erlang by Yan Cui.

With all the Ashley Madison hack publicity, I didn’t know there was a “stable marriage problem.” ;-)

Turns out is it like the Eight-Queens problem. Is is a “problem” but it isn’t one you are likely to encounter outside of a CS textbook.

Yan sets up the problem with this quote from Wikipedia:

The stable marriage problem is commonly stated as:

Given n men and n women, where each person has ranked all members of the opposite sex with a unique number between 1 and n in order of preference, marry the men and women together such that there are no two people of opposite sex who would both rather have each other than their current partners. If there are no such people, all the marriages are “stable”. (It is assumed that the participants are binary gendered and that marriages are not same-sex).

The wording is a bit awkward. I would rephrase it to say that for no pair, both partners prefer some other partner. One of the partner’s can prefer someone else, but if the someone else does not share that preference, both marriages are “stable.”

The Wikipedia article does observe:

While the solution is stable, it is not necessarily optimal from all individuals’ points of view.

Yan sets up the problem and then walks through the required code.


Conversations With Datomic

August 21st, 2015

Conversations With Datomic by Carin Meier. (See Conversations With Datomic Part 2 as well.)

Perhaps not “new” but certainly an uncommon approach to introducing users to a database system.

Carin has a “conversation” with Datomic that starts from the very beginning of creating a database and goes forward.

Rewarding and a fun read!


SEC To Accept Inline XBRL?

August 20th, 2015

SEC commits to human- and machine-readable format that could fix agency’s open data problems by Justin Duncan.

The gist of the story is that the SEC has, unfortunately, been accepting both plain text and XBRL filings since 2009. Since XBRL results in open data, you can imagine the effort put into that data by filers.

At the urging of Congress, the SEC has stated in writing:

SEC staff is currently developing recommendations for the Commission’s consideration to allow filers to submit XBRL data inline as part of their core filings, rather than filing XBRL data in an exhibit.

Before you celebrate too much, note that the SEC didn’t offer any dates for acceptance of inline XBRL filings.

Still, better an empty promise than no promise at all.

If you are interested in making sure that inline XBRL data does result in meaningful disclosure, if and when it is used for SEC filings, consult the following:

Inline XBRL 1.1 (standard)

An Integrator’s Guide to Inline XBRL

Plus you may want to consider how you would use XQuery to harvest and combine XBRL data with other data sources. It’s not too early be thinking about “enhanced” results.

Censorship of Google Spreads to the UK

August 20th, 2015

Google ordered to remove links to ‘right to be forgotten’ removal stories by Samuel Gibbs.

From the post:

Google has been ordered by the Information Commissioner’s office to remove nine links to current news stories about older reports which themselves were removed from search results under the ‘right to be forgotten’ ruling.

The search engine had previously removed links relating to a 10 year-old criminal offence by an individual after requests made under the right to be forgotten ruling. Removal of those links from Google’s search results for the claimant’s name spurred new news posts detailing the removals, which were then indexed by Google’s search engine.

Google refused to remove links to these later news posts, which included details of the original criminal offence, despite them forming part of search results for the claimant’s name, arguing that they are an essential part of a recent news story and in the public interest.

Google now has 35 days from the 18 August to remove the links from its search results for the claimant’s name. Google has the right to appeal to the General Regulatory Chamber against the notice.

It is spectacularly sad that this wasn’t the gnomes that run the EU bureaucracy, looking for something pointless to occupy their time and the time of others.

No, this was the Information Commissioner’s Office:

The UK’s independent authority set up to uphold information rights in the public interest, promoting openness by public bodies and data privacy for individuals.

Despite this being story of public interest and conceding that the public has an interest in finding stories about delisted searches:

27. Journalistic context — The Commissioner accepts that the search results in this case relate to journalistic content. Further, the Commissioner does not dispute that journalistic content relating to decisions to delist search results may be newsworthy and in the public interest. However, that interest can be adequately and properly met without a search made on the basis of the complaint’s name providing links to articles which reveal information about the complainant’s spent conviction.

The decision goes on to give Google 35 days from the 18th of August to delist websites which appear in search results on the basis of a censored name. And of course, the links are censored as well.

Despite having failed to fix the StageFright vulnerability which impacts 950 million Android users, the Information Commissioner’s Office wants to fine-tune the search results for a given name to exclude particular websites.

In the not too distant future, the search results displayed in Google will represent a vetting by the most oppressive regimes in the world to the silliest.

Google should not appeal this decision but simply ignore it.

It is an illegal and illegitimate intrusion both on the public’s right to search by any means or manner it chooses and Google’s right to truthfully report the results of searches.

Free Packtpub Books (Legitimate Ones)

August 20th, 2015

Packtpub Books is running a “free book per day” event. Most of you know Packtpub already so I won’t belabor the quality of their publications, etc.

The important news is that for 24 hours each day in August, Packtpub Books is offering a different book for free download! The current free book offer appears to expire at the end of August, 2015.

Packtpub Books – Free Learning

This is a great way to introduce non-Packtpub customers to Packtpub publications.

Please share this news widely (and with other publishers). ;-)

950 Million Users – Scoped and Bracketed – StageFright

August 20th, 2015

Summary: StageFright patch flawed – 950 Million Android users still vulnerable.

Jordan Gruskovnjak / @jgrusko (technical details) and Aaron Portnoy / @aaronportnoy (commentary) in Stagefright: Mission Accomplished? offer these findings on the StageFright patch from Google:

  • The flaw was initially reported over 120 days ago to Google, which exceeds even their own 90-day disclosure deadline
  • The patch is 4 lines of code and was (presumably) reviewed by Google engineers prior to shipping. The public at large believes the current patch protects them when it in fact does not.
  • The flaw affects an estimated 950 million Google customers.
  • Despite our notification (and their confirmation), Google is still currently distributing the faulty patch to Android devices via OTA updates
  • There has been an inordinate amount of attention drawn to the bug–we believe we are likely not the only ones to have noticed it is flawed. Others may have malicious intentions.
  • Google has not given us any indication of a timeline for correcting the faulty patch, despite our queries.
  • The Stagefright Detector application released by Zimperium (the company behind the initial discovery) reports “Congratulations! Your device is not affected by vulnerabilities in Stagefright!” when in fact it is, leading to a false sense of security among users.

Read the full post by Jordan Gruskovnjak and Aaron Portnoy for technical details and commentary on this failure to patch StageFright.

The Gremlin Graph Traversal Language (slides)

August 19th, 2015

The Gremlin Graph Traversal Language by Marko Rodriguez.

Forty-Five (45) out of fifty (50) slides have working Gremlin code!

Ninety percent (90%) of the slides have code you can enter!

It isn’t as complete as The Gremlin Graph Traversal Machine and Language, but on the other hand, it is a hell of a lot easier to follow along.


“True” Size?

August 19th, 2015

This interactive map shows how ‘wrong’ other maps are by Adam Taylor.

From the post:

Given how popular the Mercator projection is, it’s wise to question how it makes us view the world. Many have noted, for example, how the distortion around the poles makes Africa look smaller than Greenland, when in reality Africa is about 14.5 times as big. In 2010, graphic artist Kai Krause made a map to illustrate just how big the African continent is. He found that he was able to fit the United States, India and much of Europe inside the outline of the African continent.

Inspired by Krause’s map, James Talmage, and Damon Maneice, two computer developers based out of Detroit, created an interactive graphic that really puts the distortion caused by the Mercator map into perspective. The tool, dubbed “The True Size” allows you to type in the name of any country and move the outline around to see how the scale of the country gets distorted the closer it gets to the poles.

Of course, one thing the map shows well is the sheer size of Africa. Here it is compared with the United States, China and India.


This is a great resource for anyone who wants to learn more about the physical size of countries, but it is also an illustration that no map is “wrong,” some display the information you seek better than others.

For another interesting take on world maps, see WorldMapper where you will find gems like:

GDP Wealth


Absolute Poverty


Or you can rank countries by their contributions to science:

Science Research


None of these maps is more “true” than the others.

Which one you choose depends on the cause you want to advance.


August 19th, 2015



From the announcement of Sketchy:

One of the features we wanted to see in Scumblr was the ability to collect screenshots and text content from potentially malicious sites – this allows security analysts to preview Scumblr results without the risk of visiting the site directly. We wanted this collection system to be isolated from Scumblr and also resilient to sites that may perform malicious actions. We also decided it would be nice to build an API that we could use in other applications outside of Scumblr.

Although a variety of tools and frameworks exist for taking screenshots, we discovered a number of edge cases that made taking reliable screenshots difficult – capturing screenshots from AJAX-heavy sites, cut-off images with virtual X drivers, and SSL and compression issues in the PhantomJS driver for Selenium, to name a few. In order to solve these challenges, we decided to leverage the best possible tools and create an API framework that would allow for reliable, scalable, and easy to use screenshot and text scraping capabilities. Sketchy to the rescue!

Sketchy wiki

Docker for Sketchy

An interesting companion to Scumblr, especially since an action from Scumblr may visit an “unsafe” site in your absence.

Ways to monitor malicious sites once they are discovered? Suggestions?


August 19th, 2015



If like me, you missed the Ashley Madison dump to an .onion site on Tuesday, 18 August 2015, one that is now thought to be authentic, you need a tool like Scumblr!

From the Netflix description of Scumblr:

What is Scumblr?

Scumblr is a web application that allows performing periodic searches and storing / taking actions on the identified results. Scumblr uses the Workflowable gem to allow setting up flexible workflows for different types of results.

How do I use Scumblr?

Scumblr is a web application based on Ruby on Rails. In order to get started, you’ll need to setup / deploy a Scumblr environment and configure it to search for things you care about. You’ll optionally want to setup and configure workflows so that you can track the status of identified results through your triage process.

What can Scumblr look for?

Just about anything! Scumblr searches utilize plugins called Search Providers. Each Search Provider knows how to perform a search via a certain site or API (Google, Bing, eBay, Pastebin, Twitter, etc.). Searches can be configured from within Scumblr based on the options available by the Search Provider. What are some things you might want to look for? How about:

  • Compromised credentials
  • Vulnerability / hacking discussion
  • Attack discussion
  • Security relevant social media discussion

These are just a few examples of things that you may want to keep an eye on!

Scumblr found stuff, now what?

Up to you! You can create simple or complex workflows to be used along with your results. This can be as simple as marking results as “Reviewed” once they’ve been looked at, or much more complex involving multiple steps with automated actions occurring during the process.

Sounds great! How do I get started?

Take a look at the wiki for detailed instructions on setup, configuration, and use!

This looks particularly useful if you are watching for developments and/or discussions in software forums, blogs, etc.

dgol – Distributed Game Of Life

August 19th, 2015

dgol – Distributed Game Of Life by Mirko Bonadei and Gabriele Lana.

From the webpage:

This project is an implementation of the Game of life done by Gabriele Lana and me during the last months.

We took it as a “toy project” to explore all the nontrivial decisions that need to be made when you have to program a distributed system (eg: choose the right supervision strategy, how to make sub-systems communicate each other, how to store data to make it fault tolerant, ecc…).

It is inspired by the Torben Hoffman’s version and on the talk Thinking like an Erlanger.

The project is still under development, at the moment we are doing a huge refactoring of the codebase because we are reorganizing the supervision strategy.

Don’t just nod at the Thinking like an Erlanger link. Part of its description reads:

If you find Erlang is a bit tough, or if testing gives you headaches, this webinar is for you. We will spend most of this intensive session looking at how to design systems with asynchronous message passing between processes that do not share any memory.

Definitely watch the video and progress in this project!

Non-News: Algorithms Are Biased

August 19th, 2015

Programming and prejudice

From the post:

Software may appear to operate without bias because it strictly uses computer code to reach conclusions. That’s why many companies use algorithms to help weed out job applicants when hiring for a new position.

But a team of computer scientists from the University of Utah, University of Arizona and Haverford College in Pennsylvania have discovered a way to find out if an algorithm used for hiring decisions, loan approvals and comparably weighty tasks could be biased like a human being.

The researchers, led by Suresh Venkatasubramanian, an associate professor in the University of Utah’s School of Computing, have discovered a technique to determine if such software programs discriminate unintentionally and violate the legal standards for fair access to employment, housing and other opportunities. The team also has determined a method to fix these potentially troubled algorithms.

Venkatasubramanian presented his findings Aug. 12 at the 21st Association for Computing Machinery’s Conference on Knowledge Discovery and Data Mining in Sydney, Australia.

“There’s a growing industry around doing resume filtering and resume scanning to look for job applicants, so there is definitely interest in this,” says Venkatasubramanian. “If there are structural aspects of the testing process that would discriminate against one community just because of the nature of that community, that is unfair.”

It’s a puff piece and therefore misses that all algorithms are biased, but some algorithms are biased in ways not permitted under current law.

The paper, which this piece avoids citing for some reason, Certifying and removing disparate impact by Michael Feldman, Sorelle Friedler, John Moeller, Carlos Scheidegger, Suresh Venkatasubramanian

The abstract for the paper does a much better job of setting the context for this research:

What does it mean for an algorithm to be biased? In U.S. law, unintentional bias is encoded via disparate impact, which occurs when a selection process has widely different outcomes for different groups, even as it appears to be neutral. This legal determination hinges on a definition of a protected class (ethnicity, gender, religious practice) and an explicit description of the process.

When the process is implemented using computers, determining disparate impact (and hence bias) is harder. It might not be possible to disclose the process. In addition, even if the process is open, it might be hard to elucidate in a legal setting how the algorithm makes its decisions. Instead of requiring access to the algorithm, we propose making inferences based on the data the algorithm uses.

We make four contributions to this problem. First, we link the legal notion of disparate impact to a measure of classification accuracy that while known, has received relatively little attention. Second, we propose a test for disparate impact based on analyzing the information leakage of the protected class from the other data attributes. Third, we describe methods by which data might be made unbiased. Finally, we present empirical evidence supporting the effectiveness of our test for disparate impact and our approach for both masking bias and preserving relevant information in the data. Interestingly, our approach resembles some actual selection practices that have recently received legal scrutiny.

If you are a bank, you want a loan algorithm to be biased against people with a poor history of paying their debts. The distinction being that is a legitimate basis for discrimination among loan applicants.

The lesson here is that all algorithms are biased, the question is whether the bias is in your favor or not.

Suggestion: Only bet when using your own dice (algorithm).