Seldon wants to make life easier for data scientists, with a new open-source platform by Martin Bryant.
From the post:
It feels that these days we live our whole digital lives according mysterious algorithms that predict what we’ll want from apps and websites. A new open-source product could help those building the products we use worry less about writing those algorithms in the first place.
As increasing numbers of companies hire in-house data science teams, there’s a growing need for tools they can work with so they don’t need to build new software from scratch. That’s the gambit behind the launch of Seldon, a new open-source predictions API launching early in the new year.
Seldon is designed to make it easy to plug in the algorithms needed for predictions that can recommend content to customers, offer app personalization features and the like. Aimed primarily at media and e-commerce companies, it will be available both as a free-to-use self-hosted product and a fully hosted, cloud-based version.
If you think Inadvertent Algorithmic Cruelty is a problem, just wait until people who don’t understand the data or the algorithms start using them in prepackaged form.
Packaged predictive analytics are about as safe as arming school crossing guards with .600 Nitro Express rifles to ward off speeders. As attractive as the second suggestion sounds, there would be numerous safety concerns.
Different but no less pressing safety concerns abound with packaged predictive analytics. Being disconnected from the actual algorithms, can enterprises claim immunity for race, gender or sexual orientation based discrimination? Hard to prove “intent” when the answers in question were generated in complete ignorance of the algorithmic choices that drove the results.
At least Seldon is open source and so the algorithms can be examined, should you be interested in how results are calculated. But open source algorithms are but one aspect of the problem. What of the data? Blind application of algorithms, even neutral ones, can lead to any number of results. If you let me supply the data, I can give you a guarantee of the results from any known algorithm. “Untouched by human hands” as they say.
When you are given recommendations based on predictive analytics do you ask for the data and/or algorithms? Who in your enterprise can do due diligence to verify the results? Who is on the line for bad decisions based on poor predictive analytics?
I first saw this in a tweet by Gregory Piatetsky.