Archive for the ‘Machine Learning’ Category

If You Don’t Think “Working For The Man” Is All That Weird

Saturday, June 17th, 2017

J.P.Morgan’s massive guide to machine learning and big data jobs in finance by Sara Butcher.

From the post:

Financial services jobs go in and out of fashion. In 2001 equity research for internet companies was all the rage. In 2006, structuring collateralised debt obligations (CDOs) was the thing. In 2010, credit traders were popular. In 2014, compliance professionals were it. In 2017, it’s all about machine learning and big data. If you can get in here, your future in finance will be assured.

J.P. Morgan’s quantitative investing and derivatives strategy team, led Marko Kolanovic and Rajesh T. Krishnamachari, has just issued the most comprehensive report ever on big data and machine learning in financial services.

Titled, ‘Big Data and AI Strategies’ and subheaded, ‘Machine Learning and Alternative Data Approach to Investing’, the report says that machine learning will become crucial to the future functioning of markets. Analysts, portfolio managers, traders and chief investment officers all need to become familiar with machine learning techniques. If they don’t they’ll be left behind: traditional data sources like quarterly earnings and GDP figures will become increasingly irrelevant as managers using newer datasets and methods will be able to predict them in advance and to trade ahead of their release.

At 280 pages, the report is too long to cover in detail, but we’ve pulled out the most salient points for you below.

How important is Sarah’s post and the report by J.P. Morgan?

Let put it this way: Sarah’s post is the first business type post I have saved as a complete webpage so I can clean it up and print without all the clutter. This year. Perhaps last year as well. It’s that important.

Sarah’s post is a quick guide to the languages, talents and tools you will need to start “working for the man.”

It that catches your interest, then Sarah’s post is pure gold.


PS: I’m still working on a link for the full 280 page report. The switchboard is down for the weekend so I will be following up with J.P. Morgan on Monday next.

Deep Learning – Dodging The NSA

Monday, May 29th, 2017

The $1700 great Deep Learning box: Assembly, setup and benchmarks by Slav Ivanov.

Ivanov’s motivation for local deep learning hardware came from monthly AWS bills.

You may suffer from those or be training on data sets you’d rather not share with the NSA.

For whatever reason, follow these detailed descriptions to build your own deep learning box.

Caution: If more than a month or more has lapsed from this post and your starting to build a system, check all the update links. Hardware and prices change rapidly.

Virtual Jihadists (Bots)

Saturday, March 4th, 2017

Chip Huyen, who teaches CS 20SI: “TensorFlow for Deep Learning Research” @Standford, has posted code examples for the class, along with a chatbot, developed for one of the assignments.

The readme for the chatbot reads in part:

A neural chatbot using sequence to sequence model with attentional decoder. This is a fully functional chatbot.

This is based on Google Translate Tensorflow model

Sequence to sequence model by Cho et al.(2014)

Created by Chip Huyen as the starter code for assignment 3, class CS 20SI: “TensorFlow for Deep Learning Research”

The detailed assignment handout and information on training time can be found at

Dialogue is lacking but this chatbot could be trained to appear to government forces as a live “jihadist” following and conversing with other “jihadists.” Who may themselves be chatbots.

Unlike the expense of pilots for a fleet of drones, a single user could “pilot” a group of chatbots, creating an over-sized impression in cyberspace. The deeper the modeling of human jihadists, the harder it will be to distinguish virtual jihadists.

I say “jihadists” for headline effect. You could create interacting chatbots for right/left wing hate groups, gun owners, churches, etc., in short, anyone seeking to dilute surveillance.

(Unlike the ACLU or EFF, I don’t concede there are any legitimate reasons for government surveillance. The dangers of government surveillance far exceed any possible crime it could prevent. Government surveillance is the question. The answer is NO.)

CS 20SI: Tensorflow for Deep Learning Research

From the webpage:

Tensorflow is a powerful open-source software library for machine learning developed by researchers at Google Brain. It has many pre-built functions to ease the task of building different neural networks. Tensorflow allows distribution of computation across different computers, as well as multiple CPUs and GPUs within a single machine. TensorFlow provides a Python API, as well as a less documented C++ API. For this course, we will be using Python.

This course will cover the fundamentals and contemporary usage of the Tensorflow library for deep learning research. We aim to help students understand the graphical computational model of Tensorflow, explore the functions it has to offer, and learn how to build and structure models best suited for a deep learning project. Through the course, students will use Tensorflow to build models of different complexity, from simple linear/logistic regression to convolutional neural network and recurrent neural networks with LSTM to solve tasks such as word embeddings, translation, optical character recognition. Students will also learn best practices to structure a model and manage research experiments.


Meet Fenton (my data crunching machine)

Saturday, February 25th, 2017

Meet Fenton (my data crunching machine) by Alex Staravoitau.

From the post:

As you might be aware, I have been experimenting with AWS as a remote GPU-enabled machine for a while, configuring Jupyter Notebook to use it as a backend. It seemed to work fine, although costs did build over time, and I had to always keep in mind to shut it off, alongside with a couple of other limitations. Long story short, around 3 months ago I decided to build my own machine learning rig.

My idea in a nutshell was to build a machine that would only act as a server, being accessible from anywhere to me, always ready to unleash its computational powers on whichever task I’d be working on. Although this setup did take some time to assess, assemble and configure, it has been working flawlessly ever since, and I am very happy with it.

This is the most crucial part. After serious consideration and leveraging the budget I decided to invest into EVGA GeForce GTX 1080 8GB card backed by Nvidia GTX 1080 GPU. It is really snappy (and expensive), and in this particular case it only takes 15 minutes to run — 3 times faster than a g2.2xlarge AWS machine! If you still feel hesitant, think of it this way: the faster your model runs, the more experiments you can carry out over the same period of time.
… (emphasis in original)

Total for this GPU rig? £1562.26

You now know the fate of your next big advance. 😉

If you are interested in comparing the performance of a Beowulf cluster, see: A Homemade Beowulf Cluster: Part 1, Hardware Assembly and A Homemade Beowulf Cluster: Part 2, Machine Configuration.

Either way, you are going to have enough processing power that your skill and not hardware limits are going to be the limiting factor.

Aerial Informatics and Robotics Platform [simulator]

Thursday, February 16th, 2017

Aerial Informatics and Robotics Platform (Microsoft)

From the webpage:

Machine learning is becoming an increasingly important artificial intelligence approach to building autonomous and robotic systems. One of the key challenges with machine learning is the need for many samples — the amount of data needed to learn useful behaviors is prohibitively high. In addition, the robotic system is often non-operational during the training phase. This requires debugging to occur in real-world experiments with an unpredictable robot.

The Aerial Informatics and Robotics platform solves for these two problems: the large data needs for training, and the ability to debug in a simulator. It will provide realistic simulation tools for designers and developers to seamlessly generate the copious amounts of training data they need. In addition, the platform leverages recent advances in physics and perception computation to create accurate, real-world simulations. Together, this realism, based on efficiently generated ground truth data, enables the study and execution of complex missions that might be time-consuming and/or risky in the real-world. For example, collisions in a simulator cost virtually nothing, yet provide actionable information for improving the design.

Open source simulator from Microsoft for drones.

How very cool!

Imagine training your drone to search for breaches of the Dakota Access pipeline.

Or how to react when it encounters hostile drones.


DeepBach: a Steerable Model for Bach chorales generation

Wednesday, December 14th, 2016

DeepBach: a Steerable Model for Bach chorales generation by Gaëtan Hadjeres and François Pachet.


The composition of polyphonic chorale music in the style of J.S Bach has represented a major challenge in automatic music composition over the last decades. The art of Bach chorales composition involves combining four-part harmony with characteristic rhythmic patterns and typical melodic movements to produce musical phrases which begin, evolve and end (cadences) in a harmonious way. To our knowledge, no model so far was able to solve all these problems simultaneously using an agnostic machine-learning approach. This paper introduces DeepBach, a statistical model aimed at modeling polyphonic music and specifically four parts, hymn-like pieces. We claim that, after being trained on the chorale harmonizations by Johann Sebastian Bach, our model is capable of generating highly convincing chorales in the style of Bach. We evaluate how indistinguishable our generated chorales are from existing Bach chorales with a listening test. The results corroborate our claim. A key strength of DeepBach is that it is agnostic and flexible. Users can constrain the generation by imposing some notes, rhythms or cadences in the generated score. This allows users to reharmonize user-defined melodies. DeepBach’s generation is fast, making it usable for interactive music composition applications. Several generation examples are provided and discussed from a musical point of view.

Take this with you on January 20, 2017 in case you tire of playing #DisruptJ20 Twitter Game (guessing XQuery/XPath definitions). Unlikely I know but anything can happen.

Deeply impressive work.

You can hear samples at:

Download the code:

Makes me curious about the composition of “like” works for composers who left smaller corpora.

Comparing Symbolic Deep Learning Frameworks

Thursday, December 8th, 2016

Deep Learning Part 1: Comparison of Symbolic Deep Learning Frameworks by Anusua Trivedi.

From the post:

This blog series is based on my upcoming talk on re-usability of Deep Learning Models at the Hadoop+Strata World Conference in Singapore. This blog series will be in several parts – where I describe my experiences and go deep into the reasons behind my choices.

Deep learning is an emerging field of research, which has its application across multiple domains. I try to show how transfer learning and fine tuning strategy leads to re-usability of the same Convolution Neural Network model in different disjoint domains. Application of this model across various different domains brings value to using this fine-tuned model.

In this blog (Part1), I describe and compare the commonly used open-source deep learning frameworks. I dive deep into different pros and cons for each framework, and discuss why I chose Theano for my work.

Your mileage may vary but a great starting place!

Four Experiments in Handwriting with a Neural Network

Tuesday, December 6th, 2016

Four Experiments in Handwriting with a Neural Network by Shan Carter, David Ha, Ian Johnson, and Chris Olah.

While the handwriting experiments are compelling and entertaining, the author’s have a more profound goal for this activity:

The black box reputation of machine learning models is well deserved, but we believe part of that reputation has been born from the programming context into which they have been locked into. The experience of having an easily inspectable model available in the same programming context as the interactive visualization environment (here, javascript) proved to be very productive for prototyping and exploring new ideas for this post.

As we are able to move them more and more into the same programming context that user interface work is done, we believe we will see richer modes of human-ai interactions flourish. This could have a marked impact on debugging and building models, for sure, but also in how the models are used. Machine learning research typically seeks to mimic and substitute humans, and increasingly it’s able to. What seems less explored is using machine learning to augment humans. This sort of complicated human-machine interaction is best explored when the full capabilities of the model are available in the user interface context.

Setting up a search alert for future work from these authors!

Andrew Ng – Machine Learning – Lecture Notes

Tuesday, November 1st, 2016

CS 229 Machine Learning Course Materials.

If your hand writing is as bad as mine, lecture notes are a great read-along with the video lectures or to use for review.

As you might expect, these notes are of exceptional quality.


Boosting (in Machine Learning) as a Metaphor for Diverse Teams [A Quibble]

Sunday, October 23rd, 2016

Boosting (in Machine Learning) as a Metaphor for Diverse Teams by Renee Teate.

Renee’s summary:

tl;dr: Boosting ensemble algorithms in Machine Learning use an approach that is similar to assembling a diverse team with a variety of strengths and experiences. If machines make better decisions by combining a bunch of “less qualified opinions” vs “asking one expert”, then maybe people would, too.

Very much worth your while to read at length but to setup my quibble:

What a Random Forest does is build up a whole bunch of “dumb” decision trees by only analyzing a subset of the data at a time. A limited set of features (columns) from a portion of the overall records (rows) is used to generate each decision tree, and the “depth” of the tree (and/or size of the “leaves”, the number of examples that fall into each final bin) is limited as well. So the trees in the model are “trained” with only a portion of the available data and therefore don’t individually generate very accurate classifications.

However, it turns out that when you combine the results of a bunch of these “dumb” trees (also known as “weak learners”), the combined result is usually even better than the most finely-tuned single full decision tree. (So you can see how the algorithm got its name – a whole bunch of small trees, somewhat randomly generated, but used in combination is a random forest!)

All true but “weak learners” in machine learning are easily reconfigured, combined with different groups of other “weak learners,” or even discarded.

None of which is true for people who are hired to be part of a diverse team.

I don’t mean to discount Renee’s metaphor because I think it has much to recommend it, but diverse “weak learners” make poor decisions too.

Don’t take my word for it, watch the 2016 congressional election results.

Be sure to follow Renee on @BecomingDataSci. I’m interested to see how she develops this metaphor and where it leads.


Python and Machine Learning in Astronomy (Rejuvenate Your Emotional Health)

Saturday, October 22nd, 2016

Python and Machine Learning in Astronomy (Episode #81) (Jack VanderPlas)

From the webpage:

The advances in Astronomy over the past century are both evidence of and confirmation of the highest heights of human ingenuity. We have learned by studying the frequency of light that the universe is expanding. By observing the orbit of Mercury that Einstein’s theory of general relativity is correct.

It probably won’t surprise you to learn that Python and data science play a central role in modern day Astronomy. This week you’ll meet Jake VanderPlas, an astrophysicist and data scientist from University of Washington. Join Jake and me while we discuss the state of Python in Astronomy.

Links from the show:

Jake on Twitter: @jakevdp

Jake on the web:

Python Data Science Handbook:

Python Data Science Handbook on GitHub:

Statistics, Data Mining, and Machine Learning in Astronomy: A Practical Python Guide for the Analysis of Survey Data:

PyData Talk:

eScience Institue: @UWeScience

Large Synoptic Survey Telescope:

AstroML: Machine Learning and Data Mining for Astronomy:

Astropy project:

altair package:

If you social media feeds have been getting you down, rejoice! This interview with Jake VanderPlas covers Python, machine learning and astronomy.

Nary a mention of current social dysfunction around the globe!

Replace an hour of TV this weekend with this podcast. (Or more hours with others.)

Not only will you have more knowledge, you will be in much better emotional shape to face the coming week!

Deep-Fried Data […money laundering for bias…]

Tuesday, October 4th, 2016

Deep-Fried Data by Maciej Ceglowski. (paper) (video of same presentation) Part of Collections as Data event at the Library of Congress.

If the “…money laundering for bias…” quote doesn’t capture your attention, try:

I find it helpful to think of algorithms as a dim-witted but extremely industrious graduate student, whom you don’t fully trust. You want a concordance made? An index? You want them to go through ten million photos and find every picture of a horse? Perfect.

You want them to draw conclusions on gender based on word use patterns? Or infer social relationships from census data? Now you need some adult supervision in the room.

Besides these issues of bias, there’s also an opportunity cost in committing to computational tools. What irks me about the love affair with algorithms is that they remove a lot of the potential for surprise and serendipity that you get by working with people.

If you go searching for patterns in the data, you’ll find patterns in the data. Whoop-de-doo. But anything fresh and distinctive in your digital collections will not make it through the deep frier.

We’ve seen entire fields disappear down the numerical rabbit hole before. Economics came first, sociology and political science are still trying to get out, bioinformatics is down there somewhere and hasn’t been heard from in a while.

A great read and equally enjoyable presentation.


Introducing the Open Images Dataset

Friday, September 30th, 2016

Introducing the Open Images Dataset by Ivan Krasin and Tom Duerig.

From the post:

In the last few years, advances in machine learning have enabled Computer Vision to progress rapidly, allowing for systems that can automatically caption images to apps that can create natural language replies in response to shared photos. Much of this progress can be attributed to publicly available image datasets, such as ImageNet and COCO for supervised learning, and YFCC100M for unsupervised learning.

Today, we introduce Open Images, a dataset consisting of ~9 million URLs to images that have been annotated with labels spanning over 6000 categories. We tried to make the dataset as practical as possible: the labels cover more real-life entities than the 1000 ImageNet classes, there are enough images to train a deep neural network from scratch and the images are listed as having a Creative Commons Attribution license*.

The image-level annotations have been populated automatically with a vision model similar to Google Cloud Vision API. For the validation set, we had human raters verify these automated labels to find and remove false positives. On average, each image has about 8 labels assigned. Here are some examples:

Impressive data set, if you want to recognize a muffin, gherkin, pebble, etc., see the full list at dict.csv.

Hopeful the techniques you develop with these images will lead to more focused image recognition. 😉

I lightly searched the list and no “non-safe” terms jumped out at me. Suitable for family image training.

Reinforcement Learning: An Introduction

Tuesday, September 27th, 2016

Reinforcement Learning: An Introduction, Second edition by Richard S. Sutton and Andrew G. Barto.

From Chapter 1:

The idea that we learn by interacting with our environment is probably the first to occur to us when we think about the nature of learning. When an infant plays, waves its arms, or looks about, it has no explicit teacher, but it does have a direct sensorimotor connection to its environment. Exercising this connection produces a wealth of information about cause and effect, about the consequences of actions, and about what to do in order to achieve goals. Throughout our lives, such interactions are undoubtedly a major source of knowledge about our environment and ourselves. Whether we are learning to drive a car or to hold a conversation, we are acutely aware of how our environment responds to what we do, and we seek to influence what happens through our behavior. Learning from interaction is a foundational idea underlying nearly all theories of learning and intelligence.

In this book we explore a computational approach to learning from interaction. Rather than directly theorizing about how people or animals learn, we explore idealized learning situations and evaluate the effectiveness of various learning methods. That is, we adopt the perspective of an artificial intelligence researcher or engineer. We explore designs for machines that are effective in solving learning problems of scientific or economic interest, evaluating the designs through mathematical analysis or computational experiments. The approach we explore, called reinforcement learning, is much more focused on goal-directed learning from interaction than are other approaches to machine learning.

When this draft was first posted, it was so popular a download that the account was briefly suspended.

Consider that as an indication of importance.



Text To Image Synthesis Using Thought Vectors

Sunday, August 28th, 2016

Text To Image Synthesis Using Thought Vectors by Paarth Neekhara.


This is an experimental tensorflow implementation of synthesizing images from captions using Skip Thought Vectors. The images are synthesized using the GAN-CLS Algorithm from the paper Generative Adversarial Text-to-Image Synthesis. This implementation is built on top of the excellent DCGAN in Tensorflow. The following is the model architecture. The blue bars represent the Skip Thought Vectors for the captions.

OK, that didn’t grab my attention, but this did:


Full size image.

Not quite “Tea, Earl Grey, Hot,” but a step in that direction!

“Why Should I Trust You?”…

Tuesday, August 23rd, 2016

“Why Should I Trust You?”: Explaining the Predictions of Any Classifier by Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin.


Despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust, which is fundamental if one plans to take action based on a prediction, or when choosing whether to deploy a new model. Such understanding also provides insights into the model, which can be used to transform an untrustworthy model or prediction into a trustworthy one.

In this work, we propose LIME, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning an interpretable model locally around the prediction. We also propose a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem. We demonstrate the flexibility of these methods by explaining different models for text (e.g. random forests) and image classification (e.g. neural networks). We show the utility of explanations via novel experiments, both simulated and with human subjects, on various scenarios that require trust: deciding if one should trust a prediction, choosing between models, improving an untrustworthy classifier, and identifying why a classifier should not be trusted.

LIME software at Github.

For a quick overview consider: Introduction to Local Interpretable Model-Agnostic Explanations (LIME) (blog post).

Or what originally sent me in this direction: Trusting Machine Learning Models with LIME at Data Skeptic, a podcast described as:

Machine learning models are often criticized for being black boxes. If a human cannot determine why the model arrives at the decision it made, there’s good cause for skepticism. Classic inspection approaches to model interpretability are only useful for simple models, which are likely to only cover simple problems.

The LIME project seeks to help us trust machine learning models. At a high level, it takes advantage of local fidelity. For a given example, a separate model trained on neighbors of the example are likely to reveal the relevant features in the local input space to reveal details about why the model arrives at it’s conclusion.

Data Science Renee finds deeply interesting material such as this on a regular basis and should follow her account on Twitter.

I do have one caveat on a quick read of these materials. The authors say in the paper, under 4. Submodular Pick For Explaining Models:

Even though explanations of multiple instances can be insightful, these instances need to be selected judiciously, since users may not have the time to examine a large number of explanations. We represent the time/patience that humans have by a budget B that denotes the number of explanations they are willing to look at in order to understand a model. Given a set of instances X, we define the pick step as the task of selecting B instances for the user to inspect.

The pick step is not dependent on the existence of explanations – one of the main purpose of tools like Modeltracker [1] and others [11] is to assist users in selecting instances themselves, and examining the raw data and predictions. However, since looking at raw data is not enough to understand predictions and get insights, the pick step should take into account the explanations that accompany each prediction. Moreover, this method should pick a diverse, representative set of explanations to show the user – i.e. non-redundant explanations that represent how the model behaves globally.

The “judicious” selection of instances, in models of any degree of sophistication, based upon large data sets seems problematic.

The focus on the “non-redundant coverage intuition” is interesting but based on the assumption that changes in factors don’t lead to “redundant explanations.” In the cases presented that’s true, but I lack confidence that will be true in every case.

Still, a very important area of research and an effort that is worth tracking.

What’s the Difference Between Artificial Intelligence, Machine Learning, and Deep Learning?

Friday, August 19th, 2016

What’s the Difference Between Artificial Intelligence, Machine Learning, and Deep Learning? by Michael Copeland.

From the post:

Artificial intelligence is the future. Artificial intelligence is science fiction. Artificial intelligence is already part of our everyday lives. All those statements are true, it just depends on what flavor of AI you are referring to.

For example, when Google DeepMind’s AlphaGo program defeated South Korean Master Lee Se-dol in the board game Go earlier this year, the terms AI, machine learning, and deep learning were used in the media to describe how DeepMind won. And all three are part of the reason why AlphaGo trounced Lee Se-Dol. But they are not the same things.

The easiest way to think of their relationship is to visualize them as concentric circles with AI — the idea that came first — the largest, then machine learning — which blossomed later, and finally deep learning — which is driving today’s AI explosion — fitting inside both.

If you are confused by the mix of artificial intelligence, machine learning, and deep learning, floating around, Copeland will set you straight.

It’s a fun read and one you can recommend to non-technical friends.

Re-Use, Re-Use! Using Weka within Lisp

Friday, August 19th, 2016

Suggesting code re-use, as described by Paul Homer in The Myth of Code Reuse, provokes this reaction from most programmers (substitute re-use for refund):


Atabey Kaygun demonstrates he isn’t one of those programmers in Using Weka within Lisp:

From the post:

As much as I like implementing machine learning algorithms from scratch within various languages I like using, in doing serious research one should not take the risk of writing error-prone code. Most likely somebody already spent many thousand hours writing, debugging and optimizing code you can use with some effort. Re-use people, re-use!

In any case, today I am going to describe how one can use weka libraries within ABCL implementation of common lisp. Specifically, I am going to use the k-means implementation of weka.

As usual, well written and useful guide to using Weka and Lisp.

The issues of code re-use aren’t confined to programmers.

Any stats you can suggest on re-use of database or XML schemas?

Weka MOOCs – Self-Paced Courses

Friday, July 8th, 2016

All three Weka MOOCs available as self-paced courses

From the post:

All three MOOCs (“Data Mining with Weka”, “More Data Mining with Weka” and “Advanced Data Mining with Weka”) are now available on a self-paced basis. All the material, activities and assessments are available from now until 24th September 2016 at:

The Weka software and MOOCs are great introductions to machine learning!

…possibly biased? Try always biased.

Friday, June 24th, 2016

Artificial Intelligence Has a ‘Sea of Dudes’ Problem by Jack Clark.

From the post:

Much has been made of the tech industry’s lack of women engineers and executives. But there’s a unique problem with homogeneity in AI. To teach computers about the world, researchers have to gather massive data sets of almost everything. To learn to identify flowers, you need to feed a computer tens of thousands of photos of flowers so that when it sees a photograph of a daffodil in poor light, it can draw on its experience and work out what it’s seeing.

If these data sets aren’t sufficiently broad, then companies can create AIs with biases. Speech recognition software with a data set that only contains people speaking in proper, stilted British English will have a hard time understanding the slang and diction of someone from an inner city in America. If everyone teaching computers to act like humans are men, then the machines will have a view of the world that’s narrow by default and, through the curation of data sets, possibly biased.

“I call it a sea of dudes,” said Margaret Mitchell, a researcher at Microsoft. Mitchell works on computer vision and language problems, and is a founding member—and only female researcher—of Microsoft’s “cognition” group. She estimates she’s worked with around 10 or so women over the past five years, and hundreds of men. “I do absolutely believe that gender has an effect on the types of questions that we ask,” she said. “You’re putting yourself in a position of myopia.”

Margaret Mitchell makes a pragmatic case for diversity int the workplace, at least if you want to avoid male biased AI.

Not that a diverse workplace results in an “unbiased” AI, it will be a biased AI that isn’t solely male biased.

It isn’t possible to escape bias because some person or persons has to score “correct” answers for an AI. The scoring process imparts to the AI being trained, the biases of its judge of correctness.

Unless someone wants to contend there are potential human judges without biases, I don’t see a way around imparting biases to AIs.

By being sensitive to evidence of biases, we can in some cases choose the biases we want an AI to possess, but an AI possessing no biases at all, isn’t possible.

AIs are, after all, our creations so it is only fair that they be made in our image, biases and all.

Bots, Won’t You Hide Me?

Thursday, June 23rd, 2016

Emerging Trends in Social Network Analysis of Terrorism and Counterterrorism, How Police Are Scanning All Of Twitter To Detect Terrorist Threats, Violent Extremism in the Digital Age: How to Detect and Meet the Threat, Online Surveillance: …ISIS and beyond [Social Media “chaff”] are just a small sampling of posts on the detection of “terrorists” on social media.

The last one is my post illustrating how “terrorist” at one time = “anti-Vietnam war,” “civil rights,” and “gay rights.” Due to the public nature of social media, avoiding government surveillance isn’t possible.

I stole the title, Bots, Won’t You Hide Me? from Ben Bova’s short story, Stars, Won’t You Hide Me?. It’s not very long and if you like science fiction, you will enjoy it.

Bova took verses in the short story from Sinner Man, a traditional African spiritual, which was recorded by a number of artists.

All of that is a very round about way to introduce you to a new Twitter account: ConvJournalism:

All you need to know about Conversational Journalism, (journalistic) bots and #convcomm by @martinhoffmann.

Surveillance of groups on social media isn’t going to succeed, The White House Asked Social Media Companies to Look for Terrorists. Here’s Why They’d #Fail by Jenna McLaughlin bots can play an important role in assisting in that failure.

Imagine not only having bots that realistically mimic the chatter of actual human users but who follow, unfollow, etc., and engage in apparent conspiracies, with other bots. Entirely without human direction or very little.

Follow ConvJournalism and promote bot research/development that helps all of us hide. (I’d rather have the bots say yes than Satan.)

Machine Learning Yearning [New Book – Free Draft – Signup By Friday June 24th (2016)

Monday, June 20th, 2016

Machine Learning Yearning by Andrew Ng.

About Andrew Ng:

Andrew Ng is Associate Professor of Computer Science at Stanford; Chief Scientist of Baidu; and Chairman and Co-founder of Coursera.

In 2011 he led the development of Stanford University’s main MOOC (Massive Open Online Courses) platform and also taught an online Machine Learning class to over 100,000 students, leading to the founding of Coursera. Ng’s goal is to give everyone in the world access to a great education, for free. Today, Coursera partners with some of the top universities in the world to offer high quality online courses, and is the largest MOOC platform in the world.

Ng also works on machine learning with an emphasis on deep learning. He founded and led the “Google Brain” project which developed massive-scale deep learning algorithms. This resulted in the famous “Google cat” result, in which a massive neural network with 1 billion parameters learned from unlabeled YouTube videos to detect cats. More recently, he continues to work on deep learning and its applications to computer vision and speech, including such applications as autonomous driving.

Haven’t you signed up yet?

OK, What You Will Learn:

The goal of this book is to teach you how to make the numerous decisions needed with organizing a machine learning project. You will learn:

  • How to establish your dev and test sets
  • Basic error analysis
  • How you can use Bias and Variance to decide what to do
  • Learning curves
  • Comparing learning algorithms to human-level performance
  • Debugging inference algorithms
  • When you should and should not use end-to-end deep learning
  • Error analysis by parts

Free drafts of a new book on machine learning projects, not just machine learning, by one of the leading world experts on machine learning.

Now are you signed up?

If you are interested in machine learning, following Andrew Ng on Twitter isn’t a bad place to start.

Be aware, however, that even machine learning experts can be mistaken. For example, Andrew tweeted, favorably, How to make a good teacher from the Economist.

Instilling these techniques is easier said than done. With teaching as with other complex skills, the route to mastery is not abstruse theory but intense, guided practice grounded in subject-matter knowledge and pedagogical methods. Trainees should spend more time in the classroom. The places where pupils do best, for example Finland, Singapore and Shanghai, put novice teachers through a demanding apprenticeship. In America high-performing charter schools teach trainees in the classroom and bring them on with coaching and feedback.

Teacher-training institutions need to be more rigorous—rather as a century ago medical schools raised the calibre of doctors by introducing systematic curriculums and providing clinical experience. It is essential that teacher-training colleges start to collect and publish data on how their graduates perform in the classroom. Courses that produce teachers who go on to do little or nothing to improve their pupils’ learning should not receive subsidies or see their graduates become teachers. They would then have to improve to survive.

The author conflates “demanding apprenticeship” with “teacher-training colleges start to collect and publish data on how their graduates perform in the classroom,” as though whatever data we collect has some meaningful relationship with teaching and/or the training of teachers.

A “demanding apprenticeship” no doubt weeds out people who are not well suited to be teachers, there is no evidence that it can make a teacher out of someone who isn’t suited for the task.

The collection of data is one of the ongoing fallacies about American education. Simply because you can collect data is no indication that it is useful and/or has any relationship to what you are attempting to measure.

Follow Andrew for his work on machine learning, not so much for his opinions on education.

Are Non-AI Decisions “Open to Inspection?”

Thursday, June 16th, 2016

Ethics in designing AI Algorithms — part 1 by Michael Greenwood.

From the post:

As our civilization becomes more and more reliant upon computers and other intelligent devices, there arises specific moral issue that designers and programmers will inevitably be forced to address. Among these concerns is trust. Can we trust that the AI we create will do what it was designed to without any bias? There’s also the issue of incorruptibility. Can the AI be fooled into doing something unethical? Can it be programmed to commit illegal or immoral acts? Transparency comes to mind as well. Will the motives of the programmer or the AI be clear? Or will there be ambiguity in the interactions between humans and AI? The list of questions could go on and on.

Imagine if the government uses a machine-learning algorithm to recommend applications for student loan approvals. A rejected student and or parent could file a lawsuit alleging that the algorithm was designed with racial bias against some student applicants. The defense could be that this couldn’t be possible since it was intentionally designed so that it wouldn’t have knowledge of the race of the person applying for the student loan. This could be the reason for making a system like this in the first place — to assure that ethnicity will not be a factor as it could be with a human approving the applications. But suppose some racial profiling was proven in this case.

If directed evolution produced the AI algorithm, then it may be impossible to understand why, or even how. Maybe the AI algorithm uses the physical address data of candidates as one of the criteria in making decisions. Maybe they were born in or at some time lived in poverty‐stricken regions, and that in fact, a majority of those applicants who fit these criteria happened to be minorities. We wouldn’t be able to find out any of this if we didn’t have some way to audit the systems we are designing. It will become critical for us to design AI algorithms that are not just robust and scalable, but also easily open to inspection.

While I can appreciate the desire to make AI algorithms that are “…easily open to inspection…,” I feel compelled to point out that human decision making has resisted such openness for thousands of years.

There are the tales we tell each other about “rational” decision making but those aren’t how decisions are made, rather they are how we justify decisions made to ourselves and others. Not exactly the same thing.

Recall the parole granting behavior of israeli judges that depended upon the proximity to their last meal. Certainly all of those judges would argue for their “rational” decisions but meal time was a better predictor than any other. (Extraneous factors in judicial decisions)

My point being that if we struggle to even articulate the actual basis for non-AI decisions, where is our model for making AI decisions “open to inspection?” What would that look like?

You could say, for example, no discrimination based on race. OK, but that’s not going to work if you want to purposely setup scholarships for minority students.

When you object, “…that’s not what I meant! You know what I mean!…,” well, I might, but try convincing an AI that has no social context of what you “meant.”

The openness of AI decisions to inspection is an important issue but the human record in that regard isn’t encouraging.

AI Cultist On Justice System Reform

Wednesday, June 8th, 2016

White House Challenges Artificial Intelligence Experts to Reduce Incarceration Rates by Jason Shueh.

From the post:

The U.S. spends $270 billion on incarceration each year, has a prison population of about 2.2 million and an incarceration rate that’s spiked 220 percent since the 1980s. But with the advent of data science, White House officials are asking experts for help.

On Tuesday, June 7, the White House Office of Science and Technology Policy’s Lynn Overmann, who also leads the White House Police Data Initiative, stressed the severity of the nation’s incarceration crisis while asking a crowd of data scientists and artificial intelligence specialists for aid.

“We have built a system that is too large, and too unfair and too costly — in every sense of the word — and we need to start to change it,” Obermann said, speaking at a Computing Community Consortium public workshop.

She argued that the U.S., a country that has the highest amount incarcerated citizens in the world, is in need of systematic reforms with both data tools to process alleged offenders and at the policy level to ensure fair and measured sentences. As a longtime counselor, advisor and analyst for the Justice Department and at the city and state levels, Overman said she has studied and witnessed an alarming number of issues in terms of bias and unwarranted punishments.

For instance, she said that statistically, while drug use is about equal between African Americans and Caucasians, African Americans are more likely to be arrested and convicted. They also receive longer prison sentences compared to Caucasian inmates convicted of the same crimes.

Other problems, Oberman said, are due to inflated punishments that far exceed the severity of crimes. She recalled her years spent as an assistant public defender for Florida’s Miami-Dade County Public Defender’s Office as an example.

“I represented a client who was looking at spending 40 years of his life in prison because he stole a lawnmower and a weedeater from a shed in a backyard,” Obermann said, “I had another person who had AIDS and was offered a 15-year sentence for stealing mangos.”

Data and digital tools can help curb such pitfalls by increasing efficiency, transparency and accountability, she said.
… (emphasis added)

Spotting a cultist tip: Before specifying criteria for success or even understanding a problem, a cultist announces the approach that will succeed.

Calls like this one are a disservice to legitimate artificial intelligence research, to say nothing of experts in criminal justice (unlike Lynn Overmann), who have struggled for decades to improve the criminal justice system.

Yes, Overmann has experience in the criminal justice system, both in legal practice and at a policy level, but that makes her no more of an expert on criminal justice reform than having multiple flat tires makes me an expert on tire design.

Data is not, has not been, nor will it ever be a magic elixir that solves undefined problems posed to it.

White House sponsored AI cheer leading is a disservice to AI practitioners, experts in the field of criminal justice reform and more importantly, to those impacted by the criminal justice system.

Substitute meaningful problem definitions for the AI pom-poms if this is to be more than resume padding and currying favor with contractors project.

The anatomy of online deception:… [ Statistics Can Be Deceiving – Even In Academic Papers]

Wednesday, June 8th, 2016

The anatomy of online deception: what makes automated text convincing? by Richard M. Everett, Jason R. C. Nurse, Arnau Erola.


Technology is rapidly evolving, and with it comes increasingly sophisticated bots (i.e. software robots) which automatically produce content to inform, influence, and deceive genuine users. This is particularly a problem for social media networks where content tends to be extremely short, informally written, and full of inconsistencies. Motivated by the rise of bots on these networks, we investigate the ease with which a bot can deceive a human. In particular, we focus on deceiving a human into believing that an automatically generated sample of text was written by a human, as well as analysing which factors affect how convincing the text is. To accomplish this, we train a set of models to write text about several distinct topics, to simulate a bot’s behaviour, which are then evaluated by a panel of judges. We find that: (1) typical Internet users are twice as likely to be deceived by automated content than security researchers; (2) text that disagrees with the crowd’s opinion is more believably human; (3) light-hearted topics such as Entertainment are significantly easier to deceive with than factual topics such as Science; and (4) automated text on Adult content is the most deceptive regardless of a user’s background.

The statistics presented are impressive:

We found that automated text is twice as likely to deceive Internet users than security researchers. Also, text that disagrees with the Crowd’s opinion increases the likelihood of deception by up to 78%, while text on light-hearted Topics such as Entertainment increases the likelihood by up to 85%. Notably, we found that automated text on Adult content is the most deceptive for both typical Internet users and security researchers, increasing the likelihood of deception by at least 30% compared to other Topics on average. Together, this shows that it is feasible for a party with technical resources and knowledge to create an environment populated by bots that could successfully deceive users.
… (at page 1120)

To evaluate those statistics consider the judges panels that create the supporting data:

To evaluate this test dataset, a panel of judges is used where every judge receives the entire test set with no other accompanying data such as Topic and Crowd opinion. Then, each judge evaluates the comments based solely on their text and labels each as either human or bot, depending who they believe wrote it. To fill this panel, three judges were selected – in keeping with the average procedure of the work highlighted by Bailey et al. [2] – for two distinct groups:

  • Group 1: Three cyber security researchers who are actively involved in security work with an intimate knowledge of the Internet and its threats.
  • Group 2: Three typical Internet users who browse social media daily but are not experienced with technology or security, and therefore less aware of the threats.

… (pages 1117-1118)

The paper reports human vs. generation evaluations of topics by six (6) people.

I’m suddenly less impressed than I hoped to be from reading the abstract.

A more informative title would have been: 6 People Classify Machine/Human Generated Reddit Comments.

To their credit, the authors were explicit about the judging panels in their study.

I am forced to conclude peer review wasn’t used for the SAC 2016 31st ACM Symposium on Applied Computing or its peer reviewers left a great deal to be desired.

As a conference goer, would you be interested in human/machine judgments of six unknown panelists?

Deep Learning Trends @ ICLR 2016 (+ Shout-Out to arXiv)

Friday, June 3rd, 2016

Deep Learning Trends @ ICLR 2016 by Tomasz Malisiewicz.

From the post:

Started by the youngest members of the Deep Learning Mafia [1], namely Yann LeCun and Yoshua Bengio, the ICLR conference is quickly becoming a strong contender for the single most important venue in the Deep Learning space. More intimate than NIPS and less benchmark-driven than CVPR, the world of ICLR is arXiv-based and moves fast.

Today’s post is all about ICLR 2016. I’ll highlight new strategies for building deeper and more powerful neural networks, ideas for compressing big networks into smaller ones, as well as techniques for building “deep learning calculators.” A host of new artificial intelligence problems is being hit hard with the newest wave of deep learning techniques, and from a computer vision point of view, there’s no doubt that deep convolutional neural networks are today’s “master algorithm” for dealing with perceptual data.

Information packed review of the conference and if that weren’t enough, this shout-out to arXiv:

ICLR Publishing Model: arXiv or bust
At ICLR, papers get posted on arXiv directly. And if you had any doubts that arXiv is just about the single awesomest thing to hit the research publication model since the Gutenberg press, let the success of ICLR be one more data point towards enlightenment. ICLR has essentially bypassed the old-fashioned publishing model where some third party like Elsevier says “you can publish with us and we’ll put our logo on your papers and then charge regular people $30 for each paper they want to read.” Sorry Elsevier, research doesn’t work that way. Most research papers aren’t good enough to be worth $30 for a copy. It is the entire body of academic research that provides true value, for which a single paper just a mere door. You see, Elsevier, if you actually gave the world an exceptional research paper search engine, together with the ability to have 10-20 papers printed on decent quality paper for a $30/month subscription, then you would make a killing on researchers and I would endorse such a subscription. So ICLR, rightfully so, just said fuck it, we’ll use arXiv as the method for disseminating our ideas. All future research conferences should use arXiv to disseminate papers. Anybody can download the papers, see when newer versions with corrections are posted, and they can print their own physical copies. But be warned: Deep Learning moves so fast, that you’ve gotta be hitting refresh or arXiv on a weekly basis or you’ll be schooled by some grad students in Canada.

Is your publishing < arXiv?

Do you hit arXiv every week?

Bias? What Bias? We’re Scientific!

Monday, May 23rd, 2016

This ProPublica story by Julia Angwin, Jeff Larson, Surya Mattu and Lauren Kirchner, isn’t short but it is worth your time to not only read, but to download the data and test their analysis for yourself.

Especially if you have the mis-impression that algorithms can avoid bias. Or that clients will apply your analysis with the caution that it deserves.

Finding a bias in software, like finding a bug, is a good thing. But that’s just one, there is no estimate of how many others may exist.

And as you will find, clients may not remember your careful explanation of the limits to your work. Or apply it in ways you don’t anticipate.

Machine Bias – There’s software used across the country to predict future criminals. And it’s biased against blacks.

Here’s the first story to try to lure you deeper into this study:

ON A SPRING AFTERNOON IN 2014, Brisha Borden was running late to pick up her god-sister from school when she spotted an unlocked kid’s blue Huffy bicycle and a silver Razor scooter. Borden and a friend grabbed the bike and scooter and tried to ride them down the street in the Fort Lauderdale suburb of Coral Springs.

Just as the 18-year-old girls were realizing they were too big for the tiny conveyances — which belonged to a 6-year-old boy — a woman came running after them saying, “That’s my kid’s stuff.” Borden and her friend immediately dropped the bike and scooter and walked away.

But it was too late — a neighbor who witnessed the heist had already called the police. Borden and her friend were arrested and charged with burglary and petty theft for the items, which were valued at a total of $80.

Compare their crime with a similar one: The previous summer, 41-year-old Vernon Prater was picked up for shoplifting $86.35 worth of tools from a nearby Home Depot store.

Prater was the more seasoned criminal. He had already been convicted of armed robbery and attempted armed robbery, for which he served five years in prison, in addition to another armed robbery charge. Borden had a record, too, but it was for misdemeanors committed when she was a juvenile.

Yet something odd happened when Borden and Prater were booked into jail: A computer program spat out a score predicting the likelihood of each committing a future crime. Borden — who is black — was rated a high risk. Prater — who is white — was rated a low risk.

Two years later, we know the computer algorithm got it exactly backward. Borden has not been charged with any new crimes. Prater is serving an eight-year prison term for subsequently breaking into a warehouse and stealing thousands of dollars’ worth of electronics.

This analysis demonstrates that malice isn’t required for bias to damage lives. Whether the biases are in software, in its application, in the interpretation of its results, the end result is the same, damaged lives.

I don’t think bias in software is avoidable but here, here no one was even looking.

What role do you think budget justification/profit making played in that blindness to bias?

Deep Learning: Image Similarity and Beyond (Webinar, May 10, 2016)

Friday, May 6th, 2016

Deep Learning: Image Similarity and Beyond (Webinar, May 10, 2016)

From the registration page:

Deep Learning is a powerful machine learning method for image tagging, object recognition, speech recognition, and text analysis. In this demo, we’ll cover the basic concept of deep learning and walk you through the steps to build an application that finds similar images using an already-trained deep learning model.

Recommended for:

  • Data scientists and engineers
  • Developers and technical team managers
  • Technical product managers

What you’ll learn:

  • How to leverage existing deep learning models
  • How to extract deep features and use them using GraphLab Create
  • How to build and deploy an image similarity service using Dato Predictive Services

What we’ll cover:

  • Using an already-trained deep learning model
  • Extracting deep features
  • Building and deploying an image similarity service for pictures 

Deep learning has difficulty justifying its choices, just like human judges of similarity, but could it play a role in assisting topic map authors in constructing explicit decisions for merging?

Once trained, could deep learning suggest properties and/or values to consider for merging it has not yet experienced?

I haven’t seen any webinars recently so I am ready to gamble on this being an interesting one.


Peda(bot)bically Speaking:…

Monday, April 25th, 2016

Peda(bot)bically Speaking: Teaching Computational and Data Journalism with Bots by Nicholas Diakopoulos.

From the post:

Bots can be useful little creatures for journalism. Not only because they help us automate tasks like alerting and filtering, but also because they encapsulate how data and computing can work together, in service of automated news. At the University of Maryland, where I’m a professor of journalism, my students are using the power of news bots to learn concepts and skills in computational journalism—including both editorial thinking and computational thinking.

Hmmm, bot that filters all tweets that don’t contain a URL? (To filter cat pics and the like.) 😉

Or retweets tweets with #’s that trigger creation of topics/associations?

I don’t think there is a requirement that hashtags be meaningful to others. Yes?

Sounds like a great class!

Hello World – Machine Learning Recipes #1

Saturday, April 16th, 2016

Hello World – Machine Learning Recipes #1 by Josh Gordon.

From the description:

Six lines of Python is all it takes to write your first machine learning program! In this episode, we’ll briefly introduce what machine learning is and why it’s important. Then, we’ll follow a recipe for supervised learning (a technique to create a classifier from examples) and code it up.

The first in a promised series on machine learning using scikit learn and TensorFlow.

The quality of video that you wish was available to intermediate and advanced treatments.

Quite a treat! Pass onto anyone interested in machine learning.