## Archive for the ‘Deep Learning’ Category

### Deep Learning for NLP Best Practices

Wednesday, July 26th, 2017

From the introduction:

This post is a collection of best practices for using neural networks in Natural Language Processing. It will be updated periodically as new insights become available and in order to keep track of our evolving understanding of Deep Learning for NLP.

There has been a running joke in the NLP community that an LSTM with attention will yield state-of-the-art performance on any task. While this has been true over the course of the last two years, the NLP community is slowly moving away from this now standard baseline and towards more interesting models.

However, we as a community do not want to spend the next two years independently (re-)discovering the next LSTM with attention. We do not want to reinvent tricks or methods that have already been shown to work. While many existing Deep Learning libraries already encode best practices for working with neural networks in general, such as initialization schemes, many other details, particularly task or domain-specific considerations, are left to the practitioner.

This post is not meant to keep track of the state-of-the-art, but rather to collect best practices that are relevant for a wide range of tasks. In other words, rather than describing one particular architecture, this post aims to collect the features that underly successful architectures. While many of these features will be most useful for pushing the state-of-the-art, I hope that wider knowledge of them will lead to stronger evaluations, more meaningful comparison to baselines, and inspiration by shaping our intuition of what works.

I assume you are familiar with neural networks as applied to NLP (if not, I recommend Yoav Goldberg’s excellent primer [43]) and are interested in NLP in general or in a particular task. The main goal of this article is to get you up to speed with the relevant best practices so you can make meaningful contributions as soon as possible.

I will first give an overview of best practices that are relevant for most tasks. I will then outline practices that are relevant for the most common tasks, in particular classification, sequence labelling, natural language generation, and neural machine translation.

Certainly a resource to bookmark while you read A Primer on Neural Network Models for Natural Language Processing by Yoav Goldberg, at 76 pages and to consult frequently as you move beyond the primer stage.

Enjoy and pass it on!

### Deep Learning – Dodging The NSA

Monday, May 29th, 2017

Ivanov’s motivation for local deep learning hardware came from monthly AWS bills.

You may suffer from those or be training on data sets you’d rather not share with the NSA.

For whatever reason, follow these detailed descriptions to build your own deep learning box.

Caution: If more than a month or more has lapsed from this post and your starting to build a system, check all the update links. Hardware and prices change rapidly.

### Weaponizing GPUs (Terrorism)

Monday, May 22nd, 2017

ETH Zurich scientists leveraged deep learning to automatically stich together millions of public images and video into a three-dimensional, living model of the city of Zurich.

The platform called “VarCity” combines a variety of different image sources: aerial photographs, 360-degree panoramic images taken from vehicles, photos published by tourists on social networks and video material from YouTube and public webcams.

“The more images and videos the platform can evaluate, the more precise the model becomes,” says Kenneth Vanhoey, a postdoc in the group led by Luc Van Gool, a Professor at ETH Zurich’s Computer Vision Lab. “The aim of our project was to develop the algorithms for such 3D city models, assuming that the volume of available images and videos will also increase dramatically in the years ahead.”

Using a cluster of GPUs including Tesla K40s with cuDNN to train their deep learning models, the technology recognizes image content such as buildings, windows and doors, streets, bodies of water, people, and cars. Without human assistance, the 3D model “knows”, for example, what pavements are and – by evaluating webcam data – which streets are one-way only.

The data/information gap between nation states and non-nation state groups grows narrower everyday. Here, GPUs and deep learning, produce planning data terrorists could have only dreamed about twenty years ago.

Technical advances make precautions such as:

Federal, state, and local law enforcement let people know that if they take pictures or notes around monuments and critical infrastructure facilities, they could be subject to an interrogation or an arrest; in addition to the See Something, Say Something awareness campaign, DHS also has broader initiatives such as the Buffer Zone Protection Program, which teach local police and security how to spot potential terrorist activities. (DHS focus on suspicious activity at critical infrastructure facilities)

sound old fashioned and quaint.

Such measures annoy tourists but unless potential terrorists are as dumb as the underwear bomber, against a skilled adversary, not so much.

I guess that’s the question isn’t it?

Are you planning to fight terrorists from shallow end of the gene pool or someone a little more challenging?

### DeepSketch2Face

Tuesday, May 16th, 2017

DeepSketch2Face: A Deep Learning Based Sketching System for 3D Face and Caricature Modeling by Xiaguang Han, Chang Gao, and Yizhou Yu.

Abstract:

Face modeling has been paid much attention in the field of visual computing. There exist many scenarios, including cartoon characters, avatars for social media, 3D face caricatures as well as face-related art and design, where low-cost interactive face modeling is a popular approach especially among amateur users. In this paper, we propose a deep learning based sketching system for 3D face and caricature modeling. This system has a labor-efficient sketching interface, that allows the user to draw freehand imprecise yet expressive 2D lines representing the contours of facial features. A novel CNN based deep regression network is designed for inferring 3D face models from 2D sketches. Our network fuses both CNN and shape based features of the input sketch, and has two independent branches of fully connected layers generating independent subsets of coefficients for a bilinear face representation. Our system also supports gesture based interactions for users to further manipulate initial face models. Both user studies and numerical results indicate that our sketching system can help users create face models quickly and effectively. A significantly expanded face database with diverse identities, expressions and levels of exaggeration is constructed to promote further research and evaluation of face modeling techniques.

Deep learning assisted drawing, here with faces or drawing more generally, is rife with possibilities for humor.

Realistic caricature/avatars are nearly within the reach of even art-challenged users.

Saturday, March 4th, 2017

Chip Huyen, who teaches CS 20SI: “TensorFlow for Deep Learning Research” @Standford, has posted code examples for the class, along with a chatbot, developed for one of the assignments.

A neural chatbot using sequence to sequence model with attentional decoder. This is a fully functional chatbot.

This is based on Google Translate Tensorflow model https://github.com/tensorflow/models/blob/master/tutorials/rnn/translate/

Sequence to sequence model by Cho et al.(2014)

Created by Chip Huyen as the starter code for assignment 3, class CS 20SI: “TensorFlow for Deep Learning Research” cs20si.stanford.edu

The detailed assignment handout and information on training time can be found at http://web.stanford.edu/class/cs20si/assignments/a3.pdf

Dialogue is lacking but this chatbot could be trained to appear to government forces as a live “jihadist” following and conversing with other “jihadists.” Who may themselves be chatbots.

Unlike the expense of pilots for a fleet of drones, a single user could “pilot” a group of chatbots, creating an over-sized impression in cyberspace. The deeper the modeling of human jihadists, the harder it will be to distinguish virtual jihadists.

I say “jihadists” for headline effect. You could create interacting chatbots for right/left wing hate groups, gun owners, churches, etc., in short, anyone seeking to dilute surveillance.

(Unlike the ACLU or EFF, I don’t concede there are any legitimate reasons for government surveillance. The dangers of government surveillance far exceed any possible crime it could prevent. Government surveillance is the question. The answer is NO.)

CS 20SI: Tensorflow for Deep Learning Research

From the webpage:

Tensorflow is a powerful open-source software library for machine learning developed by researchers at Google Brain. It has many pre-built functions to ease the task of building different neural networks. Tensorflow allows distribution of computation across different computers, as well as multiple CPUs and GPUs within a single machine. TensorFlow provides a Python API, as well as a less documented C++ API. For this course, we will be using Python.

This course will cover the fundamentals and contemporary usage of the Tensorflow library for deep learning research. We aim to help students understand the graphical computational model of Tensorflow, explore the functions it has to offer, and learn how to build and structure models best suited for a deep learning project. Through the course, students will use Tensorflow to build models of different complexity, from simple linear/logistic regression to convolutional neural network and recurrent neural networks with LSTM to solve tasks such as word embeddings, translation, optical character recognition. Students will also learn best practices to structure a model and manage research experiments.

Enjoy!

### AI Podcast: Winning the Cybersecurity Cat and Mouse Game with AI

Wednesday, February 22nd, 2017

AI Podcast: Winning the Cybersecurity Cat and Mouse Game with AI. Brian Caulfield interviews Eli David of Deep Instinct.

From the description:

Cybersecurity is a cat-and-mouse game. And the mouse always has the upper hand. That’s because it’s so easy for new malware to go undetected.

Eli David, an expert in computational intelligence, wants to use AI to change that. He’s CTO of Deep Instinct, a security firm with roots in Israel’s defense industry, that is bringing the GPU-powered deep learning techniques underpinning modern speech and image recognition to the vexing world of cybersecurity.

“It’s exactly like Tom and Jerry, the cat and the mouse, with the difference being that, in this case, Jerry the mouse always has the upper hand,” David said in a conversation on the AI Podcast with host Michael Copeland. He notes that more than 1 million new pieces of malware are created every day.

Interesting take on detection of closely similar malware using deep learning.

Directed in part at detecting smallish modifications that evade current malware detection techniques.

OK, but who is working on using deep learning to discover flaws in software code?

### Deep Learning (MIT Press Book) – Published (and still online)

Monday, February 13th, 2017

Deep Learning by Yoshua Bengio, Ian Goodfellow and Aaron Courville.

From the introduction:

1.1 Who Should Read This Book?

This book can be useful for a variety of readers, but we wrote it with two main target audiences in mind. One of these target audiences is university students(undergraduate or graduate) learning about machine learning, including those who are beginning a career in deep learning and artiﬁcial intelligence research. The other target audience is software engineers who do not have a machine learning or statistics background, but want to rapidly acquire one and begin using deep learning in their product or platform. Deep learning has already proven useful in many software disciplines including computer vision, speech and audio processing,natural language processing, robotics, bioinformatics and chemistry, video games,search engines, online advertising and ﬁnance.

This book has been organized into three parts in order to best accommodate a variety of readers. Part I introduces basic mathematical tools and machine learning concepts. Part II describes the most established deep learning algorithms that are essentially solved technologies. Part III describes more speculative ideas that are widely believed to be important for future research in deep learning.

Readers should feel free to skip parts that are not relevant given their interests or background. Readers familiar with linear algebra, probability, and fundamental machine learning concepts can skip part I, for example, while readers who just want to implement a working system need not read beyond part II. To help choose which chapters to read, ﬁgure 1.6 provides a ﬂowchart showing the high-level organization of the book.

We do assume that all readers come from a computer science background. We assume familiarity with programming, a basic understanding of computational performance issues, complexity theory, introductory level calculus and some of the terminology of graph theory.

This promises to be a real delight, whether read for an application space or to get a better handle on deep learning.

### Comparing Symbolic Deep Learning Frameworks

Thursday, December 8th, 2016

Deep Learning Part 1: Comparison of Symbolic Deep Learning Frameworks by Anusua Trivedi.

From the post:

This blog series is based on my upcoming talk on re-usability of Deep Learning Models at the Hadoop+Strata World Conference in Singapore. This blog series will be in several parts – where I describe my experiences and go deep into the reasons behind my choices.

Deep learning is an emerging field of research, which has its application across multiple domains. I try to show how transfer learning and fine tuning strategy leads to re-usability of the same Convolution Neural Network model in different disjoint domains. Application of this model across various different domains brings value to using this fine-tuned model.

In this blog (Part1), I describe and compare the commonly used open-source deep learning frameworks. I dive deep into different pros and cons for each framework, and discuss why I chose Theano for my work.

Your mileage may vary but a great starting place!

### srez: Image super-resolution through deep learning

Sunday, August 28th, 2016

From the webpage:

Image super-resolution through deep learning. This project uses deep learning to upscale 16×16 images by a 4x factor. The resulting 64×64 images display sharp features that are plausible based on the dataset that was used to train the neural net.

Here’s an random, non cherry-picked, example of what this network can do. From left to right, the first column is the 16×16 input image, the second one is what you would get from a standard bicubic interpolation, the third is the output generated by the neural net, and on the right is the ground truth.

Once you have collected names, you are likely to need image processing.

Here’s an interesting technique using deep learning. Face on at the moment but you can expect that to improve.

### What’s the Difference Between Artificial Intelligence, Machine Learning, and Deep Learning?

Friday, August 19th, 2016

From the post:

Artificial intelligence is the future. Artificial intelligence is science fiction. Artificial intelligence is already part of our everyday lives. All those statements are true, it just depends on what flavor of AI you are referring to.

For example, when Google DeepMind’s AlphaGo program defeated South Korean Master Lee Se-dol in the board game Go earlier this year, the terms AI, machine learning, and deep learning were used in the media to describe how DeepMind won. And all three are part of the reason why AlphaGo trounced Lee Se-Dol. But they are not the same things.

The easiest way to think of their relationship is to visualize them as concentric circles with AI — the idea that came first — the largest, then machine learning — which blossomed later, and finally deep learning — which is driving today’s AI explosion — fitting inside both.

If you are confused by the mix of artificial intelligence, machine learning, and deep learning, floating around, Copeland will set you straight.

It’s a fun read and one you can recommend to non-technical friends.

### Grokking Deep Learning

Wednesday, August 17th, 2016

Grokking Deep Learning by Andrew W. Trask.

From the description:

Artificial Intelligence is the most exciting technology of the century, and Deep Learning is, quite literally, the “brain” behind the world’s smartest Artificial Intelligence systems out there. Loosely based on neuron behavior inside of human brains, these systems are rapidly catching up with the intelligence of their human creators, defeating the world champion Go player, achieving superhuman performance on video games, driving cars, translating languages, and sometimes even helping law enforcement fight crime. Deep Learning is a revolution that is changing every industry across the globe.

Grokking Deep Learning is the perfect place to begin your deep learning journey. Rather than just learn the “black box” API of some library or framework, you will actually understand how to build these algorithms completely from scratch. You will understand how Deep Learning is able to learn at levels greater than humans. You will be able to understand the “brain” behind state-of-the-art Artificial Intelligence. Furthermore, unlike other courses that assume advanced knowledge of Calculus and leverage complex mathematical notation, if you’re a Python hacker who passed high-school algebra, you’re ready to go. And at the end, you’ll even build an A.I. that will learn to defeat you in a classic Atari game.

In the Manning Early Access Program (MEAP) with three (3) chapters presently available.

A much more plausible undertaking than DARPA’s quest for “Explainable AI” or “XAI.” (DARPA WANTS ARTIFICIAL INTELLIGENCE TO EXPLAIN ITSELF) DARPA reasons that:

Potential applications for defense are endless—autonomous aerial and undersea war-fighting or surveillance, among others—but humans won’t make full use of AI until they trust it won’t fail, according to the Defense Advanced Research Projects Agency. A new DARPA effort aims to nurture communication between machines and humans by investing in AI that can explain itself as it works.

If non-failure is the criteria for trust, U.S. troops should refuse to leave their barracks in view of the repeated failures of military strategy since the end of WWII.

DARPA should choose a less stringent criteria for trusting an AI. However, failing less often than the Joint Chiefs of Staff may be too low a bar to set.

### Deep Learning Trends @ ICLR 2016 (+ Shout-Out to arXiv)

Friday, June 3rd, 2016

From the post:

Started by the youngest members of the Deep Learning Mafia [1], namely Yann LeCun and Yoshua Bengio, the ICLR conference is quickly becoming a strong contender for the single most important venue in the Deep Learning space. More intimate than NIPS and less benchmark-driven than CVPR, the world of ICLR is arXiv-based and moves fast.

Today’s post is all about ICLR 2016. I’ll highlight new strategies for building deeper and more powerful neural networks, ideas for compressing big networks into smaller ones, as well as techniques for building “deep learning calculators.” A host of new artificial intelligence problems is being hit hard with the newest wave of deep learning techniques, and from a computer vision point of view, there’s no doubt that deep convolutional neural networks are today’s “master algorithm” for dealing with perceptual data.

Information packed review of the conference and if that weren’t enough, this shout-out to arXiv:

ICLR Publishing Model: arXiv or bust
At ICLR, papers get posted on arXiv directly. And if you had any doubts that arXiv is just about the single awesomest thing to hit the research publication model since the Gutenberg press, let the success of ICLR be one more data point towards enlightenment. ICLR has essentially bypassed the old-fashioned publishing model where some third party like Elsevier says “you can publish with us and we’ll put our logo on your papers and then charge regular people $30 for each paper they want to read.” Sorry Elsevier, research doesn’t work that way. Most research papers aren’t good enough to be worth$30 for a copy. It is the entire body of academic research that provides true value, for which a single paper just a mere door. You see, Elsevier, if you actually gave the world an exceptional research paper search engine, together with the ability to have 10-20 papers printed on decent quality paper for a \$30/month subscription, then you would make a killing on researchers and I would endorse such a subscription. So ICLR, rightfully so, just said fuck it, we’ll use arXiv as the method for disseminating our ideas. All future research conferences should use arXiv to disseminate papers. Anybody can download the papers, see when newer versions with corrections are posted, and they can print their own physical copies. But be warned: Deep Learning moves so fast, that you’ve gotta be hitting refresh or arXiv on a weekly basis or you’ll be schooled by some grad students in Canada.

Do you hit arXiv every week?

### Deep Learning: Image Similarity and Beyond (Webinar, May 10, 2016)

Friday, May 6th, 2016

Deep Learning: Image Similarity and Beyond (Webinar, May 10, 2016)

From the registration page:

Deep Learning is a powerful machine learning method for image tagging, object recognition, speech recognition, and text analysis. In this demo, we’ll cover the basic concept of deep learning and walk you through the steps to build an application that finds similar images using an already-trained deep learning model.

#### Recommended for:

• Data scientists and engineers
• Developers and technical team managers
• Technical product managers

What you’ll learn:

• How to leverage existing deep learning models
• How to extract deep features and use them using GraphLab Create
• How to build and deploy an image similarity service using Dato Predictive Services

What we’ll cover:

• Using an already-trained deep learning model
• Extracting deep features
• Building and deploying an image similarity service for pictures

Deep learning has difficulty justifying its choices, just like human judges of similarity, but could it play a role in assisting topic map authors in constructing explicit decisions for merging?

Once trained, could deep learning suggest properties and/or values to consider for merging it has not yet experienced?

I haven’t seen any webinars recently so I am ready to gamble on this being an interesting one.

Enjoy!

### Revealing the Hidden Patterns of News Photos:… [Uncovers Anti-Sanders Bias]

Saturday, March 26th, 2016

Abstract:

In this work, we analyze more than two million news photos published in January 2016. We demonstrate i) which objects appear the most in news photos; ii) what the sentiments of news photos are; iii) whether the sentiment of news photos is aligned with the tone of the text; iv) how gender is treated; and v) how differently political candidates are portrayed. To our best knowledge, this is the first large-scale study of news photo contents using deep learning-based vision APIs.

Not that bias-free news is possible, but deep learning appears to be useful in foregrounding bias against particular candidates:

We then conducted a case study of assessing the portrayal of Democratic and Republican party presidential candidates in news photos. We found that all the candidates but Sanders had a similar proportion of being labeled as an athlete, which is typically associates with a victory pose or a sharp focus on a face with blurred background. Pro-Clinton media recognized by their endorsements show the same tendency; their Sanders photos are not labeled as an athlete at all. Furthermore, we found that Clinton expresses joy more than Sanders does in the six popular news media. Similarly. pro-Clinton media shows a higher proportion of Clinton expressing joy than Sanders.

If the requirement is an “appearance” of lack of bias, the same techniques enable the monitoring/shaping of your content to prevent your bias from being discovered by others.

Data scientists who can successfully wield this framework will be in high demand for political campaigns.

### Automating Amazon/Hotel/Travel Reviews (+ Human Intelligence Test (HIT))

Sunday, February 28th, 2016

The Neural Network That Remembers by Zachary C. Lipton & Charles Elkan.

From the post:

On tap at the brewpub. A nice dark red color with a nice head that left a lot of lace on the glass. Aroma is of raspberries and chocolate. Not much depth to speak of despite consisting of raspberries. The bourbon is pretty subtle as well. I really don’t know that find a flavor this beer tastes like. I would prefer a little more carbonization to come through. It’s pretty drinkable, but I wouldn’t mind if this beer was available.

Besides the overpowering bouquet of raspberries in this guy’s beer, this review is remarkable for another reason. It was produced by a computer program instructed to hallucinate a review for a “fruit/vegetable beer.” Using a powerful artificial-intelligence tool called a recurrent neural network, the software that produced this passage isn’t even programmed to know what words are, much less to obey the rules of English syntax. Yet, by mining the patterns in reviews from the barflies at BeerAdvocate.com, the program learns how to generate similarly coherent (or incoherent) reviews.

The neural network learns proper nouns like “Coors Light” and beer jargon like “lacing” and “snifter.” It learns to spell and to misspell, and to ramble just the right amount. Most important, the neural network generates reviews that are contextually relevant. For example, you can say, “Give me a 5-star review of a Russian imperial stout,” and the software will oblige. It knows to describe India pale ales as “hoppy,” stouts as “chocolatey,” and American lagers as “watery.” The neural network also learns more colorful words for lagers that we can’t put in print.

This particular neural network can also run in reverse, taking any review and recognizing the sentiment (star rating) and subject (type of beer). This work, done by one of us (Lipton) in collaboration with his colleagues Sharad Vikram and Julian McAuley at the University of California, San Diego, is part of a growing body of research demonstrating the language-processing capabilities of recurrent networks. Other related feats include captioning images, translating foreign languages, and even answering e-mail messages. It might make you wonder whether computers are finally able to think.

(emphasis in original)

An enthusiastic introduction and projection of the future of recurrent neural networks! Quite a bit so.

My immediate thought was what a time saver a recurrent neural network would be for “evaluation” requests that appear in my inbox with alarming regularity.

What about a service that accepts forwarded emails and generates a review for the book, seller, hotel, travel, etc., which is returned to you for cut-n-paste?

That would be about as “intelligent” as the amount of attention most of us devote to such requests.

You could set the service to mimic highly followed reviewers so over time you would move up the ranks of reviewers.

I mention Amazon, hotel, travel reviews but those are just low-lying fruit. You could do journal book reviews with a different data set.

Near the end of the post the authors write:

In this sense, the computer-science community is evaluating recurrent neural networks via a kind of Turing test. We try to teach a computer to act intelligently by training it to imitate what people produce when faced with the same task. Then we evaluate our thinking machine by seeing whether a human judge can distinguish between its output and what a human being might come up with.

While the very fact that we’ve come this far is exciting, this approach may have some fundamental limitations. For instance, it’s unclear how such a system could ever outstrip the capabilities of the people who provide the training data. Teaching a machine to learn through imitation might never produce more intelligence than was present collectively in those people.

One promising way forward might be an approach called reinforcement learning. Here, the computer explores the possible actions it can take, guided only by some sort of reward signal. Recently, researchers at Google DeepMind combined reinforcement learning with feed-forward neural networks to create a system that can beat human players at 31 different video games. The system never got to imitate human gamers. Instead it learned to play games by trial and error, using its score in the video game as a reward signal.

Instead of asking whether computers can think, the more provocative question is “whether people think for a large range of daily activities?”

Consider it as the Human Intelligence Test (HIT).

How much “intelligence” does it take to win a video game?

Eye/hand coordination to be sure, attention, but what “intelligence” is involved?

Computers may “eclipse” human beings at non-intelligent activities, as a shovel “eclipses” our ability to dig with our bare hands.

But I’m not overly concerned.

Are you?

### Webinar: Image Similarity: Deep Learning and Beyond (January 12th/Register for Recording)

Monday, January 11th, 2016

From the webpage:

In this talk, we will extract features from the convolutional networks applied to real estate images to build a similarity graph and then do label propagation on the images to label different images in our dataset.

Recommended for:

• Data scientists and engineers
• Developers and technical team managers
• Technical product managers

What you’ll learn:

• How to extract features from a convolutional network using GraphLab Create
• How to build similarity graphs using nearest neighbors
• How to implement graph algorithms such as PageRank using GraphLab Create

What we’ll cover:

• Extracting features from convolutional networks
• Building similarity graphs using nearest neighbors
• Clustering: kmeans and beyond
• Graph algorithms: PageRank and label propagation

I had mixed results with webinars in 2015.

Looking forward to this one because of the coverage of similarity graphs.

From a subject identity perspective, how much similarity do you need to be the “same” subject?

If I have two books, one printed within the copyright period and another copy printed after the work came into the public domain, are they the same subject?

For some purposes yes and for other purposes not.

The strings we give web browsers, usually starting with “https://” these days, are crude measures of subject identity, don’t you think?

I say “the strings we give web browsers” as the efforts of TBL and his cronies to use popularity as a measure of success, continue their efforts to conflate URI, IRI, and URL into only URL. https://url.spec.whatwg.org/ The simplification doesn’t bother me as much as the attempts to conceal it.

It’s one way to bolster a claim to have anyways been right, just re-write the records that anyone is likely to remember. I prefer my history with warts and all.

### Awesome Deep Learning – Value-Add Curation?

Monday, December 28th, 2015

Tweeted by Gregory Piatetsky as:

Awesome Curated #DeepLearning resources on #GitHub: books, courses, lectures, researchers…

What will you find there? (As of 28 December 2015):

• Courses – 15
• Datasets – 114
• Free Online Books – 8
• Frameworks – 35
• Miscellaneous – 26
• Papers – 32
• Researchers – 96
• Tutorials – 13
• Videos and Lectures – 16
• Websites – 24

By my count, that’s 359 resources.

We know from detailed analysis of PubMed search logs, that 80% of searchers choose a link from the first twenty “hits” returned for a search.

You could assume that out of “23 million user sessions and more than 58 million user queries” PubMed searchers and/or PubMed itself or both transcend the accuracy of searching observed in other contexts. That seems rather unlikely.

The authors note:

Two interesting phenomena are observed: first, the number of clicks for the documents in the later pages degrades exponentially (Figure 8). Second, PubMed users are more likely to click the first and last returned citation of each result page (Figure 9). This suggests that rather than simply following the retrieval order of PubMed, users are influenced by the results page format when selecting returned citations.

Result page format seems like a poor basis for choosing search results, in addition to being in the top twenty (20) results.

Eliminating all the cruft from search results to give you 359 resources is a value-add, but what value-add should added to this list of resources?

Serious question because we have tools far beyond what were available to curators in the 1960’s but there is little (if any) curation to match of the Reader’s Guide to Periodical Literature.

Here is a screen-shot of some of its contents:

If you can, tell me what search you would use to return that sort of result for “abortion” as a subject.

Nothing come to mind?

Just to get you started, would pointing to algorithms across these 359 resources be helpful? Would you want to know more than algorithm N occurs in resource Y? Some of the more popular ones may occur in every resource. How helpful is that?

So I repeat my earlier question:

Please forward, repost, reblog, tweet. Thanks!

### The Limitations of Deep Learning in Adversarial Settings [The other type of setting would be?]

Tuesday, November 24th, 2015

Abstract:

Deep learning takes advantage of large datasets and computationally efficient training algorithms to outperform other approaches at various machine learning tasks. However, imperfections in the training phase of deep neural networks make them vulnerable to adversarial samples: inputs crafted by adversaries with the intent of causing deep neural networks to misclassify. In this work, we formalize the space of adversaries against deep neural networks (DNNs) and introduce a novel class of algorithms to craft adversarial samples based on a precise understanding of the mapping between inputs and outputs of DNNs. In an application to computer vision, we show that our algorithms can reliably produce samples correctly classified by human subjects but misclassified in specific targets by a DNN with a 97% adversarial success rate while only modifying on average 4.02% of the input features per sample. We then evaluate the vulnerability of different sample classes to adversarial perturbations by defining a hardness measure. Finally, we describe preliminary work outlining defenses against adversarial samples by defining a predictive measure of distance between a benign input and a target classification.

I recommended deep learning for parsing lesser known languages earlier today. The utility of deep learning isn’t in doubt, but its vulnerability to “adversarial” input should give us pause.

Adversarial input isn’t likely to be labeled as such. In fact, it may be concealed in ordinary open data that is freely available for download.

As the authors note, the more prevalent deep learning becomes, the greater the incentive for the manipulation of input into a deep neural network (DNN).

Although phrased as “adversaries,” the manipulation of input into DNNs isn’t limited to the implied “bad actors.” The choice or “cleaning” of input could be considered manipulation of input, from a certain point of view.

This paper is notice that input into a DNN is as important in evaluating its results as as any other factor, if not more so.

Or to put it more bluntly, no disclosure of DNN data = no trust of DNN results.

### Deep Learning and Parsing

Sunday, November 22nd, 2015

Jason Baldridge tweets that the work of James Henderson (Google Scholar) should get more cites for deep learning and parsing.

Jason points to the following two works (early 1990’s) in particular:

Description Based Parsing in a Connectionist Network by James B. Henderson.

Abstract:

Recent developments in connectionist architectures for symbolic computation have made it possible to investigate parsing in a connectionist network while still taking advantage of the large body of work on parsing in symbolic frameworks. This dissertation investigates syntactic parsing in the temporal synchrony variable binding model of symbolic computation in a connectionist network. This computational architecture solves the basic problem with previous connectionist architectures,
while keeping their advantages. However, the architecture does have some limitations, which impose computational constraints on parsing in this architecture. This dissertation argues that, despite these constraints, the architecture is computationally adequate for syntactic parsing, and that these constraints make signi cant linguistic predictions. To make these arguments, the nature of the architecture’s limitations are fi rst characterized as a set of constraints on symbolic
computation. This allows the investigation of the feasibility and implications of parsing in the architecture to be investigated at the same level of abstraction as virtually all other investigations of syntactic parsing. Then a specifi c parsing model is developed and implemented in the architecture. The extensive use of partial descriptions of phrase structure trees is crucial to the ability of this model to recover the syntactic structure of sentences within the constraints. Finally, this parsing model is tested on those phenomena which are of particular concern given the constraints, and on an approximately unbiased sample of sentences to check for unforeseen difficulties. The results show that this connectionist architecture is powerful enough for syntactic parsing. They also show that some linguistic phenomena are predicted by the limitations of this architecture. In particular, explanations are given for many cases of unacceptable center embedding, and for several signifi cant constraints on long distance dependencies. These results give evidence for the cognitive signi ficance
of this computational architecture and parsing model. This work also shows how the advantages of both connectionist and symbolic techniques can be uni ed in natural language processing applications. By analyzing how low level biological and computational considerations influence higher level processing, this work has furthered our understanding of the nature of language and how it can be efficiently and e ffectively processed.

Connectionist Syntactic Parsing Using Temporal Variable Binding by James Henderson.

Abstract:

Recent developments in connectionist architectures for symbolic computation have made it possible to investigate parsing in a connectionist network while still taking advantage of the large body of work on parsing in symbolic frameworks. The work discussed here investigates syntactic parsing in the temporal synchrony variable binding model of symbolic computation in a connectionist network. This computational architecture solves the basic problem with previous connectionist architectures, while keeping their advantages. However, the architecture does have some limitations, which impose constraints on parsing in this architecture. Despite these constraints, the architecture is computationally adequate for syntactic parsing. In addition, the constraints make some signifi cant linguistic predictions. These arguments are made using a specifi c parsing model. The extensive use of partial descriptions of phrase structure trees is crucial to the ability of this model to recover the syntactic structure of sentences within the constraints imposed by the architecture.

Enjoy!

### Hopping on the Deep Learning Bandwagon

Thursday, November 5th, 2015

Hopping on the Deep Learning Bandwagon by Yanir Seroussi.

From the post:

I’ve been meaning to get into deep learning for the last few years. Now, the stars having finally aligned and I have the time and motivation to work on a small project that will hopefully improve my understanding of the field. This is the first in a series of posts that will document my progress on this project.

As mentioned in a previous post on getting started as a data scientist, I believe that the best way of becoming proficient at solving data science problems is by getting your hands dirty. Despite being familiar with high-level terminology and having some understanding of how it all works, I don’t have any practical experience applying deep learning. The purpose of this project is to fix this experience gap by working on a real problem.

#### The problem: Inferring genre from album covers

Deep learning has been very successful at image classification. Therefore, it makes sense to work on an image classification problem for this project. Rather than using an existing dataset, I decided to make things a bit more interesting by building my own dataset. Over the last year, I’ve been running BCRecommender – a recommendation system for Bandcamp music. I’ve noticed that album covers vary by genre, though it’s hard to quantify exactly how they vary. So the question I’ll be trying to answer with this project is how accurately can genre be inferred from Bandcamp album covers?

As the goal of this project is to learn about deep learning rather than make a novel contribution, I didn’t do a comprehensive search to see whether this problem has been addressed before. However, I did find a recent post by Alexandre Passant that describes his use of Clarifai’s API to tag the content of Spotify album covers (identifying elements such as men, night, dark, etc.), and then using these tags to infer the album’s genre. Another related project is Karayev et al.’s Recognizing image style paper, in which the authors classified datasets of images from Flickr and Wikipedia by style and art genre, respectively. In all these cases, the results are pretty good, supporting my intuition that the genre inference task is feasible.

Yanir continues this adventure into deep learning with: Learning About Deep Learning Through Album Cover Classification. And you will want to look over his list of Deep Learning Resources.

Yanir’s observation that the goal of the project was “…to learn about deep learning rather than make a novel contribution…” is an important one.

The techniques and lessons you learn may be known to others but they will be new to you.

### How to build and run your first deep learning network

Thursday, October 29th, 2015

From the post:

When I first became interested in using deep learning for computer vision I found it hard to get started. There were only a couple of open source projects available, they had little documentation, were very experimental, and relied on a lot of tricky-to-install dependencies. A lot of new projects have appeared since, but they’re still aimed at vision researchers, so you’ll still hit a lot of the same obstacles if you’re approaching them from outside the field.

In this article — and the accompanying webcast — I’m going to show you how to run a pre-built network, and then take you through the steps of training your own. I’ve listed the steps I followed to set up everything toward the end of the article, but because the process is so involved, I recommend you download a Vagrant virtual machine that I’ve pre-loaded with everything you need. This VM lets us skip over all the installation headaches and focus on building and running the neural networks.

I have been unable to find the posts that were to follow in this series.

Even by itself this will be enough to get you going on deep learning but the additional posts would be nice.

Pointers anyone?

### Teaching Deep Convolutional Neural Networks to Play Go [Networks that can’t explain their play]

Sunday, October 18th, 2015

Abstract:

Mastering the game of Go has remained a long standing challenge to the field of AI. Modern computer Go systems rely on processing millions of possible future positions to play well, but intuitively a stronger and more ‘humanlike’ way to play the game would be to rely on pattern recognition abilities rather then brute force computation. Following this sentiment, we train deep convolutional neural networks to play Go by training them to predict the moves made by expert Go players. To solve this problem we introduce a number of novel techniques, including a method of tying weights in the network to ‘hard code’ symmetries that are expect to exist in the target function, and demonstrate in an ablation study they considerably improve performance. Our final networks are able to achieve move prediction accuracies of 41.1% and 44.4% on two different Go datasets, surpassing previous state of the art on this task by significant margins. Additionally, while previous move prediction programs have not yielded strong Go playing programs, we show that the networks trained in this work acquired high levels of skill. Our convolutional neural networks can consistently defeat the well known Go program GNU Go, indicating it is state of the art among programs that do not use Monte Carlo Tree Search. It is also able to win some games against state of the art Go playing program Fuego while using a fraction of the play time. This success at playing Go indicates high level principles of the game were learned.

The last line of the abstract caught my eye:

This success at playing Go indicates high level principles of the game were learned.

That statement is expanded in 4.3 Playing Go:

The results are very promising. Even though the networks are playing using a ‘zero step look ahead’ policy, and using a fraction of the computation time as their opponents, they are still able to play better then GNU Go and take some games away from Fuego. Under these settings GNU Go might play at around a 6-8 kyu ranking and Fuego at 2-3 kyu, which implies the networks are achieving a ranking of approximately 4-5 kyu. For a human player reaching this ranking would normally require years of study. This indicates that sophisticated knowledge of the game was acquired. This also indicates great potential for a Go program that integrates the information produced by such a network.

An interesting limitation that the network can’t communicate what it has learned. It can only produce an answer for a given situation. In gaming situations that opaqueness isn’t immediately objectionable.

But what if the situation was fire/don’t fire in a combat situation? Would the limitation that the network can only say yes or no, with no way to explain its answer, be acceptable?

Is that any worse than humans inventing explanations for decisions that weren’t the result of any rational thinking process?

Some additional Go resources you may find useful: American Go Association, Go Game Guru (with a printable Go board and stones), GoBase.org (has a Japanese dictionary). Those site will lead you to many other Go sites.

### Neural Networks and Deep Learning

Wednesday, June 3rd, 2015

From the webpage:

Neural Networks and Deep Learning is a free online book. The book will teach you about:

• Neural networks, a beautiful biologically-inspired programming paradigm which enables a computer to learn from observational data
• Deep learning, a powerful set of techniques for learning in neural networks

Neural networks and deep learning currently provide the best solutions to many problems in image recognition, speech recognition, and natural language processing. This book will teach you the core concepts behind neural networks and deep learning.

The book is currently an incomplete beta draft. More chapters will be added over the coming months. For now, you can:

Michael starts off with a task that we all mastered as small children, recognizing hand written digits. Along the way, you will learn not just the mechanics of how the characters are recognized but why neural networks work the way they do.

Great introductory material to pass along to a friend.

### Deep Learning (MIT Press Book) – Update

Friday, May 22nd, 2015

Deep Learning (MIT Press Book) by Yoshua Bengio, Ian Goodfellow and Aaron Courville.

I last mentioned this book last August and wanted to point out that a new draft appeared on 19/05/2015.

Typos and opportunities for improvement still exist! Now is your chance to help the authors make this a great book!

Enjoy!

### Summer DIY: Combination Lock Cracker

Monday, May 18th, 2015

Amusing account of Samy Kamkar and his hacking history up to and including:

…an open-source 3D-printed robot that can crack a combination lock in just 30 seconds by twiddling the dial all by itself.

Paul includes some insights into opening combination locks.

Good opportunity to learn about 3D printing and fundamentals of combination locks.

If that seems too simple, try safe locks with the 3D-printed robot (adjust for the size/torque required to turn the dial). The robot will turn the dial more consistently than any human hand. Use very sensitive vibration detectors to pick up the mechanical movement of the lock, capture that vibration as a digital file, from knowledge of the lock, you know the turns, directions, etc.

Then use deep learning over several passes on the lock to discover the opening sequence. Need a stand for the robot to isolate its vibrations from the safe housing and for it to reach the combination dial.

Or you can call a locksmith and pay big bugs to open a safe.

The DIY way has you learning some mechanics, a little physics and deep learning.

If you are up for a real challenge, consider the X-09™ Locks (NSN #5340-01-498-2758), which is certified to meet FF-L-2740A, the “the US Government’s highest security standard for container locks and doors.”

The factory default combination is 50-25-50, so try that first. 😉

### Practical Text Analysis using Deep Learning

Friday, May 1st, 2015

From the post:

Deep Learning has become a household buzzword these days, and I have not stopped hearing about it. In the beginning, I thought it was another rebranding of Neural Network algorithms or a fad that will fade away in a year. But then I read Piotr Teterwak’s blog post on how Deep Learning can be easily utilized for various image analysis tasks. A powerful algorithm that is easy to use? Sounds intriguing. So I decided to give it a closer look. Maybe it will be a new hammer in my toolbox that can later assist me to tackle new sets of interesting problems.

After getting up to speed on Deep Learning (see my recommended reading list at the end of this post), I decided to try Deep Learning on NLP problems. Several years ago, Professor Moshe Koppel gave a talk about how he and his colleagues succeeded in determining an author’s gender by analyzing his or her written texts. They also released a dataset containing 681,288 blog posts. I found it remarkable that one can infer various attributes about an author by analyzing the text, and I’ve been wanting to try it myself. Deep Learning sounded very versatile. So I decided to use it to infer a blogger’s personal attributes, such as age and gender, based on the blog posts.

If you haven’t gotten into deep learning, here’s another opportunity focused on natural language processing. You can follow Michael’s general directions to learn on your own or follow more detailed instructions in his Ipython notebook.

Enjoy!

### Deep Space Navigation With Deep Learning

Saturday, April 18th, 2015

Well, that’s not exactly the title but the paper does describe a better than 99% accuracy when compared to human recognition of galaxy images by type. I assume galaxy type is going to be a question on deep space navigation exams in the distant future. 😉

Abstract:

Measuring the morphological parameters of galaxies is a key requirement for studying their formation and evolution. Surveys such as the Sloan Digital Sky Survey (SDSS) have resulted in the availability of very large collections of images, which have permitted population-wide analyses of galaxy morphology. Morphological analysis has traditionally been carried out mostly via visual inspection by trained experts, which is time-consuming and does not scale to large (≳104) numbers of images.

Although attempts have been made to build automated classification systems, these have not been able to achieve the desired level of accuracy. The Galaxy Zoo project successfully applied a crowdsourcing strategy, inviting online users to classify images by answering a series of questions. Unfortunately, even this approach does not scale well enough to keep up with the increasing availability of galaxy images.

We present a deep neural network model for galaxy morphology classification which exploits translational and rotational symmetry. It was developed in the context of the Galaxy Challenge, an international competition to build the best model for morphology classification based on annotated images from the Galaxy Zoo project.

For images with high agreement among the Galaxy Zoo participants, our model is able to reproduce their consensus with near-perfect accuracy (>99%) for most questions. Confident model predictions are highly accurate, which makes the model suitable for filtering large collections of images and forwarding challenging images to experts for manual annotation. This approach greatly reduces the experts’ workload without affecting accuracy. The application of these algorithms to larger sets of training data will be critical for analysing results from future surveys such as the LSST.

I particularly like the line:

Confident model predictions are highly accurate, which makes the model suitable for filtering large collections of images and forwarding challenging images to experts for manual annotation.

It reminds me of a suggestion I made for doing something quite similar where the uncertainly of crowd classifiers on a particular letter (as in a manuscript) would trigger the forwarding of that portion to an expert for a “definitive” read. You would surprised at the resistance you can encounter to the suggestion that no special skills are needed to read Greek manuscripts, which are in many cases as clear as when they were written in the early Christian era. Some aren’t and some aspects of them require expertise, but that isn’t to say they all require expertise.

Of course, if successful, such a venture could quite possibly result in papers that cite the images of all extant biblical witnesses and all of the variant texts, as opposed to those that cite a fragment entrusted to them for publication. The difference being whether you want to engage in scholarship, the act of interpreting witnesses or whether you wish to tell the proper time and make a modest noise while doing so.

### Recommending music on Spotify with deep learning

Friday, April 17th, 2015

From the post:

This summer, I’m interning at Spotify in New York City, where I’m working on content-based music recommendation using convolutional neural networks. In this post, I’ll explain my approach and show some preliminary results.

#### Overview

This is going to be a long post, so here’s an overview of the different sections. If you want to skip ahead, just click the section title to go there.

If you are interested in the details of deep learning and recommendation for music, you have arrived at the right place!

Walking through Sander’s post will take some time but it will repay your efforts handsomely.

Not to mention Spotify having the potential to broaden your musical horizons!

I first saw this in a tweet by Mica McPeeters.

Monday, March 30th, 2015

A collection of all the Google DeepMind publications to date.

Twenty-two (22) papers so far!

Enjoy!

### Classifying Plankton With Deep Neural Networks

Monday, March 23rd, 2015

Classifying Plankton With Deep Neural Networks by Sander Dieleman.

From the post:

The National Data Science Bowl, a data science competition where the goal was to classify images of plankton, has just ended. I participated with six other members of my research lab, the Reservoir lab of prof. Joni Dambre at Ghent University in Belgium. Our team finished 1st! In this post, we’ll explain our approach.

The ≋ Deep Sea ≋ team consisted of Aäron van den Oord, Ira Korshunova, Jeroen Burms, Jonas Degrave, Lionel Pigou, Pieter Buteneers and myself. We are all master students, PhD students and post-docs at Ghent University. We decided to participate together because we are all very interested in deep learning, and a collaborative effort to solve a practical problem is a great way to learn.

There were seven of us, so over the course of three months, we were able to try a plethora of different things, including a bunch of recently published techniques, and a couple of novelties. This blog post was written jointly by the team and will cover all the different ingredients that went into our solution in some detail.

## Overview

This blog post is going to be pretty long! Here’s an overview of the different sections. If you want to skip ahead, just click the section title to go there.

## Introduction

### The problem

The goal of the competition was to classify grayscale images of plankton into one of 121 classes. They were created using an underwater camera that is towed through an area. The resulting images are then used by scientists to determine which species occur in this area, and how common they are. There are typically a lot of these images, and they need to be annotated before any conclusions can be drawn. Automating this process as much as possible should save a lot of time!

The images obtained using the camera were already processed by a segmentation algorithm to identify and isolate individual organisms, and then cropped accordingly. Interestingly, the size of an organism in the resulting images is proportional to its actual size, and does not depend on the distance to the lens of the camera. This means that size carries useful information for the task of identifying the species. In practice it also means that all the images in the dataset have different sizes.

Participants were expected to build a model that produces a probability distribution across the 121 classes for each image. These predicted distributions were scored using the log loss (which corresponds to the negative log likelihood or equivalently the cross-entropy loss).

This loss function has some interesting properties: for one, it is extremely sensitive to overconfident predictions. If your model predicts a probability of 1 for a certain class, and it happens to be wrong, the loss becomes infinite. It is also differentiable, which means that models trained with gradient-based methods (such as neural networks) can optimize it directly – it is unnecessary to use a surrogate loss function.

Interestingly, optimizing the log loss is not quite the same as optimizing classification accuracy. Although the two are obviously correlated, we paid special attention to this because it was often the case that significant improvements to the log loss would barely affect the classification accuracy of the models.

This rocks!

Code is coming soon to Github!

Certainly of interest to marine scientists but also to anyone in bio-medical imaging.

The problem of too much data and too few experts is a common one.

What I don’t recall seeing are releases of pre-trained classifiers. Is the art developing too quickly for that to be a viable product? Just curious.

I first saw this in a tweet by Angela Zutavern.