Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

February 20, 2015

Deep Learning Track at GTC

Filed under: Deep Learning,Machine Learning — Patrick Durusau @ 8:22 pm

Deep Learning Track at GTC

March 17-20, 2015 | San Jose, California

From the webpage:

The Deep Learning Track at GTC features over 40 sessions from industry experts on topics ranging from visual object recognition to the next generation of speech.

Just the deep learning sessions.

Keynote Speakers in Deep Learning track:

Jeff Dean – Google, Senior Fellow

Jen-Hsun – Huang NVIDIA, CEO & Co-Founder

Andrew Ng – Baidu, Chief Scientist

Featured Speakers:

John Canny – UC Berkeley, Professor

Dan Ciresan – IDSIA, Senior Researcher

Rob Fergus – Facebook, Research Scientist

Yangqing Jia – Google, Research Scientist

Ian Lane – Carnegie Mellon University, Assistant Research Professor

Ren Wu – Baidu, Distinguished Scientist

Have you registered yet? If not, why not? 😉

Expecting lots of blog posts covering presentations at the conference.

February 18, 2015

Efficient Estimation of Word Representations in Vector Space

Filed under: Machine Learning — Patrick Durusau @ 3:59 pm

Efficient Estimation of Word Representations in Vector Space by Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean.

Abstract:

We propose two novel model architectures for computing continuous vector representations of words from very large data sets. The quality of these representations is measured in a word similarity task, and the results are compared to the previously best performing techniques based on different types of neural networks. We observe large improvements in accuracy at much lower computational cost, i.e. it takes less than a day to learn high quality word vectors from a 1.6 billion words data set. Furthermore, we show that these vectors provide state-of-the-art performance on our test set for measuring syntactic and semantic word similarities.

The technical side of Learning the meaning behind words, where we reported the open sourcing of Google’s word2vec toolkit.

A must read.

I first saw this in a tweet by onepaperperday.

Dato Updates Machine Learning Platform…

Filed under: Dato,GraphLab,Machine Learning — Patrick Durusau @ 2:51 pm

Dato Updates Machine Learning Platform, Puts Spotlight on Data Engineering Automation, Spark and Hadoop Integrations

From the post:

Today at Strata + HadoopWorld San Jose, Dato (formerly known as GraphLab) announced new updates to its machine learning platform, GraphLab Create, that allow data science teams to wrangle terabytes of data on their laptops at interactive speeds so that they can build intelligent applications faster. With Dato, users leverage machine learning to build prototypes, tune them, deploy in production and even offer them as a predictive service, all in minutes. These are the intelligent applications that provide predictions for a myriad of use cases including recommenders, sentiment analysis, fraud detection, churn prediction and ad targeting.

Continuing with its commitment to the Open Source community, Dato is also announcing the Open Source release of its core engine, including the out of core machine learning(ML)-optimized SFrame and SGraph data structures which make ML tasks blazing fast. Commercial and non-commercial versions of the full GraphLab Create platform are available for download at www.dato.com/download.

New features available in the GraphLab Create platform include:

  • Predictive Service Deployment Enhancements:
    enables easy integrations of Dato predictive services with applications regardless of development environment and allows administrators to view information about deployed models and statistics on requests and latency on a per predictive object basis.
  • Data Science Task Automation:
    a new Data Matching Toolkit allows for automatic tagging of data from a reference dataset and deduplication of lists automatically. In addition, the new Feature Engineering pipeline makes it easy to chain together multiple feature transformations–a vast simplification for the data engineering stage.
  • Open Source Version of GraphLab Create:
    Dato is offering an open-source release of GraphLab Create’s core code. Included in this version is the source for the SFrame and SGraph, along with many machine learning models, such as triangle counting, pagerank and more. Using this code, it is easy to build a new machine learning toolkit or a connector from the Dato SFrame to a data store. The source code can be found on
    Dato’s GitHub page.
  • New Pricing and Packaging Options:
    updated pricing and packaging include a non-commercial, free offering with the same features as the GraphLab Create commercial version. The free version allows data science enthusiasts to interact with and prototype on a leading machine learning platform. Also available is a new 30-day, no obligation evaluation license of the full-feature, commercial version of Dato’s product line.

Excellent news!

Now if we just had secure hardware to run it on.

On the other hand, it is open source so you can verify there are no backdoors in the software. That is a step in the right direction for security.

February 15, 2015

Frequently updated Machine Learning blogs

Filed under: Machine Learning — Patrick Durusau @ 10:29 am

Frequently updated Machine Learning blogs

From the webpage:

Are you looking for some of the frequently updated Machine Learning blogs to learn what’s happening in the world of Machine Learning and related areas that explore the construction and study of algorithms that can learn from data and make predictions or decisions?

Check out our list.

In the process of searching top frequently updated Machine Learning blogs, we’ve found plenty of Machine Learning blogs on the internet, but shortlisted only those which are active since 2014. If we’ve missed a blog which you think should be included in this list, please let us know.

Here we go…

I count forty-seven (47) blogs listed.

A great starting point if you want to try your hand at crawling blogs on machine learning.

I first saw this in a tweet by Gregory Piatetsky.

February 10, 2015

MS Deep Learning Beats Humans (and MS is modest about it)

Filed under: Artificial Intelligence,Deep Learning,Machine Learning — Patrick Durusau @ 7:51 pm

Microsoft researchers say their newest deep learning system beats humans — and Google

Two stories for the price of one! Microsoft’s deep learning project beats human recognition on a data set and Microsoft is modest about it. 😉

From the post:

The Microsoft creation got a 4.94 percent error rate for the correct classification of images in the 2012 version of the widely recognized ImageNet data set , compared with a 5.1 percent error rate among humans, according to the paper. The challenge involved identifying objects in the images and then correctly selecting the most accurate categories for the images, out of 1,000 options. Categories included “hatchet,” “geyser,” and “microwave.”

[modesty]
“While our algorithm produces a superior result on this particular dataset, this does not indicate that machine vision outperforms human vision on object recognition in general,” they wrote. “On recognizing elementary object categories (i.e., common objects or concepts in daily lives) such as the Pascal VOC task, machines still have obvious errors in cases that are trivial for humans. Nevertheless, we believe that our results show the tremendous potential of machine algorithms to match human-level performance on visual recognition.”

You can grab the paper here.

Hoping that Microsoft sets a trend in reporting breakthroughs in big data and machine learning. Stating the achievement but also its limitations may lead to more accurate reporting of technical news. Not holding my breath but I am hopeful.

I first saw this in a tweet by GPUComputing.

February 4, 2015

All Models of Learning have Flaws

Filed under: Artificial Intelligence,Machine Learning — Patrick Durusau @ 5:55 pm

All Models of Learning have Flaws by John Langford.

From the post:

Attempts to abstract and study machine learning are within some given framework or mathematical model. It turns out that all of these models are significantly flawed for the purpose of studying machine learning. I’ve created a table (below) outlining the major flaws in some common models of machine learning.

Quite dated (2007) but still quite handy chart of what is “right” and “wrong” about machine learning models.

Would be even more useful with smallish data sets that illustrate what is “right” and “wrong” about each model.

Anything you would add or take away?

I first saw this in a tweet by Computer Science.

January 27, 2015

Business Analytics Error: Learn from Uber’s Mistake During the Sydney Terror Attack

Filed under: Algorithms,Business Intelligence,Machine Learning — Patrick Durusau @ 2:17 pm

Business Analytics Error: Learn from Uber’s Mistake During the Sydney Terror Attack by RK Paleru.

From the post:

Recently, as a sad day of terror ended in Sydney, a bad case of Uber’s analytical approach to pricing came to light – an “algorithm based price surge.” Uber’s algorithm driven price surge started overcharging people fleeing the Central Business District (CBD) of Sydney following the terror attack.

I’m not sure the algorithm got it wrong. If you asked me to drive into a potential war zone to ferry strangers out, I suspect a higher fee than normal is to be expected.

The real dilemma for Uber is that not all ground transportation has surge price algorithms. When buses, subways, customary taxis, etc. all have surge price algorithms, the price hikes won’t appear to be abnormal.

One of the consequences of an algorithm/data-driven world is that factors known or unknown to you may be driving the price or service. To say it another way, your “expectations” of system behavior may be at odds with how the system will behave.

The inventory algorithm at my local drugstore thought a recent prescription was too unusual to warrant stocking. My drugstore had to order it from a regional warehouse. Just-in-time inventory I think they call it. That was five (5) days ago. That isn’t “just-in-time” for the customer (me) but that isn’t the goal of most cost/pricing algorithms. Particularly when the customer has little choice about the service.

I first saw this in a tweet by Kirk Borne.

January 26, 2015

Machine Learning Etudes in Astrophysics: Selection Functions for Mock Cluster Catalogs

Filed under: Astroinformatics,Classification,Machine Learning — Patrick Durusau @ 3:24 pm

Machine Learning Etudes in Astrophysics: Selection Functions for Mock Cluster Catalogs by Amir Hajian, Marcelo Alvarez, J. Richard Bond.

Abstract:

Making mock simulated catalogs is an important component of astrophysical data analysis. Selection criteria for observed astronomical objects are often too complicated to be derived from first principles. However the existence of an observed group of objects is a well-suited problem for machine learning classification. In this paper we use one-class classifiers to learn the properties of an observed catalog of clusters of galaxies from ROSAT and to pick clusters from mock simulations that resemble the observed ROSAT catalog. We show how this method can be used to study the cross-correlations of thermal Sunya’ev-Zeldovich signals with number density maps of X-ray selected cluster catalogs. The method reduces the bias due to hand-tuning the selection function and is readily scalable to large catalogs with a high-dimensional space of astrophysical features.

From the introduction:

In many cases the number of unknown parameters is so large that explicit rules for deriving the selection function do not exist. A sample of the objects does exist (the very objects in the observed catalog) however, and the observed sample can be used to express the rules for the selection function. This “learning from examples” is the main idea behind classi cation algorithms in machine learning. The problem of selection functions can be re-stated in the statistical machine learning language as: given a set of samples, we would like to detect the soft boundary of that set so as to classify new points as belonging to that set or not. (emphasis added)

Does the sentence:

In many cases the number of unknown parameters is so large that explicit rules for deriving the selection function do not exist.

sound like they could be describing people?

I mention this as a reason why you should be read broadly in machine learning in particular and IR in general.

What if all the known data about known terrorists, sans all the idle speculation by intelligence analysts, were gathered into a data set. Machine learning on that data set could then be tested against a simulation of potential terrorists, to help avoid the biases of intelligence analysts.

Lest the undeserved fixation on Muslims blind security services to other potential threats, such as governments bent on devouring their own populations.

I first saw this in a tweet by Stat.ML.

January 17, 2015

Facebook open sources tools for bigger, faster deep learning models

Filed under: Artificial Intelligence,Deep Learning,Facebook,Machine Learning — Patrick Durusau @ 6:55 pm

Facebook open sources tools for bigger, faster deep learning models by Derrick Harris.

From the post:

Facebook on Friday open sourced a handful of software libraries that it claims will help users build bigger, faster deep learning models than existing tools allow.

The libraries, which Facebook is calling modules, are alternatives for the default ones in a popular machine learning development environment called Torch, and are optimized to run on Nvidia graphics processing units. Among the modules are those designed to rapidly speed up training for large computer vision systems (nearly 24 times, in some cases), to train systems on potentially millions of different classes (e.g., predicting whether a word will appear across a large number of documents, or whether a picture was taken in any city anywhere), and an optimized method for building language models and word embeddings (e.g., knowing how different words are related to each other).

“‘[T]here is no way you can use anything existing” to achieve some of these results, said Soumith Chintala, an engineer with Facebook Artificial Intelligence Research.

How very awesome! Keeping abreast of the latest releases and papers on deep learning is turning out to be a real chore. Enjoyable but a time sink none the less.

Derrick’s post and the release from Facebook have more details.

Apologies for the “lite” posting today but I have been proofing related specifications where one defines a term and the other uses the term, but doesn’t cite the other specification’s definition or give its own. Do those mean the same thing? Probably the same thing but users outside the process may or may not realize that. Particularly in translation.

I first saw this in a tweet by Kirk Borne.

January 13, 2015

Deep Learning: Methods and Applications

Filed under: Deep Learning,Indexing,Information Retrieval,Machine Learning — Patrick Durusau @ 7:01 pm

Deep Learning: Methods and Applications by Li Deng and Dong Yu. (Li Deng and Dong Yu (2014), “Deep Learning: Methods and Applications”, Foundations and TrendsÂź in Signal Processing: Vol. 7: No. 3–4, pp 197-387. http://dx.doi.org/10.1561/2000000039)

Abstract:

This monograph provides an overview of general deep learning methodology and its applications to a variety of signal and information processing tasks. The application areas are chosen with the following three criteria in mind: (1) expertise or knowledge of the authors; (2) the application areas that have already been transformed by the successful use of deep learning technology, such as speech recognition and computer vision; and (3) the application areas that have the potential to be impacted significantly by deep learning and that have been experiencing research growth, including natural language and text processing, information retrieval, and multimodal information processing empowered by multi-task deep learning.

Keywords:

Deep learning, Machine learning, Artificial intelligence, Neural networks, Deep neural networks, Deep stacking networks, Autoencoders, Supervised learning, Unsupervised learning, Hybrid deep networks, Object recognition, Computer vision, Natural language processing, Language models, Multi-task learning, Multi-modal processing

If you are looking for another rich review of the area of deep learning, you have found the right place. Resources, conferences, primary materials, etc. abound.

Don’t be thrown off by the pagination. This is issues 3 and 4 of the periodical Foundations and TrendsÂź in Signal Processing. You are looking at the complete text.

Be sure to read Selected Applications in Information Retrieval (Section 9, pages 308-319). Where 9.2 starts with:

Here we discuss the “semantic hashing” approach for the application of deep autoencoders to document indexing and retrieval as published in [159, 314]. It is shown that the hidden variables in the final layer of a DBN not only are easy to infer after using an approximation based on feed-forward propagation, but they also give a better representation of each document, based on the word-count features, than the widely used latent semantic analysis and the traditional TF-IDF approach for information retrieval. Using the compact code produced by deep autoencoders, documents are mapped to memory addresses in such a way that semantically similar text documents are located at nearby addresses to facilitate rapid document retrieval. The mapping from a word-count vector to its compact code is highly efficient, requiring only a matrix multiplication and a subsequent sigmoid function evaluation for each hidden layer in the encoder part of the network.

That is only one of the applications detailed in this work. I do wonder if this will be the approach that breaks the “document” (as in this work for example) model of information retrieval? If I am searching for “deep learning” and “information retrieval,” a search result that returns these pages would be a great improvement over the entire document. (At the user’s option.)

Before the literature on deep learning gets much more out of hand, now would be a good time to start building not only a corpus of the literature but a sub-document level topic map to ideas and motifs as they develop. That would be particularly useful as patents start to appear for applications of deep learning. (Not a volunteer or charitable venture.)

I first saw this in a tweet by StatFact.

January 12, 2015

A Comparison of Two Unsupervised Table Recognition Methods from Digital Scientific Articles

Filed under: Machine Learning,PDF,Tables,Text Mining — Patrick Durusau @ 8:17 pm

A Comparison of Two Unsupervised Table Recognition Methods from Digital Scientific Articles by Stefan Klampfl, Kris Jack, Roman Kern.

Abstract:

In digital scientific articles tables are a common form of presenting information in a structured way. However, the large variability of table layouts and the lack of structural information in digital document formats pose significant challenges for information retrieval and related tasks. In this paper we present two table recognition methods based on unsupervised learning techniques and heuristics which automatically detect both the location and the structure of tables within a article stored as PDF. For both algorithms the table region detection first identifies the bounding boxes of individual tables from a set of labelled text blocks. In the second step, two different tabular structure detection methods extract a rectangular grid of table cells from the set of words contained in these table regions. We evaluate each stage of the algorithms separately and compare performance values on two data sets from different domains. We find that the table recognition performance is in line with state-of-the-art commercial systems and generalises to the non-scientific domain.

Excellent article if you have ever struggled with the endless tables in government documents.

I first saw this in a tweet by Anita de Waard.

Open-Source projects: Computer Security Group at the University of Göttingen, Germany.

Filed under: Cybersecurity,Machine Learning,Malware,Security — Patrick Durusau @ 8:03 pm

Open-Source projects: Computer Security Group at the University of Göttingen, Germany.

I mentioned Joern March 2014 but these other projects may be of interest as well:

Joern: A Robust Tool for Static Code Analysis

Joern is a platform for robust analysis of C/C++ code. It generates code property graphs, a novel graph representation of code that exposes the code’s syntax, control-flow, data-flow and type information. Code property graphs are stored in a Neo4J graph database. This allows code to be mined using search queries formulated in the graph traversal language Gremlin. (Paper1,
Paper2,Paper3)

Harry: A Tool for Measuring String Similarity

Harry is a tool for comparing strings and measuring their
similarity. The tool supports several common distance and kernel
functions for strings as well as some excotic similarity measures. The
focus lies on implicit similarity measures, that is, comparison
functions that do not give rise to an explicit vector space. Examples of such similarity measures are the Levenshtein and Jaro-Winkler distance.

Adagio: Structural Analysis and Detection of Android Malware

Adagio is a collection of Python modules for analyzing and detecting
Android malware. These modules allow to extract labeled call graphs from Android APKs or DEX files and apply an explicit feature map that captures their structural relationships. Additional modules provide classes for designing binary or multiclass classification experiments and applying machine learning for detection of malicious structure. (Paper1, Paper2)

Salad: A Content Anomaly Detector based on n-Grams

Letter Salad, or Salad for short, is an efficient and flexible
implementation of the anomaly detection method Anagram. The method
uses n-grams (substrings of length n) maintained in a Bloom filter
for efficiently detecting anomalies in large sets of string data.
Salad extends the original method by supporting n-grams of bytes as
well n-grams of words and tokens. (Paper)

Sally: A Tool for Embedding Strings in Vector Spaces

Sally is a small tool for mapping a set of strings to a set of
vectors. This mapping is referred to as embedding and allows for
applying techniques of machine learning and data mining for
analysis of string data. Sally can applied to several types of
string data, such as text documents, DNA sequences or log files,
where it can handle common formats such as directories, archives
and text files. (Paper)

Malheur: Automatic Analysis of Malware Behavior

Malheur is a tool for the automatic analysis of program behavior
recorded from malware. It has been designed to support the regular
analysis of malware and the development of detection and defense
measures. Malheur allows for identifying novel classes of malware
with similar behavior and assigning unknown malware to discovered
classes using machine learning. (Paper)

Prisma: Protocol Inspection and State Machine Analysis

Prisma is an R package for processing and analyzing huge text
corpora. In combination with the tool Sally the package provides
testing-based token selection and replicate-aware, highly tuned
non-negative matrix factorization and principal component analysis. Prisma allows for analyzing very big data sets even on desktop machines.
(Paper)

Derrick: A Simple Network Stream Recorder

Derrick is a simple tool for recording data streams of TCP and UDP
traffic. It shares similarities with other network recorders, such as
tcpflow and wireshark, where it is more advanced than the first and
clearly inferior to the latter. Derrick has been specifically designed to monitor application-layer communication. In contrast to other tools the application data is logged in a line-based ASCII format. Common UNIX tools, such as grep, sed & awk, can be directly applied.

There are days when malware is a relief from thinking about present and proposed government policies.

I first saw this in a tweet by Kirk Borne.

NASA is using machine learning to predict the characteristics of stars

Filed under: Astroinformatics,Machine Learning — Patrick Durusau @ 7:42 pm

NASA is using machine learning to predict the characteristics of stars by Nick Summers.

From the post:

stars1

With so many stars in our galaxy to discover and catalog, NASA is adopting new machine learning techniques to speed up the process. Even now, telescopes around the world are capturing countless images of the night sky, and new projects such as the Large Synoptic Survey Telescope (LSST) will only increase the amount of data available at NASA’s fingertips. To give its analysis a helping hand, the agency has been using some of its prior research and recordings to essentially “teach” computers how to spot patterns in new star data.

NASA’s Jet Propulsion Laboratory started with 9,000 stars and used their individual wavelengths to identify their size, temperature and other basic properties. The data was then cross-referenced with light curve graphs, which measure the brightness of the stars, and fed into NASA’s machines. The combination of the two, combined with some custom algorithms, means that NASA’s computers should be able to make new predictions based on light curves alone. Of course, machine learning isn’t new to NASA, but this latest approach is a little different because it can identify specific star characteristics. Once the LSST is fully operational in 2023, it could reduce the number of astronomers pulling all-nighters.

[Image Credit: Image credit: NASA/JPL-Caltech, Flickr]

Do they have a merit badge in machine learning yet? Thinking that would make a great summer camp project!

Whatever field or hobby you learn machine learning in, the skills can be reused in many others. Good investment.

January 10, 2015

Use Google’s Word2Vec for movie reviews

Filed under: Deep Learning,Machine Learning,Vectors — Patrick Durusau @ 4:33 pm

Use Google’s Word2Vec for movie reviews Kaggle Tutorial.

From the webpage:

In this tutorial competition, we dig a little “deeper” into sentiment analysis. Google’s Word2Vec is a deep-learning inspired method that focuses on the meaning of words. Word2Vec attempts to understand meaning and semantic relationships among words. It works in a way that is similar to deep approaches, such as recurrent neural nets or deep neural nets, but is computationally more efficient. This tutorial focuses on Word2Vec for sentiment analysis.

Sentiment analysis is a challenging subject in machine learning. People express their emotions in language that is often obscured by sarcasm, ambiguity, and plays on words, all of which could be very misleading for both humans and computers. There’s another Kaggle competition for movie review sentiment analysis. In this tutorial we explore how Word2Vec can be applied to a similar problem.

Mark Needham mentions this Kaggle tutorial in Thoughts on Software Development Python NLTK/Neo4j:
.

The description also mentions:

Since deep learning is a rapidly evolving field, large amounts of the work has not yet been published, or exists only as academic papers. Part 3 of the tutorial is more exploratory than prescriptive — we experiment with several ways of using Word2Vec rather than giving you a recipe for using the output.

To achieve these goals, we rely on an IMDB sentiment analysis data set, which has 100,000 multi-paragraph movie reviews, both positive and negative.

Movie, book, TV, etc., reviews are fairly common.

Where would you look for a sentiment analysis data set on contemporary U.S. criminal proceedings?

Deep Learning in Neural Networks: An Overview

Filed under: Deep Learning,Machine Learning — Patrick Durusau @ 2:01 pm

Deep Learning in Neural Networks: An Overview by Jüergen Schmidhuber.

Abstract:

In recent years, deep artificial neural networks (including recurrent ones) have won numerous contests in pattern recognition and machine learning. This historical survey compactly summarises relevant work, much of it from the previous millennium. Shallow and deep learners are distinguished by the depth of their credit assignment paths, which are chains of possibly learnable, causal links between actions and effects. I review deep supervised learning (also recapitulating the history of backpropagation), unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.

A godsend for any graduate student working in deep learning! Not only does Jüergen cover recent literature but he also traces the ideas back into history. Fortunately for all of us interested in the history of ideas in computer science, both the LATEX source, DeepLearning8Oct2014.tex and the BIBTEX file deep.bib are available.

Be forewarned that deep.bib has 2944 entries.

This is what was termed “European” scholarship, scholarship that traces ideas across disciplines and time. As opposed to more common American scholarship in the sciences (both social and otherwise), which has a discipline focus and shorter time point of view. There are exceptions both ways but I point out this difference to urge you to take a broader and longer range view of ideas.

January 9, 2015

Machine Learning (Andrew Ng) – Jan. 19th

Filed under: Computer Science,Education,Machine Learning — Patrick Durusau @ 6:00 pm

Machine Learning (Andrew Ng) – Jan. 19th

From the course page:

Machine learning is the science of getting computers to act without being explicitly programmed. In the past decade, machine learning has given us self-driving cars, practical speech recognition, effective web search, and a vastly improved understanding of the human genome. Machine learning is so pervasive today that you probably use it dozens of times a day without knowing it. Many researchers also think it is the best way to make progress towards human-level AI. In this class, you will learn about the most effective machine learning techniques, and gain practice implementing them and getting them to work for yourself. More importantly, you’ll learn about not only the theoretical underpinnings of learning, but also gain the practical know-how needed to quickly and powerfully apply these techniques to new problems. Finally, you’ll learn about some of Silicon Valley’s best practices in innovation as it pertains to machine learning and AI.

This course provides a broad introduction to machine learning, datamining, and statistical pattern recognition. Topics include: (i) Supervised learning (parametric/non-parametric algorithms, support vector machines, kernels, neural networks). (ii) Unsupervised learning (clustering, dimensionality reduction, recommender systems, deep learning). (iii) Best practices in machine learning (bias/variance theory; innovation process in machine learning and AI). The course will also draw from numerous case studies and applications, so that you’ll also learn how to apply learning algorithms to building smart robots (perception, control), text understanding (web search, anti-spam), computer vision, medical informatics, audio, database mining, and other areas.

I could have just posted Machine Learning, Andrew Ng and 19 Jan. but there are people who have heard of this course before. Hard to believe but I have been assured that is in fact the case.

So the prose stuff is for them. Why are you reading this far? Go register for the course!

I have heard rumors the first course had an enrollment of over 100,000! I wonder if this course will break current records?

Enjoy!

January 8, 2015

Simple Pictures That State-of-the-Art AI Still Can’t Recognize

Filed under: Artificial Intelligence,Deep Learning,Machine Learning,Neural Networks — Patrick Durusau @ 3:58 pm

Simple Pictures That State-of-the-Art AI Still Can’t Recognize by Kyle VanHemert.

I encountered this non-technical summary of Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images, which I covered as: Deep Neural Networks are Easily Fooled:
 earlier today.

While I am sure you have read the fuller explanation, I wanted to replicate the top 40 images for your consideration:

top40-660x589

Select the image to see a larger, readable version.

Enjoy the images and pass the Wired article along to friends.

January 6, 2015

Google Maps For The Genome

Filed under: Deep Learning,Genomics,Machine Learning — Patrick Durusau @ 2:28 pm

This man is trying to build Google Maps for the genome by Daniela Hernandez.

From the post:

The Human Genome Project was supposed to unlock all of life’s secrets. Once we had a genetic roadmap, we’d be able to pinpoint why we got ill and figure out how to fix our maladies.

That didn’t pan out. Ten years and more than $4 billion dollars later, we got the equivalent of a medieval hand-drawn map when what we needed was Google Maps.

“Even though we had the text of the genome, people didn’t know how to interpret it, and that’s really puzzled scientists for the last decade,” said Brendan Frey, a computer scientist and medical researcher at the University of Toronto. “They have no idea what it means.”

For the past decade, Frey has been on a quest to build scientists a sort of genetic step-by-step navigation system for the genome, powered by some of the same artificial-intelligence systems that are now being used by big tech companies like Google, Facebook, Microsoft, IBM and Baidu for auto-tagging images, processing language, and showing consumers more relevant online ads.

Today Frey and his team are unveiling a new artificial intelligence system in the top-tier academic journal Science that’s capable of predicting how mutations in the DNA affect something called gene splicing in humans. That’s important because many genetic diseases–including cancers and spinal muscular atrophy, a leading cause of infant mortality–are the result of gene splicing gone wrong.

“It’s a turning point in the field,” said Terry Sejnowski, a computational neurobiologist at the Salk Institute in San Diego and a long-time machine learning researcher. “It’s bringing to bear a completely new set of techniques, and that’s when you really make advances.”

Those leaps could include better personalized medicine. Imagine you have a rare disease doctors suspect might be genetic but that they’ve never seen before. They could sequence your genome, feed the algorithm your data, and, in theory, it would give doctors insights into what’s gone awry with your genes–maybe even how to fix things.

For now, the system can only detect one minor genetic pathway for diseases, but the platform can be generalized to other areas, says Frey, and his team is already working on that.

I really like the line:

Ten years and more than $4 billion dollars later, we got the equivalent of a medieval hand-drawn map when what we needed was Google Maps.

Daniela gives a high level view of deep learning and its impact on genomic research. There is still much work to be done but it sounds very promising.

I tried to find a non-paywall copy of Frey’s most recent publication in Science but to no avail. After all, the details of such a break trough couldn’t possibly interest anyone other than subscribers to Science.

In lieu of the details, I did find an image on the Frey Lab. Probabilistic and Statistical Inference Group, University of Toronto page:

Frey-genomics

I am very sympathetic to publishers making money. At one time I worked for a publisher and they have to pay for staff and that involves making money. However, hoarding information to which publishers contribute so little, isn’t a good model. Leaving public access to one side, specialty publishers have a fragile economic position based on their subscriber base.

An alternative model to managing individual and library subscriptions would be to site license their publications to national governments over the WWW. Their publications would become expected resources in every government library and used by everyone who had an interest in the subject. A stable source of income (governments), becoming part of the expected academic infrastructure, much wider access to a broader audience, with additional revenue from anyone who wanted a print copy.

Sorry, a diversion from the main point, which is an important success story about deep learning.

I first saw this in a tweet by Nikhil Buduma.

Deep Learning in a Nutshell

Filed under: Deep Learning,Machine Learning — Patrick Durusau @ 1:51 pm

Deep Learning in a Nutshell by Nikhil Buduma.

From the post:

Deep learning. Neural networks. Backpropagation. Over the past year or two, I’ve heard these buzz words being tossed around a lot, and it’s something that has definitely seized my curiosity recently. Deep learning is an area of active research these days, and if you’ve kept up with the field of computer science, I’m sure you’ve come across at least some of these terms at least once.

Deep learning can be an indimidating concept, but it’s becoming increasingly important these days. Google’s already making huge strides in the space with the Google Brain project and its recent acquisition of the London-based deep learning startup DeepMind. Moreover, deep learning methods are beating out traditional machine learning approaches on virtually every single metric.

So what exactly is deep learning? How does it work? And most importantly, why should you even care?

One of the more accessible introductions to deep learning that I have seen recently.

There are hints of more posts to come on deep learning topics.

Enjoy!

Deep learning Reading List

Filed under: Deep Learning,Machine Learning — Patrick Durusau @ 1:36 pm

Deep learning Reading List by J Mohamed Zahoor.

Fifty-two unannotated links but enough to keep you busy for a while. 😉

Update – 7 January 2015

Working Now: The dataset link: Berkeley Segmentation Dataset 500 was broken. Working now. Thanks Berkeley!

I first saw this in a tweet by Alexander Beck.

January 4, 2015

AdversariaLib

Filed under: Algorithms,Machine Learning,Python — Patrick Durusau @ 5:35 pm

AdversariaLib

Speaking of combat machine learning environments:

AdversariaLib is an open-source python library for the security evaluation of machine learning (ML)-based classifiers under adversarial attacks. It comes with a set of powerful features:

  • Easy-to-use. Running sophisticated experiments is as easy as launching a single script. Experimental settings can be defined through a single setup file.
  • Wide range of supported ML algorithms. All supervised learning algorithms supported by scikit-learn are available, as well as Neural Networks (NNs), by means of our scikit-learn wrapper for FANN. In the current implementation, the library allows for the security evaluation of SVMs with linear, rbf, and polynomial kernels, and NNs with one hidden layer, against evasion attacks.
  • Fast Learning and Evaluation. Thanks to scikit-learn and FANN, all supported ML algorithms are optimized and written in C/C++ language.
  • Built-in attack algorithms. Evasion attacks based on gradient-descent optimization.
  • Extensible. Other attack algorithms can be easily added to the library.
  • Multi-processing. Do you want to further save time? The built-in attack algorithms can run concurrently on multiple processors.

Last, but not least, AdversariaLib is free software, released under the GNU General Public License version 3!

The “full documentation” link on the homepage returns a “no page.” I puzzled over it until I realized that the failing link reads:

http://comsec.diee.unica.it/adversarialib/

and the successful link reads:

https://comsec.diee.unica.it/adversarialib/advlib.html

I have pinged the site owners.

The sourceforge link for the code: http://sourceforge.net/projects/adversarialib/ still works.

The full documentation page notes:

However, learning algorithms typically assume data stationarity: that is, both the data used to train the classifier and the operational data it classifies are sampled from the same (though possibly unknown) distribution. Meanwhile, in adversarial settings such as the above mentioned ones, intelligent and adaptive adversaries may purposely manipulate data (violating stationarity) to exploit existing vulnerabilities of learning algorithms, and to impair the entire system.

Not quite the case of reactive data that changes representations depending upon the source of a query but certainly a move in that direction.

Do you have a data stability assumption?

Linear Algebra for Machine Learning

Filed under: Machine Learning,Mathematics — Patrick Durusau @ 4:55 pm

Linear Algebra for Machine Learning by Jason Brownlee.

From the post:

You do not need to learn linear algebra before you get started in machine learning, but at some time you may wish to dive deeper.

In fact, if there was one area of mathematics I would suggest improving before the others, it would be linear algebra. It will give you the tools to help you with the other areas of mathematics required to understand and build better intuitions for machine learning algorithms.

In this post we take a closer look at linear algebra and why you should make the time to improve your skills and knowledge in linear algebra if you want to get more out of machine learning.

If you already know your way around Eigen Vectors and SVD decompositions, this post is probably not for you.

Another great collection of resources from Jason!

As usual, a great collection of resources is only the starting point for learning. The next step requires effort from the user. Sorry, wish I had better news. 😉

On the upside though, rather than thinking of it as boring mathematics, imagine how you can manipulate machine learning if you know linear algebra.

Embedding linear algebra in a machine learning book that is written from a battle perspective between different camps could be quite engaging. For that matter if online, exercises could be part of an e-warfare environment.

Something to think about.

January 3, 2015

Talking Machines

Filed under: Machine Learning — Patrick Durusau @ 4:49 pm

Talking Machines: Human Conversation about Machine Learning. Episode 1: Hello World.

From the webpage:

In the first episode of Talking Machines we meet our hosts, Katherine Gorman (nerd, journalist) and Ryan Adams (nerd, Harvard computer science professor), and explore some of the interviews you’ll be able to hear this season. Today we hear some short clips on big issues, we’ll get technical, but the today is all about introductions.

We start with Kevin Murphy of Google talking about his textbook that has become a standard in the field. Then we turn to Hanna Wallach of Microsoft Research NYC and UMass Amherst and hear about the founding of WiML (Women in Machine Learning). Next we discuss academia’s relationship with business with Max Welling from the University of Amsterdam, program co-chair of  the 2013 NIPS conference (Neural Information Processing Systems). Finally, we sit down with three pillars of the field Yann LeCun, Yoshua Bengio, and Geoff Hinton to hear about where the field has been and where it might be headed. 

If you are trying to attract students into machine learning, this podcast series has a lot of potential. Machine learning isn’t a dry as some texts make it appear. 😉

Episodes are promised every two weeks.

I first saw this in tweet from Hanna Wallach.

January 2, 2015

H2O World 2014

Filed under: H20,Machine Learning,R — Patrick Durusau @ 7:11 pm

H2O World 2014

From the H2O homepage:

H2O is for data scientists and application developers who need fast, in-memory scalable machine learning for smarter applications. H2O is an open source parallel processing engine for machine learning. Unlike traditional analytics tools, H2O provides a combination of extraordinary math, a high performance parallel architecture, and unrivaled ease of use.

Videos and docs from two days of presentations on H2O.

I first saw this in Video: H2O Talks by Trevor Hastie and John Chambers by Joseph Rickert.

January 1, 2015

Show and Tell (C-suite version)

Filed under: Algorithms,Deep Learning,Machine Learning — Patrick Durusau @ 5:02 pm

How Google “Translates” Pictures into Words Using Vector Space Mathematics

From the post:

Translating one language into another has always been a difficult task. But in recent years, Google has transformed this process by developing machine translation algorithms that change the nature of cross cultural communications through Google Translate.

Now that company is using the same machine learning technique to translate pictures into words. The result is a system that automatically generates picture captions that accurately describe the content of images. That’s something that will be useful for search engines, for automated publishing and for helping the visually impaired navigate the web and, indeed, the wider world.

One of the best c-suite level explanations I have seen of Show and Tell: A Neural Image Caption Generator.

May be useful to you in obtaining support/funding for similar efforts in your domain.

Take particular note of the decision to not worry overmuch about the meaning of words. I would never make that simplifying assumption. Just runs counter to the grain for the meaning of the words to not matter. However, I am very glad that Oriol Vinyals and colleagues made that assumption!

That assumption enables the processing of images at a large scale.

I started to write that I would not use such an assumption for more precise translation tasks, say the translation of cuneiform tablets. But as a rough finding aid for untranslated cuneiform or hieroglyphic texts, this could be the very thing. Doesn’t have to be 100% precise or accurate, just enough that the vast archives of ancient materials becomes easier to use.

Is there an analogy for topic maps here? That topic maps need not be final production quality materials when released but can be refined over time by authors, editors and others?

Like Wikipedia but not quite so eclectic and more complete. Imagine a Solr reference manual that inlines or at least links to the most recent presentations and discussions on a particular topic. And incorporates information from such sources into the text.

Is Google offering us “good enough” results with data, expectations that others will refine the data further? Perhaps a value-add economic model where the producer of the “good enough” content has an interest in the further refinement of that data by others?

December 27, 2014

A Common Logic to Seeing Cats and Cosmos

Filed under: Machine Learning — Patrick Durusau @ 5:40 pm

A Common Logic to Seeing Cats and Cosmos by Natalie Wolchover.

From the post:

CatCollage_03_SM

There may be a universal logic to how physicists, computers and brains tease out important features from among other irrelevant bits of data.

When in 2012 a computer learned to recognize cats in YouTube videos and just last month another correctly captioned a photo of “a group of young people playing a game of Frisbee,” artificial intelligence researchers hailed yet more triumphs in “deep learning,” the wildly successful set of algorithms loosely modeled on the way brains grow sensitive to features of the real world simply through exposure.

Using the latest deep-learning protocols, computer models consisting of networks of artificial neurons are becoming increasingly adept at image, speech and pattern recognition — core technologies in robotic personal assistants, complex data analysis and self-driving cars. But for all their progress training computers to pick out salient features from other, irrelevant bits of data, researchers have never fully understood why the algorithms or biological learning work.

Now, two physicists have shown that one form of deep learning works exactly like one of the most important and ubiquitous mathematical techniques in physics, a procedure for calculating the large-scale behavior of physical systems such as elementary particles, fluids and the cosmos.

The new work, completed by Pankaj Mehta of Boston University and David Schwab of Northwestern University, demonstrates that a statistical technique called “renormalization,” which allows physicists to accurately describe systems without knowing the exact state of all their component parts, also enables the artificial neural networks to categorize data as, say, “a cat” regardless of its color, size or posture in a given video.

“They actually wrote down on paper, with exact proofs, something that people only dreamed existed,” said Ilya Nemenman, a biophysicist at Emory University. “Extracting relevant features in the context of statistical physics and extracting relevant features in the context of deep learning are not just similar words, they are one and the same.”

As for our own remarkable knack for spotting a cat in the bushes, a familiar face in a crowd or indeed any object amid the swirl of color, texture and sound that surrounds us, strong similarities between deep learning and biological learning suggest that the brain may also employ a form of renormalization to make sense of the world.

“Maybe there is some universal logic to how you can pick out relevant features from data,” said Mehta. “I would say this is a hint that maybe something like that exists.”

The finding formalizes what Schwab, Mehta and others saw as a philosophical similarity between physicists’ techniques and the learning procedure behind object or speech recognition. Renormalization is “taking a really complicated system and distilling it down to the fundamental parts,” Schwab said. “And that’s what deep neural networks are trying to do as well. And what brains are trying to do.”

If you weren’t already planning on learning/catching up on deep learning in 2015, this article should tip the balance towards deep learning. Not simply because it appears to be “the” idea for 2015 but because you are likely to be called upon to respond to analysis/conclusions based upon deep learning techniques.

Unlike Stephen Hawking, I don’t fear the rise of artificial intelligence. What I fear is the uncritical acceptance of machine learning results, whether artificial intelligence ever arrives or not.

Critical discussion of deep learning results and techniques is going to require people as informed as the advocates of deep learning on all sides. How can you oppose a policy that is justified by a algorithm considering far more factors than any person and that has no racial prejudice. How can it? It is simply an algorithm.

Saying that a result or algorithm is racist isn’t very scientific. What opposition to the policies of tomorrow will require is detailed analysis of both data and algorithms so as to leave little or no doubt that a racist outcome was an intentional one.

Here’s a concrete example of where greater knowledge allows someone to deceive the general public while claiming to be completely open. In the Michael Brown case, Prosecutor McCulloch claims to have allowed everyone who claimed to have knowledge of the case to testify. Which is true, as far as it went. What he failed to say was that every witness that supported a theory that Darren Wilson was guilty of murdering Michael Brown, had their prior statements presented to the grand jury and were heavily cross-examined by the prosecutors. On the surface fair, just beneath, extremely unfair. But you have to know the domain to see the unfairness.

The same is going to be the case when results of deep learning are presented. How much do you trust the person presenting the results? And the people they trusted with the data and analysis?

The Inductive Biases of Various Machine Learning Algorithms

Filed under: Induction,Machine Learning — Patrick Durusau @ 2:37 pm

The Inductive Biases of Various Machine Learning Algorithms by Laura Diane Hamilton.

From the post:

Every machine learning algorithm with any ability to generalize beyond the training data that it sees has, by definition, some type of inductive bias.

That is, there is some fundamental assumption or set of assumptions that the learner makes about the target function that enables it to generalize beyond the training data.

Below is a chart that shows the inductive biases for various machine learning algorithms:

Inductive reasoning has a checkered history (Hume) but is widely relied upon in machine learning.

Consider this a starter set of biases for classes of machine learning algorithms.

There may be entire monographs on the subject but I haven’t seen a treatment at length on how to manipulate data sets so they take advantage of known biases in the better known machine learning algorithms.

You could take the position that misleading data sets test the robustness of machine learning algorithms and so the principles of their generation and use have the potential to improve machine learning.

That may well be the case but I would be interested in such a treatment so that detection of such manipulation of data could be detected.

Either way, it would be an interesting effort, assuming it doesn’t exist already.

Pointers anyone?

I first saw this in a tweet by Alex Hall.

December 24, 2014

DL4J: Deep Learning for Java

Filed under: Deep Learning,Machine Learning,Neural Networks — Patrick Durusau @ 9:26 am

DL4J: Deep Learning for Java

From the webpage:

Deeplearning4j is the first commercial-grade, open-source deep-learning library written in Java. It is meant to be used in business environments, rather than as a research tool for extensive data exploration. Deeplearning4j is most helpful in solving distinct problems, like identifying faces, voices, spam or e-commerce fraud.

Deeplearning4j integrates with GPUs and includes a versatile n-dimensional array class. DL4J aims to be cutting-edge plug and play, more convention than configuration. By following its conventions, you get an infinitely scalable deep-learning architecture suitable for Hadoop and other big-data structures. This Java deep-learning library has a domain-specific language for neural networks that serves to turn their multiple knobs.

Deeplearning4j includes a distributed deep-learning framework and a normal deep-learning framework (i.e. it runs on a single thread as well). Training takes place in the cluster, which means it can process massive amounts of data. Nets are trained in parallel via iterative reduce, and they are equally compatible with Java, Scala and Clojure, since they’re written for the JVM.

This open-source, distributed deep-learning framework is made for data input and neural net training at scale, and its output should be highly accurate predictive models.

By following the links at the bottom of each page, you will learn to set up, and train with sample data, several types of deep-learning networks. These include single- and multithread networks, Restricted Boltzmann machines, deep-belief networks, Deep Autoencoders, Recursive Neural Tensor Networks, Convolutional Nets and Stacked Denoising Autoencoders.

For a quick introduction to neural nets, please see our overview.

There are a lot of knobs to turn when you’re training a deep-learning network. We’ve done our best to explain them, so that Deeplearning4j can serve as a DIY tool for Java, Scala and Clojure programmers. If you have questions, please join our Google Group; for premium support, contact us at Skymind. ND4J is the Java scientific computing engine powering our matrix manipulations.

And you thought I write jargon laden prose. 😉

This both looks both exciting (as a technology) and challenging (as in needing accessible documentation).

Are you going to be “…turn[ing] their multiple knobs” over the holidays?

GitHub Repo

Tweets

#deeplearning4j @IRC

Google Group

I first saw this in a tweet by Gregory Piatetsky.

December 23, 2014

Abridged List of Machine Learning Topics

Filed under: Machine Learning — Patrick Durusau @ 11:46 am

Abridged List of Machine Learning Topics

Covers:

  • Computer Vision
  • Deep Learning
  • Ensemble Methods
  • GPU Learning
  • Graphical Models
  • Graphs
  • Hadoop/Spark
  • Hyper-Parameter Optimization
  • Julia
  • Kernel Methods
  • Natural Language Processing
  • Online Learning
  • Optimization
  • Robotics
  • Structured Predictions
  • Visualization

Great resource that lists software and one or two reading references for each area. Not all you will want but a nice way to explore areas unfamiliar to you.

Bookmark and return often.

December 21, 2014

$175K to Identify Plankton

Filed under: Classification,Data Science,Machine Learning — Patrick Durusau @ 10:20 am

Oregon marine researchers offer $175,000 reward for ‘big data’ solution to identifying plankton by Kelly House.

From the post:

The marine scientists at Oregon State University need to catalog tens of millions of plankton photos, and they’re willing to pay good money to anyone willing to do the job.

The university’s Hatfield Marine Science Center on Monday announced the launch of the National Data Science Bowl, a competition that comes with a $175,000 reward for the best “big data” approach to sorting through the photos.

It’s a job that, done by human hands, would take two lifetimes to finish.

Data crunchers have 90 days to complete their task. Authors of the top three algorithms will share the $175,000 purse and Hatfield will gain ownership of their algorithms.

From the competition description:

The 2014/2015 National Data Science Bowl challenges you to create an image classification algorithm to automatically classify plankton species. This challenge is not easy— there are 100 classes of plankton, the images may contain non-plankton organisms and particles, and the plankton can appear in any orientation within three-dimensional space. The winning algorithms will be used by Hatfield Marine Science Center for simpler, faster population assessment. They represent a $1 million in-kind donation by the data science community!

There is a comprehensive tutorial to get you started and weekly blog posts on the contest.

You may also see this billed as the first National Data Science Bowl.

The contest runs from December 15, 2014 until March 16, 2015.

Competing is free and even if you don’t win the big prize, you will have gained valuable experience from the tutorials and discussions during the contest.

I first saw this in a tweet by Gregory Piatetsky

« Newer PostsOlder Posts »

Powered by WordPress