Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

April 7, 2015

Exploring the Unknown Frontier of the Brain

Filed under: Neural Information Processing,Neural Networks,Neuroinformatics,Science — Patrick Durusau @ 1:33 pm

Exploring the Unknown Frontier of the Brain by James L. Olds.

From the post:

To a large degree, your brain is what makes you… you. It controls your thinking, problem solving and voluntary behaviors. At the same time, your brain helps regulate critical aspects of your physiology, such as your heart rate and breathing.

And yet your brain — a nonstop multitasking marvel — runs on only about 20 watts of energy, the same wattage as an energy-saving light bulb.

Still, for the most part, the brain remains an unknown frontier. Neuroscientists don’t yet fully understand how information is processed by the brain of a worm that has several hundred neurons, let alone by the brain of a human that has 80 billion to 100 billion neurons. The chain of events in the brain that generates a thought, behavior or physiological response remains mysterious.

Building on these and other recent innovations, President Barack Obama launched the Brain Research through Advancing Innovative Neurotechnologies Initiative (BRAIN Initiative) in April 2013. Federally funded in 2015 at $200 million, the initiative is a public-private research effort to revolutionize researchers’ understanding of the brain.

James reviews currently funded efforts under the BRAIN Initiative, each of which is pursuing possible ways to explore, model and understand brain activity. Exploration in its purest sense. The researchers don’t know what they will find.

I suspect the leap from not understanding <302 neurons in a worm to understanding the 80 to 100 billion neurons in each person, is going to happen anytime soon. Just as well, think of all the papers, conferences and publications along the way!

March 23, 2015

Classifying Plankton With Deep Neural Networks

Filed under: Bioinformatics,Deep Learning,Machine Learning,Neural Networks — Patrick Durusau @ 3:46 pm

Classifying Plankton With Deep Neural Networks by Sander Dieleman.

From the post:

The National Data Science Bowl, a data science competition where the goal was to classify images of plankton, has just ended. I participated with six other members of my research lab, the Reservoir lab of prof. Joni Dambre at Ghent University in Belgium. Our team finished 1st! In this post, we’ll explain our approach.

The ≋ Deep Sea ≋ team consisted of Aäron van den Oord, Ira Korshunova, Jeroen Burms, Jonas Degrave, Lionel Pigou, Pieter Buteneers and myself. We are all master students, PhD students and post-docs at Ghent University. We decided to participate together because we are all very interested in deep learning, and a collaborative effort to solve a practical problem is a great way to learn.

There were seven of us, so over the course of three months, we were able to try a plethora of different things, including a bunch of recently published techniques, and a couple of novelties. This blog post was written jointly by the team and will cover all the different ingredients that went into our solution in some detail.

Overview

This blog post is going to be pretty long! Here’s an overview of the different sections. If you want to skip ahead, just click the section title to go there.

Introduction

The problem

The goal of the competition was to classify grayscale images of plankton into one of 121 classes. They were created using an underwater camera that is towed through an area. The resulting images are then used by scientists to determine which species occur in this area, and how common they are. There are typically a lot of these images, and they need to be annotated before any conclusions can be drawn. Automating this process as much as possible should save a lot of time!

The images obtained using the camera were already processed by a segmentation algorithm to identify and isolate individual organisms, and then cropped accordingly. Interestingly, the size of an organism in the resulting images is proportional to its actual size, and does not depend on the distance to the lens of the camera. This means that size carries useful information for the task of identifying the species. In practice it also means that all the images in the dataset have different sizes.

Participants were expected to build a model that produces a probability distribution across the 121 classes for each image. These predicted distributions were scored using the log loss (which corresponds to the negative log likelihood or equivalently the cross-entropy loss).

This loss function has some interesting properties: for one, it is extremely sensitive to overconfident predictions. If your model predicts a probability of 1 for a certain class, and it happens to be wrong, the loss becomes infinite. It is also differentiable, which means that models trained with gradient-based methods (such as neural networks) can optimize it directly – it is unnecessary to use a surrogate loss function.

Interestingly, optimizing the log loss is not quite the same as optimizing classification accuracy. Although the two are obviously correlated, we paid special attention to this because it was often the case that significant improvements to the log loss would barely affect the classification accuracy of the models.

This rocks!

Code is coming soon to Github!

Certainly of interest to marine scientists but also to anyone in bio-medical imaging.

The problem of too much data and too few experts is a common one.

What I don’t recall seeing are releases of pre-trained classifiers. Is the art developing too quickly for that to be a viable product? Just curious.

I first saw this in a tweet by Angela Zutavern.

March 20, 2015

Convolutional Neural Networks for Visual Recognition

Filed under: Deep Learning,Image Recognition,Machine Learning,Neural Networks — Patrick Durusau @ 7:29 pm

Convolutional Neural Networks for Visual Recognition by Fei-Fei Li and Andrej Karpathy.

From the description:

Computer Vision has become ubiquitous in our society, with applications in search, image understanding, apps, mapping, medicine, drones, and self-driving cars. Core to many of these applications are visual recognition tasks such as image classification, localization and detection. Recent developments in neural network (aka “deep learning”) approaches have greatly advanced the performance of these state-of-the-art visual recognition systems. This course is a deep dive into details of the deep learning architectures with a focus on learning end-to-end models for these tasks, particularly image classification. During the 10-week course, students will learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in computer vision. The final assignment will involve training a multi-million parameter convolutional neural network and applying it on the largest image classification dataset (ImageNet). We will focus on teaching how to set up the problem of image recognition, the learning algorithms (e.g. backpropagation), practical engineering tricks for training and fine-tuning the networks and guide the students through hands-on assignments and a final course project. Much of the background and materials of this course will be drawn from the ImageNet Challenge.

Be sure to check out the course notes!

A very nice companion for your DIGITS experiments over the weekend.

I first saw this in a tweet by Lasse.

March 18, 2015

Use The Code Luke!

Filed under: Deep Learning,Machine Learning,Neural Networks — Patrick Durusau @ 2:41 pm

Hacker’s guide to Neural Networks by Andrej Karpathy.

From the post:

Hi there, I'm a CS PhD student at Stanford. I've worked on Deep Learning for a few years as part of my research and among several of my related pet projects is ConvNetJS – a Javascript library for training Neural Networks. Javascript allows one to nicely visualize what's going on and to play around with the various hyperparameter settings, but I still regularly hear from people who ask for a more thorough treatment of the topic. This article (which I plan to slowly expand out to lengths of a few book chapters) is my humble attempt. It's on web instead of PDF because all books should be, and eventually it will hopefully include animations/demos etc.

My personal experience with Neural Networks is that everything became much clearer when I started ignoring full-page, dense derivations of backpropagation equations and just started writing code. Thus, this tutorial will contain very little math (I don't believe it is necessary and it can sometimes even obfuscate simple concepts). Since my background is in Computer Science and Physics, I will instead develop the topic from what I refer to as hackers's perspective. My exposition will center around code and physical intuitions instead of mathematical derivations. Basically, I will strive to present the algorithms in a way that I wish I had come across when I was starting out.

"…everything became much clearer when I started writing code."

You might be eager to jump right in and learn about Neural Networks, backpropagation, how they can be applied to datasets in practice, etc. But before we get there, I'd like us to first forget about all that. Let's take a step back and understand what is really going on at the core. Lets first talk about real-valued circuits.

I won’t say you don’t need to more formal methods as well but everyone learns in different ways. If doing the code first is better for you, here’s a treatment of deep learning from that perspective.

The last comments were approximately four (4) months ago. I am hopeful this work will continue.

March 15, 2015

Distilling the Knowledge in a Neural Network

Filed under: Machine Learning,Neural Networks — Patrick Durusau @ 7:19 pm

Distilling the Knowledge in a Neural Network by Geoffrey Hinton, Oriol Vinyals, Jeff Dean.

Abstract:

A very simple way to improve the performance of almost any machine learning algorithm is to train many different models on the same data and then to average their predictions. Unfortunately, making predictions using a whole ensemble of models is cumbersome and may be too computationally expensive to allow deployment to a large number of users, especially if the individual models are large neural nets. Caruana and his collaborators have shown that it is possible to compress the knowledge in an ensemble into a single model which is much easier to deploy and we develop this approach further using a different compression technique. We achieve some surprising results on MNIST and we show that we can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model. We also introduce a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse. Unlike a mixture of experts, these specialist models can be trained rapidly and in parallel.

The technique described appears very promising but I suspect the paper’s importance lies in another discovery by its authors:

Many insects have a larval form that is optimized for extracting energy and nutrients from the environment and a completely different adult form that is optimized for the very different requirements of traveling and reproduction. In large-scale machine learning, we typically use very similar models for the training stage and the deployment stage despite their very different requirements: For tasks like speech and object recognition, training must extract structure from very large, highly redundant datasets but it does not need to operate in real time and it can use a huge amount of computation. Deployment to a large number of users, however, has much more stringent requirements on latency and computational resources. The analogy with insects suggests that we should be willing to train very cumbersome models if that makes it easier to extract structure from the data.

The sparse results of machine learning haven’t been due to the difficulty of machine learning but by our limited conceptions of it.

Consider the recent rush of papers and promising results with deep learning. Compare that to years of labor spent on trying to specify rules and logic for machine reasoning. The verdict isn’t in, yet, but I suspect that formal logic is too sparse and pinched to support robust machine reasoning.

Like the Google’s Pinball Wizard with Atari games, so long as it wins, does its method matter? What if it isn’t expressible in first order logic?

It will be very ironic after the years of debate over “logical” entities if computers must become less logical and more like us in order to advance machine reasoning projects.

I first saw this in a tweet by Andrew Beam.

Artificial Neurons and Single-Layer Neural Networks…

Artificial Neurons and Single-Layer Neural Networks – How Machine Learning Algorithms Work Part 1 by Sebastian Raschka.

From the post:

This article offers a brief glimpse of the history and basic concepts of machine learning. We will take a look at the first algorithmically described neural network and the gradient descent algorithm in context of adaptive linear neurons, which will not only introduce the principles of machine learning but also serve as the basis for modern multilayer neural networks in future articles.

Machine learning is one of the hottest and most exciting fields in the modern age of technology. Thanks to machine learning, we enjoy robust email spam filters, convenient text and voice recognition, reliable web search engines, challenging chess players, and, hopefully soon, safe and efficient self-driving cars.

Without any doubt, machine learning has become a big and popular field, and sometimes it may be challenging to see the (random) forest for the (decision) trees. Thus, I thought that it might be worthwhile to explore different machine learning algorithms in more detail by not only discussing the theory but also by implementing them step by step.
To briefly summarize what machine learning is all about: “[Machine learning is the] field of study that gives computers the ability to learn without being explicitly programmed” (Arthur Samuel, 1959). Machine learning is about the development and use of algorithms that can recognize patterns in data in order to make decisions based on statistics, probability theory, combinatorics, and optimization.

The first article in this series will introduce perceptrons and the adaline (ADAptive LINear NEuron), which fall into the category of single-layer neural networks. The perceptron is not only the first algorithmically described learning algorithm [1], but it is also very intuitive, easy to implement, and a good entry point to the (re-discovered) modern state-of-the-art machine learning algorithms: Artificial neural networks (or “deep learning” if you like). As we will see later, the adaline is a consequent improvement of the perceptron algorithm and offers a good opportunity to learn about a popular optimization algorithm in machine learning: gradient descent.

Starting point for what appears to be a great introduction to neural networks.

While you are at Sebastian’s blog, it is very much worthwhile to look around. You will be pleasantly surprised.

March 3, 2015

Understanding Natural Language with Deep Neural Networks Using Torch

Filed under: GPU,Natural Language Processing,Neural Networks — Patrick Durusau @ 7:00 pm

Understanding Natural Language with Deep Neural Networks Using Torch by Soumith Chintala and Wojciech Zaremba.

This is a deeply impressive article and a good introduction to Torch (scientific computing package with neural network, optimization, etc.)

In the preliminary materials, the authors illustrate one of the difficulties of natural language processing by machine:

For a machine to understand language, it first has to develop a mental map of words, their meanings and interactions with other words. It needs to build a dictionary of words, and understand where they stand semantically and contextually, compared to other words in their dictionary. To achieve this, each word is mapped to a set of numbers in a high-dimensional space, which are called “word embeddings”. Similar words are close to each other in this number space, and dissimilar words are far apart. Some word embeddings encode mathematical properties such as addition and subtraction (For some examples, see Table 1).

Word embeddings can either be learned in a general-purpose fashion before-hand by reading large amounts of text (like Wikipedia), or specially learned for a particular task (like sentiment analysis). We go into a little more detail on learning word embeddings in a later section.

You can already see the problem but just to call it out, the language usage in Wikipedia, for example, may or may not match the domain of interest. You could certainly use it as a general case but it will produce very odd results when the text to be “understood” in a regional version of a language where common words have meanings other than you will find in Wikipedia.

Slang is a good example. In the 17th century for example, “cab” was a term used for a brothel. To take a “hit” has a different meaning than being struck by a boxer, would be a more recent example.

“Understanding” natural language with machines is a great leap forward but one should never leap without looking.

January 8, 2015

Simple Pictures That State-of-the-Art AI Still Can’t Recognize

Filed under: Artificial Intelligence,Deep Learning,Machine Learning,Neural Networks — Patrick Durusau @ 3:58 pm

Simple Pictures That State-of-the-Art AI Still Can’t Recognize by Kyle VanHemert.

I encountered this non-technical summary of Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images, which I covered as: Deep Neural Networks are Easily Fooled:… earlier today.

While I am sure you have read the fuller explanation, I wanted to replicate the top 40 images for your consideration:

top40-660x589

Select the image to see a larger, readable version.

Enjoy the images and pass the Wired article along to friends.

December 24, 2014

DL4J: Deep Learning for Java

Filed under: Deep Learning,Machine Learning,Neural Networks — Patrick Durusau @ 9:26 am

DL4J: Deep Learning for Java

From the webpage:

Deeplearning4j is the first commercial-grade, open-source deep-learning library written in Java. It is meant to be used in business environments, rather than as a research tool for extensive data exploration. Deeplearning4j is most helpful in solving distinct problems, like identifying faces, voices, spam or e-commerce fraud.

Deeplearning4j integrates with GPUs and includes a versatile n-dimensional array class. DL4J aims to be cutting-edge plug and play, more convention than configuration. By following its conventions, you get an infinitely scalable deep-learning architecture suitable for Hadoop and other big-data structures. This Java deep-learning library has a domain-specific language for neural networks that serves to turn their multiple knobs.

Deeplearning4j includes a distributed deep-learning framework and a normal deep-learning framework (i.e. it runs on a single thread as well). Training takes place in the cluster, which means it can process massive amounts of data. Nets are trained in parallel via iterative reduce, and they are equally compatible with Java, Scala and Clojure, since they’re written for the JVM.

This open-source, distributed deep-learning framework is made for data input and neural net training at scale, and its output should be highly accurate predictive models.

By following the links at the bottom of each page, you will learn to set up, and train with sample data, several types of deep-learning networks. These include single- and multithread networks, Restricted Boltzmann machines, deep-belief networks, Deep Autoencoders, Recursive Neural Tensor Networks, Convolutional Nets and Stacked Denoising Autoencoders.

For a quick introduction to neural nets, please see our overview.

There are a lot of knobs to turn when you’re training a deep-learning network. We’ve done our best to explain them, so that Deeplearning4j can serve as a DIY tool for Java, Scala and Clojure programmers. If you have questions, please join our Google Group; for premium support, contact us at Skymind. ND4J is the Java scientific computing engine powering our matrix manipulations.

And you thought I write jargon laden prose. 😉

This both looks both exciting (as a technology) and challenging (as in needing accessible documentation).

Are you going to be “…turn[ing] their multiple knobs” over the holidays?

GitHub Repo

Tweets

#deeplearning4j @IRC

Google Group

I first saw this in a tweet by Gregory Piatetsky.

December 12, 2014

Deep Neural Networks are Easily Fooled:…

Filed under: Deep Learning,Machine Learning,Neural Networks — Patrick Durusau @ 7:47 pm

Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images by Anh Nguyen, Jason Yosinski, Jeff Clune.

Abstract:

Deep neural networks (DNNs) have recently been achieving state-of-the-art performance on a variety of pattern-recognition tasks, most notably visual classification problems. Given that DNNs are now able to classify objects in images with near-human-level performance, questions naturally arise as to what differences remain between computer and human vision. A recent study revealed that changing an image (e.g. of a lion) in a way imperceptible to humans can cause a DNN to label the image as something else entirely (e.g. mislabeling a lion a library). Here we show a related result: it is easy to produce images that are completely unrecognizable to humans, but that state-of-the-art DNNs believe to be recognizable objects with 99.99% confidence (e.g. labeling with certainty that white noise static is a lion). Specifically, we take convolutional neural networks trained to perform well on either the ImageNet or MNIST datasets and then find images with evolutionary algorithms or gradient ascent that DNNs label with high confidence as belonging to each dataset class. It is possible to produce images totally unrecognizable to human eyes that DNNs believe with near certainty are familiar objects. Our results shed light on interesting differences between human vision and current DNNs, and raise questions about the generality of DNN computer vision.

This is a great paper for weekend reading, even if computer vision isn’t your field. In part because the results were unexpected. Computer science is moving towards being an experimental science, at least in some situations.

Before you read the article, spend a few minutes thinking about how DNNs and human vision differ.

I haven’t run it to ground yet but I wonder if the authors have stumbled upon a way to deceive deep neural networks outside of computer vision applications? If so, does that suggest experiments that could identify ways to deceive other classification algorithms? And how would you detect such means if they were employed? Still confident about your data processing results?

I first saw this in a tweet by Gregory Piatetsky.

October 9, 2014

Intriguing properties of neural networks [Gaming Neural Networks]

Filed under: Data Analysis,Neural Networks — Patrick Durusau @ 4:48 pm

Intriguing properties of neural networks by Christian Szegedy, et al.

Abstract:

Deep neural networks are highly expressive models that have recently achieved state of the art performance on speech and visual recognition tasks. While their expressiveness is the reason they succeed, it also causes them to learn uninterpretable solutions that could have counter-intuitive properties. In this paper we report two such properties.

First, we find that there is no distinction between individual high level units and random linear combinations of high level units, according to various methods of unit analysis. It suggests that it is the space, rather than the individual units, that contains of the semantic information in the high layers of neural networks.

Second, we find that deep neural networks learn input-output mappings that are fairly discontinuous to a significant extend. Specifically, we find that we can cause the network to misclassify an image by applying a certain imperceptible perturbation, which is found by maximizing the network’s prediction error. In addition, the specific nature of these perturbations is not a random artifact of learning: the same perturbation can cause a different network, that was trained on a different subset of the dataset, to misclassify the same input.

Both findings are of interest but the discovery of “adversarial examples” that can cause a trained network to misclassify images, is the more intriguing of the two.

How do you validate a result from a neural network? Possessing the same network and data isn’t going to help if it contains “adversarial examples.” I suppose you could “spot” a misclassification but one assumes a neural network is being used because physical inspection by a person isn’t feasible.

What “adversarial examples” work best against particular neural networks? How to best generate such examples?

How do users of off-the-shelf neural networks guard against “adversarial examples?” (One of those cases where “shrink-wrap” data services may not be a good choice.)

I first saw this in a tweet by Xavier Amatriain

September 8, 2014

Accelerate Machine Learning with cuDNN Deep Neural Network Library

Filed under: GPU,Neural Networks,NVIDIA — Patrick Durusau @ 10:23 am

Accelerate Machine Learning with the cuDNN Deep Neural Network Library by Larry Brown.

From the post:

Introducing cuDNN

NVIDIA cuDNN is a GPU-accelerated library of primitives for DNNs. It provides tuned implementations of routines that arise frequently in DNN applications, such as:

  • convolution
  • pooling
  • softmax
  • neuron activations, including:
    • Sigmoid
    • Rectified linear (ReLU)
    • Hyperbolic tangent (TANH)

Of course these functions all support the usual forward and backward passes. cuDNN’s convolution routines aim for performance competitive with the fastest GEMM-based (matrix multiply) implementations of such routines while using significantly less memory.

cuDNN features customizable data layouts, supporting flexible dimension ordering, striding and subregions for the 4D tensors used as inputs and outputs to all of its routines. This flexibility allows easy integration into any neural net implementation and avoids the input/output transposition steps sometimes necessary with GEMM-based convolutions.

cuDNN is thread safe, and offers a context-based API that allows for easy multithreading and (optional) interoperability with CUDA streams. This allows the developer to explicitly control the library setup when using multiple host threads and multiple GPUs, and ensure that a particular GPU device is always used in a particular host thread (for example).

cuDNN allows DNN developers to easily harness state-of-the-art performance and focus on their application and the machine learning questions, without having to write custom code. cuDNN works on Windows or Linux OSes, and across the full range of NVIDIA GPUs, from low-power embedded GPUs like Tegra K1 to high-end server GPUs like Tesla K40. When a developer leverages cuDNN, they can rest assured of reliable high performance on current and future NVIDIA GPUs, and benefit from new GPU features and capabilities in the future.

I didn’t quote the background and promotional material on machine learning or deep neural networks (DNN’s), assuming that if you are interested at all, you will read the original post to pick up that material. Attention has been paid to making cuDNN “easy” to use. “Easy” is a relative term but I think you will appreciate the effort.

BTW, cuDNN is free for any purpose but does require you to have a registered CUDA developer account. If you are already a registered CUDA developer or after you are, see: http://developer.nvidia.com/cuDNN

Caffe, a deep learning framework, has support for cuDNN in its current development branch.

I first saw this in a tweet by Mark Harris.

July 14, 2014

Quoc Le’s Lectures on Deep Learning

Filed under: Machine Learning,Neural Networks — Patrick Durusau @ 1:32 pm

Quoc Le’s Lectures on Deep Learning by Gaurav Trivedi.

From the post:

Dr. Quoc Le from the Google Brain project team (yes, the one that made headlines for creating a cat recognizer) presented a series of lectures at the Machine Learning Summer School (MLSS ’14) in Pittsburgh this week. This is my favorite lecture series from the event till now and I was glad to be able to attend them.

The good news is that the organizers have made available the entire set of video lectures in 4K for you to watch. But since Dr. Le did most of them on the board and did not provide any accompanying slides, I decided to put the contents of the lectures along with the videos here.

I like Gaurav’s “enhanced” version over the straight YouTube version.

I need to go back and look at the cat recognizer. Particularly if I can use it as a filter on a twitter stream. 😉

I first saw this in Nat Torkington’s Four short links: 14 July 2014.

April 9, 2014

clortex

Filed under: Clojure,Neural Information Processing,Neural Networks,Neuroinformatics — Patrick Durusau @ 7:24 pm

clortex – Clojure Library for Jeff Hawkins’ Hierarchical Temporal Memory

From the webpage:

Hierarchical Temporal Memory (HTM) is a theory of the neocortex developed by Jeff Hawkins in the early-mid 2000’s. HTM explains the working of the neocortex as a hierarchy of regions, each of which performs a similar algorithm. The algorithm performed in each region is known in the theory as the Cortical Learning Algorithm (CLA).

Clortex is a reimagining and reimplementation of Numenta Platfrom for Intelligent Computing (NuPIC), which is also an Open Source project released by Grok Solutions (formerly Numenta), the company founded by Jeff to make his theories a practical and commercial reality. NuPIC is a mature, excellent and useful software platform, with a vibrant community, so please join us at Numenta.org.

Warning: pre-alpha software. This project is only beginning, and everything you see here will eventually be thrown away as we develop better ways to do things. The design and the APIs are subject to drastic change without a moment’s notice.

Clortex is Open Source software, released under the GPL Version 3 (see the end of the README). You are free to use, copy, modify, and redistribute this software according to the terms of that license. For commercial use of the algorithms used in Clortex, please contact Grok Solutions, where they’ll be happy to discuss commercial licensing.

An interesting project both in terms of learning theory but also for the requirements for the software implementing the theory.

The first two requirements capture the main points:

2.1 Directly Analogous to HTM/CLA Theory

In order to be a platform for demonstration, exploration and experimentation of Jeff Hawkins’ theories, the system must at all levels of relevant detail match the theory directly (ie 1:1). Any optimisations introduced may only occur following an effectively mathematical proof that this correspondence is maintained under the change.

2.2 Transparently Understandable Implementation in Source Code

All source code must at all times be readable by a non-developer. This can only be achieved if a person familiar with the theory and the models (but not a trained programmer) can read any part of the source code and understand precisely what it is doing and how it is implementing the algorithms.

This requirement is again deliberately very stringent, and requires the utmost discipline on the part of the developers of the software. Again, there are several benefits to this requirement.

Firstly, the extreme constraint forces the programmer to work in the model of the domain rather than in the model of the software. This constraint, by being adhered to over the lifecycle of the project, will ensure that the only complexity introduced in the software comes solely from the domain. Any other complexity introduced by the design or programming is known as incidental complexity and is the cause of most problems in software.

Secondly, this constraint provides a mechanism for verifying the first requirement. Any expert in the theory must be able to inspect the code for an aspect of the system and verify that it is transparently analogous to the theory.

Despite my misgivings about choosing the domain in which you stand, I found it interesting the project recognizes the domain of its theory and the domain of software to implement that theory are separate and distinct.

How closely two distinct domains can be mapped one to the other should be an interesting exercise.

BTW, some other resources you will find helpful:

NuPicNumenta Platform for Intelligent Computing

Cortical Learning Algorithm (CLA) white paper in eight languages.

Real Machine Intelligence with Clortex and NuPIC (book)

March 26, 2014

Deep Belief in Javascript

Filed under: Image Recognition,Image Understanding,Javascript,Neural Networks,WebGL — Patrick Durusau @ 1:34 pm

Deep Belief in Javascript

From the webpage:

It’s an implementation of the Krizhevsky convolutional neural network architecture for object recognition in images, running entirely in the browser using Javascript and WebGL!

I built it so people can easily experiment with a classic deep belief approach to image recognition themselves, to understand both its limitations and its power, and to demonstrate that the algorithms are usable even in very restricted client-side environments like web browsers.

A very impressive demonstration of the power of Javascript to say nothing of neural networks.

You can submit your own images for “recognition.”

I first saw this in Nat Torkington’s Four short links: 24 March 2014.

July 9, 2013

…Recursive Neural Networks

Filed under: Natural Language Processing,Neural Networks — Patrick Durusau @ 1:37 pm

Parsing Natural Scenes and Natural Language with Recursive Neural Networks by Richard Socher; Cliff Chiung-Yu Lin; Andrew Ng; and Chris Manning.

Description:

Recursive structure is commonly found in the inputs of different modalities such as natural scene images or natural language sentences. Discovering this recursive structure helps us to not only identify the units that an image or sentence contains but also how they interact to form a whole. We introduce a max-margin structure prediction architecture based on recursive neural networks that can successfully recover such structure both in complex scene images as well as sentences. The same algorithm can be used both to provide a competitive syntactic parser for natural language sentences from the Penn Treebank and to outperform alternative approaches for semantic scene segmentation, annotation and classification. For segmentation and annotation our algorithm obtains a new level of state-of-the-art performance on the Stanford background dataset (78.1%). The features from the image parse tree outperform Gist descriptors for scene classification by 4%.

Video of Richard Socher’s presentation at ICML 2011.

PDF of the paper: http://nlp.stanford.edu/pubs/SocherLinNgManning_ICML2011.pdf

According to one popular search engine the paper has 51 citations (as of today).

What caught my attention was the mapping of phrases into vector spaces which resulted in the ability to calculate nearest neighbors on phrases.

Both for syntactic and semantic similarity.

If you need more than a Boolean test for similarity (Yes/No), then you are likely to be interested in this work.

Later work by Socher at his homepage.

April 7, 2013

Advances in Neural Information Processing Systems (NIPS)

Filed under: Decision Making,Inference,Machine Learning,Neural Networks,Neuroinformatics — Patrick Durusau @ 5:47 am

Advances in Neural Information Processing Systems (NIPS)

From the homepage:

The Neural Information Processing Systems (NIPS) Foundation is a non-profit corporation whose purpose is to foster the exchange of research on neural information processing systems in their biological, technological, mathematical, and theoretical aspects. Neural information processing is a field which benefits from a combined view of biological, physical, mathematical, and computational sciences.

Links to videos from NIPS 2012 meetings are featured on the homepage. The topics are as wide ranging as the foundation’s description.

A tweet from Chris Diehl, wondering what to do with “old hardbound NIPS proceedings (NIPS 11)” led me to: Advances in Neural Information Processing Systems (NIPS) [Online Papers], which has the papers from 1987 to 2012 by volume and a search interface to the same.

Quite a remarkable collection just from a casual skim of some of the volumes.

Unless you need to fill book shelf space, suggest you bookmark the NIPS Online Papers.

November 23, 2012

Course on Information Theory, Pattern Recognition, and Neural Networks

Filed under: CS Lectures,Information Theory,Neural Networks,Pattern Recognition — Patrick Durusau @ 11:27 am

Course on Information Theory, Pattern Recognition, and Neural Networks by David MacKay.

From the description:

A series of sixteen lectures covering the core of the book “Information Theory, Inference, and Learning Algorithms (Cambridge University Press, 2003)” which can be bought at Amazon, and is available free online. A subset of these lectures used to constitute a Part III Physics course at the University of Cambridge. The high-resolution videos and all other course material can be downloaded from the Cambridge course website.

Excellent lectures on information theory, the probability that a message sent is the one received.

Makes me wonder if there is a similar probability theory for the semantics of a message sent being the semantics of the message as received?

July 27, 2012

Information Theory, Pattern Recognition, and Neural Networks

Filed under: Inference,Information Theory,Neural Networks,Pattern Recognition — Patrick Durusau @ 11:13 am

Information Theory, Pattern Recognition, and Neural Networks by David MacKay.

David MacKay’s lectures with slides on information theory, inference and neural networks. Spring/Summer of 2012.

Just in time for the weekend!

I saw this in Christophe Lalanne’s Bag of Tweets for July 2012.

June 4, 2012

Predictive Analytics: NeuralNet, Bayesian, SVM, KNN [part 4]

Filed under: Bayesian Data Analysis,Neural Networks,Support Vector Machines — Patrick Durusau @ 4:29 pm

Predictive Analytics: NeuralNet, Bayesian, SVM, KNN by Ricky Ho.

From the post:

Continuing from my previous blog in walking down the list of Machine Learning techniques. In this post, we’ll be covering Neural Network, Support Vector Machine, Naive Bayes and Nearest Neighbor. Again, we’ll be using the same iris data set that we prepared in the last blog.

Ricky continues his march through machine learning techniques. This post promises one more to go.

May 9, 2012

Structural Abstractions in Brains and Graphs

Filed under: Graphs,Neural Networks,Neuroinformatics — Patrick Durusau @ 10:31 am

Structural Abstractions in Brains and Graphs.

Marko Rodriguez compares the brain to a graph saying (in part):

A graph database is a software system that persists and represents data as a collection of vertices (i.e. nodes, dots) connected to one another by a collection of edges (i.e. links, lines). These databases are optimized for executing a type of process known as a graph traversal. At various levels of abstraction, both the structure and function of a graph yield a striking similarity to neural systems such as the human brain. It is posited that as graph systems scale to encompass more heterogenous data, a multi-level structural understanding can help facilitate the study of graphs and the engineering of graph systems. Finally, neuroscience may foster an appreciation and understanding of the various structural abstractions that exist within the graph.

It is a very suggestive post for thinking about graphs and I commend it to you for reading, close reading.

May 2, 2012

Natural Language Processing (almost) from Scratch

Filed under: Artificial Intelligence,Natural Language Processing,Neural Networks,SENNA — Patrick Durusau @ 2:18 pm

Natural Language Processing (almost) from Scratch by Ronan Collobert, Jason Weston, Leon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa.

Abstract:

We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including: part-of-speech tagging, chunking, named entity recognition, and semantic role labeling. This versatility is achieved by trying to avoid task-specific engineering and therefore disregarding a lot of prior knowledge. Instead of exploiting man-made input features carefully optimized for each task, our system learns internal representations on the basis of vast amounts of mostly unlabeled training data. This work is then used as a basis for building a freely available tagging system with good performance and minimal computational requirements.

In the introduction the authors remark:

The overwhelming majority of these state-of-the-art systems address a benchmark task by applying linear statistical models to ad-hoc features. In other words, the researchers themselves discover intermediate representations by engineering task-specifi c features. These features are often derived from the output of preexisting systems, leading to complex runtime dependencies. This approach is e ffective because researchers leverage a large body of linguistic knowledge. On the other hand, there is a great temptation to optimize the performance of a system for a speci fic benchmark. Although such performance improvements can be very useful in practice, they teach us little about the means to progress toward the broader goals of natural language understanding and the elusive goals of Arti ficial Intelligence.

I am not an AI enthusiast but I agree that pre-judging linguistic behavior (based on our own) in a data set will find no more (or less) linguistic behavior than our judgment allows. Reliance on the research of others just adds more opinions to our own. Have you ever wonder on what basis we accept the judgments of others?

A very deep and annotated dive into NLP approaches (the author’s and others) with pointers to implementations, data sets and literature.

In case you are interested, the source code is available at: SENNA (Semantic/syntactic Extraction using a Neural Network Architecture)

February 24, 2012

A Well-Woven Study of Graphs, Brains, and Gremlins

Filed under: Graphs,Gremlin,Neural Networks,Neuroinformatics — Patrick Durusau @ 4:53 pm

A Well-Woven Study of Graphs, Brains, and Gremlins by Marko Rodriguez.

From the post:

What do graphs and brains have in common? First, they both share a relatively similar structure: Vertices/neurons are connected to each other by edges/axons. Second, they both share a similar process: traversers/action potentials propagate to effect some computation that is a function of the topology of the structure. If there exists a mapping between two domains, then it is possible to apply the processes of one domain (the brain) to the structure of the other (the graph). The purpose of this post is to explore the application of neural algorithms to graph systems.

Entertaining and informative post by Marko Rodriguez comparing graphs, brains and the graph query language Gremlin.

I agree with Marko on the potential of graphs but am less certain than I read him to be on how well we understand the brain. Both the brain and graphs have many dark areas yet to be explored. As we shine new light on one place, more unknown places are just beyond the reach of our light.

November 28, 2011

Interesting papers coming up at NIPS’11

Filed under: Biomedical,Conferences,Neural Networks — Patrick Durusau @ 7:13 pm

Interesting papers coming up at NIPS’11

Yaroslav Bulatov has tracked down papers that have been accepted for NIPS’11. Not abstracts or summaries but the actual papers.

Well worth a visit to take advantage of his efforts.

While looking at the NIPS’11 site (will post that tomorrow) I ran across a paper on a proposal for a “…array/matrix/n-dimensional base object implementations for GPUs.” Will post that tomorrow as well.

November 1, 2011

Natural Language Processing from Scratch

Filed under: Natural Language Processing,Neural Networks — Patrick Durusau @ 3:32 pm

Natural Language Processing from Scratch

From the post:

Ronan's masterpiece, "Natural Language Processing (Almost) from Scratch", has been published in JMLR. This paper describes how to use a unified neural network architecture to solve a collection of natural language processing tasks with near state-of-the-art accuracies and ridiculously fast processing speed. A couple thousand lines of C code processes english sentence at more than 10000 words per second and outputs part-of-speech tags, named entity tags, chunk boundaries, semantic role labeling tags, and, in the latest version, syntactic parse trees. Download SENNA!

This looks very cool! Check out the paper along with the software!

October 24, 2011

Fast Deep/Recurrent Nets for AGI Vision

Filed under: Artificial Intelligence,Neural Networks,Pattern Recognition — Patrick Durusau @ 6:43 pm

Fast Deep/Recurrent Nets for AGI Vision

Jürgen Schmidhuber at AGI-2011 delivers a deeply amusing presentation promoting neural networks, particularly deep/recurrent networks pioneered by his lab.

The jargon falls fast and furious so you probably want to visit his homepage for pointers to more information.

A wealth of information awaits! Suggestions on what looks the most promising for assisted topic map authoring welcome!

July 25, 2011

Interesting Neural Network Papers at ICML 2011

Filed under: Machine Learning,Neural Networks — Patrick Durusau @ 6:39 pm

Interesting Neural Network Papers at ICML 2011 by Richard Socher.

Brief comments on eight (8) papers and the ICML 2011 conference.

Highly recommended, particularly if you are interested in neural networks and/or machine learning in connection with your topic maps.

The conference website: The 28th International Conference on Machine Learning, has pointers to the complete proceedings as well as videos of all Session A talks.

Kudos to the conference and its organizers for making materials from the conference available!

February 17, 2011

Encog Java and DotNet Neural Network Framework

Filed under: .Net,Encog,Java,Machine Learning,Neural Networks,Silverlight — Patrick Durusau @ 6:56 am

Encog Java and DotNet Neural Network Framework

From the website:

Encog is an advanced neural network and machine learning framework. Encog contains classes to create a wide variety of networks, as well as support classes to normalize and process data for these neural networks. Encog trains using multithreaded resilient propagation. Encog can also make use of a GPU to further speed processing time. A GUI based workbench is also provided to help model and train neural networks. Encog has been in active development since 2008.

Encog is available for Java, .Net and Silverlight.

An important project for at least two reasons.

First, the obvious applicability to the creation of topic maps using machine learning techniques.

Second, it demonstrates that supporting Java, .Net and Silverlight, isn’t, you know, all that weird.

The world is changing and becoming, somewhat more interoperable.

Topic maps has a role to play in that process, both in terms of semantic interoperability of the infrastructure as well as the data it contains.

February 3, 2011

Twenty-Fourth Annual Conference on Neural Information Processing Systems (NIPS) 2010

Filed under: Biomedical,Conferences,Neural Networks — Patrick Durusau @ 3:18 pm

Twenty-Fourth Annual Conference on Neural Information Processing Systems (NIPS) 2010

Another treasure trove of conference presentations, tutorials and other materials of interest to anyone working on information systems.

From the website:

You are invited to participate in the Twenty-Fourth Annual Conference on Neural Information Processing Systems, which is the premier scientific meeting on Neural Computation.

A one-day Tutorial Program offered a choice of six two-hour tutorials by leading scientists. The topics span a wide range of subjects including Neuroscience, Learning Algorithms and Theory, Bioinformatics, Image Processing, and Data Mining.

The NIPS Conference featured a single track program, with contributions from a large number of intellectual communities. Presentation topics include: Algorithms and Architectures; Applications; Brain Imaging; Cognitive Science and Artificial Intelligence; Control and Reinforcement Learning; Emerging Technologies; Learning Theory; Neuroscience; Speech and Signal Processing; and Visual Processing.

There were two Posner Lectures named in honor of Ed Posner who founded NIPS. Ed worked on communications and information theory at Caltech and was an early pioneer in neural networks. He organized the first NIPS conference and workshop in Denver in 1989 and incorporated the NIPS Foundation in 1992. He was an inpiring teacher and an effective leader. His untimely death in a bicycle accident in 1993 was a great loss to our community. Posner Lecturers were Josh Tenebaum and Michael Jordan.

The Poster Sessions offered high-quality posters and an opportunity for researchers to share their work and exchange ideas in a collegial setting. The majority of contributions accepted at NIPS were presented as posters.

The Demonstrations enabled researchers to highlight scientific advances, systems, and technologies in ways that go beyond conventional poster presentations. It provided a unique forum for demonstrating advanced technologies — both hardware and software — and fostering the direct exchange of knowledge.

October 21, 2010

A Survey of Genetics-based Machine Learning

Filed under: Evoluntionary,Learning Classifier,Machine Learning,Neural Networks — Patrick Durusau @ 5:15 am

A Survey of Genetics-based Machine Learning Author: Tim Kovacs

Abstract:

This is a survey of the field of Genetics-based Machine Learning (GBML): the application of evolutionary algorithms to machine learning. We assume readers are familiar with evolutionary algorithms and their application to optimisation problems, but not necessarily with machine learning. We briefly outline the scope of machine learning, introduce the more specific area of supervised learning, contrast it with optimisation and present arguments for and against GBML. Next we introduce a framework for GBML which includes ways of classifying GBML algorithms and a discussion of the interaction between learning and evolution. We then review the following areas with emphasis on their evolutionary aspects: GBML for sub-problems of learning, genetic programming, evolving ensembles, evolving neural networks, learning classifier systems, and genetic fuzzy systems.

The author’s preprint has 322 references. Plus there are slides, bibliographies in BibTeX.

If you are interesting in augmented topic map authoring using GBML, this would be a good starting place.

Questions:

  1. Pick 3 subject areas. What arguments would you make in favor of GBML for augmenting authoring of a topic map for those subject areas?
  2. Same subject areas, but what arguments would you make against the use of GBML for augmenting authoring of a topic map for those subject areas?
  3. Design an experiment to test one of your arguments for and against GBML. (project, use of the literature encouraged)
  4. Convert the BibTeX formatted bibliographies into a topic map. (project)
« Newer Posts

Powered by WordPress