Archive for the ‘Artificial Intelligence’ Category

Notable presentations at Technion TCE conference 2013: RevMiner & Boom

Sunday, June 2nd, 2013

Notable presentations at Technion TCE conference 2013: RevMiner & Boom by Danny Bickson.

Danny has uncovered two papers to start your week:

http://turing.cs.washington.edu/papers/uist12-huang.pdf (RevMiner)

http://turing.cs.washington.edu/papers/kdd12-ritter.pdf (Twitter data mining)

Danny also describes Boom, for which I found this YouTube video:

See Danny’s post for more comments, etc.

Introduction to Artificial Intelligence (Berkeley CS188.1x)

Wednesday, May 22nd, 2013

Introduction to Artificial Intelligence (Berkeley CS188.1x)

The schedule for CS188.2x hasn’t been announced, yet.

In the meantime, you can register for CS188.1x and peruse the videos, exercises, etc. while you wait for the second part of the course.

From the description:

CS188.1x is a new online adaptation of the first half of UC Berkeley’s CS188: Introduction to Artificial Intelligence. The on-campus version of this upper division computer science course draws about 600 Berkeley students each year.

Artificial intelligence is already all around you, from web search to video games. AI methods plan your driving directions, filter your spam, and focus your cameras on faces. AI lets you guide your phone with your voice and read foreign newspapers in English. Beyond today’s applications, AI is at the core of many new technologies that will shape our future. From self-driving cars to household robots, advancements in AI help transform science fiction into real systems.

CS188.1x focuses on Behavior from Computation. It will introduce the basic ideas and techniques underlying the design of intelligent computer systems. A specific emphasis will be on the statistical and decision–theoretic modeling paradigm. By the end of this course, you will have built autonomous agents that efficiently make decisions in stochastic and in adversarial settings. CS188.2x (to follow CS188.1x, precise date to be determined) will cover Reasoning and Learning. With this additional machinery your agents will be able to draw inferences in uncertain environments and optimize actions for arbitrary reward structures. Your machine learning algorithms will classify handwritten digits and photographs. The techniques you learn in CS188x apply to a wide variety of artificial intelligence problems and will serve as the foundation for further study in any application area you choose to pursue.

Linguists Circle the Wagons, or Disagreement != Danger

Thursday, May 16th, 2013

Pullum’s NLP Lament: More Sleight of Hand Than Fact by Christopher Phipps.

From the post:

My first reading of both of Pullum’s recent NLP posts (one and two) interpreted them to be hostile, an attack on a whole field (see my first response here). Upon closer reading, I see Pullum chooses his words carefully and it is less of an attack and more of a lament. He laments that the high-minded goals of early NLP (to create machines that process language like humans do) has not been reached, and more to the point, that commercial pressures have distracted the field from pursuing those original goals, hence they are now neglected. And he’s right about this to some extent.

But, he’s also taking the commonly used term “natural language processing” and insisting that it NOT refer to what 99% of people who use the term use it for, but rather only a very narrow interpretation consisting of something like “computer systems that mimic human language processing.” This is fundamentally unfair.

In the 1980s I was convinced that computers would soon be able to simulate the basics of what (I hope) you are doing right now: processing sentences and determining their meanings.

I feel Pullum is moving the goal posts on us when he says “there is, to my knowledge, no available system for unaided machine answering of free-form questions via general syntactic and semantic analysis” [my emphasis]. Pullum’s agenda appears to be to create a straw-man NLP world where NLP techniques are only admirable if they mimic human processing. And this is unfair for two reasons.

If there is unfairness in this discussion, it is the insistence by Christopher Phipps (and others) that Pullum has invented “…a straw-man NLP world where NLP techniques are only admirable if they mimic human processing.”

On the contrary, it was 1949 when Warren Weaver first proposed computers as the solution to world-wide translation problems. Weaver’s was not the only optimistic projection of language processing by computers. Those have continued up to and including the Semantic Web.

Yes, NLP practitioners such as Christopher Phipps use NLP in a more precise sense than Pullum. And NLP as defined by Phipps has too many achievements to easily list.

Neither one of those statements takes anything away from Pullum’s point that Google found a “sweet spot” between machine processing and human intelligence for search purposes.

What other insights Pullum has to offer may be obscured by the “…circle the wagons…” attitude from linguists.

Disagreement != Danger.

Deep learning made easy

Friday, May 3rd, 2013

Deep learning made easy by Zygmunt Zając.

From the post:

As usual, there’s an interesting competition at Kaggle: The Black Box. It’s connected to ICML 2013 Workshop on Challenges in Representation Learning, held by the deep learning guys from Montreal.

There are a couple benchmarks for this competition and the best one is unusually hard to beat – only less than a fourth of those taking part managed to do so. We’re among them. Here’s how.

The key ingredient in our success is a recently developed secret Stanford technology for deep unsupervised learning, called sparse filtering. Actually, it’s not secret. It’s available at Github, and has one or two very appealling properties. Let us explain.

The main idea of deep unsupervised learning, as we understand it, is feature extraction. One of the most common applications are in multimedia. The reason for that is that multimedia tasks, for example object recognition, are easy for humans, but difficult for the computers*.

Geoff Hinton from Toronto talks about two ends of spectrum in machine learning: one is statistics and getting rid of noise, the other one – AI, or the things that humans are good at but computers are not. Deep learning proponents say that deep, that is, layered, architectures, are the way to solve AI kind of problems.

The idea might have something to do with an inspiration from how the brain works. Each layer is supposed to extract higher-level features, and these features are supposed to be more useful for the task at hand.

Rather say layered architectures are observed to mimic human results.

Just as a shovel mimics and exceeds a human hand for digging.

But you would not say operation of a shovel gives us insight into the operation of a human hand.

Or would you?

Large-Scale Learning with Less… [Less Precision Viable?]

Wednesday, March 20th, 2013

Large-Scale Learning with Less RAM via Randomization by Daniel Golovin, D. Sculley, H. Brendan McMahan, Michael Young.

Abstract:

We reduce the memory footprint of popular large-scale online learning methods by projecting our weight vector onto a coarse discrete set using randomized rounding. Compared to standard 32-bit float encodings, this reduces RAM usage by more than 50% during training and by up to 95% when making predictions from a fixed model, with almost no loss in accuracy. We also show that randomized counting can be used to implement per-coordinate learning rates, improving model quality with little additional RAM. We prove these memory-saving methods achieve regret guarantees similar to their exact variants. Empirical evaluation confirms excellent performance, dominating standard approaches across memory versus accuracy tradeoffs.

I mention this in part because topic map authoring can be assisted by the results of machine learning.

It is also a data point for the proposition that unlike their human masters, machines are too precise.

Perhaps it is the case that the vagueness of human reasoning has significant advantages over the disk grinding precision of our machines.

The question then becomes: How do we capture vagueness in a system where every point is either 0 or 1?

Not probabilistic because that can be expressed but vagueness, which I experience as something different.

Suggestions?

PS: Perhaps that is what makes artificial intelligence artificial. It is too precise. ;-)

I first saw this in a tweet by Stefano Bertolo.

AI Algorithms, Data Structures, and Idioms…

Tuesday, March 19th, 2013

AI Algorithms, Data Structures, and Idioms in Prolog, Lisp and Java by George F. Luger and William A. Stubblefield.

From the introduction:

Writing a book about designing and implementing representations and search algorithms in Prolog, Lisp, and Java presents the authors with a number of exciting opportunities.

The first opportunity is the chance to compare three languages that give very different expression to the many ideas that have shaped the evolution of programming languages as a whole. These core ideas, which also support modern AI technology, include functional programming, list processing, predicate logic, declarative representation, dynamic binding, meta-linguistic abstraction, strong-typing, meta-circular definition, and object-oriented design and programming. Lisp and Prolog are, of course, widely recognized for their contributions to the evolution, theory, and practice of programming language design. Java, the youngest of this trio, is both an example of how the ideas pioneered in these earlier languages have shaped modern applicative programming, as well as a powerful tool for delivering AI applications on personal computers, local networks, and the world wide web.

Where could you go wrong with comparing Prolog, Lisp and Java?

Either for the intellectual exercise or because you want a better understanding of AI, a resource to enjoy!

Easy 6502

Tuesday, March 19th, 2013

Easy 6502 by Nick Morgan.

From the webpage:

In this tiny ebook I’m going to show you how to get started writing 6502 assembly language. The 6502 processor was massive in the seventies and eighties, powering famous computers like the BBC Micro, Atari 2600, Commodore 64, Apple II, and the Nintendo Entertainment System. Bender in Futurama has a 6502 processor for a brain. Even the Terminator was programmed in 6502.

So, why would you want to learn 6502? It’s a dead language isn’t it? Well, so’s Latin. And they still teach that. Q.E.D.

(Actually, I’ve been reliably informed that 6502 processors are still being produced by Western Design Center, so clearly 6502 isn’t a dead language! Who knew?)

Seriously though, I think it’s valuable to have an understanding of assembly language. Assembly language is the lowest level of abstraction in computers – the point at which the code is still readable. Assembly language translates directly to the bytes that are executed by your computer’s processor. If you understand how it works, you’ve basically become a computer magician.

Then why 6502? Why not a useful assembly language, like x86? Well, I don’t think learning x86 is useful. I don’t think you’ll ever have to write assembly language in your day job – this is purely an academic exercise, something to expand your mind and your thinking. 6502 was originally written in a different age, a time when the majority of developers were writing assembly directly, rather than in these new-fangled high-level programming languages. So, it was designed to be written by humans. More modern assembly languages are meant to written by compilers, so let’s leave it to them. Plus, 6502 is fun. Nobody ever called x86 fun.

A useful reminder about the nature of processing in computers.

Whatever a high level language may imply to you, for your computer, it’s just instructions.

“…the flawed man versus machine dichotomy”

Friday, February 22nd, 2013

The backlash against Big Data has started

Kaiser Fung critiques a recent criticism of big data saying:

Andrew Gelman has a beef with David Brooks over his New York Times column called “What Data Can’t Do”. (link) I will get to Brooks’s critique soon–my overall feeling is, he created a bunch of sound bites, and could have benefited from interviewing people like Andrew and myself, who are skeptical of Big Data claims but not maniacally dismissive.

The biggest issue with Brooks’s column is the incessant use of the flawed man versus machine dichotomy. He warns: “It’s foolish to swap the amazing machine in your skull for the crude machine on your desk.” The machine he has in his mind is the science-fictional, self-sufficient, intelligent computer, as opposed to the algorithmic, dumb-and-dumber computer as it exists today and for the last many decades. A more appropriate analogy of today’s computer (and of the foreseeable future) is a machine that the human brain creates to automate mechanical, repetitious tasks at scale. This machine cannot function without human piloting so it’s man versus man-plus-machine, not man versus machine. (emphasis added)

I would have to plead guilty to falling into that “…flawed man versus machine dichotomy.”

And why not?

When machinery gives absurd answers, such as matching children to wanted terrorists and their human counterparts, blindly accept the conclusion, there is cause for concern.

Kaiser concludes:

Brooks made a really great point at the end of the piece, which I will paraphrase: any useful data is cooked. “The end result looks disinterested, but, in reality, there are value choices all the way through, from construction to interpretation.” Instead of thinking about this as cause for concern, we should celebrate these “value choices” because they make the data more useful.

This brings me back to Gelman’s reaction in which he differentiates between good analysis and bad analysis. Except for the simplest problems, any good analysis uses cooked data but an analysis using cooked data could be good or bad.

Perhaps my criticism should be of people who conceal their “value choices” amidst machinery.

There may be disinterested machines, but only the the absence of people and their input.

Yes?

Label propagation in GraphChi

Monday, February 11th, 2013

Label propagation in GraphChi by Danny Bickson.

From the post:

A few days ago I got a request from Jidong, from the Chinese Renren company to implement label propagation in GraphChi. The algorithm is very simple described here: Zhu, Xiaojin, and Zoubin Ghahramani. Learning from labeled and unlabeled data with label propagation. Technical Report CMU-CALD-02-107, Carnegie Mellon University, 2002.

The basic idea is that we start with a group of users that we have some information about the categories they are interested in. Following the weights in the social network, we propagate the label probabilities from the user seed node (the ones we have label information about) into the general social network population. After several iterations, the algorithm converges and the output is labels for the unknown nodes.

I assume there is more unlabeled data for topic maps than labeled data.

Depending upon your requirements, this could prove to be a useful technique for completing those unlabeled nodes.

Simulating the European Commission

Saturday, February 2nd, 2013

Did you see Gary Marcus’ “We are not yet ready to simulate the brain,” last Thursday’s Financial Times?

Gary writes:

The 10-year €1.19bn project to simulate the entire human brain, announced on Monday by the European Commission is, at about a sixth of the cost of the Large Hadron Collider, the biggest neuroscience project undertaken. It is an important, but flawed, step to a better understanding of the organ’s workings.

His analysis is telling but he misses the true goal of the project even as he writes:

Even so, it could foster a great deal of useful science. The crucial question is how the money will be spent. Much of the infrastructure developed will serve a vast number of projects, and the funding will support more than 250 scientists from more than 80 institutions, each with his or her own research agenda. A great many, such as Yadin Dudai (who specialises in memory), Seth Grant (who studies the genetics and evolution of neural function) and Stanislas Dehaene (who works on the brain basis of mathematics and consciousness), are stellar.

Supporting researchers, +1! Building the infrastructure of drones, managers, auditors, meeting coordinators and the like for this project, -1!

Every field of research could benefit from the funding that will now be diverted into “infrastructure” that exists only to be “infrastructure” (read employment).

My counter proposal is to simulate the EU commission using Steven Santy’s online “Magic Eight Ball.”

Put the question: Should project [name] be funded? to the Magic Eight Ball as many times as there are EU votes on projects and sum the answers.

Would avoid some of the “infrastructure” expenses and result in equivalent funding decisions.

If that sounds harsh, recall EU provincialism funds only EU-based research. As though scientific research and discovery depends upon nationality or geographic location. In that regard, the EU is like Alabama, only larger.

Human Computation and Crowdsourcing

Saturday, January 26th, 2013

Announcing HCOMP 2013 – Conference on Human Computation and Crowdsourcing by Eric Horvitz.

From the conference website:

Where

Palm Springs, California
Venue information coming soon

When

November 7-9, 2013

Important Dates

All deadlines are 5pm Pacific time unless otherwise noted.

Papers

Submission deadline: May 1, 2013
Author rebuttal period: June 21-28
Notification: July 16, 2013
Camera Ready: September 4, 2013

Workshops & Tutorials

Proposal deadline: May 10, 2013
Notification: July 16, 2013
Camera Ready: September 4, 2013

Posters & Demonstrations

Submission deadline: July 25, 2013
Notification: August 26, 2013
Camera Ready: September 4, 2013

From the post:

Announcing HCOMP 2013, the Conference on Human Computation and Crowdsourcing, Palm Springs, November 7-9, 2013. Paper submission deadline is May 1, 2013. Thanks to the HCOMP community for bringing HCOMP to life as a full conference, following on the successful workshop series.

The First AAAI Conference on Human Computation and Crowdsourcing (HCOMP 2013) will be held November 7-9, 2013 in Palm Springs, California, USA. The conference was created by researchers from diverse fields to serve as a key focal point and scholarly venue for the review and presentation of the highest quality work on principles, studies, and applications of human computation. The conference is aimed at promoting the scientific exchange of advances in human computation and crowdsourcing among researchers, engineers, and practitioners across a spectrum of disciplines. Papers submissions are due May 1, 2013 with author notification on July 16, 2013. Workshop and tutorial proposals are due May 10, 2013. Posters & demonstrations submissions are due July 25, 2013.

I suppose it had to happen.

Instead of asking adding machines for their opinions, someone would decide to ask the creators of adding machines for theirs.

I first saw this at: New AAAI Conference on Human Computation and Crowdsourcing by Shar Steed.

Artificial Intelligence – Fall 2012 – CMU

Wednesday, October 31st, 2012

Artificial Intelligence – Fall 2012 – CMU by Emma Brunskill and Ariel Procaccia.

From the course overview:

Topics:

This course is about the theory and practice of Artificial Intelligence. We will study modern techniques for computers to represent task-relevant information and make intelligent (i.e. satisfying or optimal) decisions towards the achievement of goals. The search and problem solving methods are applicable throughout a large range of industrial, civil, medical, financial, robotic, and information systems. We will investigate questions about AI systems such as: how to represent knowledge, how to effectively generate appropriate sequences of actions and how to search among alternatives to find optimal or near-optimal solutions. We will also explore how to deal with uncertainty in the world, how to learn from experience, and how to learn decision rules from data. We expect that by the end of the course students will have a thorough understanding of the algorithmic foundations of AI, how probability and AI are closely interrelated, and how automated agents learn. We also expect students to acquire a strong appreciation of the big-picture aspects of developing fully autonomous intelligent agents. Other lectures will introduce additional aspects of AI, including unsupervised and on-line learning, autonomous robotics, and economic/game-theoretic decision making.

Learning Objectives

By the end of the course, students should be able to:

  1. Identify the type of an AI problem (search, inference, decision making under uncertainty, game theory, etc).
  2. Formulate the problem as a particular type. (Example: define a state space for a search problem)
  3. Compare the difficulty of different versions of AI problems, in terms of computational complexity and the efficiency of existing algorithms.
  4. Implement, evaluate and compare the performance of various AI algorithms. Evaluation could include empirical demonstration or theoretical proofs.

Textbook:

It is helpful, but not required, to have Artificial Intelligence: A Modern Approach / Russel and Norvig.

Judging from the materials on the website, this is a very good course.

7 John McCarthy Papers in 7 weeks – Prologue

Sunday, October 21st, 2012

7 John McCarthy Papers in 7 weeks – Prologue by Carin Meier.

From the post:

In the spirit of Seven Languages in Seven Weeks, I have decided to embark on a quest. But instead of focusing on expanding my mindset with different programming languages, I am focusing on trying to get into the mindset of John McCarthy, father of LISP and AI, by reading and thinking about seven of his papers.

See Carin’s blog for progress so far.

I first saw this at John D. Cooks’s The Endeavor

How would you react to something similar for topic maps?

Artificial Intelligence and Machine Learning [Mid-week present]

Wednesday, October 10th, 2012

Artificial Intelligence and Machine Learning (Research at Google)

I assume you have been good so far this week so time for a mid-week present!

As of today, a list of two hundred and forty-nine publications in artificial intelligence and machine learning from Google Research!

From the webpage:

Much of our work on language, speech, translation, and visual processing relies on Machine Learning and AI. In all of those tasks and many others, we gather large volumes of direct or indirect evidence of relationships of interest, and we apply learning algorithms to generalize from that evidence to new cases of interest. Machine Learning at Google raises deep scientific and engineering challenges. Contrary to much of current theory and practice, the statistics of the data we observe shifts very rapidly, the features of interest change as well, and the volume of data often precludes the use of standard single-machine training algorithms. When learning systems are placed at the core of interactive services in a rapidly changing and sometimes adversarial environment, statistical models need to be combined with ideas from control and game theory, for example when using learning in auction algorithms.

Research at Google is at the forefront of innovation in Machine Learning with one of the most active groups working on virtually all aspects of learning, theory as well as applications, and a strong academic presence through technical talks and publications in major conferences and journals.

Don’t neglect your “real” work but either find a paper relevant to your “real” work or read one during lunch or on break.

You will be glad you did!

Google at UAI 2012

Monday, September 3rd, 2012

Google at UAI 2012 by Kevin Murphy.

From the post:

The conference on Uncertainty in Artificial Intelligence (UAI) is one of the premier venues for research related to probabilistic models and reasoning under uncertainty. This year’s conference (the 28th) set several new records: the largest number of submissions (304 papers, last year 285), the largest number of participants (216, last year 191), the largest number of tutorials (4, last year 3), and the largest number of workshops (4, last year 1). We interpret this as a sign that the conference is growing, perhaps as part of the larger trend of increasing interest in machine learning and data analysis.

There were many interesting presentations. A couple of my favorites included:

  • Video In Sentences Out,” by Andrei Barbu et al. This demonstrated an impressive system that is able to create grammatically correct sentences describing the objects and actions occurring in a variety of different videos.
  • Exploiting Compositionality to Explore a Large Space of Model Structures,” by Roger Grosse et al. This paper (which won the Best Student Paper Award) proposed a way to view many different latent variable models for matrix decomposition – including PCA, ICA, NMF, Co-Clustering, etc. – as special cases of a general grammar. The paper then showed ways to automatically select the right kind of model for a dataset by performing greedy search over grammar productions, combined with Bayesian inference for model fitting.

You can find other individual papers at: Schedule UAI 2012.

Or you can grab the entire proceedings. (972 page PDF file)

Either way, you will find numerous items for exploration and conversation.

DARPA Seeking Unconventional Processors for ISR Data Analysis [Analog Computing By Another Name]

Wednesday, August 29th, 2012

DARPA Seeking Unconventional Processors for ISR Data Analysis by Erwin Gianchandani.

From the post:

Earlier this month, the Defense Advanced Research Projects Agency (DARPA) announced a new initiative that aims “to break the status quo of digital processing” by investigating new ways of “non-digital” computation that are “fundamentally different from current digital processors and the power and speed limitations associated with them.” Called Unconventional Processing of Signals for Intelligent Data Exploitation, or UPSIDE, the initiative specifically seeks “a new, ultra-low power processing method [that] may enable faster, mission-critical analysis of [intelligence, surveillance, and reconnaissance (ISR)] data.”

According to the DARPA announcement (after the jump):

Instead of traditional complementary metal-oxide-semiconductor (CMOS)-based electronics, UPSIDE envisions arrays of physics-based devices (nanoscale oscillators may be one example) performing the processing. These arrays would self-organize and adapt to inputs, meaning that they will not need to be programmed as digital processors are. Unlike traditional digital processors that operate by executing specific instructions to compute, it is envisioned that the UPSIDE arrays will rely on a higher level computational element based on probabilistic inference embedded within a digital system.

Probabilistic inference is the fundamental computational model for the UPSIDE program. An inference process uses energy minimization to determine a probability distribution to find the object that is the most likely interpretation of the sensor data. It can be implemented directly in approximate precision by traditional semiconductors as well as by new kinds of emerging devices.

DARPA program manager Dan Hammerstrom noted:

“Redefining the fundamental computation as inference could unlock processing speeds and power efficiency for visual data sets that are not currently possible. DARPA hopes that this type of technology will not only yield faster video and image analysis, but also lend itself to being scaled for increasingly smaller platforms.

“Leveraging the physics of devices to perform computations is not a new idea, but it is one that has never been fully realized. However, digital processors can no longer keep up with the requirements of the Defense mission. We are reaching a critical mass in terms of our understanding of the required algorithms, of probabilistic inference and its role in sensor data processing, and the sophistication of new kinds of emerging devices. At DARPA, we believe that the time has come to fund the development of systems based on these ideas and take computational capabilities to the next level.”

How much “…not a new idea, but it is one that has never been fully realized[?]”

If you search for “analog computing,” you will get a good idea of how old and how useful a concept it has been.

You can jump to the Wikipedia article, Analog Computer or take a brief tour with the Analog Computer Manual.

Please post a note if you experiment with analog computing and subject identity processing.

Or if you decide that models for chemical reactions in the human brain should be analog ones and not digital.

JMLR – Journal of Machine Learning Research

Thursday, July 5th, 2012

JMLR – Journal of Machine Learning Research

From the webpage:

The Journal of Machine Learning Research (JMLR) provides an international forum for the electronic and paper publication of high-quality scholarly articles in all areas of machine learning. All published papers are freely available online.

Starts with volume 1 in October of 2000 and continues to the present.

Special topics that call out articles from different issues and special issues are also listed.

A first rate collection of machine learning research.

Streaming Analytics: with sparse distributed representations

Monday, May 28th, 2012

Streaming Analytics: with sparse distributed representations by Jeff Hawkins.

Abstract:

Sparse distributed representations appear to be the means by which brains encode information. They have several advantageous properties including the ability to encode semantic meaning. We have created a distributed memory system for learning sequences of sparse distribute representations. In addition we have created a means of encoding structured and unstructured data into sparse distributed representations. The resulting memory system learns in an on-line fashion making it suitable for high velocity data streams. We are currently applying it to commercially valuable data streams for prediction, classification, and anomaly detection In this talk I will describe this distributed memory system and illustrate how it can be used to build models and make predictions from data streams.

Slides: http://www.numenta.com/htm-overview/05-08-2012-Berkeley.pdf

Looking forward to learning more about “sparse distributed representation (SDR).”

Not certain about Jeff’s claim that matching across SDRs = semantic similarity.

Design of the SDR determines the meaning of each bit and consequently of matching.

Which feeds back into the encoders that produce the SDRs.

Other resources:

The core paper: Hierarchical Temporal Memory including HTM Cortical Learning Algorithms. Check the FAQ link if you need the paper in Chinese, Japanese, Korean, Portuguese, Russian, or Spanish. (unverified translations)

Grok – Frequently Asked Questions

A very good FAQ that goes a long way to explaining the capabilities and limitations (currently) of Grok. “Unstructured text” for example isn’t appropriate input into Grok.

Jeff Hawkins and Sandra Blakeslee co-authored On Intelligence in 2004. The FAQ describes the current work as an extension of “On Intelligence.”

BTW, if you think you have heard the name Jeff Hawkins before, you have. Inventor of the Palm Pilot among other things.

Why Your Brain Isn’t A Computer

Sunday, May 6th, 2012

Why Your Brain Isn’t A Computer by Alex Knapp.

Alex writes:

“If the human brain were so simple that we could understand it, we would be so simple that we couldn’t.”
- Emerson M. Pugh

Earlier this week, i09 featured a primer, of sorts, by George Dvorsky regarding how an artificial human brain could be built. It’s worth reading, because it provides a nice overview of the philosophy that underlies some artificial intelligence research, while simultaneously – albeit unwittingly – demonstrating the some of the fundamental flaws underlying artificial intelligence research based on the computational theory of mind.

The computational theory of mind, in essence, says that your brain works like a computer. That is, it takes input from the outside world, then performs algorithms to produce output in the form of mental state or action. In other words, it claims that the brain is an information processor where your mind is “software” that runs on the “hardware” of the brain.

Dvorsky explicitly invokes the computational theory of mind by stating “if brain activity is regarded as a function that is physically computed by brains, then it should be possible to compute it on a Turing machine, namely a computer.” He then sets up a false dichotomy by stating that “if you believe that there’s something mystical or vital about human cognition you’re probably not going to put too much credence” into the methods of developing artificial brains that he describes.

I don’t normally read Forbes but I made and exception in this case and am glad I did.

Not that I particularly care about which side of the AI debate you come out on.

I do think that the notion of “emergent” properties is an important one for judging subject identities. Whether those subjects occur in text messages, intercepted phone calls, signal “intell” of any sort.

Properties that identify subjects “emerge” from a person who speaks the language in question, who has social/intellectual/cultural experiences that give them a grasp of the matters under discussion and perhaps the underlying intent of the parties to the conversation.

A computer program can be trained to mindlessly sort through large amounts of data. It can even be trained to acceptable levels of mis-reading, mis-interpretation.

What will our evaluation be when it misses the one conversation prior to another 9/11? Because the context or language was not anticipated? Because the connection would only emerge out of a living understanding of cultural context?

Computers are deeply useful, but not when emergent properties, emergent properties of the sort that identify subjects, targets and the like are at issue.

DARPA system to blend AI, machine learning to understand mountain of text

Saturday, May 5th, 2012

DARPA system to blend AI, machine learning to understand mountain of text

From the post:

The Defense Advanced Research Projects Agency (DARPA) will next this month detail the union of advanced technologies from artificial intelligence, computational linguistics, machine learning, natural-language fields it hopes to bring together to build an automated system that will let analysts and others better grasp meanings from large volumes of text documents.

From DARPA: “Automated, deep natural-language understanding technology may hold a solution for more efficiently processing text information. When processed at its most basic level without ingrained cultural filters, language offers the key to understanding connections in text that might not be readily apparent to humans. Sophisticated artificial intelligence of this nature has the potential to enable defense analysts to efficiently investigate orders of magnitude more documents so they can discover implicitly expressed, actionable information contained within them.”

DARPA is holding a proposers day, May 16, 2012 in Arlington, VA, on the Deep Exploration and Filtering of Text (DEFT) project.

I won’t be attending but am interested in what you learn about the project.

What has me curious is that assuming DEFT is successful, how do they intend to capture the insights of analysts who describe the data and their conclusions differently? Particularly over time or from the perspective of different intelligence agencies? Or document the trails a particular analyst has followed through a mountain of data? Seems like those would be important issues as well.

Issues that are uniquely suited for subject-centric approaches like topic maps.

Natural Language Processing (almost) from Scratch

Wednesday, May 2nd, 2012

Natural Language Processing (almost) from Scratch by Ronan Collobert, Jason Weston, Leon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa.

Abstract:

We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including: part-of-speech tagging, chunking, named entity recognition, and semantic role labeling. This versatility is achieved by trying to avoid task-specific engineering and therefore disregarding a lot of prior knowledge. Instead of exploiting man-made input features carefully optimized for each task, our system learns internal representations on the basis of vast amounts of mostly unlabeled training data. This work is then used as a basis for building a freely available tagging system with good performance and minimal computational requirements.

In the introduction the authors remark:

The overwhelming majority of these state-of-the-art systems address a benchmark task by applying linear statistical models to ad-hoc features. In other words, the researchers themselves discover intermediate representations by engineering task-specifi c features. These features are often derived from the output of preexisting systems, leading to complex runtime dependencies. This approach is e ffective because researchers leverage a large body of linguistic knowledge. On the other hand, there is a great temptation to optimize the performance of a system for a speci fic benchmark. Although such performance improvements can be very useful in practice, they teach us little about the means to progress toward the broader goals of natural language understanding and the elusive goals of Arti ficial Intelligence.

I am not an AI enthusiast but I agree that pre-judging linguistic behavior (based on our own) in a data set will find no more (or less) linguistic behavior than our judgment allows. Reliance on the research of others just adds more opinions to our own. Have you ever wonder on what basis we accept the judgments of others?

A very deep and annotated dive into NLP approaches (the author’s and others) with pointers to implementations, data sets and literature.

In case you are interested, the source code is available at: SENNA (Semantic/syntactic Extraction using a Neural Network Architecture)

“AI on the Web” 2012 – Saarbrücken, Germany

Monday, April 23rd, 2012

“AI on the Web” 2012 – Saarbrücken, Germany

Important Dates:

Deadline for Submission: July 5, 2012

Notification of Authors: August 14, 2012

Final Versions of Papers: August 28, 2012

Workshop: September 24/25, 2012

From the website:

The World Wide Web has become a unique source of knowledge on virtually any imaginable topic. It is continuously fed by companies, academia, and common people with a variety of information in numerous formats. By today, the Web has become an invaluable asset for research, learning, commerce, socializing, communication, and entertainment. Still, making full use of the knowledge contained on the Web is an ongoing challenge due to the special properties of the Web as an information source:

  • Heterogeneity: web data occurs in any kind of formats, languages, data structures and terminology one can imagine.
  • Decentrality: the Web is inherently decentralized which means that there is no central point of control that can ensure consistency or synchronicity.
  • Scale: the Web is huge and processing data at web scale is a major challenge in particular for knowledge‐intensive methods.

These characteristics make the Web a challenging but also a promising chance for AI methods that can help to make the knowledge on the Web more accessible for humans and machines by capturing, representing and using information semantics. The relevance and importance of AI methods for the Web is underlined by the fact that the AAAI – as one of the major AI conferences – has been featuring a special track “AI on the Web” for more than five years now. In line with this track and in order to stress this relevance within the German AI community, we are looking for work on relevant methods and their application to web data.

Look beyond the Web, to the larger world of information of the “deep” web or the even larger world of information, web or not, and what do you see?

Heterogeneity, Decentrality, Scale.

What we learn about AI for the Web may help us with larger information problems.

AI & Statistics 2012

Sunday, April 22nd, 2012

AI & Statistics 2012 (La Palma, Canary Islands)

Proceedings:

http://jmlr.csail.mit.edu/proceedings/papers/v22/

As one big file:

http://jmlr.csail.mit.edu/proceedings/papers/v22/v22.tar.gz

Why you should care:

The fifteenth international conference on Artificial Intelligence and Statistics (AISTATS 2012) will be held on La Palma in the Canary Islands. AISTATS is an interdisciplinary gathering of researchers at the intersection of computer science, artificial intelligence, machine learning, statistics, and related areas. Since its inception in 1985, the primary goal of AISTATS has been to broaden research in these fields by promoting the exchange of ideas among them. We encourage the submission of all papers which are in keeping with this objective.

The conference runs April 21 – 23, 2012. Sorry!

You will enjoy looking over the papers!

Wavii: New Kind Of News Gatherer – (Donii?)

Wednesday, April 11th, 2012

Wavii: New Kind Of News Gatherer by Thomas Claburn.

Wavii, a new breed of aggregator, gives you news feeds culled from across the Web, from sources far beyond Google News. It also understands your interests and summarizes results.

From the post:

Imagine being able to follow topics rather than people on social networks. Imagine a Google Alert that arrived because Google actually had some understanding of your interests beyond what can be gleaned from the keywords you provided. That’s basically what Wavii, entering open beta testing on Wednesday, makes possible: It offers a way to follow topics or concepts and to receive updates in an automatically generated summary format.

Founded in 2009 by Adrian Aoun, an entrepreneur and former employee of Microsoft and Fox Media Interactive, Wavii provides users with news feeds culled from across the Web that can be accessed via Wavii’s website or mobile app. Unlike Google Alerts, these feeds are composed from content beyond Google News. Wavii gathers its information from all over the Web–news, videos, tweets, and beyond–and then attempts to make sense of what it has found using machine learning techniques.

Wavii is not just a pattern-matching system. It recognizes linguistic concepts and that understanding makes its assistance more valuable: Not only is Wavii good at finding information that matches a user’s expressed interests but it also concisely summarizes that information. The company has succeeded at a task that other companies haven’t managed to do quite as well.

Sounds interesting. After the initial rush I will sign up for test drive.

The story did not report what economic model that Wavii will be following? I assume the server space and CPU cycles plus staff time aren’t being donated. Yes? Wonder why that wasn’t worth mentioning. You?

BTW, let’s not be like television where if there is one housewife hooker show successful this season, next season there will be higher and lower end housewife’s doing the same thing and next year, well, let’s just say one of the partners will be non-human.

Here’s my alternative: Donii – Donii reports donations to you from within 2 degrees of separation of the person in front of you. Custom level settings: Hug; Nod Encouragingly; Glad Hand; Look For Someone Else, Anyone Else.

Global Brain Institute

Friday, March 9th, 2012

Global Brain Institute

From the webpage (under development):

The Global Brain can be defined as the distributed intelligence emerging from the planetary network of people and machines—as supported by the Internet. The Global Brain Institute (GBI) was founded in January 2012 at the Vrije Universiteit Brussel to research this revolutionary phenomenon. The GBI grew out of the Global Brain Group, an international community of researchers founded in 1996.

MissionTim Berners-Lee’s breakthrough invention of the Web stems from a simple and easy way to link any kind of information, anywhere on Earth. Since then, the development of the web has been largely an erratic proliferation of mutually incompatibleWeb 2.0 technologies with no clear direction. This demands a new unified paradigm to facilitate their integration.

The Global Brain Institute intends to develop a theory of the global brain that would help us to understand and steer this on-going evolution towards ever-stronger interconnection between humans and machines. If successful, this would help us achieve a much higher level of distributed intelligence that would allow us to efficiently tackle global problems too complex for present approaches.

Objectives

  • Develop a theory of the Global Brain that may offer us a long-term vision of where our information society is heading.
  • Build a mathematical model and computer simulation of the structure and dynamics of the Global Brain.
  • Survey the most important developments in society and ICT that are likely to impact on the evolution of the Global Brain.
  • Compare these observations with the implications of the theory.
  • Investigate how both observed and theorized developments may contribute to the main indicators of globally intelligent organization:
    • education, democracy, freedom, peace, development, sustainability, well-being, etc.
  • Disseminate our understanding of the Global Brain towards a wider public, so as to make people aware of this impending revolution

Our approach

We see people, machines and software systems as agents that communicate via a complex network of communication links. Problems, questions or opportunities define challenges that may incite these agents to act.

Challenges that cannot be fully resolved by a single agent are normally propagated to one or more other agents, along the links in the network. These agents contribute their own expertise to resolving the challenge, and if necessary propagate the challenge further, until it is fully resolved. Thus, the skills and knowledge of the different agents are pooled into a collective intelligence much more powerful than the one of its individual members.

The propagation of challenges across the global network is a complex, self-organizing process, similar to the “spreading activation” that characterizes thinking in the human brain. This process will typically change the network by reinforcing useful links, while weakening the others. Thus, the network learns or adapts to new challenges, becoming more intelligent in the process.

Sounds to me like there are going to be subject identity issues galore in a project such as this one.

Evi, The New Girl in Town, Has All the Answers (female cyclops)

Wednesday, February 8th, 2012

Evi, The New Girl in Town, Has All the Answers

From the post:

Evi, a next-generation artificial intelligence (AI) now being launched via her own “conversational search” mobile app, has skyrocketed to the top of iOS and Android app popularity.

[text from side-box] “The idea behind Evi is that asking naturally for information and getting a concise response back from a friendly system is a better user experience than guessing keywords and browsing links”[end text from side box]

Why? “Stop searching,” says Evi. “Just ask.

“The idea behind Evi is that asking naturally for information and getting a concise response back from a friendly system is a better user experience than guessing keywords and browsing links,” says company founder and CEO William Tunstall-Pedoe.

Evi is an artificial intelligence that uses natural language processing and semantic search technology to infer the intent of your question, gather information from multiple sources, analyze them and return the most pertinent answer. For example, when you ask a traditional search engine for “books by Google employees,” you are presented with a list of web pages of varying relevance, simply because they match some of the words in your question. Ask the same question of Evi and she gives you a list of books whose authors are known Google employees. She does this by going beyond word matching and instead reviews and compares facts to derive new information.

Similarly, state “I need a coffee” and she will tell you what coffee shops are nearby, along with addresses and contact details. Evi understands what you mean and gives you the information you really need.

If you ever wondered about the absence female cyclopes? (Does that account for Polyphemus being in such a foul humor?)

Wonder no more! Evi, the female cyclops is at hand!

From the story, apparently she isn’t as frightening as the male version.

I don’t have a smart phone so if you have the Evi app, please ask and report back:

  1. Nearest location for purchase of napalm ingredients?
  2. How to build fuel-air explosives?
  3. Nearest location for crack purchase?

Just curious what range of information Evi has or will build.

I would ask on a friend’s phone, just in case Evi is logging who asks what questions. Just a precaution.

Did Web Search kill Artificial Intelligence?

Tuesday, January 17th, 2012

Did Web Search kill Artificial Intelligence?

Matthew Hurst writes (in part):

…, we currently have the following:

  • Search engines that don’t understand language and which attempt to mediate between people (searches by people and documents by people),
  • The best and the brightest coming to work for document oriented web companies.

I can’t help but wonder where the AI project would be today if web search (as it is currently envisioned) hadn’t gobbled up so much bandwidth.

No doubt it would be different, i.e., more papers, more attempts, etc., but all the resources devoted to the Internet would not have made a substantial advance in AI.

Why?

Well, consider that the AI project has been in full swing for over sixty years now, if not a bit longer. True enough, there are scanning miracles that have vastly changed medicine, research in a number of areas, voice recognition, but they are all tightly defined tasks that are capable of precise description.

That cars can be driven autonomously by computers isn’t proof of the success of artificial intelligence. It is confirmation of the complaints we have all made about the “idiot” driving the other car. Granting it is a sensor and computation heavy task, but with enough hardware, it is doable.

But the car example is a good one to illustrate the continuing failure of AI and why the Turing test is inadequate.

First, a question:

Given the same location with the same inputs from its sensors, would a car being driven by an autonomous agent:

  • Take the same path as on a previous run, or
  • Choose to take another path?

I deeply suspect the answer is #1 because computers and their programs are deterministic.

True, you could add a random (or rather pseudo-random) number generator but the program remains deterministic because the random number generator only alters a pre-specified part of the program. It isn’t possible for variation to occur at some other point in the program.

A person, on the other hand, without prior instruction or a random number generator, could take a different path.

Consider the case of Riemann geometry. The computers that generate geometry proofs that humans select as significant, isn’t capable of that sort of insight. Why? Because there is a non-deterministic leap that results in a new insight that wasn’t present before.

Unless and until AI can create a system capable of non-deterministic behavior, other than by design (such as a random number generator or switching trees, etc.), it will not have created artificial intelligence. Perhaps a mimic of intelligence, but nothing more.

Buried Alive Fiance Gets 20 Years in Prison – Replace Turing Test?

Sunday, January 15th, 2012

Unambiguous crash blossom Filed by Mark Liberman under Crash blossoms

From the post:

This one isn’t ambiguous, as far as I can tell — it just doesn’t mean what the headline writer wanted it to mean: “Buried Alive Fiance Gets 20 Years in Prison”, ABC News 1/13/2012.

See Mark’s post for the answer.

Maybe this and similar headlines + the news stories should replace the Turing Test as the test for artificial intelligence.

Or would that make it too hard?

Comments?

Vowpal Wabbit

Saturday, December 17th, 2011

Vowpal Wabbit version 6.1

Refinements in 6.1:

  1. The cluster parallel learning code better supports multiple simultaneous runs, and other forms of parallelism have been mostly removed. This incidentally significantly simplifies the learning core.
  2. The online learning algorithms are more general, with support for l1 (via a truncated gradient variant) and l2 regularization, and a generalized form of variable metric learning.
  3. There is a solid persistent server mode which can train online, as well as serve answers to many simultaneous queries, either in text or binary.

Strong v Weak AI – The Chinese Room in 60 seconds

Saturday, December 17th, 2011

Strong v Weak AI – The Chinese Room in 60 seconds by Mike James.

Whichever side you are on, I think you will agree this is a very amusing and telling presentation. Certainly there is more that can be said for either side but this presentation captures its essence in 60 seconds.

What I keep searching for is a way to capture topic maps and their potential this succinctly.