Archive for the ‘Kernel Methods’ Category

Better table search through Machine Learning and Knowledge

Friday, August 24th, 2012

Better table search through Machine Learning and Knowledge by Johnny Chen.

From the post:

The Web offers a trove of structured data in the form of tables. Organizing this collection of information and helping users find the most useful tables is a key mission of Table Search from Google Research. While we are still a long way away from the perfect table search, we made a few steps forward recently by revamping how we determine which tables are “good” (one that contains meaningful structured data) and which ones are “bad” (for example, a table that hold the layout of a Web page). In particular, we switched from a rule-based system to a machine learning classifier that can tease out subtleties from the table features and enables rapid quality improvement iterations. This new classifier is a support vector machine (SVM) that makes use of multiple kernel functions which are automatically combined and optimized using training examples. Several of these kernel combining techniques were in fact studied and developed within Google Research [1,2].

Important work on tables from Google Research.

Important in part because you can compare your efforts on accessible tables to theirs, to gain insight into what you are, or aren’t doing “right.”

For any particular domain, you should be able to do better than a general solution.

BTW, I disagree on the “good” versus “bad” table distinction. I suspect that tables that hold the layout of web pages, say for a CMS, are more consistent than database tables of comparable size. And that data, may or may not be important to you.

Important versus non-important data for a particular set of requirements is a defensible distinction.

“Good” versus “bad” tables is not.

kernel-machine library

Saturday, December 24th, 2011

kernel-machine library

From the webpage:

The Kernel-Machine Library is a free (released under the LGPL) C++ library to promote the use of and progress of kernel machines. It is intended for use in research as well as in practice. The library is known to work with a recent C++ compiler on GNU/Linux, on Mac OS, and on several flavours of Windows.

Below, we would like to give you the choice to either install, use, or improve the library.

The documentation seems a bit slim but perhaps this is an area where contributions would be welcome.

Kernel Perceptron in Python

Tuesday, November 1st, 2011

Kernel Perceptron in Python

From the post:

The Perceptron (Rosenblatt, 1957) is one of the oldest and simplest Machine Learning algorithms. It’s also trivial to kernelize, which makes it an ideal candidate to gain insights on kernel methods.

The original paper by F. Rosenblatt, The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain, Psychological Review, Vol. 65, No. 6, 1958.

Good way to learn more about kernel methods.

I have included a link to the original paper by Rosenblatt.

  1. What do you make of Rosenblatt’s choice to not use symbolic or Boolean logic?
  2. What do you make of the continued efforts (think Cyc/SUMA) to use symbolic or Boolean logic?
  3. Is knowledge/information probabilistic?

There are no certain answers to these questions, I am interested in how you approach discussing them.

Kernel Methods and Support Vector Machines de-Mystified

Sunday, October 9th, 2011

Kernel Methods and Support Vector Machines de-Mystified

From the post:

We give a simple explanation of the interrelated machine learning techniques called kernel methods and support vector machines. We hope to characterize and de-mystify some of the properties of these methods. To do this we work some examples and draw a few analogies. The familiar no matter how wonderful is not perceived as mystical.

Did the authors succeed in their goal of a “simple explanation”?

You might want to compare the Wikipedia entry they cite on support vector machines before making your comment. Success is often a relative term.

Learning Discriminative Metrics via Generative Models and Kernel Learning

Tuesday, September 27th, 2011

Learning Discriminative Metrics via Generative Models and Kernel Learning by Yuan Shi, Yung-Kyun Noh, Fei Sha, and Daniel D. Lee.


Metrics specifying distances between data points can be learned in a discriminative manner or from generative models. In this paper, we show how to unify generative and discriminative learning of metrics via a kernel learning framework. Specifically, we learn local metrics optimized from parametric generative models. These are then used as base kernels to construct a global kernel that minimizes a discriminative training criterion. We consider both linear and nonlinear combinations of local metric kernels. Our empirical results show that these combinations significantly improve performance on classification tasks. The proposed learning algorithm is also very efficient, achieving order of magnitude speedup in training time compared to previous discriminative baseline methods.

Combination of machine learning techniques within a framework.

It may be some bias in my reading patterns but I don’t recall any explicit combination of human + machine learning techniques? I don’t take analysis of search logs to be an explicit human contribution since the analysis is guessing as to why a particular link and not another was chosen. I suppose time on the resource chosen might be an indication but a search log per se isn’t going to give that level of detail.

For that level of detail you would need browsing history. Would be interesting to see if a research library or perhaps employer (fewer “consent” issues) would permit browsing history collection over some long period of time, say 3 to 6 months. So that not only is the search log captured but the entire browsing history.

Hard to say if that would result in enough increased accuracy on search results to be worth the trouble.

Interesting paper about combining purely machine learning techniques and promises significant gains. What these plus human learning would produce remains a subject for future research papers.

Advanced Topics in Machine Learning

Thursday, June 23rd, 2011

Advanced Topics in Machine Learning

Andreas Krause and Daniel Golovin course at CalTech. Lecture notes, readings, this will keep you entertained for some time.


How can we gain insights from massive data sets?

Many scientific and commercial applications require us to obtain insights from massive, high-dimensional data sets. In particular, in this course we will study:

  • Online learning: How can we learn when we cannot fit the training data into memory? We will cover no regret online algorithms; bandit algorithms; sketching and dimension reduction.
  • Active learning: How should we choose few expensive labels to best utilize massive unlabeled data? We will cover active learning algorithms, learning theory and label complexity.
  • Nonparametric learning on large data: How can we let complexity of classifiers grow in a principled manner with data set size? We will cover large-­scale kernel methods; Gaussian process regression, classification, optimization and active set methods.

Why would a non-strong AI person list so much machine learning stuff?

Two reasons:

1) Machine learning techniques are incredibly useful in appropriate cases.

2) You have to understand machine learning to pick out the appropriate cases.

Shogun – Google Summer of Code 2011

Sunday, April 3rd, 2011

Shogun – Google Summer of Code 2011

Students! Here is your change to work on a cutting edge software library for machine learning!

Posted ideas, or submit your own.

From the website:

SHOGUN is a machine learning toolbox, which is designed for unified large-scale learning for a broad range of feature types and learning settings. It offers a considerable number of machine learning models such as support vector machines for classification and regression, hidden Markov models, multiple kernel learning, linear discriminant analysis, linear programming machines, and perceptrons. Most of the specific algorithms are able to deal with several different data classes, including dense and sparse vectors and sequences using floating point or discrete data types. We have used this toolbox in several applications from computational biology, some of them coming with no less than 10 million training examples and others with 7 billion test examples. With more than a thousand installations worldwide, SHOGUN is already widely adopted in the machine learning community and beyond.

SHOGUN is implemented in C++ and interfaces to MATLAB, R, Octave, Python, and has a stand-alone command line interface. The source code is freely available under the GNU General Public License, Version 3 at

This summer we are looking to extend the library in four different ways: Improving interfaces to other machine learning libraries or integrating them when appropriate, improved i/o support, framework improvements and new machine algorithms. Here is listed a set of suggestions for projects.

A prior post on Shogun.

Learning Kernel Classifiers

Saturday, February 12th, 2011

Learning Kernel Classifiers by Ralf Herbrich.

A bit dated (2001) to be on the web in partial form but may still be a useful work.

The source code listings appear to be complete and are mostly written in R.

Interested in how anyone sees this versus more recent works on kernel classifiers.

Normalized Kernels as Similarity Indices (and algorithm bias)

Wednesday, November 17th, 2010

Normalized Kernels as Similarity Indices Authors(s): Julien Ah-Pine Keywords Kernels normalization, similarity indices, kernel PCA based clustering


Measuring similarity between objects is a fundamental issue for numerous applications in data-mining and machine learning domains. In this paper, we are interested in kernels. We particularly focus on kernel normalization methods that aim at designing proximity measures that better fit the definition and the intuition of a similarity index. To this end, we introduce a new family of normalization techniques which extends the cosine normalization. Our approach aims at refining the cosine measure between vectors in the feature space by considering another geometrical based score which is the mapped vectors’ norm ratio. We show that the designed normalized kernels satisfy the basic axioms of a similarity index unlike most unnormalized kernels. Furthermore, we prove that the proposed normalized kernels are also kernels. Finally, we assess these different similarity measures in the context of clustering tasks by using a kernel PCA based clustering approach. Our experiments employing several real-world datasets show the potential benefits of normalized kernels over the cosine normalization and the Gaussian RBF kernel.

Points out that some methods don’t result in an object being found to be most similar to…itself. What an odd result.

Moreover, it is possible for vectors the represent different scores to be treated as identical.


  1. What axioms of similarity indexes should we take notice of? (3-5 pages, citations)
  2. What methods treat vectors with different scores as identical? (3-5 pages, citations)
  3. Are geometric based similarity indices measuring semantic or geometric similarity? Are those the same concepts or different concepts? (10-15 pages, citations, you can make this a final paper if you like.)

Shogun – A Large Scale Machine Learning Toolbox

Thursday, October 21st, 2010

Shogun – A Large Scale Machine Learning Toolbox

Not for the faint of heart but an excellent resource for those interested in large scale kernel methods.

Offers several Support Vector Machine (SVM) implementations and implementations of the latest kernels. Has interfaces to Mathlab(tm), R, Octave and Python.


  1. Pick any one of the methods. How would you integrate it into augmented authoring for a topic map?
  2. What aspect(s) of this site would you change using topic maps?
  3. What augmented authoring techniques that would help you apply topic maps to this site?
  4. Apply topic maps to this site. (project)

Integrating Biological Data – Not A URL In Sight!

Wednesday, October 20th, 2010

Actual title: Kernel methods for integrating biological data by Dick de Ridder, The Delft Bioinformatics Lab, Delft University of Technology.

Biological data integration to improve protein expression – read hugely profitable industrial processes based on biology.

Need to integrate biological data, including “prior knowledge.”

In case kernel methods aren’t your “thing,” one important point:

There are vast seas of economically important data unsullied by URLs.

Kernel methods are one method to integrate some of that data.


  1. How to integrate kernel methods into topic maps? (research project)
  2. Subjects in a kernel method? (research paper, limit to one method)
  3. Modeling specific uses of kernels in topic maps. (research project)
  4. Edges of kernels? Are there subject limits to kernels? (research project>