k-means Approach to the Karhunen-Loeve Transform by Krzysztof Misztal, Przemyslaw Spurek, and Jacek Tabor.
Abstract:
We present a simultaneous generalization of the well-known Karhunen-Loeve (PCA) and k-means algorithms. The basic idea lies in approximating the data with k affine subspaces of a given dimension n. In the case n=0 we obtain the classical k-means, while for k=1 we obtain PCA algorithm.
We show that for some data exploration problems this method gives better result then either of the classical approaches.
I know, it is a very forbidding title but once you look at the paper you will be glad you did.
First, the authors begin with a graphic illustration of the goal of their technique (no, you have to look at the paper to see it), which even the most “lay” reader can appreciate.
Second, the need for topic maps strikes again as in the third paragraph we learn: “…Karhunen-Loéve transform (called also PCA – Principle Component Analysis)….”
Third, some of the uses of this technique:
- data mining – we can detect important coordinates and subsets with similar properties;
- clustering – our modification of k-means can detect different, high dimensional relation in data;
- image compression and image segmentation;
- pattern recognition – thanks to detection of relation in data we can use it to assign data to defined before classes.
A sample implementation is available at: http://www.ii.uj.edu.pl/~misztalk.