Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

April 1, 2012

A Very Simple Explanation of Spectral Clustering

Filed under: Clustering,Spectral Clustering — Patrick Durusau @ 7:11 pm

A Very Simple Explanation of Spectral Clustering

Akshay writes:

I’ve been wanting to write a post about some of the research I’ve been working on but I realized that my work has gotten really technical and not very accessible to someone not in machine learning. So I thought as a precursor I would write about spectral clustering at a very high level and with lots of pictures. I hope you can follow along. For a more technical introduction to spectral clustering, this tutorial is very nice and much more complete than this article.

Clustering

To start out, it is important to understand what the clustering problem is and why it is important. Clustering is an extremely useful tool in data analysis whereby a set of objects are organized into groups based on their similarity. We typically want to form groups so that objects in the same group are highly similar and objects in different groups are less similar. This is typically known as unsupervised learning because the data set is not labeled (differentiating it from other tasks like regression and classification).

I think the point that eigenvectors are “stable under noise” could be made earlier and without a lot of technical detail.

I would insert/change:

The changes to the rows and columns introduces “noise” into the example.

The eigenvectors are $(1\ 0\ 1\ 0)^T$ and $(0\ 1\ 0\ 1)^T$, with the first and third objects are in one cluster and the second and fourth are in the other. The eigenvectors make the correct identifications of the clusters, despite the introduction of noise.

Why eigenvectors perform well in the presence of noise is beyond the scope of this post.

Still, highly recommended as an introduction to spectral clustering.

Building Web apps with Python and Neo4j

Filed under: Neo4j,Python,Web Applications — Patrick Durusau @ 7:10 pm

Building Web apps with Python and Neo4j

Robert Rees writes:

Neo4J’s provision of an easy to use REST-based graph datastore gives us the chance to explore new ways of storing and organising the information we want to use on the web in the language of our choice; in my case Python.

In this talk I would like to present a few ideas about how we can use graph storage to closely mirror real world structures in web applications that in turn only need to lightly reflect the underlying data.

I will be offering examples through the medium of fantasy games like Morrowind, Fallout and Skyrim. Through a mixture of gaming geekery, reactive programming and data modelling I hope to offer a vision of a graph-based more webby future.

Robert makes a good argument that graphs are better for capturing the flow of actions or information on a website than the traditional page model.

The source code for the flow-based web question and answer: Flow web demo

The source code for the Heroku version: Flow web demo (Heroku version)

Intro to Distributed Erlang (screencast)

Filed under: Distributed Systems,Erlang — Patrick Durusau @ 7:10 pm

Intro to Distributed Erlang (screencast) by Bryan Hunter.

From the description:

Here’s an introduction to distribution in Erlang. This screencast demonstrates creating three Erlang nodes on a Windows box and one on a Linux box and then connecting them using the one-liner “net_adm:ping” to form a mighty compute cluster.

Topics covered:

  • Using erl to start an Erlang node (an instance of the Erlang runtime system).
  • How to use net_adm:ping to connect four Erlang nodes (three on Windows, one on Linux).
  • Using rpc:call to RickRoll a Linux box from an Erlang node running on a Windows box.
  • Using nl to load (deploy) a module from one node to all connected nodes.

Not the most powerful cluster but a good way to learn distributed Erlang.

Neo4j Spring Data & Scala

Filed under: Neo4j,Scala,Spring Data — Patrick Durusau @ 7:10 pm

Neo4j Spring Data & Scala by Jan Machacek.

From the post:

Spring Data is an excellent tool that generates implementations of repositories using the naming conventions similar to the convention used in the dynamic language runtimes such as Grails and Ruby on Rails. In this post, I am going to show you how to use Spring Data in your Scala code.

In this post, we will construct trivial application that uses the Spring Data Neo4j to persist simple User objects. The only difference is that we’ll use Scala throughout and highlight some of the sticky points of Spring Data in Scala.

The post seeks to illustrate that Spring remains relevant, even after the advent of Scala.

It does that but code adoption, like application of security patches, is a mixed bag. Some people are using (read advocating) the latest releases, some people are using useful (read stable) software and still others are using older (read unsupported) software. You are likely to find Neo4j in one or more of those environments. Documentation for any and/or all of them would promote usage of Neo4j.

10 Steps to run Spring Data Neo4j at OpenShift

Filed under: Neo4j — Patrick Durusau @ 7:10 pm

10 Steps to run Spring Data Neo4j at OpenShift.

Tomás Augusto Müller writes:

It’s very easy to get Neo4j running on RedHat’s OpenShift Cloud Platform. Note that at the time of writing this, there is no Neo4j cartridge available at OpenShift.

If you are familiar with Heroku, think that a cartridge is like a Heroku Add-on. It plugs functionality into the PaaS environment.

So, how we can use Neo4j at OpenShift? First, remember to always read a README file, if there’s one.

When you create a project at OpenShift, a README file can be located at the root directory of your app.

The post continues with the steps you need to get running at OpenShift.

Community question:

Should Neo4j documentation have a chapter: Neo4j in the Cloud(s) ? Which clouds?

« Newer Posts

Powered by WordPress