Archive for the ‘JRuby’ Category

Pig as Hadoop Connector, Part Two: HBase, JRuby and Sinatra

Monday, August 27th, 2012

Pig as Hadoop Connector, Part Two: HBase, JRuby and Sinatra by Russell Jurney.

From the post:

Hadoop is about freedom as much as scale: providing you disk spindles and processor cores together to process your data with whatever tool you choose. Unleash your creativity. Pig as duct tape facilitates this freedom, enabling you to connect distributed systems at scale in minutes, not hours. In this post we’ll demonstrate how you can turn raw data into a web service using Hadoop, Pig, HBase, JRuby and Sinatra. In doing so we will demonstrate yet another way to use Pig as connector to publish data you’ve processed on Hadoop.

When (not if) the next big cache of emails or other “sensitive” documents drops, everyone who has followed this and similar tutorials should be ready.

Jogger: almost like named_scopes

Sunday, March 4th, 2012

Jogger: almost like named_scopes

From the post:

We talked about graph databases in this and this blog post. As you might have read we’re big fans of a graph database called neo4j, and we’re using it together with JRuby. In this post we’ll share a little piece of code we created to make expressing graph traversals super easy and fun.

Jogger – almost like named_scopes

Jogger is a JRuby gem that enables lazy people to do very expressive graph traversals with the great pacer gem. If you don’t know what the pacer gem is, you should probably check pacer out first. (And don’t miss the pacer section at the end of the post.)

Remember the named_scopes from back in the days when you were using rails? Jogger gives you named traversals and is a little bit like named scopes. Jogger groups multiple pacer traversals together and give them a name. Pacer traversals are are like pipes. What are pipes? Pipes are great!!

The most important conceptual difference is, that the order in which named traversals are called matter, while it usually doesn’t matter in which order you call named scopes.

A nice way to make common traversals accessible by name.

Does the “order of calling” scopes in topic maps matter? At least for the current TMDM I think not because scopes are additive. That is the value covered by a set of scopes must be valid in each scope individually.