Archive for the ‘Redis’ Category

The god Architecture

Saturday, March 9th, 2013

The god Architecture

From the overview:

god is a scalable, performant, persistent, in-memory data structure server. It allows massively distributed applications to update and fetch common data in a structured and sorted format.

Its main inspirations are Redis and Chord/DHash. Like Redis it focuses on performance, ease of use and a small, simple yet powerful feature set, while from the Chord/DHash projects it inherits scalability, redundancy, and transparent failover behaviour.

This is a general architectural overview aimed at somewhat technically inclined readers interested in how and why god does what it does.

To try it out right now, install Go, git, Mercurial and gcc, go get github.com/zond/god/god_server, run god_server, browse to http://localhost:9192/.

For API documentation, go to http://go.pkgdoc.org/github.com/zond/god.

For the source, go to https://github.com/zond/god

I know, “in memory” means its not “web scale” but to be honest, I have a lot of data needs that aren’t “web scale.”

There, I’ve said it. Some (most?) important data is not “web scale.”

And when it is, I only have to check my spam filter for options to deal with “web scale” data.

The set operations in particular look quite interesting.

Enjoy!

I first saw this in Nat Torkington’s Four short links: 1 March 2013.

Redis Data Structure Cheatsheet

Tuesday, February 26th, 2013

Redis Data Cheatsheet by Brian P O’Rourke.

From the post:

Redis data structures are simple – none of them are likely to be a perfect match for the problem you’re trying to solve. But if you pick the right initial structure for your data, Redis commands can guide you toward efficient ways to get what you need.

Here’s our standard reference table for Redis datatypes, their most common uses, and their most common misuses. We’ll have follow-up posts with more details, specific use-cases (and code), but this is a handy reference:

I created a PDF version of the Redis Datatypes — Uses and Misuses.

Thinking it would be easier to reference than bookmarking a post. Any errors introduced are solely my responsibility.

I first saw this at: Alex Popescu’s Redis – Pick the Right Data Structure.

Apache Camel meets Redis

Saturday, February 23rd, 2013

Apache Camel meets Redis by Bilgin Ibryam.

From the post:

The Lamborghini of Key-Value stores

Camel is the best of bread Integration framework and in this post I’m going to show you how to make it even more powerful by leveraging another great project – Redis. Camel 2.11 is on its way to be released soon with lots of new features, bug fixes and components. Couple of these new components are authored by me, redis-component being my favourite one. Redis – a ligth key/value store is an amazing piece of Italian software designed for speed (same as Lamborghini – a two-seater Italian car designed for speed). Written in C and having an in-memory closer to the metal nature, Redis performs extremely well (Lamborgini’s motto is “Closer to the Road”). Redis is often referred to as a data structure server since keys can contain strings, hashes, lists and sorted sets. A fast and light data structure server is like a super sportscars for software engineers – it just flies. If you want to find out more about Redis’ and Lamborghini’s unique performance characteristics google around and you will see for yourself.

Idempotent Repository

The term idempotent is used in mathematics to describe a function that produces the same result if it is applied to itself. In Messaging this concepts translates into the a message that has the same effect whether it is received once or multiple times. In Camel this pattern is implemented using the IdempotentConsumer class which uses an Expression to calculate a unique message ID string for a given message exchange; this ID can then be looked up in the IdempotentRepository to see if it has been seen before; if it has the message is consumed; if its not then the message is processed and the ID is added to the repository. RedisIdempotentRepository is using a set structure to store and check for existing Ids.

If you have or are considering a message passing topic map application, this may be of interest.

Pig, ToJson, and Redis to publish data with Flask

Saturday, February 16th, 2013

Pig, ToJson, and Redis to publish data with Flask by Russell Jurney.

From the post:

Pig can easily stuff Redis full of data. To do so, we’ll need to convert our data to JSON. We’ve previously talked about pig-to-json in JSONize anything in Pig with ToJson. Once we convert our data to json, we can use the pig-redis project to load redis.

What do you think?

Something “lite” to test a URI dictionary locally?

Redis on Windows Azure

Monday, January 21st, 2013

One step closer to full support for Redis on Windows, MS Open Tech releases 64-bit and Azure installer by Claudio Caldato.

From the post:

I’m happy to report new updates today for Redis on Windows Azure: the open-source, networked, in-memory, key-value data store. We’ve released a new 64-bit version that gives developers access to the full benefits of an extended address space. This was an important step in our journey toward full Windows support. You can download it from the Microsoft Open Technologies github repository.

Last April we announced the release of an important update for Redis on Windows: the ability to mimic the Linux Copy On Write feature, which enables your code to serve requests while simultaneously saving data on disk.

Along with 64-bit support, we are also releasing a Windows Azure installer that enables deployment of Redis on Windows Azure as a PaaS solution using a single command line tool. Instructions on using the tool are available on this page and you can find a step-by-step tutorial here. This is another important milestone in making Redis work great on the Windows and Windows Azure platforms.

We are happy to communicate that we are using now the Microsoft Open Technologies public github repository as our main go-to SCM so the community will be able to follow what is happening more closely and get involved in our project.

Is it just me or does it seem like technology is getting easier to deploy?

Perhaps my view is jaded by doing Linux installs with raw write 1.44 MB floppies and editing boot sectors at the command line. 😉

If you like Redis or Azure, either way this is welcome news!

Autocomplete Search with Redis

Sunday, December 9th, 2012

Autocomplete Search with Redis

From the post:

When we launched GetGlue HD, we built a faster and more powerful search to help users find the titles they were looking for when they want to check-in to their favorite shows and movies as they typed into the search box. To accomplish that, we used the in-memory data structures of the Redis data store to build an autocomplete search index.

Search Goals

The results we wanted to autocomplete for are a little different than the usual result types. The Auto complete with Redis writeup by antirez explores using the lexicographical ordering behavior of sorted sets to autocomplete for names. This is a great approach for things like usernames, where the prefix typed by the user is also the prefix of the returned results: typing mar could return Mara, Marabel, and Marceline. The deal-breaking limitation is that it will not return Teenagers From Mars, which is what we want our autocomplete to be able to do when searching for things like show and movie titles. To do that, we decided to roll our own autocomplete engine to fit our requirements. (Updated the link to the “Auto complete with Redis” post.)

Rather like the idea of autocomplete being more than just string completion.

What if while typing a name, “autocompletion” returns one or more choices for what it thinks you may be talking about? With additional properties/characteristics, you can disambiguate your usage by allowing your editor to tag the term.

Perhaps another way to ease the burden of authoring a topic map.

Redis 2.6.2 Released!

Friday, October 26th, 2012

Redis 2.6.2 Released!

From the introduction to Redis:

Redis is an open source, advanced key-value store. It is often referred to as a data structure server since keys can contain strings, hashes, lists, sets and sorted sets.

You can run atomic operations on these types, like appending to a string; incrementing the value in a hash; pushing to a list; computing set intersection, union and difference; or getting the member with highest ranking in a sorted set.

In order to achieve its outstanding performance, Redis works with an in-memory dataset. Depending on your use case, you can persist it either by dumping the dataset to disk every once in a while, or by appending each command to a log.

Redis also supports trivial-to-setup master-slave replication, with very fast non-blocking first synchronization, auto-reconnection on net split and so forth.

Other features include a simple check-and-set mechanism, pub/sub and configuration settings to make Redis behave like a cache.

You can use Redis from most programming languages out there.

Redis is written in ANSI C and works in most POSIX systems like Linux, *BSD, OS X without external dependencies. Linux and OSX are the two operating systems where Redis is developed and more tested, and we recommend using Linux for deploying. Redis may work in Solaris-derived systems like SmartOS, but the support is best effort. There is no official support for Windows builds, although you may have some options.

The “in-memory” nature of Redis will be a good excuse for more local RAM. 😉

I noticed the most recent release of Redis at Alex Popescu’s myNoSQL.

Memobot

Friday, October 5th, 2012

Memobot

From the webpage:

Memobot is a data structure server written in clojure. It speaks Redis protocol, so any standard redis client can work with it.

For interests in data structures, Clojure or both.

r3 redistribute reduce reuse

Monday, August 6th, 2012

r3 redistribute reduce reuse

From the project homepage:

r³ is a map-reduce engine written in python using redis as a backend

r³ is a map reduce engine written in python using a redis backend. It’s purpose is to be simple.

r³ has only three concepts to grasp: input streams, mappers and reducers.

You need to visit this project. It is simple, efficient and effective.

I found this following r³ – A quick demo of usage, which I found at: Demoing the Python-Based Map-Reduce R3 Against GitHub Data, Alex Popescu’s myNoSQL.

Masstree – Much Faster than MongoDB, VoltDB, Redis, and Competitive with Memcached

Tuesday, May 1st, 2012

Masstree – Much Faster than MongoDB, VoltDB, Redis, and Competitive with Memcached

From the post:

The EuroSys 2012 system conference has an excellent live blog summary of their talks for: Day 1, Day 2, Day 3 (thanks Henry at the Paper Trail blog). Summaries for each of the accepted papers are here.

One of the more interesting papers from a NoSQL perspective was Cache Craftiness for Fast Multicore Key-Value Storage, a wonderfully detailed description of the low level techniques used to implement Masstree:

A storage system specialized for key-value data in which all data fits in memory, but must persist across server restarts. It supports arbitrary, variable-length keys. It allows range queries over those keys: clients can traverse subsets of the database, or the whole database, in sorted order by key. On a 16-core machine Masstree achieves six to ten million operations per second on parts A–C of the Yahoo! Cloud Serving Benchmark benchmark, more than 30x as fast as VoltDB [5] or MongoDB [2].

An inspiration for anyone pursuing pure performance in the key-value space.

As the authors note when comparing Masstree to other systems:

Many of these systems support features that Masstree does not, some of which may bottleneck their performance. We disable other systems’ expensive features when possible.

The lesson here is to not buy expensive features unless you need them.

First Light – MS Open Tech: Redis on Windows

Saturday, April 28th, 2012

First Light – MS Open Tech: Redis on Windows

Claudio Caldato writes:

The past few weeks have been very busy in our offices as we announced the creation of Microsoft Open Technologies, Inc. Now that the dust has settled it’s time for us to resume our regular cadence in releasing code, and we are happy to share with you the very first deliverable from our new company: a new and significant iteration of our work on Redis on Windows, the open-source, networked, in-memory, key-value data store.

The major improvements in this latest version involve the process of saving data on disk. Redis on Linux uses an OS feature called Fork/Copy On Write. This feature is not available on Windows, so we had to find a way to be able to mimic the same behavior without changing completely the save on disk process so as to avoid any future integration issues with the Redis code.

Excellent news!

BTW, Microsoft Open Technologies has a presence on Github. Just the one project (Redis on Windows) but I am sure more will follow.

Related

Thursday, March 29th, 2012

Related

From the webpage:

Related

Related is a Redis-backed high performance distributed graph database.

Raison d’être

Related is meant to be a simple graph database that is fun, free and easy to use. The intention is not to compete with “real” graph databases like Neo4j, but rather to be a replacement for a relational database when your data is better described as a graph. For example when building social software. Related is very similar in scope and functionality to Twitters FlockDB, but is among other things designed to be easier to setup and use. Related also has better documentation and is easier to hack on. The intention is to be web scale, but we ultimately rely on the ability of Redis to scale (using Redis Cluster for example). Read more about the philosophy behind Related in the Wiki.

Well, which is it?

A “Redis-backed high performance distributed graph database,”

or

“…not to compete with “real” graph databases like Neo4j….?”

If the intent is to have a “web scale” distributed graph database, then it will be competing with other graph database products.

If you are building a graph database, keep an eye on René Pickhardt’s blog for notices about the next meeting of his graph reading club.

Stash

Thursday, March 8th, 2012

Stash by Nate Kohari.

From the post:

Stash is a graph-based cache for Node.js powered by Redis.

Warning! Stash is just a mental exercise at this point. Feedback is very much appreciated, but using it in production may cause you to contract ebola or result in global thermonuclear war.

Overview

“There are only two hard things in computer science: cache invalidation and naming things.”
— Phil Karlton

One of the most difficult parts about caching is managing dependencies between cache entries. In order to reap the benefits of caching, you typically have to denormalize the data that’s stored in the cache. Since data from child items is then stored within parent items, it can be challenging to figure out what entries to invalidate in the cache in response to changes in data.

As Nate says, a thought experiment but an interesting one.

From a topic map perspective, I don’t know that I would consider cache invalidation and naming things as two distinct problems. Or rather, the same problem under different constraints.

If you don’t think “cache invalidation” is related to naming, what sort of problem is it when a person’s name changes upon marriage? Isn’t a stored record “cached?” May not be cache in the sense of the cache in an online service or chip, but those are the special cases aren’t they?

Redis (and Jedis) – delightfully simple and focused NoSQL

Sunday, March 4th, 2012

Redis (and Jedis) – delightfully simple and focused NoSQL by Ashwin Jayaprakash.

A very nice reminder that Redis may be the solution you need.

Redis is an open source NoSQL project that I had not paid much attention to. Largely because it didn’t seem very special at the time nor did it have a good persistence and paging story. Also, there is/was so much noise out there and the loudest among them being Memcached, Hadoop, Cassandra, Voldemort, Riak, MongoDB etc that it slipped my mind.

Last weekend I thought I’d give Redis another try. This time I just wanted to see Redis for what it is and not compare it with other solutions. So, as it says on the site:

Redis is an open source, advanced key-value store. It is often referred to as a data structure server since keys can contain strings, hashes, lists, sets and sorted sets.

Seemed interesting enough to warrant another look. There are so many projects that need:

  • Simple, fast, light
  • In-memory (with optional checkpointing)
  • Fault tolerant / Sharded / Distributed
  • Shared access from many processes and machines
  • Some real data structures instead of just wimpy key-value
  • Flexible storage format – without needing crummy layers to hide/overcome limitations
  • Clean Java API

So, I downloaded the Windows port of Redis and Jedis JAR for the Java API.

  1. Unzip Redis Windows zip file 
  2. Copy the Jedis JAR file
  3. Go to the 64bit or 32bit folder and start “redis-server.exe”
  4. Write a simple Java program that uses Jedis to talk to the Redis server
  5. That’s it

The Little Redis Book

Tuesday, January 24th, 2012

The Little Redis Book by Karl Seguin.

Weighs in at 29 pages and does a good job of creating an interest in knowing more about Redis.

Seguin is also the author of The Little MongoDB Book. (which comes in at 32 pages)

Redis in Practice: Who’s Online?

Friday, December 9th, 2011

Redis in Practice: Who’s Online?

From the post:

Redis is one of the most interesting of the NOSQL solutions. It goes beyond a simple key-value store in that keys’ values can be simple strings, but can also be data structures. Redis currently supports lists, sets and sorted sets. This post provides an example of using Redis’ Set data type in a recent feature I implemented for Weplay, our social youth sports site.

See, having complex key values isn’t all that weird.

Seven Databases in Seven Weeks now in Beta

Thursday, December 1st, 2011

Seven Databases in Seven Weeks now in Beta

From the webpage:

Redis, Neo4J, Couch, Mongo, HBase, Riak, and Postgres: with each database, you’ll tackle a real-world data problem that highlights the concepts and features that make it shine. You’ll explore the five data models employed by these databases: relational, key/value, columnar, document, and graph. See which kinds of problems are best suited to each, and when to use them.

You’ll learn how MongoDB and CouchDB, both JavaScript powered, document oriented datastores, are strikingly different. Learn about the Dynamo heritage at the heart of Riak and Cassandra. Understand MapReduce and how to use it to solve Big Data problems.

Build clusters of servers using scalable services like Amazon’s Elastic Compute Cloud (EC2). Discover the CAP theorem and its implications for your distributed data. Understand the tradeoffs between consistency and availability, and when you can use them to your advantage. Use multiple databases in concert to create a platform that’s more than the sum of its parts, or find one that meets all your needs at once.

Seven Databases in Seven Weeks will give you a broad understanding of the databases, their strengths and weaknesses, and how to choose the ones that fit your needs.

Now in beta, in non-DRM PDF, epub, and mobi from pragprog.com/book/rwdata.

If you know the Seven Languages in Seven Weeks by Bruce Tate, no further recommendation is necessary for the approach.

I haven’t read the book, yet, but will be getting the electronic beta tonight. More to follow.

Mneme: Scalable Duplicate Filtering Service

Saturday, November 12th, 2011

Mneme: Scalable Duplicate Filtering Service

From the post:

Detecting and dealing with duplicates is a common problem: sometimes we want to avoid performing an operation based on this knowledge, and at other times, like in a case of a database, we want may want to only permit an operation based on a hit in the filter (ex: skip disk access on a cache miss). How do we build a system to solve the problem? The solution will depend on the amount of data, frequency of access, maintenance overhead, language, and so on. There are many ways to solve this puzzle.

In fact, that is the problem – they are too many ways. Having reimplemented at least half a dozen solutions in various languages and with various characteristics at PostRank, we arrived at the following requirements: we want a system that is able to scale to hundreds of millions of keys, we want it to be as space efficient as possible, have minimal maintenance, provide low latency access, and impose no language barriers. The tradeoff: we will accept a certain (customizable) degree of error, and we will not persist the keys forever.

Mneme: Duplicate filter & detection

Mneme is an HTTP web-service for recording and identifying previously seen records – aka, duplicate detection. To achieve the above requirements, it is implemented via a collection of bloomfilters. Each bloomfilter is responsible for efficiently storing the set membership information about a particular key for a defined period of time. Need to filter your keys for the trailing 24 hours? Mneme can create and automatically rotate 24 hourly filters on your behalf – no maintenance required.

Interesting in several respects:

  1. Duplicate detection
  2. Duplicate detection for a defined period of time
  3. Duplicate detection for a defined period of time with “customizable” degree of error

Would depend on your topic map project requirements. Assuming absolute truth forever and ever isn’t one of them, detecting duplicate subject representatives for some time period at a specified error rate may be the concepts you are looking for.

Enables a discussion of how much certainly (error rate) for how long (time period) for detection of duplicates (subject representatives) on what basis? All of those are going to impact project complexity and duration.

Interesting as well as a solution that for some duplicate detection requirements will work quite well.

Redis: Zero to Master in 30 minutes – Part 1

Wednesday, November 9th, 2011

Redis: Zero to Master in 30 minutes – Part 1

From the post:

More than once, I’ve said that learning Redis is the most efficient way a programmer can spend 30 minutes. This is a testament to both how useful Redis is and how easy it is to learn. But, is it true, can you really learn, and even master, Redis in 30 minutes?

Let’s try it. In this part we’ll go over what Redis is. In the next, we’ll look at a simple example. Whatever time we have left will be for you to set up and play with Redis.

This is a nice post. Introduces enough of Redis for you to get some idea of its power without being overwhelming with details. Continues with Part 2 by the way.

Redis for processing payments

Saturday, September 3rd, 2011

Redis for processing payments

Not a complete payment or even work-flow system but enough to make you think about how to use Redis in such a situation.

How You Should Go About Learning NoSQL

Thursday, August 18th, 2011

How You Should Go About Learning NoSQL

Interesting post that expands on three rules for learning NoSQL:

1: Use MongoDB.
2: Take 20 minute to learn Redis
3: Watch this video to understand Dynamo.

The Beauty of Simplicity: Mastering Database Design Using Redis

Saturday, July 23rd, 2011

The Beauty of Simplicity: Mastering Database Design Using Redis by Ryan Briones.

Not so much teaching database design as illustrating how Redis forces you to think about the structure of the data you are storing.

Covers some Redis commands, other can be found at http://redis.io, along with the Redis distribution.

Use Cases Solved in Redis
(TM Use Cases?)

Thursday, July 7th, 2011

11 Common Web Use Cases Solved in Redis

From the webpage:

In How to take advantage of Redis just adding it to your stack Salvatore ‘antirez’ Sanfilippo shows how to solve some common problems in Redis by taking advantage of its unique data structure handling capabilities. Common Redis primitives like LPUSH, and LTRIM, and LREM are used to accomplish tasks programmers need to get done, but that can be hard or slow in more traditional stores. A very useful and practical article. How would you accomplish these tasks in your framework?

Good post about Redis and common web use cases.

Occurs to me that I don’t have a similar list for topic maps (whatever software you use) as a technology.

Sure, topic map apply when you need to have a common locus for information about a subject or need better modeling of relationships, but that’s all rather vague and hand-wavy.

Here are two examples that are more concrete:

The small office supply store on the town square (this is a true story) had its own internal inventory system with numbers, etc. The small store ordered from several larger suppliers, who all had their own names and internal numbers for the same items. A stable mapping wasn’t an option because the numbers used both by the large suppliers (as well as the descriptions) and the manufacturers were subject to change and reuse.

The small office supply store could see the value in a topic map but the cost in employee time to match up the inventory numbers was less than construction and maintenance of a topic map on top of their internal system. I would say that dynamic inventory control is a topic maps use case.

The other use case involves medical terminology. A doctor I know covers the hospital for an entire local medical practice. He isn’t a specialist in any of the fields covered by the practice so he has to look up the latest medical advances in several fields. Like all of us, he has terms that he learned in for particular conditions, which aren’t the ones in the medical databases. So he has trouble searching from time to time.

He recognized the value of a topic map being able to create a mapping between his terminology and the terminology used by the medical database. It would enable him to search more quickly and effectively. Unfortunately the problem, in these economic times, isn’t pinching enough to result in a project. Personalized search interfaces are another topic map use case.

What’s yours?

Writing a Simple Keyword Search Engine Using Haskell and Redis

Saturday, June 11th, 2011

Writing a Simple Keyword Search Engine Using Haskell and Redis

Alex Popescu says this is a good guide to “…translat[ing] logical operators in Redis set commands” which is true, but it is also an entertaining post on writing a search engine.

Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs HBase comparison

Thursday, May 12th, 2011

Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs HBase comparison

Good thumb-nail comparison of the major features of all six (6) NoSQL databases by Kristóf Kovács.

Sorry to see that Neo4J didn’t make the comparison.

TMDM to Redis Schema (paper)

Thursday, April 14th, 2011

Yet another mapping of the Topic Maps Data Model to Redis schema

By Johannes Schmidt :

In this document another mapping of the Topic Maps Data Model (TMDM) [3] to Redis key-value store [8] schema is drafted. An initial mapping [5] of the TMDM to Redis schema has been provided by the Topic Maps Lab of the University of Leipzig [9]. The main motivation is not to design a “better” schema but to simply do a mapping of the TMDM to a key-value store schema. Some valuable enhancements for the Topic Maps Lab schema are created, though.

Possible guide to mapping the TMDM to key-value store databases.

Something to consider would be mapping the TMDM to a graph database.

Do topic, association, and occurrence become nodes?

Redis, from the ground up

Tuesday, March 15th, 2011

Redis, from the ground up

Mark J. Russo:

A deep dive into Redis’ origins, design decisions, feature set, and a look at a few potential applications.

Not all that you would want to know about Redis but enough to develop an appetite for more!

Summify’s Technology Examined

Tuesday, March 8th, 2011

Summify’s Technology Examined

Phil Whelan writes an interesting review of the underlying technology for Summify.

Many those same components are relevant to the construction of topic map based services.

Interesting that Summify uses MySQL, Redis and MongoDB.

I rather like the idea of using the best tool for a particular job.

Worth a close read.

NoSQL Databases: Why, what and when

Tuesday, March 1st, 2011

NoSQL Databases: Why, what and when by Lorenzo Alberton.

When I posted RDBMS in the Social Networks Age I did not anticipate returning the very next day with another slide deck from Lorenzo. But, after viewing this slide deck, I just had to post it.

It is a very good overview of NoSQL databases and their underlying principles, with useful graphics as well (as opposed to the other kind).

I am going to have to study his graphic technique in hopes of applying it to the semantic issues that are at the core of topic maps.

Auto Completion

Tuesday, February 15th, 2011

Auto-completion is a feature that I find useful in a number of applications.

I suspect users would find that to be the case for topic map authoring and navigation software.

One article to look at is: Auto Complete with Redis.

Which was cited by: Announcing Soulmate, A Redis-Backed Service For Fast Autocompleting

The second item being an application complete with an interface.

From the Soulmate announcement:

Inspired by Auto Complete with Redis, Soulmate uses sorted sets to build an index of partially completed words and the corresponding top matching items, and provides a simple sinatra app to query them.

Here’s a quick overview of what the initial version of Soulmate supports:

  • Provide suggestions for multiple types of items in a single query (at SeatGeek we’re autocompleting for performers, events, and venues)
  • Results are ordered by a user-specified score
  • Arbitrary metadata for each item (at SeatGeek we’re storing both a url and a subtitle)

I rather like the idea of arbitrary metadata.

Could be a utility that presents snippets to paste into a topic map?