NoSQL « Another Word For It

Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

September 3, 2011

Redis for processing payments

Filed under: NoSQL,Redis — Patrick Durusau @ 6:46 pm

Not a complete payment or even work-flow system but enough to make you think about how to use Redis in such a situation.

Comments Off

September 2, 2011

Groonga

Filed under: Column-Oriented,NoSQL,Search Engines,Searching — Patrick Durusau @ 7:54 pm

Groonga

From the webpage:

Groonga is an open-source fulltext search engine and column store. It lets you write high-performance applications that requires fulltext search.

The latest release is 1.2.5, released 2011-08-29.

Most of the documentation is in Japanese so I can’t comment on it.

Think of this as an opportunity to (hopefully) learn some Japanese. Given the rate of computer science research in Japan it will not be wasted effort.

PS: If you already read Japanese, feel free to contribute some comments on Groonga.

Comments (2)

August 31, 2011

Couchbase Server 2.0 – Up and Running

Filed under: Couchbase,NoSQL — Patrick Durusau @ 7:46 pm

Couchbase Server 2.0 – Up and Running

A 5 minute video to get Couchbase Server 2.0 up and running.

Almost makes me wish I was a sysadmin again. They never met some of my users. 😉 Note I said almost made me wish. (shudder)

You are all brighter than that and so should have no problems with the five minute limit.

Don’t overlook the sign-up for the tech webinar series.

Curious about Couchbase Server 2.0?

Couchbase Server 2.0 Developer Release is now available! This new release combines the unmatched elastic data management capabilities of Membase Server with the distributed indexing and querying capabilities of Apache CouchDB to deliver the industry’s most powerful, bullet-proof NoSQL database technology.

Come to a series of weekly 30-minute webinars to learn more about the technical details of Couchbase Server 2.0.

This nine-week webinar series will cover:

-Couchbase Server 2.0 overview
-Indexing and querying basics
-SDKs/client libraries (including Moxi Server)
-Development/production View usage
-Advanced indexing and querying
-Clustering and monitoring
-Auto compaction
-Upgrading to 2.0 from Membase Server
-Cross data center replication

Whether you are currently running CouchBase or not, this could be interesting.

Comments Off

Approaching and evaluating NoSQL

Filed under: NoSQL,Use Cases — Patrick Durusau @ 7:37 pm

Approaching and evaluating NoSQL by Mårten Gustafson.

From the webpage:

Brown bag lunch presentation at TUI / Fritidsresor about approaching and evaluating the NoSQL area. Embedded presentation below, downloadable as PDF and Keynote.

A “brown bag lunch” presentation that balances detail with ideas so listeners will follow it with interesting conversations and research.

The “use case” approach lends itself to exploring “why” someone would want to use a NoSQL database as opposed to the usual mantra that NoSQL databases are flexible, scalable and fast.

So? If my data format is fixed, not all that large (under a few terabytes), and I run batch reports, I may not need a NoSQL database. Could, depends on the facts/use cases. This presentation gets high marks for its “use case” approach.

Comments Off

August 30, 2011

MongoDB 2.0.0-rc0

Filed under: MongoDB,NoSQL — Patrick Durusau @ 7:10 pm

MongoDB 2.0.0-rc0 was released 25 August 2011.

Check out the latest release or download a stable version at:

MongoDB homepage

Comments Off

August 22, 2011

OrientDB v1.0rc5 New

Filed under: NoSQL,OrientDB — Patrick Durusau @ 7:40 pm

OrientDB v1.0rc5: improved index and transactions, better crossing of trees and graphs

Just quickly:

SQL engine: new [] operator to extract items from lists, sets, maps and arrays

SQL engine: ORDER BY works with projection alias

SQL engine: Cross trees and graphs in projections

SQL engine: IN operator uses Index when available

Fixed all known bugs on transaction recovery

Rewritten the memory management of MVRB-Tree: now it’s faster and uses much less RAM

Java 5 compatibility of common and core subprojects

16 issues fixed in total

Full list: http://code.google.com/p/orient/issues/list?can=1&q=label%3Av1.0rc5

Comments Off

August 21, 2011

NoSQL Patterns

Filed under: NoSQL — Patrick Durusau @ 7:08 pm

NoSQL Patterns by Ricky Ho.

Ricky has put together a great summary of what NoSQL solutions have in common. Ranging from the API model, consistent hashing and other NoSQL earmarks. Recommended if you are new to the area.

Comments Off

YeSQL?

Filed under: NoSQL,SQL — Patrick Durusau @ 7:07 pm

Perspectives on NoSQL by Gavin M. Roy.

I don’t remember how I found this presentation but it is quite interesting.

Starts with a review of NoSQL database options, one slide summaries.

Compares them to PostgreSQL 9.0b1 using KVPBench, http://github.com/gmr/kvpbench.

Concludes that SQL databases perform as well if not out-performing NoSQL databases.

Really depends on the benchmark or more importantly, what use case is at hand. Use the most appropriate technology, SQL or not.

Still, I like the slide with database administrators running with scissors. I have always wondered what that would look like. Now I know. It isn’t pretty.

Comments Off

August 18, 2011

How You Should Go About Learning NoSQL

Filed under: Dynamo,MongoDB,NoSQL,Redis — Patrick Durusau @ 6:46 pm

How You Should Go About Learning NoSQL

Interesting post that expands on three rules for learning NoSQL:

1: Use MongoDB.
2: Take 20 minute to learn Redis
3: Watch this video to understand Dynamo.

Comments Off

Getting Started with Riak and .Net

Filed under: NoSQL,Riak — Patrick Durusau @ 6:46 pm

Getting Started with Riak and .Net by Adrian Hills.

Short “getting started” guide. The installation was on Ubuntu and then he connects to the server with a .Net client.

I wondered about the statement that Riak would not run on Windows (there are no pre-compiled binaries for Windows). Stackoverflow reports on Riak on Windows, several options to have Riak run on a Windows system. Compile under Windows, CYGwin, or run VMWARE or VirtualBox and run Riak inside the Linux VM.

Comments Off

August 17, 2011

What’s New in MySQL 5.6 – Part 1: Overview – Webinar 18 August 2011

Filed under: MySQL,NoSQL,SQL — Patrick Durusau @ 6:54 pm

What’s New in MySQL 5.6 – Part 1: Overview

From the webpage:

MySQL 5.6 builds on Oracle’s investment in MySQL by adding improvements to Performance, InnoDB, Replication, Instrumentation and flexibility with NoSQL (Not Only SQL) access. In the first session of this 5-part Webinar series, we’ll cover the highlights of those enhancements to help you begin the development and testing efforts around the new features and improvements that are now available in the latest MySQL 5.6 Development Milestone and MySQL Labs releases.

OK, I’ll ‘fess up, I haven’t kept up with MySQL like I did when I was a sysadmin and running it everyday in a production environment. So, maybe its time to do some catching up.

Besides, when you read:

We will also explore how you can now use MySQL 5.6 as a “Not Only SQL” data source for high performance key-value operations by leveraging the new Memcached Plug-in to InnoDB, running simultaneously with SQL for more complex queries, all across the same data set.

“…SQL for more complex queries,…” you almost have to look. 😉

So, get up early tomorrow and throw a recent copy of MySQL on a box.

Comments Off

August 15, 2011

Index With Performance Close to Linear (was: Index With Linear Performance Close to Constant)

Filed under: NoSQL,OrientDB — Patrick Durusau @ 7:32 pm

I don’t have a link (yet) but @lgarulli reports that OrientDB’s new index has a measured growth factor of 0,000006 per entry stored.

Will update when more information becomes available.

See OrientDB.

Although for the new index you will need the sources I suspect: OrientDB sources.

Lars suggested the correction when this post appeared but I never quite got around to changing it. Preserved the original as I dislike content that changes under foot.

Comments (2)

August 14, 2011

Planet Cassandra

Filed under: Cassandra,NoSQL — Patrick Durusau @ 7:13 pm

Planet Cassandra

Aggregation of feeds on Cassandra. If you need to follow Cassandra closely, this would be among your first stops.

Comments Off

Graph Databases, NoSQL and Neo4j

Filed under: Graphs,Neo4j,NoSQL — Patrick Durusau @ 7:11 pm

Graph Databases, NoSQL and Neo4j by Peter Neubauer.

From the post:

Of the many different datamodels, the relational model has been dominating since the 80s, with implementations like Oracle, MySQL and MSSQL – also known as Relational Database Management System (RDBMS). Lately, however, in an increasing number of cases the use of relational databases leads to problems both because of Deficits and problems in the modeling of data and constraints of horizontal scalability over several servers and big amounts of data. There are two trends that bringing these problems to the attention of the international software community:

The exponential growth of the volume of data generated by users, systems and sensors, further accelerated by the concentration of large part of this volume on big distributed systems like Amazon, Google and other cloud services.

The increasing interdependency and complexity of data, accelerated by the Internet, Web2.0, social networks and open and standardized access to data sources from a large number of different systems.

The relational databases have increasing problems to cope with these trends. This has led to a number of different technologies targeting special aspects of these problems, which can be used together or alternatively to the existing RDBMS – also know as Polyglot Persistence. Alternative databases are nothing new, they have been around for a long time in the form of e.g. Object Databases (OODBMS), Hierarchical Databases (e.g. LDAP) and many more. But during the last few years a large number of new projects have been started which together are known under the name NOSQL-databases.

This article aims to give an overview of the position of Graph Databases in the NOSQL-movement. The second part is an introduction to Neo4j, a Java-based Graph Database.

Excellent article with heavy cross-linking to additional information.

Comments Off

August 11, 2011

Cassandra: Introduction for System Administrators

Filed under: Cassandra,NoSQL — Patrick Durusau @ 6:32 pm

Cassandra: Introduction for System Administrators by Nathan Milford.

Introductory slide deck for administrators interested in Cassandra (or being asked to participate in its use).

Comments Off

August 10, 2011

LevelDB – Update

Filed under: leveldb,NoSQL — Patrick Durusau @ 7:16 pm

LevelDB – Fast and Lightweight Key/Value Database From the Authors of MapReduce and BigTable

From the post:

LevelDB is an exciting new entrant into the pantheon of embedded databases, notable both for its pedigree, being authored by the makers of the now mythical Google MapReduce and BigTable products, and for its emphasis on efficient disk based random access using log-structured-merge (LSM) trees.

The plan is to keep LevelDB fairly low-level. The intention is that it will be a useful building block for higher-level storage systems. Basho is already investigating using LevelDB as one if its storage engines.

Includes a great summary of information from the LevelDB mailing list.

A must read if you are interested in LevelDB.

Comments Off

August 5, 2011

A Storm is coming: more details and plans for release

Filed under: NoSQL,Storm — Patrick Durusau @ 7:07 pm

A Storm is coming: more details and plans for release

Storm is going to be released at Strange Loop on September 19!

From the post:

Here’s a recap of the three broad use cases for Storm:

Stream processing: Storm can be used to process a stream of new data and update databases in realtime. Unlike the standard approach of doing stream processing with a network of queues and workers, Storm is fault-tolerant and scalable.

Continuous computation: Storm can do a continuous query and stream the results to clients in realtime. An example is streaming trending topics on Twitter into browsers. The browsers will have a realtime view on what the trending topics are as they happen.

Distributed RPC: Storm can be used to parallelize an intense query on the fly. The idea is that your Storm topology is a distributed function that waits for invocation messages. When it receives an invocation, it computes the query and sends back the results. Examples of Distributed RPC are parallelizing search queries or doing set operations on large numbers of large sets.

The beauty of Storm is that it’s able to solve such a wide variety of use cases with just a simple set of primitives.

The really exciting part about all the current frenzy of development is imagining where it is going to be five (5) years from now.

Comments Off

August 3, 2011

Optimizing Distributed Read Operations in VoltDB

Filed under: NoSQL,VoltDB — Patrick Durusau @ 7:37 pm

Optimizing Distributed Read Operations in VoltDB

From the post:

Many VoltDB applications, such as gaming leader boards and real-time analytics, use multi-partition procedures to compute consistent global aggregates (and other interesting statistics). It’s challenging to efficiently process distributed reads operations, especially for performance sensitive applications. Based on feedback from our users, we in VoltDB engineering have been enhancing the VoltDB SQL planner over the last few releases to improve this capability.

Executing global aggregates efficiently requires calculating sub-results at each partition replica and combining the sub-results at a coordinating partition to produce the final result. For example, to calculate a total sum, the VoltDB planner should produce a sub-total at each partition and then sum the sub-totals at the coordinator node. All of this work must be transparent to the application, of course.

Hmmm, “global aggregates,” doesn’t that sound familiar? I realize here is means summing up the number of “kills,” “votes,” etc., simple number stuff but in principal, what you return and how you sum it I would think is application specific. Yes?

Comments Off

Consistency or Bust: Breaking a Riak Cluster

Filed under: NoSQL,Riak — Patrick Durusau @ 7:36 pm

Consistency or Bust: Breaking a Riak Cluster by Jeff Kirkell.

Not your usual slidedeck.

Has enough examples and working instructions for you to actually learn something separate from the presentation.

Perhaps the one time you will be glad someone broke the rule about not putting text for the audience to read on a slide.

Comments Off

July 31, 2011

Riak and Python – 2nd August 2011

Filed under: NoSQL,Riak — Patrick Durusau @ 7:49 pm

Riak and Python

A free webinar sponsored by basho.

Dates and times:

Tuesday, August 2, 2011 2:00 pm, Eastern Daylight Time (New York, GMT-04:00)
Tuesday, August 2, 2011 11:00 am, Pacific Daylight Time (San Francisco, GMT-07:00)
Tuesday, August 2, 2011 8:00 pm, Europe Summer Time (Berlin, GMT+02:00)

Includes:

Building and Deploying a Simple imgur.com Clone Using Riak, Luwak, and Riak Search

You know where to get Riak and Riak Search, Luwak: https://github.com/basho/luwak.

Comments Off

NoSQL NOW! August 23-25 San Jose

Filed under: Conferences,NoSQL — Patrick Durusau @ 7:48 pm

NoSQL NOW! August 23-25 San Jose

OK, it’s August in San Jose, CA and not at the DoubleTree Hotel (the usual Unicode conference site).

I think those are the only two negatives you can find about this conference!

Take a look at the program if you don’t want to take my word for it.

It isn’t clear if conference presentations will be posted and maintained as informal proceedings. This would be a good opportunity to start collecting that sort of thing.

Comments Off

July 30, 2011

Hypertable 0.9.5.0 Binary Packages

Filed under: Hypertable,NoSQL — Patrick Durusau @ 9:11 pm

Hypertable 0.9.5.0 Binary Packages (download)

New release of Hypertable!

Change notes.

Comments Off

Couchbase Server 2.0

Filed under: CouchDB,NoSQL — Patrick Durusau @ 9:09 pm

Couchbase Releases Flagship NoSQL Database, Couchbase Server 2.0

From the release:

SAN FRANCISCO, Calif. – CouchConf San Francisco – July 29, 2011 – Couchbase, the leading NoSQL database company, today released a developer preview of Couchbase Server 2.0, the company’s high-performance, highly scalable, document-oriented NoSQL database. Couchbase Server 2.0 combines the unmatched elastic data management capabilities of Membase Server with the distributed indexing, querying and mobile synchronization capabilities of Apache CouchDB, the most widely deployed open source document database, to deliver the industry’s most powerful, bullet-proof NoSQL database technology.

The database world just gets more interesting with each passing day!

Comments Off

July 29, 2011

MongoDB Schema Design Basics

Filed under: MongoDB,NoSQL,Schema — Patrick Durusau @ 7:46 pm

MongoDB Schema Design Basics

From Alex Popescu’s myNoSQL:

For NoSQL databases there are no clear rules like the Boyce-Codd Normal Form database normalization. Data modeling and analysis of data access patterns are two fundamental activities. While over the last 2 years we’ve gather some recipes, it’s always a good idea to check what are the recommended ways to model your data with your choice of NoSQL database.

After the break, watch 10gen’s Richard Kreuter’s presentation on MongoDB schema design.

A must see video!

Comments Off

State of HBase

Filed under: HBase,NoSQL — Patrick Durusau @ 7:43 pm

State of HBase by Michael Stack (StumbleUpon).

From the abstract:

Attendees will learn about the current state of the HBase project. We’ll review what the community is contributing, some of the more interesting production installs, killer apps on HBase, the on-again, off-again HBase+HDFS love affair, and what the near-future promises. A familiarity with BigTable concepts and Hadoop is presumed.

Catch the latest news on HBase!

Comments Off

July 27, 2011

NoSQL @ Netflix, Part 2

Filed under: Cassandra,NoSQL,SQL — Patrick Durusau @ 2:17 pm

NoSQL @ Netflix, Part 2 by Sid Anand.

OSCON 2011 presentation.

I think the RDBMS Concepts to Key-Value Store Concepts was the best part of the slide deck.

What do you think?

Comments Off

July 24, 2011

MongoDB and the Democratic Party

Filed under: MongoDB,NoSQL — Patrick Durusau @ 6:46 pm

MongoDB and the Democratic Party – A Case Study by Pramod Sadalage.

Interesting case study for an application that managed contacts of the Democratic Party (US) for fund raising and voter turnout efforts on election day.

Talks about elimination of duplicate records but given the breath of the talk, the speaker doesn’t go into any detail.

Pay particular attention to the data structure that is created for this project.

Note that any organization can have a different ID for any particular person. That is a local organization can query by its identifier and its ID for a person. And it gets back the information on that person. (I assume the IDs used by other organizations is filtered out of the return.)

Granted it isn’t aggregation of unbounded information for any particular voter from an unknown number of sources but it is a low cost solution to the national ID (for this data set) and providing access via local IDs problem. That “pattern” could prove to be useful in other cases.

Comments Off

Real World CouchDB

Filed under: CouchDB,NoSQL,Web Applications — Patrick Durusau @ 6:46 pm

Real World CouchDB by John Wood.

Very good overview of CouchDB, including its limitations.

Two parts really caught my attention:

First, the “crash only” design. CouchDB doesn’t shut down, its process is killed. There’s a data integrity test!

Second, the “scale down architecture.” Can run CouchDB plus data on a mobile device. Synches up when connectivity is restored but otherwise, application based on CouchDB can continue working. CouchDB supports delivery of HTML and Javascript so supports basic web apps.

CouchDB looks like a good candidate for delivery of topic map content.

I wanted to include a link to a CouchDB app for the Afghan War Diaries but the site isn’t responding. You can see the source code for the app at: https://github.com/benoitc/afgwardiary.

Comments Off

July 23, 2011

The Beauty of Simplicity: Mastering Database Design Using Redis

Filed under: NoSQL,Redis — Patrick Durusau @ 3:07 pm

The Beauty of Simplicity: Mastering Database Design Using Redis by Ryan Briones.

Not so much teaching database design as illustrating how Redis forces you to think about the structure of the data you are storing.

Covers some Redis commands, other can be found at http://redis.io, along with the Redis distribution.

Comments Off

July 20, 2011

Voldemort V0.9 Released: NIO, Pipelined FSM, Hinted Handoff

Filed under: Key-Value Stores,NoSQL,Voldemort — Patrick Durusau @ 12:54 pm

Voldemort V0.9 Released: NIO, Pipelined FSM, Hinted Handoff

From Alex Popescu’s myNoSQL, links to commentary on the latest release.