Archive for the ‘CQL – Cassandra Query Language’ Category

CQL Under the Hood

Saturday, September 13th, 2014

CQL Under the Hood by Robbie Strickland.

Description:

As a reformed CQL critic, I’d like to help dispel the myths around CQL and extol its awesomeness. Most criticism comes from people like me who were early Cassandra adopters and are concerned about the SQL-like syntax, the apparent lack of control, and the reliance on a defined schema. I’ll pop open the hood, showing just how the various CQL constructs translate to the underlying storage layer–and in the process I hope to give novices and old-timers alike a reason to love CQL.

Slides from CassandraSummit 2014

Best viewed with a running instance of Cassandra.

Cassandra – A Decentralized Structured Storage System [Annotated]

Monday, September 16th, 2013

Cassandra – A Decentralized Structured Storage System by Avinash Lakshman, Facebook and Prashant Malik, Facebook.

Abstract:

Cassandra is a distributed storage system for managing very large amounts of structured data spread out across many commodity servers, while providing highly available service with no single point of failure. Cassandra aims to run on top of an infrastructure of hundreds of nodes (possibly spread across different data centers). At this scale, small and large components fail continuously. The way Cassandra manages the persistent state in the face of these failures drives the reliability and scalability of the software systems relying on this service. While in many ways Cassandra resembles a database and shares many design and implementation strategies therewith, Cassandra does not support a full relational data model; instead, it provides clients with a simple data model that supports dynamic control over data layout and format. Cassandra system was designed to run on cheap commodity hardware and handle high write throughput while not sacrificing read efficiency.

Annotated version of the original 2009 Cassandra paper.

Not a guide to future technology but a very interesting read about how Cassandra arrived at the present.

Become a Super Modeler

Thursday, May 9th, 2013

Become a Super Modeler (Webinar)

Thursday, May 16th
11am PDT / 2pm EDT / 7pm BST / 8pm CEST

Sure you can do some time series modeling. Maybe some user profiles. What’s going to make you a super modeler? Let’s take a look at some great techniques taken from real world applications where we exploit the Cassandra big table model to it’s fullest advantage. We’ll cover some of the new features in CQL 3 as well as some tried and true methods. In particular, we will look at fast indexing techniques to get data faster at scale. You’ll be jet setting through your data like a true super modeler in no time.

Speaker: Patrick McFadin, Principal Solutions Architect at DataStax

Looks interesting and I have neglected to look closely at CQL 3.

Could be some incentive to read up before the webinar.

What’s New in Cassandra 1.2 (Notes)

Saturday, January 12th, 2013

What’s New in Cassandra 1.2

From the description:

Apache Cassandra Project Chair, Jonathan Ellis, looks at all the great improvements in Cassandra 1.2, including Vnodes, Parallel Leveled Compaction, Collections, Atomic Batches and CQL3.

There is only so much you can cover in an hour but Jonathan did a good job of hitting the high points of virtual nodes (rebuild failed drives/nodes faster), atomic batches (fewer requirements on clients, new default btw), CQL improvements, and tracing.

Enough to make you interested in running (not watching) the examples plus your own.

The slides: http://www.slideshare.net/DataStax/college-credit-whats-new-in-apache-cassandra-12

Cassandra homepage.

CQL 3 Language Reference.

Cassandra Radical NoSQL Scalability

Monday, February 27th, 2012

Cassandra Radical NoSQL Scalability by Tim Berglund.

From the description:

Cassandra is a scalable, highly available, column-oriented data store in use use at Netflix, Twitter, Urban Airship, Constant Contact, Reddit, Cisco, OpenX, Digg, CloudKick, Ooyala and more companies that have large, active data sets. The largest known Cassandra cluster has over 300 TB of data in over 400 machines.

This open source project managed by the Apache foundation offers a compelling combination of a rich data model, a robust deployment track record, and a sound architecture. This video presents the Cassandra’s data model, works through its API in Java and Groovy, talks about how to deploy it and looks at use cases in which it is an appropriate data storage solution.

It explores the Amazon Dynamo project and Google’s BigTable and explains how its architecture helps us achieve the gold standard of scalability: horizontal scalability on commodity hardware. You will be ready to begin experimenting with Cassandra immediately and planning its adoption in your next project.

Take some time to look at CQL – Cassandra Query Language.

BTW, Berglund is a good presenter.