Archive for the ‘DataStax’ Category

A Practical Guide to Graph Databases

Wednesday, January 20th, 2016

A Practical Guide to Graph Databases by Matthias Broecheler.

Slides from Graph Day 2016 @ Austin.

If you notice any of the “trash talking” on social media about graphs and graph databases, you will find slide 15 quite amusing.

Not everyone agrees on the relative position of graph products. 😉

I haven’t seen a video of Matthias’ presentation. If you happen across one, give me a ping. Thanks!

Query the Northwind Database as a Graph Using Gremlin

Wednesday, October 21st, 2015

Query the Northwind Database as a Graph Using Gremlin by Mark Kromer.

From the post:

One of the most popular and interesting topics in the world of NoSQL databases is graph. At DataStax, we have invested in graph computing through the acquisition of Aurelius, the company behind TitanDB, and are especially committed to ensuring the success of the Gremlin graph traversal language. Gremlin is part of the open source Apache TinkerPop graph framework project and is a graph traversal language used by many different graph databases.

I wanted to introduce you to a superb web site that our own Daniel Kuppitz maintains called “SQL2Gremlin” ( which I think is great way to start learning how to query graph databases for those of us who come from the traditional relational database world. It is full of excellent sample SQL queries from the popular public domain RDBMS dataset Northwind and demonstrates how to produce the same results by using Gremlin. For me, learning by example has been a great way to get introduced to graph querying and I think that you’ll find it very useful as well.

I’m only going to walk through a couple of examples here as an intro to what you will find at the full site. But if you are new to graph databases and Gremlin, then I highly encourage you to visit the sql2gremlin site for the rest of the complete samples. There is also a nice example of an interactive visualization / filtering, search tool here that helps visualize the Northwind data set as it has been converted into a graph model.

I’ve worked with (and worked for) Microsoft SQL Server for a very long time. Since Daniel’s examples use T-SQL, we’ll stick with SQL Server for this blog post as an intro to Gremlin and we’ll use the Northwind samples for SQL Server 2014. You can download the entire Northwind sample database here. Load that database into your SQL Server if you wish to follow along.

When I first saw the title to this post,

Query the Northwind Database as a Graph Using Gremlin (emphasis added)

I thought this was something else. A database about the Northwind album.

Little did I suspect that the Northwind Database is a test database for SQL Server 2005 and SQL Server 2008. Yikes!

Still, I thought some of you might have access to such legacy software and so I am pointing you to this post. 😉


Support for SQL Server 2005 ends April 16, 2016 (that’s next April)

Support for SQL Server 2008 ended July 8, 2014 Ouch! You are more than a year into a dangerous place. Upgrade, migrate or get another job. Hard times are coming and blame will be assigned.

DataStax – New for 2015 – Free Online Instructor Led Training

Wednesday, February 25th, 2015

DataStax – New for 2015 – Free Online Instructor Led Training

I count six (6) online free courses in March 2015:

As of today, both:

report being “sold out” and you can join a waiting list.

If you take one or more of these courses, don’t keep your attendance a secret. Provide feedback to DataStax and post your comments about the experience online.

High quality online training isn’t cheap and positive feedback will strengthen the hand of those responsible for these free training classes.

Marko Reassures the TinkerPop Community

Tuesday, February 3rd, 2015

How the DataStax Acquisition of Aurelius Will Effect TinkerPop by Marko A. Rodriguez.

From the post:

As you may already know, Aurelius has been acquired by DataStax — Aurelius is the graph computing company behind Titan which also provides a good number of contributors to TinkerPop. DataStax is the distributed database company behind Cassandra. Matthias and I are very excited about this acquisition. With DataStax’s resources, the graph community is going to see a powerful, rock-solid, Titan-inspired distributed graph database in the near future — enterprise support, focused large-team development, and 1000+ node cluster testing for each release. A great thing for the graph community, indeed. Also, a great thing for TinkerPop —

Looking forward to seeing how a pairing of DataStax resources and Marko’s vision for graph computing expresses itself. This could be a lot of fun!

DevCenter 1.2 delivers support for Cassandra 2.1 and query tracing

Friday, October 17th, 2014

DevCenter 1.2 delivers support for Cassandra 2.1 and query tracing by Alex Popescu.

From the post:

We’re very pleased to announce the availability of DataStax DevCenter 1.2, which you can download now. We’re excited to see how DevCenter has already become the defacto query and development tool for those of you working with Cassandra and DataStax Enterprise, and now with version 1.2, we’ve added additional support and options to make your development work even easier.

Version 1.2 of DevCenter delivers full support for the many new features in Apache Cassandra 2.1, including user defined types and tuples. DevCenter’s built-in validations, quick fix suggestions, the updated code assistance engine and the new snippets can greatly simplify your work with all the new features of Cassandra 2.1.

The download page offers the DataStax Sandbox if you are interested in a VM version.


Cassandra and OpsCenter from Datastax

Tuesday, August 7th, 2012

Cassandra and OpsCenter from Datastax

Istvan Szegedi details installation of both Cassandra and OpsCenter along with some basic operations.

From the post:

Cassandra – originally developed at Facebook – is another popular NoSQL database that combines Amazon’s Dynamo distributed systems technologies and Google’s Bigtable data model based on Column Families. It is designed for distributed data at large scale.Its key components are as follows:

Keyscape: it acts as a container for data, similar to RDBMS schema. This determines the replication parameters such as replication factor and replication placement strategy as we will see it later in this post. More details on replication placement strategy can be read here.

Column Familiy: within a keyscape you can have one or more column families. This is similar to tables in RDBMS world. They contain multiple columns which are referenced by row keys.

Column: it is the smallest increment of data. It is a tuple having a name, a value and and a timestamp.

Another information center you are likely to encounter.

DataStax Enterprise and DataStax Community Edition

Friday, November 11th, 2011

DataStax Enterprise and DataStax Community Edition

From the announcement:

BURLINGAME, Calif. – Nov.1, 2011 –DataStax, the commercial leader in Apache Cassandra™, today announced that DataStax Enterprise, the industry’s first distributed, scalable, and highly available database platform powered by Apache Cassandra™ 1.0, is now available.

“The ability to manage both real-time and analytic data in a simple, massively scalable, integrated solution is at the heart of challenges faced by most businesses with legacy databases,” said Billy Bosworth, CEO, DataStax. “Our goal is to ensure businesses can conquer these challenges with a modern application solution that provides operational simplicity, optimal performance and incredible cost savings.”

“Apache Cassandra is the scalable, high-impact, comprehensive data platform that is well-suited to the rapidly-growing real-time data needs of our social media platform,” said Christian Carollo, Senior Manager, Mobile for GameFly. “We leveraged the expertise of DataStax to deploy our new social media platform, and were able to complete the project without worrying about scale or distribution – we simply built a great application and Apache Cassandra took care of the rest.”

BTW, DataStax just added its 100th customer. You might recognize some of them, Netflix, Cisco, etc.