Archive for the ‘Aerospike’ Category

Open Source Aerospike NoSQL Database Scales To 1M TPS For $1.68 Per Hour…

Tuesday, November 11th, 2014

Open Source Aerospike NoSQL Database Scales To 1M TPS For $1.68 Per Hour On A Single Amazon Web Services Instance at AWS re:Invent 2014

From the post:

Aerospike – the first flash-optimized open source database and the world’s fastest in-memory NoSQL database – will be at Amazon Web Services (AWS) re:Invent 2014 conference in Las Vegas, Nev.

An ultra low latency Key-Value Store, Aerospike can operate in pure RAM backed by Amazon Elastic Block Store (EBS) for persistence as well as in a hybrid mode using RAM and SSDs. Aerospike engineers have documented the performance of different AWS EC2 instances and described the best techniques to achieve 1 Million transactions per second on one instance with sub-millisecond latency.

The Aerospike AMI in the Amazon Marketplace comes with cloud formation scripts for simple, single click deployments. The open source Aerospike Community Edition is free and the Aerospike Enterprise Edition with certified binaries and Cross Data Center Replication (XDR) is also free for startups in the startup special program. Aerospike is priced simply based on the volume of unique data managed, with no charge for replicated data, for Transactions Per Second (TPS) or number of servers in a cluster.

Aerospike is popularly used as a session store, cookie store, user profile store, id-mapping store, for fraud detection, dynamic pricing, real-time product recommendations and personalization of cross channel user experiences on websites, mobile apps, e-commerce portals, travel portals, financial services portals and real-time bidding platforms. To ensure 24x7x365 operations; data in Aerospike is replicated synchronously with immediate consistency within a cluster and asynchronously across clusters in different availability zones using Aerospike Cross Data Center Replication (XDR).

This is not a plug for or against Aerospike. I am mostly posting this as a reminder to me as much as you that cloud data prices can be remarkably sane. Even $1.68 per hour could add up over a week but if you develop locally and test in the cloud, you should be able to meet your budget targets.

For any paying client, you can pass the cloud hosting fees (with an upfront deposit and one month in advance) to them.

Other examples of reasonable cloud pricing?

Aerospike goes Open Source

Wednesday, July 2nd, 2014

Aerospike goes Open Source

From the post:

We are excited to announce that the Aerospike database is now open source.

Aerospike’s mission is to disrupt the entire field of databases by offering an addictive proposition: a database literally ten times faster than existing NoSQL solutions, and one hundred times faster than existing SQL solutions. By offering as open source the battle-tested operational database that powers the largest and highest scale companies on the planet, Aerospike will change how applications are architected, and solutions are created!

The code for Aerospike clients and the Aerospike server is published on github. Clients are available under the Apache 2 license so you can use and modify with no restrictions. The server is available under AGPL V3 to protect the long term interests of the community – you are free to use it with no restrictions but if you change the server code, then those code changes must be contributed back.

Aerospike Community Edition has no limits on the number of servers, tps or terabytes of data, and is curated by Aerospike. Use is unlimited and the code is open. We cannot wait to see what you will do with it!

You will have to read the details to decide if Aerospike is appropriate for your requirements.

Among other things, I would focus on statements like:

This layer [Distribution Layer] scales linearly and implements many of the ACID guarantees.

That’s like reading a poorly written standards document. 😉 What does “many of the ACID guarantees” mean exactly?

From the ACID article at Wikipedia I read:

In computer science, ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties that guarantee that database transactions are processed reliably.

Jim Gray defined these properties of a reliable transaction system in the late 1970s and developed technologies to achieve them automatically.

I don’t think four (4) requirements count as “many” but my first question would be:

Which of the “many” ACID guarantees does Aerospike not implement? How hard that this be? It has to be one of the four. Yes?

Second question: So, more than three decades after Jim Gray demonstrated how to satisfy all four ACID guarantees, Aerospike doesn’t? Yes?

I’m not denying there may be valid reasons to ignore one or more of the ACID guarantees. But let’s be clear about which ones and the trade-offs that justify it.

I first saw this in a tweet by Charles Ditzel.

Aerospike 3

Tuesday, September 10th, 2013

Aerospike 3 by Alex Popescu.

From the post:

Aerospike 3 database builds off of Aerospike’s legacy of speed, scale, and reliability, adding an extensible data model that supports complex data types, large data types, queries using secondary indexes, user defined functions (UDFs) and distributed aggregations. Process more data faster to create the richest, most relevant real-time interactions.

Aerospike 3 Community Edition is a free unlimited license designed for a single cluster of up to two nodes and storage of up to 200GB of data. Enterprise version is available upon request.

Try the FREE version now.

Alex has picked up a new sponsor that merits your attention!

From the community download page:

Free Aerospike 3 Community Edition is a full copy of Aerospike Database, in a 2-node cluster configuration that supports a database up to 200 GB in size. For example, if you have 125 million records at 1.5 K bytes/object, you can do 16k reads/sec and 8k/writes/sec with data on SSD. Or, if you are deploying an in-memory database, you can handle 60k reads/sec and 30k writes/sec. This product includes:

  • Unlimited license to use the software forever. No fees, no strings attached.
  • Access to online forums and documentation
  • Tools for setting up and managing two Aerospike Servers in a single Aerospike Cluster
  • Aerospike Server software and Aerospike SDK for developing your database client application
  • When scale demands, easy upgrade to the Enterprise Edition without stopping your service!

The in-memory performance numbers look particularly impressive!

Aerospike

Friday, April 19th, 2013

Aerospike

From the architecture overview:

Aerospike is a fast Key Value Store or Distributed Hash Table architected to be a flexible NoSQL platform for today’s high scale Apps. Designed to meet the reliability or ACID requirements of traditional databases, there is no single point of failure (SPOF) and data is never lost. Aerospike can be used as an in-memory database and is uniquely optimized to take advantage of the dramatic cost benefits of flash storage. Written in C, Aerospike runs on Linux.

Based on our own experiences developing mission-critical applications with high scale databases and our interactions with customers, we’ve developed a general philosophy of operational efficiency that guides product development. Three principles drive Aerospike architecture: NoSQL flexibility, traditional database reliability, and operational efficiency.

Technical details first published in Proceeding of the VLDB (Very Large Databases), Citrusleaf: A Real-Time NoSQL DB which Preserves ACID by V. Srinivasan and Brian Bulkowski.

You can guess why they changed the name. 😉

There is a free community edition, along with an SDK and documentation.

Relies on RAM and SDDs.

Timo Elliott was speculating about entirely RAM-based computing in: In-Memory Computing.

Imagine losing all the special coding tricks to get performance despite disk storage.

Simpler code and fewer operations should result in higher speed.

How to Compare NoSQL Databases

Friday, April 19th, 2013

How to Compare NoSQL Databases by Ben Engber. (video)

From the description:

Ben Engber, CEO and founder of Thumbtack Technology, will discuss how to perform tuned benchmarking across a number of NoSQL solutions (Couchbase, Aerospike, MongoDB, Cassandra, HBase, others) and to do so in a way that does not artificially distort the data in favor of a particular database or storage paradigm. This includes hardware and software configurations, as well as ways of measuring to ensure repeatable results.

We also discuss how to extend benchmarking tests to simulate different kinds of failure scenarios to help evaluate the maintainablility and recoverability of different systems. This requires carefully constructed tests and significant knowledge of the underlying databases — the talk will help evaluators overcome the common pitfalls and time sinks involved in trying to measure this.

Lastly we discuss the YCSB benchmarking tool, its significant limitations, and the significant extensions and supplementary tools Thumbtack has created to provide distributed load generation and failure simulation.

Ben makes a very good case for understanding the details of your use case versus the characteristics of particular NoSQL solutions.

Where you will find “better” performance depends on non-obvious details.

Watch the use of terms like “consistency” in this presentation.

The paper Ben refers to: Ultra-High Performance NoSQL Benchmarking: Analyzing Durability and Performance Tradeoffs.

Forty-three pages of analysis and charts.

Slow but interesting reading.

If you are into the details of performance and NoSQL databases.