RICON East 2013 [videos, slides, resources]
I have sorted (by author) and included the abstracts for the RICON East presentations. The RICON East webpage has links to blog entries about the conference.
Enjoy!
Brian Akins, Large Scale Data Service as a Service
Slides | Video
Turner Broadcasting hosts several large sites that need to serve “data” to millions of clients over HTTP. A couple of years ago, we started building a generic service to solve this and to retire several legacy systems. We will discuss the general architecture, the growing pains, and why we decided to use Riak. We will also share some implementation details and the use of the service for a few large internet events.
Neil Conway, Bloom: Big Systems from Small Programs
Slides | Video
Distributed systems are ubiquitous, but distributed programs remain stubbornly hard to write. While many distributed algorithms can be concisely described, implementing them requires large amounts of code–often, the essence of the algorithm is obscured by low-level concerns like exception handling, task scheduling, and message serialization. This results in programs that are hard to write and even harder to maintain. Can we do better?
Bloom is a new programming language we’ve developed at UC Berkeley that takes two important steps towards improving distributed systems development. First, Bloom programs are designed to be declarative and concise, aided by a new philosophy for reasoning about state and time. Second, Bloom can analyze distributed programs for their consistency requirements and either certify that eventual consistency is sufficient, or identify program locations where stronger consistency guarantees are needed. In this talk, I’ll introduce the language, and also suggest how lessons from Bloom can be adopted in other distributed programming stacks.
Sean Cribbs, Just Open a Socket – Connecting Applications to Distributed Systems
Slides | Video
Client-server programming is a discipline as old as computer networks and well-known. Just connect socket to the server and send some bytes back and forth, right?
Au contraire! Building reliable, robust client libraries and applications is actually quite difficult, and exposes a lot of classic distributed and concurrent programming problems. From understanding and manipulating the TCP/IP network stack, to multiplexing connections across worker threads, to handling partial failures, to juggling protocols and encodings, there are many different angles one must cover.
In this talk, we’ll discuss how Basho has addressed these problems and others in our client libraries and server-side interfaces for Riak, and how being a good client means being a participant in the distributed system, rather than just a spectator.
Reid Draper, Advancing Riak CS
Slides | Video
Riak CS has come a long way since it was first released in 2012, and then open sourced in March 2013. We’ll take a look at some of the features and improvements in the recently released Riak CS 1.3.0, and planned for the future, like better integration with CloudStack and OpenStack. Next, we’ll go over some of the Riak CS guts that deployers should understand in order to successfully deploy, monitor and scale Riak CS.
Camille Fournier, ZooKeeper for the Skeptical Architect
Slides | Video
ZooKeeper is everywhere these days. It’s a core component of the Hadoop ecosystem. It provides the glue that enables high availability for systems like Redis and Solr. Your favorite startup probably uses it internally. But as every good skeptic knows, just because something is popular doesn’t mean you should use it. In this talk I will go over the core uses of ZooKeeper in the wild and why it is suited to these use cases. I will also talk about systems that don’t use ZooKeeper and why that can be the right decision. Finally I will discuss the common challenges of running ZooKeeper as a service and things to look out for when architecting a deployment.
Sathish Gaddipati, Building a Weather Data Services Platform on Riak
Slides | Video
In this talk Sathish will discuss the size, complexity and use cases surrounding weather data services and analytics, which will entail an overview of the architecture of such systems and the role of Riak in these patterns.
Sunny Gleason, Riak Techniques for Advanced Web & Mobile Application Development
Slides | Video
In recent years, there have been tremendous advances in high-performance, high-availability data storage for scalable web and mobile application development. Often times, these NoSQL solutions are portrayed as sacrificing the crispness and rapid application development features of relational database alternatives. In this presentation, we show the amazing things that are possible using a variety of techniques to apply Riak’s advanced features such as map-reduce, search, and secondary indexes. We review each feature in the context of a demanding real-world Ruby & Javascript “Pinterest clone” application with advanced features such as real-time updates via Websocket, comment feeds, content quarantining, permissions, search and social graph modeling. We pay specific attention to explaining the ‘why’ of these Riak techniques for high-performance, high availability applications, not just the ‘how’.
Andy Gross, Lessons Learned and Questions Raised from Building Distributed Systems
Slides | Video
Shawn Gravelle and Sam Townsend, High Availability with Riak and PostgreSQL
Slides | Videos
This talk will cover work to build out an internal cloud offering using Riak and PostgreSQL as a data layer, architectural decisions made to achieve high availability, and lessons learned along the way.
Rich Hickey, Using Datomic with Riak
Video
Rich Hickey, the author of Clojure and designer of Datomic, is a software developer with over 20 years of experience in various domains. Rich has worked on scheduling systems, broadcast automation, audio analysis and fingerprinting, database design, yield management, exit poll systems, and machine listening, in a variety of languages.
James Hughes, Revolution in Storage
Slides | Video
The trends of technology are rocking the storage industry. Fundamental changes in basic technology, combined with massive scale, new paradigms, and fundamental economics leads to predictions of a new storage programming paradigm. The growth of low cost/GB disk is continuing with technologies such as Shingled Magnetic Recording. Flash and RAM are continuing to scale with roadmaps, some argue, down to atom scale. These technologies do not come without a cost. It is time to reevaluate the interface that we use to all kinds of storage, RAM, Flash and Disk. The discussion starts with the unique economics of storage (as compared to processing and networking), discusses technology changes, posits a set of open questions and ends with predictions of fundamental shifts across the entire storage hierarchy.
Kyle Kingsbury, Call Me Maybe: Carly Rae Jepsen and the Perils of Network Partitions
Slides | Code | Video
Network partitions are real, but their practical consequences on complex applications are poorly understood. I want to talk about some of the neat ways I’ve found to lose important data, the challenge of building systems which are reliable under partitions, and what it means for you, an application developer.
Hilary Mason, Realtime Systems for Social Data Analysis
Slides | Video
It’s one thing to have a lot of data, and another to make it useful. This talk explores the interplay between infrastructure, algorithms, and data necessary to design robust systems that produce useful and measurable insights for realtime data products. We’ll walk through several examples and discuss the design metaphors that bitly uses to rapidly develop these kinds of systems.
Michajlo Matijkiw, Firefighting Riak at Scale
Slides | Video
Managing a business critical Riak instance in an enterprise environment takes careful planning, coordination, and the willingness to accept that no matter how much you plan, Murphy’s law will always win. At CIM we’ve been running Riak in production for nearly 3 years, and over those years we’ve seen our fair share of failures, both expected and unexpected. From disk melt downs to solar flares we’ve managed to recover and maintain 100% uptime with no customer impact. I’ll talk about some of these failures, how we dealt with them, and how we managed to keep our clients completely unaware.
Neha Narula, Why Is My Cache So Dumb? Smarter Caching with Pequod
Slides | Video
Pequod is a key/value cache we’re developing at MIT and Harvard that automatically updates the cache to keep data fresh. Pequod exploits a common pattern in these computations: different kinds of cached data are often related to each other by transformations equivalent to simple joins, filters, and aggregations. Pequod allows applications to pre-declare these transformations with a new abstraction, the cache join. Pequod then automatically applies the transformations and tracks relationships to materialize data and keep the cache up to date, and in many cases improves performance by reducing client/cacheserver communication. Sound like a database? We use abstractions from databases like joins and materialized views, while still maintaining the performance of an in-memory key/value cache.
In this talk, I’ll describe the challenges caching solves, the problems that still exist, and how tools like Pequod can make the space better.
Alex Payne, Nobody ever got fired for picking Java: evaluating emerging programming languages for business-critical systems
Slides | Video
When setting out to build greenfield systems, engineers today have a broader choice of programming language than ever before. Over the past decade, language development has accelerated dramatically thanks to mature runtimes like the JVM and CLR, not to mention the prevalence of near-universal targets for cross-compilation like JavaScript. With strong technological foundations to build on and an active open source community, modern languages can evolve from rough hobbyist projects into capable tools in a stunningly short period of time. With so many strong contenders emerging every day, how do you decide what language to bet your business on? We’ll explore the landscape of new languages and provide a decision-making framework you can use to narrow down your choices.
Theo Schlossnagle and Robert Treat, How Do You Eat An Elephant?
Slides | Video
When OmniTI first set out to build a next generation monitoring system, we turned to one of our most trusted tools for data management; Postgres. While this worked well for developing the initial Open Source application, as we continued to grow the Circonus public monitoring service, we eventually ran into scaling issues. This talk will cover some of the changes we made to make the original Postgres system work better, talk about some of the other systems we evaluated, and discuss the eventual solution to our problem; building our own time series database. Of course, that’s only half the story. We’ll also go into how we swapped out these backend data storage pieces in our production environment, all the while capturing and reporting on millions of metrics, without downtime or customer interruption.
Dr. Margo Seltzer, Automatically Scalable Computation
Slides | Video
As our computational infrastructure races gracefully forward into increasingly parallel multi-core and blade-based systems, our ability to easily produce software that can successfully exploit such systems continues to stumble. For years, we’ve fantasized about the world in which we’d write simple, sequential programs, add magic sauce, and suddenly have scalable, parallel executions. We’re not there. We’re not even close. I’ll present trajectory-based execution, a radical, potentially crazy, approach for achieving automatic scalability. To date, we’ve achieved surprisingly good speedup in limited domains, but the potential is tantalizingly enormous.
Chris Tilt, Riak Enterprise Revisited
Slides | Video
Riak Enterprise has undergone an overhaul since it’s 1.2 days, mostly around Mult-DataCenter replication. We’ll talk about the “Brave New World” of replication in depth, how it manages concurrent TCP/IP connections, Realtime Sync, and the technology preview of Active Anti-Entropy Fullsync. Finally, we’ll peek over the horizon at new features such as chaining of Realtime sync messages across multiple clusters.
Sam Townsend, High Availability with Riak and PostgreSQL
Slides | Video
Mark Wunsch, Scaling Happiness Horizontally
Slides | Video
This talk will discuss how Gilt has grown its technology organization to optimize for engineer autonomy and happiness and how that optimization has affected its software. Conway’s Law states that an organization that designs systems will inevitably produce systems that are copies of the communication structures of the organization. This talk will work its way between both the (gnarly) technical details of Gilt’s application architecture (something we internally call “LOSA”) and the Gilt Tech organization structure. I’ll discuss the technical challenges we came up against, and how these often pointed out areas of contention in the organization. I’ll discuss quorums, failover, and latency in the context of building a distributed, decentralized, peer-to-peer technical organization.
Matthew Von-Maszewski, Optimizing LevelDB for Performance and Scale
Slides | Video
LevelDB is a flexible key-value store written by Google and open sourced in August 2011. LevelDB provides an ordered mapping of binary keys to binary values. Various companies and individuals utilize LevelDB on cell phones and servers alike. The problem, however, is it does not run optimally on either as shipped.
This presentation outlines the basic internal mechanisms of LevelDB and then proceeds to discuss the tuning opportunities in the source code for each mechanism. This talk will draw heavily from our experiences optimizing LevelDB for use in Riak, which is handy for running sufficiently large clusters.
Ryan Zezeski, Yokozuna: Distributed Search You Don’t Think About
Slides | Video
Allowing users to run arbitrary and complex searches against your data is a feature required by most consumer facing applications. For example, the ability to get ranked results based on free text search and subsequently drill down on that data based on secondary attributes is at the heart of any good online retail shop. Not only must your application support complex queries such as “doggy treats in a 2 mile radius, broken down by popularity” but it must also return in hundreds of milliseconds or less to keep users happy. This is what systems like Solr are built for. But what happens when the index is too big to fit on a single node? What happens when replication is needed for availability? How do you give correct answers when the index is partitioned across several nodes? These are the problems of distributed search. These are some of the problems Yokozuna solves for you without making you think about it.
In this talk Ryan will explain what search is, why it matters, what problems distributed search brings to the table, and how Yokozuna solves them. Yokozuna provides distributed and available search while appearing to be a single-node Solr instance. This is very powerful for developers and ops professionals.
I first saw this in a tweet by Alex Popescu.
PS: If more videos go up and I miss it, please ping me. Thanks!