Archive for the ‘Pinot’ Category

Open Sourcing Pinot: Scaling the Wall of Real-Time Analytics

Friday, June 12th, 2015

Open Sourcing Pinot: Scaling the Wall of Real-Time Analytics by Kishore Gopalakrishna.

From the post:

Last fall we introduced Pinot, LinkedIn’s real-time analytics infrastructure, that we built to allow us to slice and dice across billions of rows in real-time across a wide variety of products. Today we are happy to announce that we have open sourced Pinot. We’ve had a lot of interest in Pinot and are excited to see how it is adopted by the open source community.

We’ve been using it at LinkedIn for more than two years, and in that time, it has established itself as the de facto online analytics platform to provide valuable insights to our members and customers. At LinkedIn, we have a large deployment of Pinot storing 100’s of billions of records and ingesting over a billion records every day. Pinot serves as the backend for more than 25 analytics products for our customers and members. This includes products such as Who Viewed My Profile, Who Viewed My Posts and the analytics we offer on job postings and ads to help our customers be as effective as possible and get a better return on their investment.

In addition, more than 30 internal products are powered by Pinot. This includes XLNT, our A/B testing platform, which is crucial to our business – we run more than 400 experiments in parallel daily on it.

I am intrigued by:

For ease of use we decided to provide a SQL like interface. We support most SQL features including a SQL-like query language and a rich feature set such as filtering, aggregation, group by, order by, distinct. Currently we do not support joins in order to ensure predictable latency.

“SQL-like” always seem a bit vague to me. Will be looking at the details on the query language.

Grab the code and/or see the documentation.