Airbnb open sources SQL tool built on Facebook’s Presto database by Derrick Harris.
From the post:
Apartment-sharing startup Airbnb has open sourced a tool called Airpal that the company built to give more of its employees access to the data they need for their jobs. Airpal is built atop the Presto SQL engine that Facebook created in order to speed access to data stored in Hadoop.
Airbnb built Airpal about a year ago so that employees across divisions and roles could get fast access to data rather than having to wait for a data analyst or data scientist to run a query for them. According to product manager James Mayfield, it’s designed to make it easier for novices to write SQL queries by giving them access to a visual interface, previews of the data they’re accessing, and the ability to share and reuse queries.
At this point, Mayfield said, “Over a third of all the people working at Airbnb have issued a query through Airpal.” He added, “The learning curve for SQL doesn’t have to be that high.”
From the GitHub page:
Airpal is a web-based, query execution tool which leverages Facebook’s PrestoDB to make authoring queries and retrieving results simple for users. Airpal provides the ability to find tables, see metadata, browse sample rows, write and edit queries, then submit queries all in a web interface. Once queries are running, users can track query progress and when finished, get the results back through the browser as a CSV (download it or share it with friends). The results of a query can be used to generate a new Hive table for subsequent analysis, and Airpal maintains a searchable history of all queries run within the tool.
- Optional Access Control
- Syntax highlighting
- Results exported to a CSV for download or a Hive table
- Query history for self and others
- Saved queries
- Table finder to search for appropriate tables
- Table explorer to visualize schema of table and first 1000 rows
- Java 7 or higher
- MySQL database
- Presto 0.77 or higher
- S3 bucket (to store CSVs)
- Gradle 2.2 or higher
I understand to some degree the need to make SQL “simpler” but fail to see how simpler controls translate into a solution. The controls may be obvious enough but if I don’t know the semantics of the column headers, the simplicity of the interface won’t be terribly helpful.
Or to put it another way, users seem to be assumed to know the semantics of the tables they encounter. True/False?