Archive for the ‘Google BigQuery’ Category

Got big JSON? BigQuery expands data import for large scale web apps

Tuesday, October 2nd, 2012

Got big JSON? BigQuery expands data import for large scale web apps by Ryan Boyd, Developer Advocate.

From the post:

JSON is the data format of the web. JSON is used to power most modern websites, is a native format for many NoSQL databases hosting top web applications, and provides the primary data format in many REST APIs. Google BigQuery, our cloud service for ad-hoc analytics on big data, has now added support for JSON and the nested/repeated structure inherent in the data format.

JSON opens the door to a more object-oriented view of your data compared to CSV, the original data format supported by BigQuery. It removes the need for duplication of data required when you flatten records into CSV. Here are some examples of data you might find a JSON format useful for:

  • Log files, with multiple headers and other name-value pairs.
  • User session activities, with information about each activity occurring nested beneath the session record.
  • Sensor data, with variable attributes collected in each measurement.

Nested/repeated data support is one of our most requested features. And while BigQuery’s underlying infrastructure supports it, we’d only enabled it in a limited fashion through M-Lab’s test data. Today, however, developers can use JSON to get any nested/repeated data into and out of BigQuery.

It had to happen. “Big Json” that is.

My question is when “Bigger Data” is going to catch on?

If you got far enough ahead, say six to nine months, you could copyright something like “Biggest Data” and start collecting fees when it comes into common usage.

Qlikview and Google BigQuery…

Sunday, September 23rd, 2012

Qlikview and Google BigQuery – Data Visualization for Big Data by Istvan Szegedi.

From the post:

Google have launched its BigQuery cloud service in May to support interactive analysis of massive datasets up to billions of rows. Shortly after this launch Qliktech, one of the market leaders in BI solutions who is known for its unique associative architecture based on colunm store, in-memory database demonstrated a Qlikview Google BigQuery application that provided data visualization using BigQuery as backend. This post is about how Qlikview and Google BigQuery can be intagrated to provide easy-to-use data analytics application for business users who work on large datasets.

A “big data” offering to limber you up for the coming week!

A Look At Google BigQuery

Monday, May 21st, 2012

A Look At Google BigQuery

Chris Webb writes:

Over the years I’ve written quite a few posts about Google’s BI capabilities. Google never seems to get mentioned much as a BI tools vendor but to me it’s clear that it’s doing a lot in this area and is consciously building up its capabilities; you only need to look at things like Fusion Tables (check out these recently-added features), Google Refine and of course Google Docs to see that it’s pursuing a self-service, information-worker-led vision of BI that’s very similar to the one that Microsoft is pursuing with PowerPivot and Data Explorer.

Earlier this month Google announced the launch of BigQuery and I decided to take a look. Why would a Microsoft BI loyalist like me want to do this, you ask? Well, there are a number of reasons:

Looks like an even handed report to me.

See what you think about it and BigQuery.

Google BigQuery and the Github Data Challenge

Wednesday, May 2nd, 2012

Google BigQuery and the Github Data Challenge

Deadline May 21, 2012

From the post:

Github has made data on its code repositories, developer updates, forks etc. from the public GitHub timeline available for analysis, and is offering prizes for the most interesting visualization of the data. Sounds like a great challenge for R programmers! The R language is currently the 26th most popular on GitHub (up from #29 in December), and it would be interesting to visualize the usage of R compared to other languages, for example. The deadline for submissions to the contest is May 21.

Interestingly, GitHub has made this data available on the Google BigQuery service, which is available to the public today. BigQuery was free to use while it was in beta test, but Google is now charging for storage of the data: $0.12 per gigabyte per month, up to $240/month (the service is limited to 2TB of storage – although there a Premier offering that supports larger data sizes … at a price to be negotiated). While members of the public can run SQL-like queries on the GitHub data for free, Google is charging subscribers to the service 3.5 cents per Gb processed in the query: this is measured by the source data accessed (although columns of data not referenced aren't counted); the size of the result set doesn't matter.

Watch your costs but thoughts on how you would visualize the data?

Google BigQuery Service: Big data analytics at Google speed

Tuesday, November 22nd, 2011

Google BigQuery Service: Big data analytics at Google speed

From the post:

Rapidly crunching terabytes of big data can lead to better business decisions, but this has traditionally required tremendous IT investments. Imagine a large online retailer that wants to provide better product recommendations by analyzing website usage and purchase patterns from millions of website visits. Or consider a car manufacturer that wants to maximize its advertising impact by learning how its last global campaign performed across billions of multimedia impressions. Fortune 500 companies struggle to unlock the potential of data, so it’s no surprise that it’s been even harder for smaller businesses.

We developed Google BigQuery Service for large-scale internal data analytics. At Google I/O last year, we opened a preview of the service to a limited number of enterprises and developers. Today we’re releasing some big improvements, and putting one of Google’s most powerful data analysis systems into the hands of more companies of all sizes.

  • We’ve added a graphical user interface for analysts and developers to rapidly explore massive data through a web application.
  • We’ve made big improvements for customers accessing the service programmatically through the API. The new REST API lets you run multiple jobs in the background and manage tables and permissions with more granularity.
  • Whether you use the BigQuery web application or API, you can now write even more powerful queries with JOIN statements. This lets you run queries across multiple data tables, linked by data that tables have in common.
  • It’s also now easy to manage, secure, and share access to your data tables in BigQuery, and export query results to the desktop or to Google Cloud Storage.

Did I remember to mention that this service is free? ;-) Customers will get 30-days notice when that is about to end.

Sorta like an early present isn’t it?

What did you do with Google BigQuery?