Archive for the ‘Google Correlate’ Category

Nearest Neighbor Search in Google Correlate

Tuesday, October 15th, 2013

Nearest Neighbor Search in Google Correlate by Dan Vanderkam, Rob Schonberger, Henry Rowley, Sanjiv Kumar.

Abstract:

This paper presents the algorithms which power Google Correlate, a tool which finds web search terms whose popularity over time best matches a user-provided time series. Correlate was developed to generalize the query-based modeling techniques pioneered by Google Flu Trends and make them available to end users.

Correlate searches across millions of candidate query time series to find the best matches, returning results in less than 200 milliseconds. Its feature set and requirements present unique challenges for Approximate Nearest Neighbor (ANN) search techniques. In this paper, we present Asymmetric Hashing (AH), the technique used by Correlate, and show how it can be adapted to the specific needs of the product.

We then develop experiments to test the throughput and recall of Asymmetric Hashing as compared to a brute-force search. For “full” search vectors, we achieve a 10x speedup over brute force search while maintaining 97% recall. For search vectors which contain holdout periods, we achieve a 4x speedup over brute force search, also with 97% recall. (I followed the paragraphing in the PDF for the abstract.)

Ten-x speedups on Google Flu Trends size data sets are non-trivial.

If you are playing in that size sand box, this is a must read.

Google Correlate expands to 49 additional countries

Wednesday, January 4th, 2012

Google Correlate expands to 49 additional countries

Matt Mohebbi, Software Engineer, writes:

From the post:

In May of this year we launched Google Correlate on Google Labs. This system enables a correlation search between a user-provided time series and millions of time series of Google search traffic. Since our initial launch, we’ve graduated to Google Trends and we’ve seen a number of great applications of Correlate in several domains, including economics (consumer spending, unemployment rate and housing inventory), sociology and meteorology. The correspondence of gas prices and search activity for fuel efficient cars was even briefly discussed in a Fox News presidential debate and NPR recently covered correlations related to political commentators.

Google has added 49 countries for use with Correlate, bring the total to 50.

Just in case you are curious:

Country Table for Google Correlate – 4 Jan. 2012
  • Argentina
  • Australia
  • Austria
  • Belgium
  • Brazil
  • Bulgaria
  • Canada
  • Chile
  • China
  • Colombia
  • Croatia
  • Czech Republic
  • Denmark
  • Egypt
  • Finland
  • France
  • Germany
  • Greece
  • Hungary
  • India
  • Indonesia
  • Ireland
  • Israel
  • Italy
  • Japan
  • Malaysia
  • Mexico
  • Morocco
  • Netherlands
  • New Zealand
  • Norway
  • Peru
  • Philippines
  • Poland
  • Portugal
  • Romania
  • Russian Federation
  • Saudi Arabia
  • Singapore
  • Spain
  • Sweden
  • Switzerland
  • Taiwan
  • Thailand
  • Turkey
  • Ukraine
  • United Kingdom
  • United States
  • Venezuela
  • Viet Nam

What correlations are you going to find? (Bearing in mind that correlation is not causation.)