Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

March 2, 2013

Hellerstein: Humans are the Bottleneck [Not really]

Filed under: Data,Subject Identity,Topic Maps — Patrick Durusau @ 5:06 pm

Hellerstein: Humans are the Bottleneck by Isaac Lopez.

From the post:

Humans are the bottleneck right now in the data space, commented database systems luminary, Joe Hellerstein during an interview this week at Strata 2013.

“As Moore’s law drives the cost of computing down, and as data becomes more prevalent as a result, what we see is that the remaining bottleneck in computing costs is the human factor,” says Hellerstein, one of the fathers of adaptive query processing and a half dozen other database technologies.

Hellerstein says that recent research studies conducted at Stanford and Berkeley have found that 50-80 percent of a data analyst’s time is being used for the data grunt work (with the rest left for custom coding, analysis, and other duties).

“Data prep, data wrangling, data munging are words you hear over and over,” says Hellerstein. “Even with very highly skilled professionals in the data analysis space, this is where they’re spending their time, and it really is a big bottleneck.”

Just because humans gather at a common location, in “data prep, data wrangling, data munging,” doesn’t mean they “are the bottleneck.”

The question to ask is: Why are people spending so much time at location X in data processing?

Answer: poor data quality and/or rather the inability of machines to process effectively data from different origins. That’s the bottleneck.

A problem that management of subject identities for data and its containers is uniquely poised to solve.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress