Archive for the ‘Daytona’ Category

From Big Data to New Insights

Monday, July 25th, 2011

From Big Data to New Insights

From the Office of Science and Technology Policy:

Today [18 July 2011], Microsoft is announcing the availability of a new tool called Daytona that will make it easier for researchers to harness the power of “cloud computing” to discover insights in huge quantities of data.

Daytona, which will be freely available to the research community, builds on an existing cloud computing collaboration between the National Science Foundation and Microsoft. In April, NSF announced that it was funding 13 teams to take advantage of Microsoft’s offer to provide free access to its Windows Azure cloud. Among other things, these projects will improve our understanding of large watersheds such as the Savannah River Basin, enable more and better use of renewable energy through improved weather forecasting, predict the interactions between proteins, and make cloud computing more secure, reliable, and accessible over mobile devices.

The new partnership, along with NSF collaborations with other leading IT companies, will help researchers access the computing power and storage capacity they need to tackle the big questions in their field. That’s important because researchers in a growing number of fields are generating extremely large data sets, commonly referred to as “Big Data.” For example, the size of DNA sequencing databases is increasing by a factor of 10 every 18 months! Researchers need better tools to help them store, index, search, visualize, and analyze these data, allowing them to discover new patterns and connections.

So far as I know, the issues of heterogeneous data remain largely unexplored in connection with BigData. Since heterogeneous data has proven problematic with “Small Data,” I not no doubt it will prove equally if not more difficult with Big Data.

This is the one of the offices to contact in the United States on such issues. Other US offices?

Similar offices in other countries?

Whiz-Kid on Hadoop

Monday, July 25th, 2011

Cloudera Whiz-Kid Lipcon Talks Hadoop, Big Data with SiliconANGLE’s Furrier

From the post:

Hadoop, the Big Data processing and analytics framework, isn’t your average open source project.

“If you look at a lot of the open source software that’s been popular out of Apache and elsewhere, its sort of like an open source replacement for something you can already get elsewhere,” said Todd Lipcon, a senior software engineer at Cloudera. “I think Hadoop is kind of unique in that it’s the only option for doing this kind of analysis.”

Lipcon is right. Open Office is an open source office suite alternative to Microsoft Office. MySQL is an open source database alternative to Oracle. Hadoop is an open source Big Data framework alternative for …. Well, there is no alternative.

Now that Daytona has been released by MS along with Excel DataScope, it would be interesting to know how Todd Lipcon sees the ease of use issue?

Powerful technology (LaTeX anyone?) may far exceed the capabilities of (insert your favorite word processor) but if the difficulty of use factor is too high, poorer alternatives will occupy most of the field.

That may give people with the more powerful technology a righteous feeling, but I am not interested in feeling righteous.

I am interested in winning, which means having a powerful technology that can be used by a wide variety of users of varying skill levels.

Some will use it poorer or barely invoking its capabilities. Others will make good but unimaginative use of it. Still others will push the envelope in terms of what it can do. All are legitimate and all are valuable in their own way.

Microsoft Research Releases Another Hadoop Alternative for Azure

Monday, July 18th, 2011

Microsoft Research Releases Another Hadoop Alternative for Azure

From the post:

Today Microsoft Research announced the availability of a free technology preview of Project Daytona MapReduce Runtime for Windows Azure. Using a set of tools for working with big data based on Google’s MapReduce paper, it provides an alternative to Apache Hadoop.

Daytona was created by the eXtreme Computing Group at Microsoft Research. It’s designed to help scientists take advantage of Azure for working with large, unstructured data sets. Daytona is also being used to power a data-analytics-as-a-service offering the team calls Excel DataScope.

Excellent coverage of this latest release along with information about related software from Microsoft.

I don’t think anyone disputes that Hadoop is difficult to use effectively, so why not offer an MS product that makes Apache Hadoop easier to use? With all the consumer software skills at Microsoft it would still be a challenge but one that Microsoft would be the most likely candidate to overcome.

And that would give Microsoft a window (sorry) into non-Azure environments as well as an opportunity to promote an Excel-like interface. (Hard to argue against the familiar.)

We are going to reach the future of computing more quickly the fewer times we stop to build product silos.

Products yes, product silos, no.