Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

August 25, 2013

Big Data Sets you can use with R

Filed under: BigData,R — Patrick Durusau @ 7:36 pm

Big Data Sets you can use with R by Joseph Rickert.

From the post:

The world may indeed be awash with data, however, it is not always easy to find a suitable data set when you need one. As the number of people becoming involved with R and data science increases so does the need for interesting data sets for creating examples, showcasing machine learning algorithms and developing statistical analyses. The most difficult data sets to find are those that would provide the foundation for impressive big data examples: data sets with a 100 million rows and hundreds of variables.The problem with big data, however, is that most of it is proprietary and locked away. Consequently, when constructing examples it is often necessary “make do” with data sets that are considerably smaller than an analyst is likely to be faced with in practice. To help with this problem, we have added some new data sets to lists of data sets on inside-r.org that we began keeping since almost two years ago. So, if you are looking for a sample data set or if you are the kind of person who enjoys browsing data repositories as some people enjoy browsing bookstores have a look at what is available there. The following presents some of the highlights.

Joseph highlights airline, medicare, and Australian weather data sets.

There are a number of other data sets but more would be appreciated by inside-r.org.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress