Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

March 26, 2013

Massive online data stream mining with R

Filed under: Data Mining,Data Streams,R — Patrick Durusau @ 5:31 am

Massive online data stream mining with R

From the post:

A few weeks ago, the stream package has been released on CRAN. It allows to do real time analytics on data streams. This can be very usefull if you are working with large datasets which are already hard to put in RAM completely, let alone to build some statistical model on it without getting into RAM problems.

The stream package is currently focussed on clustering algorithms available in MOA (http://moa.cms.waikato.ac.nz/details/stream-clustering/) and also eases interfacing with some clustering already available in R which are suited for data stream clustering. Classification algorithms based on MOA are on the todo list. Current available clustering algorithms are BIRCH, CluStream, ClusTree, DBSCAN, DenStream, Hierarchical, Kmeans and Threshold Nearest Neighbor.

What if data were always encountered as a stream?

Could request a “re-streaming” of data but best to do analysis in one streaming.

How would that impact your notion of subject identity?

How would you compensate for information learned later in the stream?

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress