Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

November 27, 2013

Data Quality, Feature Engineering, GraphBuilder

Filed under: Data Quality,Design,ETL,GraphBuilder,Pig — Patrick Durusau @ 3:06 pm

Avoiding Cluster-Scale Headaches with Better Tools for Data Quality and Feature Engineering by Ted Willke.

Ted’s second slide reads:

Machine Learning may nourish the soul…

…but Data Preparation will consume it.

Ted starts off talking about the problems of data preparation but fairly quickly focuses in on property graphs and using Pig ETL.

He also outlines outstanding problems with Pig ETL (slides 29-32).

Nothing surprising but good news that Graph Builder 2 Alpha is due out in Dec’ 13.

BTW, GraphBuilder 1.0 can be found at: https://01.org/graphbuilder/

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress