Archive for the ‘Data Contest’ Category

How to enter a data contest – machine learning for newbies like me

Saturday, November 5th, 2011

How to enter a data contest – machine learning for newbies like me

From the post:

I’ve not had much experience with machine learning, most of my work has been a struggle just to get data sets that are large enough to be interesting! That’s a big reason why I turned to the Kaggle community when I needed a good prediction algorithm for my current project. I wasn’t completely off the hook though, I still needed to create an example of our current approach, limited as it is, to serve as a benchmark for the teams. While I was at it, it seemed worthwhile to open up the code too, so I’ve created a new Github project:

https://github.com/petewarden/MLloWorld

It actually produces very poor results, but does demonstrate the basics of how to pull in the data and apply one of scikit-learn’s great collection of algorithms. If you get the itch there’s lots of room for improvement, and the contest has another two weeks to run!

There is a case to be made for machine learning in the production of topic maps and what better motivation than contests for learning it?

Which makes me wonder how to structure something similar for topic maps? Contests that is for creating topic maps from one or more data sets? Coming up with funding for something like a meaningful prize would not be as hard as setting up something that was not too easy but also not too hard. At least not for the early contests anyway. ­čśë

For the early ones, pride of first place might be enough.

Suggestions/Comments?