Using Clouds for MapReduce Measurement Assignments by Ariel Rabkin, Charles Reiss, Randy Katz, and David Patterson. (ACM Trans. Comput. Educ. 13, 1, Article 2 (January 2013), 18 pages. DOI = 10.1145/2414446.2414448)
Abstract:
We describe our experiences teaching MapReduce in a large undergraduate lecture course using public cloud services and the standard Hadoop API. Using the standard API, students directly experienced the quality of industrial big-data tools. Using the cloud, every student could carry out scalability benchmarking assignments on realistic hardware, which would have been impossible otherwise. Over two semesters, over 500 students took our course. We believe this is the first large-scale demonstration that it is feasible to use pay-as-you-go billing in the cloud for a large undergraduate course. Modest instructor effort was sufficient to prevent students from overspending. Average per-pupil expenses in the Cloud were under $45. Students were excited by the assignment: 90% said they thought it should be retained in future course offerings.
With properly structured assignments, I can see this technique being used to introduce library graduate students to data mining and similar topics on non-trivial data sets.
Getting “hands on” experience should make them more than a match for the sales types from information vendors.
Not to mention that data mining flourishes when used with an understanding of the underlying semantics of the data set.
I first saw this at: On Teaching MapReduce via Clouds