## Archive for the ‘CS Lectures’ Category

### Harvard Stat 221

Friday, May 10th, 2013

Harvard Stat 221 “Statistical Computing and Visualization”: by Sergiy Nesterko.

From the post:

Stat 221 is Statistical Computing and Visualization. It’s a graduate class on analyzing data without losing scientific rigor, and communicating your work. Topics span the full cycle of a data-driven project including project setup, design, implementation, and creating interactive user experiences to communicate ideas and results. We covered current theory and philosophy of building models for data, computational methods, and tools such as d3js, parallel computing with MPI, R.

See Sergily’s post for the lecture slides from this course.

### Virtual School summer courses…

Monday, May 6th, 2013

Virtual School summer courses on data-intensive and many-core computing

From the webpage:

Graduate students, post-docs and professionals from academia, government, and industry are invited to sign up now for two summer school courses offered by the Virtual School of Computational Science and Engineering.

These Virtual School courses will be delivered to sites nationwide using high-definition videoconferencing technologies, allowing students to participate at a number of convenient locations where they will be able to work with a cohort of fellow computational scientists, have access to local experts, and interact in real time with course instructors.

The Data Intensive Summer School focuses on the skills needed to manage, process, and gain insight from large amounts of data. It targets researchers from the physical, biological, economic, and social sciences who need to deal with large collections of data. The course will cover the nuts and bolts of data-intensive computing, common tools and software, predictive analytics algorithms, data management, and non-relational database models.

(…)

For more information about the Data-Intensive Summer School, including pre-requisites and course topics, visit http://www.vscse.org/summerschool/2013/bigdata.html.

The Proven Algorithmic Techniques for Many-core Processors summer school will present students with the seven most common and crucial algorithm and data optimization techniques to support successful use of GPUs for scientific computing.

Studying many current GPU computing applications, the course instructors have learned that the limits of an application’s scalability are often related to some combination of memory bandwidth saturation, memory contention, imbalanced data distribution, or data structure/algorithm interactions. Successful GPU application developers often adjust their data structures and problem formulation specifically for massive threading and executed their threads leveraging shared on-chip memory resources for bigger impact. The techniques presented in the course can improve performance of applicable kernels by 2-10X in current processors while improving future scalability.

(…)

For more information about the Proven Algorithmic Techniques for Many-core Processors course, including pre-requisites and course topics, visit http://www.vscse.org/summerschool/2013/manycore.html.

Think of it as summer camp. For $100 (waived at some locations), it would be hard to do better. ### Vote for Web Science MOOC! Wednesday, May 1st, 2013 Please help me to realize my Web science massive open online course by René Pickhardt. René has designed a Web Science MOOC but needs your vote at: https://moocfellowship.org/submissions/web-science to get the course funded. Details on the course are at: Please help me to realize my Web science massive open online course. The Web is important but to be honest, I am hopeful success here will encourage René to do a MOOC on graphs. So I have an ulterior motive for promoting this particular MOOC. ### Spring/Summer Reading – 2013 Monday, April 8th, 2013 The ACM has released: Best Reviews (2012) and, Notable Computing Books and Articles of 2012 Before you hit the summer conference or vacation schedule, visit your local bookstore or load up your ebook reader! I first saw this at Best Reviews & Notable Books and Articles of 2012 by Shar Steed. ### Introduction to C and C++ Monday, March 25th, 2013 Introduction to C and C++ Description: This course provides a fast-paced introduction to the C and C++ programming languages. You will learn the required background knowledge, including memory management, pointers, preprocessor macros, object-oriented programming, and how to find bugs when you inevitably use any of those incorrectly. There will be daily assignments and a small-scale individual project. This course is offered during the Independent Activities Period (IAP), which is a special 4-week term at MIT that runs from the first week of January until the end of the month. Just in case you want a deeper understanding of bugs that enable hacking or how to avoid creating such bugs in the first place. ### Training a New Generation of Data Scientists Thursday, March 21st, 2013 Training a New Generation of Data Scientists by Ryan Goldman. From the post: Data scientists drive data as a platform to answer previously unimaginable questions. These multi-talented data professionals are in demand like never before because they identify or create some of the most exciting and potentially profitable business opportunities across industries. However, a scarcity of existing external talent will require companies of all sizes to find, develop, and train their people with backgrounds in software engineering, statistics, or traditional business intelligence as the next generation of data scientists. Join us for the premiere of Training a New Generation of Data Scientists on Tuesday, March 26, at 2pm ET/11am PT. In this video, Cloudera’s Senior Director of Data Science, Josh Wills, will discuss what data scientists do, how they think about problems, the relationship between data science and Hadoop, and how Cloudera training can help you join this increasingly important profession. Following the video, Josh will answer your questions about data science, Hadoop, and Cloudera’s Introduction to Data Science: Building Recommender Systems course. This could be fun! And if nothing else, will give you the tools to distinguish legitimate training, like Cloudera’s, from the “How to make$millions in real estate,” from the guy who makes money selling lectures and books sort of training.

As “hot” as data science is, you don’t have to look for to find that sort of training.

### UW Courses in Computer Science and Engineering

Monday, February 25th, 2013

University of Washington Courses in Computer Science and Engineering

When I noticed the 2008 date on CSE 321: Discrete Structures 2008, I checked for a later offering of the course. The most recent one being 2010.

That’s still not terribly recent for a fundamental course so I ended up at the general courses page you see above.

By my count, two hundred and twenty-eight (228) courses, many of the ones I checked with video lectures and other materials.

I never did discover the “official” successor for CSE 321, but given the wealth of course materials, that is a small matter.

### TCS online series- could this work?

Saturday, February 2nd, 2013

TCS online series- could this work? by Bill Gasarch.

From the post:

Oded Regev, Anindya De and Thomas Vidick we are about to start an online TCS seminar series. See here for details, though I have the first few paragraphs below.

Its an interesting idea- we can’t all get to conferences so this is a good way to get information out there. Wave of the future? We’ll see how it goes.

Here is the first few paragraphs:

Ever wished you could attend that talk if only you didn’t have to hike the Rockies, or swim across the Atlantic, to get there; if only it could have been scheduled the following week, because this week is finals; if only you could watch it from your desk, or for that matter directly from your bed?

Starting this semester TCS+ will solve all your worries. We are delighted to announce the initiation of a new series of *online* seminars in theoretical computer science. The seminars will be run using the hangout feature of Google+. The speaker and slides will be broadcast live as well as recorded and made available online. Anyone with a computer (and a decent browser) can watch; anyone with a webcam can join the live audience and participate.

For updates, see the TCS webpage: https://sites.google.com/site/plustcs/

Keep a watch on this for ideas to stay ahead of your competition.

### NYU Large Scale Machine Learning Class [Not a MOOC]

Tuesday, January 8th, 2013

NYU Large Scale Machine Learning Class by John Langford.

From the post:

Yann LeCun and I are coteaching a class on Large Scale Machine Learning starting late January at NYU. This class will cover many tricks to get machine learning working well on datasets with many features, examples, and classes, along with several elements of deep learning and support systems enabling the previous.

This is not a beginning class—you really need to have taken a basic machine learning class previously to follow along. Students will be able to run and experiment with large scale learning algorithms since Yahoo! has donated servers which are being configured into a small scale Hadoop cluster. We are planning to cover the frontier of research in scalable learning algorithms, so good class projects could easily lead to papers.

For me, this is a chance to teach on many topics of past research. In general, it seems like researchers should engage in at least occasional teaching of research, both as a proof of teachability and to see their own research through that lens. More generally, I expect there is quite a bit of interest: figuring out how to use data to make predictions well is a topic of growing interest to many fields. In 2007, this was true, and demand is much stronger now. Yann and I also come from quite different viewpoints, so I’m looking forward to learning from him as well.

We plan to videotape lectures and put them (as well as slides) online, but this is not a MOOC in the sense of online grading and class certificates. I’d prefer that it was, but there are two obstacles: NYU is still figuring out what to do as a University here, and this is not a class that has ever been taught before. Turning previous tutorials and class fragments into coherent subject matter for the 50 students we can support at NYU will be pretty challenging as is. My preference, however, is to enable external participation where it’s easily possible.

Not a MOOC but videos of the lectures will be available. Details under development.

Note the request for suggestions on the class.

### Coursera’s Data Analysis with R course starts Jan 22

Monday, December 24th, 2012

Coursera’s Data Analysis with R course starts Jan 22 by David Smith.

From the post:

Following on from Coursera’s popular course introducing the R language, a new course on data analysis with R starts on January 22. The simply-titled Data Analysis course will provide practically-oriented instruction on how to plan, carry out, and communicate analyses of real data sets with R.

See also: Computing for Data Analysis course, which starts January 2nd.

Being sober by January 2nd is going to be a challenge but worth the effort.

### edX – Spring 2013

Thursday, December 20th, 2012

edX – Spring 2013

Of particular interest:

This spring also features Harvard’s Copyright, taught by Harvard Law School professor William Fisher III, former law clerk to Justice Thurgood Marshall and expert on the hotly debated U.S. copyright system, which will explore the current law of copyright and the ongoing debates concerning how that law should be reformed. Copyright will be offered as an experimental course, taking advantage of different combinations and uses of teaching materials, educational technologies, and the edX platform. 500 learners will be selected through an open application process that will run through January 3rd 2013.

An opportunity to use a topic map with complex legal issues and sources.

But CS topics are not being neglected:

In addition to these new courses, edX is bringing back several courses from the popular fall 2012 semester: Introduction to Computer Science and Programming; Introduction to Solid State Chemistry; Introduction to Artificial Intelligence; Software as a Service I; Software as a Service II; Foundations of Computer Graphics.

### Analyzing Big Data With Twitter

Friday, December 14th, 2012

UC Berkeley Course Lectures: Analyzing Big Data With Twitter by Marti Hearst.

Marti gives a summary of this excellent class, with links to videos, slides and high level notes for the course.

If you enjoyed these materials, make a post about them, recommend them to others or even send Marti a note of appreciation.

Prof. Marti Hearst, ude.yelekreb.loohcsi@tsraeh

### FutureLearn [MOOCs from Open University, UK]

Friday, December 14th, 2012

Futurelearn

From the webpage:

Futurelearn will bring together a range of free, open, online courses from leading UK universities, in the same place and under the same brand.

The Company will be able to draw on The Open University’s unparalleled expertise in delivering distance learning and in pioneering open education resources. These will enable Futurelearn to present a single, coherent entry point for students to the best of the UK’s online education content.

Futurelearn will increase the accessibility of higher education, opening up a wide range of new online courses and learning materials to students across the UK and the rest of the world.

More details in 2013.

If you want to know more, now, try:

Open University launches British Mooc platform to rival US providers

or,

OU Launches FutureLearn Ltd

Have you noticed that the more players in a space the greater the semantic diversity?

Makes me suspect that semantic diversity is a characteristic of humanity.

Are there any counter examples?

PS: MOOCs should be fertile grounds for mapping across different vocabularies for the same content.

PPS: In case you are wondering why the Open University has the .com domain, consider that futurelearn.org was taken. Oh! There are those damned re-use of name issues!

### Practical Foundations for Programming Languages

Saturday, December 8th, 2012

PFPL is out! by Robert Harper.

From the post:

Practical Foundations for Programming Languages, published by Cambridge University Press, is now available in print! It can be ordered from the usual sources, and maybe some unusual ones as well. If you order directly from Cambridge using this link, you will get a 20% discount on the cover price (pass it on).

Since going to press I have, inevitably, been informed of some (so far minor) errors that are corrected in the online edition. These corrections will make their way into the second printing. If you see something fishy-looking, compare it with the online edition first to see whether I may have already corrected the mistake. Otherwise, send your comments to me.rwh@cs.cmu.edu

By the way, the cover artwork is by Scott Draves, a former student in my group, who is now a professional artist as well as a researcher at Google in NYC. Thanks, Scott!

Update: The very first author’s copy hit my desk today!

Congratulations to Robert!

The holidays are upon us so order early and often!

### Introduction to Databases [MOOC, Stanford, January 2013]

Thursday, December 6th, 2012

Introduction to Databases (info/registration link) – Starts January 15, 2013.

From the webpage:

“Introduction to Databases” had a very successful public offering in fall 2011, as one of Stanford’s inaugural three massive open online courses. Since then, the course materials have been improved and expanded, and we’re excited to be launching a second public offering of the course in winter 2013. The course includes video lectures and demos with in-video quizzes to check understanding, in-depth standalone quizzes, a wide variety of automatically-checked interactive programming exercises, midterm and final exams, a discussion forum, optional additional exercises with solutions, and pointers to readings and resources. Taught by Professor Jennifer Widom, the curriculum draws from Stanford’s popular Introduction to Databases course.

Why Learn About Databases?

Databases are incredibly prevalent — they underlie technology used by most people every day if not every hour. Databases reside behind a huge fraction of websites; they’re a crucial component of telecommunications systems, banking systems, video games, and just about any other software system or electronic device that maintains some amount of persistent information. In addition to persistence, database systems provide a number of other properties that make them exceptionally useful and convenient: reliability, efficiency, scalability, concurrency control, data abstractions, and high-level query languages. Databases are so ubiquitous and important that computer science graduates frequently cite their database class as the one most useful to them in their industry or graduate-school careers.

Course Syllabus

This course covers database design and the use of database management systems for applications. It includes extensive coverage of the relational model, relational algebra, and SQL. It also covers XML data including DTDs and XML Schema for validation, and the query and transformation languages XPath, XQuery, and XSLT. The course includes database design in UML, and relational design principles based on dependencies and normal forms. Many additional key database topics from the design and application-building perspective are also covered: indexes, views, transactions, authorization, integrity constraints, triggers, on-line analytical processing (OLAP), JSON, and emerging NoSQL systems. Working through the entire course provides comprehensive coverage of the field, but most of the topics are also well-suited for “a la carte” learning.

Biography

Jennifer Widom is the Fletcher Jones Professor and Chair of the Computer Science Department at Stanford University. She received her Bachelors degree from the Indiana University School of Music in 1982 and her Computer Science Ph.D. from Cornell University in 1987. She was a Research Staff Member at the IBM Almaden Research Center before joining the Stanford faculty in 1993. Her research interests span many aspects of nontraditional data management. She is an ACM Fellow and a member of the National Academy of Engineering and the American Academy of Arts & Sciences; she received the ACM SIGMOD Edgar F. Codd Innovations Award in 2007 and was a Guggenheim Fellow in 2000; she has served on a variety of program committees, advisory boards, and editorial boards.

Another reason to take the course:

The structure and capabilities of databases shape the way we create solutions.

Consider normalization. An investment of time and effort that may be needed, for some problems, but not others.

Absent alternative approaches, you see every data problem as requiring normalization.

(You may anyway after taking this course. Education cannot impart imagination.)

### Algorithms: Design and Analysis, Part 2

Wednesday, December 5th, 2012

Algorithms: Design and Analysis, Part 2 by Tim Roughgarden. (Coursera)

From the course description:

In this course you will learn several fundamental principles of advanced algorithm design: greedy algorithms and applications; dynamic programming and applications; NP-completeness and what it means for the algorithm designer; the design and analysis of heuristics; and more.

The course started December 3, 2012 so if you are going to join, best do so soon.

### Course on Information Theory, Pattern Recognition, and Neural Networks

Friday, November 23rd, 2012

From the description:

A series of sixteen lectures covering the core of the book “Information Theory, Inference, and Learning Algorithms (Cambridge University Press, 2003)” which can be bought at Amazon, and is available free online. A subset of these lectures used to constitute a Part III Physics course at the University of Cambridge. The high-resolution videos and all other course material can be downloaded from the Cambridge course website.

Excellent lectures on information theory, the probability that a message sent is the one received.

Makes me wonder if there is a similar probability theory for the semantics of a message sent being the semantics of the message as received?

### Spark: Making Big Data Analytics Interactive and Real-­Time (Video Lecture)

Wednesday, November 14th, 2012

Spark: Making Big Data Analytics Interactive and Real-­Time by Matei Zaharia. Post from Marti Hearst.

From the post:

Spark is the hot next thing for Hadoop / MapReduce, and yesterday Matei Zaharia, a PhD student in UC Berkeley’s AMP Lab, gave us a terrific lecture about how it works and what’s coming next. The key idea is to make analysis of big data interactive and able to respond in real time. Matei also gave a live demo.

Spark: Lightning-Fast Cluster Computing (website).

Another great lecture from Marti’s class on Twitter and Big Data.

### Introduction to Complexity [Santa Fe Institute]

Saturday, November 3rd, 2012

Introduction to Complexity [Santa Fe Institute]

From the webpage:

Santa Fe Institute will be launching a series of MOOCs (Massive Open On-line Courses), covering the field of complex systems science. Our first course, Introduction to Complexity, will be an accessible introduction to the field, with no pre-requisites. You don’t need a science or math background to take this introductory course; it simply requires an interest in the field and the willingness to participate in a hands-on approach to the subject.

In this ten-week course, you’ll learn about the tools used by complex systems scientists to understand, and sometimes to control, complex systems. The topics you’ll learn about include dynamics, chaos, fractals, information theory, computation theory, evolution and adaptation, agent-based modeling, and networks. You’ll also get a sense of how these topics fit together to help explain how complexity arises and evolves in nature, society, and technology.

Introduction to Complexity will be free and open to anyone. The instructor is Melanie Mitchell, External Professor at SFI, Professor of Computer Science at Portland State University, and author of the award-winning book, Complexity: A Guided Tour. The course will begin in early 2013.

You can subscribe to course announcements at this page.

If you don’t know the Santa Fe Institute, you should.

### Coming soon on JAXenter: videos from JAX London [What Does Hardware Know?]

Wednesday, October 31st, 2012

Coming soon on JAXenter: videos from JAX London by Elliot Bentley.

From the post:

Can you believe it’s only been two weeks since JAX London? We’re already planning for the next one at JAX Towers (yes, really).

Yet if you’re already getting nostalgic, never fear – JAXenter is on hand to help you relive those glorious yet fleeting days, and give a taste of what you may have missed.

For a start, we’ve got videos of almost every session in the main room, including keynotes from Doug Cutting, Patrick Debois, Steve Poole and Martijn Verburg & Kirk Pepperdine, which we’ll be releasing gradually onto the site over the coming weeks. Slides for the rest of JAX London’s sessions are already freely available on SlideShare.

Pepperdine and Verburg, “Java and the Machine,” remark:

There’s no such thing as a process as far as the hardware is concerned.

A riff I need to steal to say:

There’s no such thing as semantics as far as the hardware is concerned.

We attribute semantics to data for input, we attribute semantics to processing of data by hardware, we attribute semantics to computational results.

I didn’t see a place for hardware in that statement. Do you?

### Artificial Intelligence – Fall 2012 – CMU

Wednesday, October 31st, 2012

From the course overview:

Topics:

This course is about the theory and practice of Artificial Intelligence. We will study modern techniques for computers to represent task-relevant information and make intelligent (i.e. satisfying or optimal) decisions towards the achievement of goals. The search and problem solving methods are applicable throughout a large range of industrial, civil, medical, financial, robotic, and information systems. We will investigate questions about AI systems such as: how to represent knowledge, how to effectively generate appropriate sequences of actions and how to search among alternatives to find optimal or near-optimal solutions. We will also explore how to deal with uncertainty in the world, how to learn from experience, and how to learn decision rules from data. We expect that by the end of the course students will have a thorough understanding of the algorithmic foundations of AI, how probability and AI are closely interrelated, and how automated agents learn. We also expect students to acquire a strong appreciation of the big-picture aspects of developing fully autonomous intelligent agents. Other lectures will introduce additional aspects of AI, including unsupervised and on-line learning, autonomous robotics, and economic/game-theoretic decision making.

Learning Objectives

By the end of the course, students should be able to:

1. Identify the type of an AI problem (search, inference, decision making under uncertainty, game theory, etc).
2. Formulate the problem as a particular type. (Example: define a state space for a search problem)
3. Compare the difficulty of different versions of AI problems, in terms of computational complexity and the efficiency of existing algorithms.
4. Implement, evaluate and compare the performance of various AI algorithms. Evaluation could include empirical demonstration or theoretical proofs.

Textbook:

It is helpful, but not required, to have Artificial Intelligence: A Modern Approach / Russel and Norvig.

Judging from the materials on the website, this is a very good course.

### Algorithms for Massive Data Sets

Sunday, October 28th, 2012

Algorithms for Massive Data Sets by Inge Li Gørtz and Philip Bille.

From the course description:

A student who has met the objectives of the course will be able to:

• Describe an algorithm in a comprehensible manner, i.e., accurately, concise, and unambiguous.
• Prove correctness of algorithms.
• Analyze, evaluate, and compare the performance of algorithms in models of computation relevant to massive data sets.
• Analyze, evaluate, and compare the quality and reliability of solutions.
• Apply and extend relevant algorithmic techniques for massive data sets.
• Design algorithms for problems related to massive data sets.
• Lookup and apply relevant research literature for problems related to massive data sets.
• Systematically identify and analyze problems and make informed choices for solving the problems based on the analysis.
• Argue clearly for the choices made when solving a problem.

Papers, slides and exercises provided for these topics:

Week 1: Introduction and Hashing: Chained, Universal, and Perfect.

Week 2: Predecessor Data Structures: x-fast tries and y-fast tries.

Week 3: Decremental Connectivity in Trees: Cluster decomposition, Word-Level Parallelism.

Week 4: Nearest Common Ancestors: Distributed data structures, Heavy-path decomposition, alphabetic codes.

Week 5: Amortized analysis and Union-Find.

Week 6: Range Reporting: Range Trees, Fractional Cascading, and kD Trees.

Week 7: Persistent data structures.

Week 8: String matching.

Week 9: String Indexing: Dictionaries, Tries, Suffix trees, and Suffix Sorting.

Week 10: Introduction to approximation algorithms. TSP, k-center, vertex cover.

Week 11: Approximation algorithms: Set Cover, stable matching.

Week 12: External Memory: I/O Algorithms, Cache-Oblivious Algorithms, and Dynamic Programming

Just reading the papers will improve your big data skills.

### 100 most popular Machine Learning talks at VideoLectures.Net

Friday, October 26th, 2012

100 most popular Machine Learning talks at VideoLectures.Net by Davor Orlič.

A treasure trove of lectures on machine learning.

If there is a sort order to this collection, title, author, length, subject, it escapes me.

Even browsing you will find more than enough material to fill the coming weekend (and beyond).

### 7 John McCarthy Papers in 7 weeks – Prologue

Sunday, October 21st, 2012

7 John McCarthy Papers in 7 weeks – Prologue by Carin Meier.

From the post:

In the spirit of Seven Languages in Seven Weeks, I have decided to embark on a quest. But instead of focusing on expanding my mindset with different programming languages, I am focusing on trying to get into the mindset of John McCarthy, father of LISP and AI, by reading and thinking about seven of his papers.

See Carin’s blog for progress so far.

I first saw this at John D. Cooks’s The Endeavor

How would you react to something similar for topic maps?

### Splunk’s Software Architecture and GUI for Analyzing Twitter Data

Wednesday, September 26th, 2012

From the post:

Today we learned about an alternative software architecture for processing large data, getting the technical details from Splunk’s VP of Engineering, Stephen Sorkin. Splunk also has a really amazing GUI for analyzing Twitter and other data sources in real time; be sure to watch the last 15 minutes of the video to see the demo:

Someone needs to organize a “big data tool of the month” club!

Or at the rate of current development, would that be a “big data tool of the week” club?

### Introductory FP Course Materials

Saturday, September 15th, 2012

Introductory FP Course Materials by Robert Harper.

Deeply awesome body of material.

Enjoy!

### Coding to the Twitter API

Wednesday, September 12th, 2012

Coding to the Twitter API by Marti Hearst.

From the post:

Today Rion Snow saved us huge amounts of time by giving us a primo introduction to the Twitter API. We learned about both the RESTful API and the streaming API for both Java and Python.

A very cool set of slides!

Just the right amount of detail and amusement. Clearly an experienced presenter!

### Analyzing Big Data with Twitter

Tuesday, September 11th, 2012

Analyzing Big Data with Twitter

Not really with Twitter but with tools sponsored/developed/used by Twitter. Lecture series at the UC Berkeley School of Information.

Videos of lectures are posted online.

Check out the syllabus for assignments and current content.

Four (4) lectures so far!

• Big Data Analytics with Twitter – Marti Hearst & Gilad Mishne. Introduction to Twitter in general.
• Twitter Philosophy and Software Architecture – Othman Laraki & Raffi Krikorian.
• Introduction to Hadoop – Bill Graham.
• Apache Pig – Jon Coveney
• … more to follow.

### Grant Seeking/Funding As Computer Science Activity

Sunday, August 26th, 2012

Robert Harper writes in: Believing in Computer Science:

It’s not every day that I can say that I agree with Bertrand Meyer, but today is an exception. Meyer has written an opinion piece in the current issue of C.ACM about science funding that I think is worth amplying. His main point is that funding agencies, principally the NSF and the ERC, are constantly pushing for “revolutionary” research, at the expense of “evolutionary” research. Yet we all (including the funding agencies) know full well that, in almost every case, real progress is made by making seemingly small advances on what is already known, and that whether a body of research is revolutionary or not can only be assessed with considerable hindsight. Meyer cites the example of Hoare’s formulation of his logic of programs, which was at the time a relatively small increment on Floyd’s method for proving properties of programs. For all his brilliance, Hoare didn’t just invent this stuff out of thin air, he built on and improved upon the work that had gone before, as of course have hundreds of others built on his in turn. This all goes without saying, or ought to, but as Meyer points out, we computer scientists are constantly bombarded with direct and indirect exhortations to abandon all that has gone before, and to make promises that no one can honestly keep.

Meyer’s rallying cry is for incrementalism. It’s a tough row to hoe. Who could possibly argue against fostering earth-shattering research that breaks new ground and summarily refutes all that has gone before? And who could possibly defend work that is obviously just another slice of the same salami, perhaps with a bit of mustard this time? And yet what he says is obviously true. Funding agencies routinely beg the very question under consideration by stipulating a priori that there is something wrong with a field, and that an entirely new approach is required. With all due respect to the people involved, I would say that calls such as these are both ill-informed and outrageously arrogant.

What Harper and Meyer write is true, but misses a critical point.

To illustrate: What do you think would happen if one or more of the “impossible” funding proposals succeeded?

Consider the funding agency and its staff. If even one of its perennial funding requests were to succeed, what would it replace it with for next time? Can’t have a funding apparatus, with clerks, rule books, procedures, judges, etc., without some problem to be addressed. Solving any sufficiently large problem would be a nightmare for a funding agency.

On the par with the March of Dimes solving the problem of polio. It had the choice of finding a new mission or dissolving. Can you imagine a funding agency presented with that choice?

CS funding agencies avoid that dilemma by funding research that by definition is very unlikely to succeed.

And what of the grant seekers?

What if they can only accept graduate students who can solve nearly impossible CS problems? Would not have a very large department with that as a limitation. And consequently very small budgets, limited publication venues, conferences, etc.

I completely agree with Harper and Meyers but CS departments should start teaching grant seeking/funding as a CS activity.

Perhaps even a Masters of CS/Grants&Funding? (There may be one already, check your local course catalog.)

“Real” CS will proceed incrementally, but then it always has.

I retained the link in Robert’s post but you should forward, Long Live Incremental Research!, http://cacm.acm.org/blogs/blog-cacm/109579-long-live-incremental-research/fulltext, so your non-ACM friends can enjoy the Meyer’s post.

### Machine Learning [Andrew Ng]

Saturday, August 25th, 2012

Machine Learning [Andrew Ng]

The machine learning course by Andrew Ng started up on 20 August 2012, so there is time to enroll and catch up.

From the post:

What Is Machine Learning?

Machine learning is the science of getting computers to act without being explicitly programmed. In the past decade, machine learning has given us self-driving cars, practical speech recognition, effective web search, and a vastly improved understanding of the human genome. Machine learning is so pervasive today that you probably use it dozens of times a day without knowing it. Many researchers also think it is the best way to make progress towards human-level AI. In this class, you will learn about the most effective machine learning techniques, and gain practice implementing them and getting them to work for yourself. More importantly, you’ll learn about not only the theoretical underpinnings of learning, but also gain the practical know-how needed to quickly and powerfully apply these techniques to new problems. Finally, you’ll learn about some of Silicon Valley’s best practices in innovation as it pertains to machine learning and AI.