Foundations of Data Science « Another Word For It

Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

September 29, 2013

Foundations of Data Science

Filed under: Algorithms,Clustering,Data Science,Graphical Models,Graphs,Hidden Markov Model,High Dimensionality,Singular Value Decomposition (SVD),Topic Models (LDA) — Patrick Durusau @ 3:17 pm

Foundations of Data Science by John Hopcroft and Ravindran Kannan.

From the introduction:

Computer science as an academic discipline began in the 60’s. Emphasis was on programming languages, compilers, operating systems, and the mathematical theory that supported these areas. Courses in theoretical computer science covered nite automata, regular expressions, context free languages, and computability. In the 70’s, algorithms was added as an important component of theory. The emphasis was on making computers useful. Today, a fundamental change is taking place and the focus is more on applications. There are many reasons for this change. The merging of computing and communications has played an important role. The enhanced ability to observe, collect and store data in the natural sciences, in commerce, and in other elds calls for a change in our understanding of data and how to handle it in the modern setting. The emergence of the web and social networks, which are by far the largest such structures, presents both opportunities and challenges for theory.

While traditional areas of computer science are still important and highly skilled individuals are needed in these areas, the majority of researchers will be involved with using computers to understand and make usable massive data arising in applications, not just
how to make computers useful on specific well-defined problems. With this in mind we have written this book to cover the theory likely to be useful in the next 40 years, just as automata theory, algorithms and related topics gave students an advantage in the last 40 years. One of the major changes is the switch from discrete mathematics to more of an emphasis on probability, statistics, and numerical methods.

In draft form but impressive!

Current chapters:

Introduction
High-Dimensional Space
Random Graphs
Singular Value Decomposition (SVD)
Random Walks and Markov Chains
Learning and the VC-dimension
Algorithms for Massive Data Problems
Clustering
Topic Models, Hidden Markov Process, Graphical Models, and Belief Propagation
Other Topics [Rankings, Hare System for Voting, Compressed Sensing and Sparse Vectors]
Appendix

I am certain the authors would appreciate comments and suggestions concerning the text.

Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

September 29, 2013

Foundations of Data Science

No Comments