Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

August 9, 2012

Teaching the World to Search

Filed under: CS Lectures,Searching — Patrick Durusau @ 3:50 pm

Teaching the World to Search by Maggie Johnson.

From the post:

For two weeks in July, we ran Power Searching with Google, a MOOC (Massive Open Online Course) similar to those pioneered by Stanford and MIT. We blended this format with our social and communication tools to create a community learning experience around search. The course covered tips and tricks for Google Search, like using the search box as a calculator, or color filtering to find images.

The course had interactive activities to practice new skills and reinforce learning, and many opportunities to connect with other students using tools such as Google Groups, Moderator and Google+. Two of our search experts, Dan Russell and Matt Cutts, moderated Hangouts on Air, answering dozens of questions from students in the course. There were pre-, mid- and post-class assessments that students were required to pass to receive a certificate of completion. The course content is still available.

Won’t be the same as taking the course but if you missed it, see the materials online.

As you learn new search techniques, consider what about the data (or the user) makes those techniques effective?

Understanding the relationships between data and search techniques may make you a better searcher.

Understanding the relationship between tool and user may make you a better tool designer.

August 6, 2012

There and Back Again

Filed under: CS Lectures,Programming,Types — Patrick Durusau @ 3:18 pm

There and Back Again by Robert Harper.

From the post:

Last fall it became clear to me that it was “now or never” time for completing Practical Foundations for Programming Languages, so I put just about everything else aside and made the big push to completion. The copy editing phase is now complete, the cover design (by Scott Draves) is finished, and its now in the final stages of publication. You can even pre-order a copy on Amazon; it’s expected to be out in November.

I can already think of ways to improve it, but at some point I had to declare victory and save some powder for future editions. My goal in writing the book is to organize as wide a body of material as I could manage in a single unifying framework based on structural operational semantics and structural type systems. At over 600 pages the manuscript is at the upper limit of what one can reasonably consider a single book, even though I strived for concision throughout.

Quite a lot of the technical development is original, and does not follow along traditional lines. For example, I completely decouple the concepts of assignment, reference, and storage class (heap or stack) from one another, which makes clear that one may have references to stack-allocated assignables, or make use of heap-allocated assignables without having references to them. As another example, my treatment of concurrency, while grounded in the process calculus tradition, coheres with my treatment of assignables, but differs sharply from conventional accounts (and suffers none of their pathologies in the formulation of equivalence).

From the preface:

Types are the central organizing principle of the theory of programming languages. Language features are manifestations of type structure. The syntax of a language is governed by the constructs that define its types, and its semantics is determined by the interactions among those constructs. The soundness of a language design—the absence of ill-defined programs—follows naturally.

The purpose of this book is to explain this remark. A variety of programming language features are analyzed in the unifying framework of type theory. A language feature is defined by its statics, the rules governing the use of the feature in a program, and its dynamics, the rules defining how programs using this feature are to be executed. The concept of safety emerges as the coherence of the statics and the dynamics of a language.

In this way we establish a foundation for the study of programming languages. But why these particular methods? The main justification is provided by the book itself. The methods we use are both precise and intuitive, providing a uniform framework for explaining programming language concepts. Importantly, these methods scale to a wide range of programming language concepts, supporting rigorous analysis of their properties. Although it would require another book in itself to justify this assertion, these methods are also practical in that they are directly applicable to implementation and uniquely effective as a basis for mechanized reasoning. No other framework offers as much.

Now that Robert has lunged across the author’s finish line, which one of us will incorporate his thinking into our own?

August 1, 2012

Balisage 2012 – Proceedings & Symposium

Filed under: Conferences,CS Lectures — Patrick Durusau @ 8:07 pm

The Balisage Proceedings and Symposium materials are online! (before the conference/symposium):

Balisage 2012

cover: http://www.balisage.net/Proceedings/vol8/cover.html
table of contents: http://www.balisage.net/Proceedings/vol8/contents.html

Symposium

cover: http://www.balisage.net/Proceedings/vol9/cover.html
table of contents: http://www.balisage.net/Proceedings/vol9/contents.html

As of tomorrow, you have 4 days (starts August 6th) to make the Symposium and 5 days (starts August 7th) to make Balisage.

Same day ticket purchase/travel is still possible but why risk it? Besides, I’m sure Greece can’t afford Interpol fees anymore. 😉

Your choices are:

Attend or,

Spend the rest of the year making up lame excuses for not being at Balisage in Montreal.

Choice is yours!

July 29, 2012

OSCON 2012

OSCON 2012

Over 4,000 photographs were taken at the MS booth.

I wonder how many of them include Doug?

Drop by the OSCON website after you count photos of Doug.

Your efforts at topic mapping will improve from the experience.

From the OSCON site visit.

What you get from counting photos of Doug is unknown. 😉

March 29, 2012

Mathematics for Computer Science

Filed under: CS Lectures,Mathematics — Patrick Durusau @ 6:39 pm

Mathematics for Computer Science, by Eric Lehman, F Thomson Leighton, and Albert R Meyer.

Videos, slides, class problems, miniquizes, and reading material, including the book by the same name. There are officially released parts of the book and a draft of the entire work. Has a nice section on graphs.

I saw the book mentioned in Christophe Lalanne’s Bag of Tweets for March 2012 and then back tracked to the class site.

March 18, 2012

Class Central

Filed under: CS Lectures — Patrick Durusau @ 8:53 pm

Class Central

From the webpage:

A complete list of free online courses offered by Stanford’s Coursera, MIT’s MITx, and Udacity

Well, except that the latest offering listed is from CalTech. 😉

Looks like a resource that is going to see a lot of traffic, as well as new content.

Learning from Data

Filed under: CS Lectures,Machine Learning — Patrick Durusau @ 8:53 pm

Learning from Data

Outline:

This is an introductory course on machine learning that covers the basic theory, algorithms and applications. Machine learning (ML) uses data to recreate the system that generated the data. ML techniques are widely applied in engineering, science, finance, and commerce to build systems for which we do not have full mathematical specification (and that covers a lot of systems). The course balances theory and practice, and covers the mathematical as well as the heuristic aspects. Detailed topics are listed below.

From the webpage:

Real Caltech course, not watered-down version
Broadcast live from the lecture hall at Caltech

And so, the competition of online course offerings begins. 😉

March 6, 2012

Stanford – Delayed Classes – Enroll Now!

If you have been waiting for notices about the delayed Stanford courses for Spring 2012, your wait is over!

Even if you signed up for more information, you must register at the course webpage to take the course.

Details as I have them on 6 March 2012 (check course pages for official information):

Cryptography Starts March 12th.

Design and Analysis of Algorithms Part 1 Starts March 12th.

Game Theory Starts March 19th.

Natural Language Processing Starts March 12th.

Probabilistic Graphical Models Starts March 19th.

You may be asking yourself, “Are all these courses useful for topic maps?”

I would answer by pointing out that librarians and indexers have rely on a broad knowledge of the world to make information more accessible to users.

By way of contrast, “big data” and Google, have made it less accessible.

Something to think about while you are registering for one or more of these courses!

February 19, 2012

EECS Course WEB Sites

Filed under: CS Lectures — Patrick Durusau @ 8:37 pm

EECS Course WEB Sites

Archives of EE and CS classes at Berkeley.

Some with more resources than others. But interesting none the less.

February 13, 2012

MITx Experimental Course Announced

Filed under: CS Lectures,Education,MIT — Patrick Durusau @ 8:18 pm

MITx Experimental Course Announced by Sue Gee.

A free online course in electronics, the “prototype” for future courses being offered in MIT’s online curriculum, MITx, is now open for enrollment and will begin in March.

The first MITx course, 6.002x – Circuits and Electronics begins on March 5 and runs through till June 8. It is being taught by Anant Agarwal, Director of MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), with Gerald Sussman, professor of Electrical Engineering and CSAIL Research Scientist Piotr Mitros. An on-line adaption of 6.002, MIT’s undergraduate analog design course, it is designed to serve as a first course in an undergraduate electrical engineering (EE), or electrical engineering and computer science (EECS) curriculum.

As important at the course content itself this course will serve as the experimental prototype for MITx, the Massachusetts Institute of Technology’s new online learning initiative which offers classes free of charge to students worldwide.

I know topic maps are used in Norway’s educational system.

In what way would you use topic maps to enhance an online course such as this one?

One way to find out would be to take the course and explore the potential of topic maps to enrich the experience.

January 28, 2012

Citogenesis in science and the importance of real problems

Filed under: Algorithms,CS Lectures — Patrick Durusau @ 10:54 pm

Citogenesis in science and the importance of real problems

Daniel Lemire writes:

Many papers in Computer Science tell the following story:

  • There is a pre-existing problem P.
  • There are few relatively simple but effective solution to problem P. Among them is solution X.
  • We came up with a new solution X+ which is a clever variation on X. It looks good on paper.
  • We ran some experiments and tweaked our results until X+ looked good. We found a clever way to avoid comparing X+ and X directly and fairly, as it might then become obvious that the gains are small, or even negative! We would gladly report negative results, but then our paper could not be published.

It is a very convenient story for reviewers: the story is simple and easy to assess superficially. The problem is that sometimes, especially if the authors are famous and the idea is compelling, the results will spread. People will adopt X+ and cite it in their work. And the more they cite it, the more enticing it is to use X+ as every citation becomes further validation for X+. And why bother with algorithm X given that it is older and X+ is the state-of-the-art?

Occasionally, someone might try both X and X+, and they may report results showing that the gains due to X+ are small, or negative. But they have no incentive to make a big deal of it because they are trying to propose yet another better algorithm (X++).

But don’t we see the same thing in blogs? Where writers say “some,” “many,” “often,” etc., but none are claims that can be evaluated by others?

Make no mistake, given the rate of mis-citation that I find in published proceedings, I really want to agree with Daniel but I think the matter is more complex than simply saying that “engineers” work with “real” tests.

One of my pet peeves is that lack of history that I find in most CS papers. They may go back ten years but what about thirty or even forty years ago?

But as far as engineers, why is there so little code re-use if they are so interested in being efficient? Is re-writing code really more efficient or just NWH (Not Written Here)?

January 26, 2012

Sixth Annual Machine Learning Symposium

Filed under: CS Lectures,Machine Learning — Patrick Durusau @ 6:55 pm

Sixth Annual Machine Learning Symposium sponsored by the New York Academy of the Sciences.

There were eighteen (18) presentations and any attempt to summarize on my part would do injustice to one or more of them.

Post your comments and suggestions for which ones I should watch first. Thanks!

January 24, 2012

CS 101: Build a Search Engine

Filed under: CS Lectures,Search Engines — Patrick Durusau @ 3:41 pm

CS 101: Build a Search Engine

David Evans and Sebastian Thrun teach CS 101 by teaching students how to build a search engine.

There is an outline syllabus but not any more detail at this time.

January 2, 2012

Best Paper Awards in Computer Science [2011]

Filed under: Conferences,CS Lectures — Patrick Durusau @ 11:08 am

Best Paper Awards in Computer Science [2011]

Jeff Huang’s list of the best paper awards from 21 CS conferences since 1996 up to and including 2011.

Just in case you are unfamiliar with the conference abbreviations, I have expanded them below and added links to sponsoring organization’s website.

December 29, 2011

Read-through of ‘Gödel, Escher, Bach’

Filed under: CS Lectures — Patrick Durusau @ 9:15 pm

Read-through of ‘Gödel, Escher, Bach’

A read through of ‘Gödel, Escher, Bach’ by Douglas R. Hofstadter, starting 17 January 2012.

Are there other off-line or electronic books you would suggest as “read through” candidates?

December 28, 2011

400 Free Online Courses from Top Universities

Filed under: CS Lectures,Mathematics,Statistics — Patrick Durusau @ 9:37 pm

400 Free Online Courses from Top Universities

Just in case hard core math/cs stuff isn’t your cup of tea or you want to write topic maps about some other area of study, this may be a resource for you.

Oddly enough (?), every listing of free courses seems to be different from other listings of free courses.

If you happen to run across seminar lectures (graduate school) on Ancient or Medieval philosophy, drop me a line. Or even better, on individual figures.

I first saw this linked on John Johnson’s Realizations in Biostatistics. John was pointing to the statistics/math courses but there is a wealth of other material as well.

December 6, 2011

Lecture Fox

Filed under: CS Lectures,Mathematics — Patrick Durusau @ 8:07 pm

Lecture Fox

A nice collection of links to university lectures.

Has separate pages on computer science and math, but also physics and chemistry. The homepage is a varied collection of those subjects and others.

Good to see someone collecting links for lectures beyond the usual ones.

Trivia from one of the CS lectures: What language was started by the U.S. DoD in the mid to late 1970’s to consolidate more than 500 existing languages and dialects?

Try to answer before peeking! Computer Science 164, Spring 2011, Berkeley byw, the materials for Computer Science 164.

Game Theory

Filed under: CS Lectures,Game Theory,Games — Patrick Durusau @ 8:04 pm

Game Theory by Matthew Jackson and Yoav Shoham.

Another Stanford course for the Spring of 2012!

From the description:

Popularized by movies such as “A Beautiful Mind”, game theory is the mathematical modeling of strategic interaction among rational (and irrational) agents. Beyond what we call ‘games’ in common language, such as chess, poker, soccer, etc., it includes the modeling of conflict among nations, political campaigns, competition among firms, and trading behavior in markets such as the NYSE. How could you begin to model eBay, Google keyword auctions, and peer to peer file-sharing networks, without accounting for the incentives of the people using them? The course will provide the basics: representing games and strategies, the extensive form (which computer scientists call game trees), Bayesian games (modeling things like auctions), repeated and stochastic games, and more. We’ll include a variety of examples including classic games and a few applications.

Just in time for an election year so you will be able to model what you think is rational or irrational behavior on the part of voters in the U.S. 😉

The requirements:

You must be comfortable with mathematical thinking and rigorous arguments. Relatively little specific math is required; you should be familiar with basic probability theory (for example, you should know what a conditional probability is) and with basic calculus (for instance, taking a derivative).

For those of you not familiar with game theory, I think the course will be useful in teaching you a different way to view the world. Not necessary more or less accurate than other ways, just different.

Being able to adopt a different world view and see its intersections with other world views is a primary skill in crossing domain borders for new insights or information. The more world views you learn, the better you may become at seeing intersections of world views.

November 30, 2011

Model Thinking

Filed under: CS Lectures,Modeling — Patrick Durusau @ 8:35 pm

Model Thinking by Scott E. Page.

Marijane sent this link in a comment to my post on Stanford classes.

From the class description:

We live in a complex world with diverse people, firms, and governments whose behaviors aggregate to produce novel, unexpected phenomena. We see political uprisings, market crashes, and a never ending array of social trends. How do we make sense of it?

Models. Evidence shows that people who think with models consistently outperform those who don’t. And, moreover people who think with lots of models outperform people who use only one.

Why do models make us better thinkers?

Models help us to better organize information – to make sense of that fire hose or hairball of data (choose your metaphor) available on the Internet. Models improve our abilities to make accurate forecasts. They help us make better decisions and adopt more effective strategies. They even can improve our ability to design institutions and procedures.

In this class, I present a starter kit of models: I start with models of tipping points. I move on to cover models explain the wisdom of crowds, models that show why some countries are rich and some are poor, and models that help unpack the strategic decisions of firm and politicians.

The models cover in this class provide a foundation for future social science classes, whether they be in economics, political science, business, or sociology. Mastering this material will give you a huge leg up in advanced courses. They also help you in life.

Here’s how the course will work.

For each model, I present a short, easily digestible overview lecture. Then, I’ll dig deeper. I’ll go into the technical details of the model. Those technical lectures won’t require calculus but be prepared for some algebra. For all the lectures, I’ll offer some questions and we’ll have quizzes and even a final exam. If you decide to do the deep dive, and take all the quizzes and the exam, you’ll receive a certificate of completion. If you just decide to follow along for the introductory lectures to gain some exposure that’s fine too. It’s all free. And it’s all here to help make you a better thinker!

Hope you can join the course this January.

As Marijane says, “…awfully relevant to Topic Maps!”

November 25, 2011

Stanford Courses

Filed under: CS Lectures — Patrick Durusau @ 4:24 pm

Stanford Courses

Kirk Lowery forwarded this link which has all the current and one supposes future Stanford courses that you can take for free online.

Ones that are of particular interest to the practice of topic maps I will continue to call out separately.

November 22, 2011

MIT OpenCourseware / OCW Scholar

Filed under: CS Lectures — Patrick Durusau @ 6:57 pm

MIT OpenCourseware / OCW Scholar

For some unknown reason I haven’t included a mention of these resources on my blog. Perhaps I assumed “everyone” knew about them or it was just oversight on my part.

MIT OpenCourseware is described as:

MIT OpenCourseWare (OCW) is a web-based publication of virtually all MIT course content. OCW is open and available to the world and is a permanent MIT activity.

What is MIT OpenCourseWare?

MIT OpenCourseWare is a free publication of MIT course materials that reflects almost all the undergraduate and graduate subjects taught at MIT.

  • OCW is not an MIT education.
  • OCW does not grant degrees or certificates.
  • OCW does not provide access to MIT faculty.
  • Materials may not reflect entire content of the course.

I would add: “You don’t have classmates working on the same material for discussion, etc.” but even with all those limitations, this is an incredible resource. Self-study is always more difficult but this is one of the best study aids on the Web!

OCW Scholar is described as:

OCW Scholar courses are designed for independent learners who have few additional resources available to them. The courses are substantially more complete than typical OCW courses and include new custom-created content as well as materials repurposed from MIT classrooms. The materials are also arranged in logical sequences and include multimedia such as video and simulations.

Only five courses listed but the two math courses (single and multi-value calculus) are fundamental to further CS work. And the courses include study groups.

Highly recommended and worthy of your support!

November 21, 2011

Cryptography (class)

Filed under: Cryptography,CS Lectures — Patrick Durusau @ 7:37 pm

Cryptography with Dan Boneh. (Stanford)

Looks like competition to have an online class is heating up at Stanford. 😉

From the description:

Cryptography is an indispensable tool for protecting information in computer systems. This course explains the inner workings of cryptographic primitives and how to correctly use them. Students will learn how to reason about the security of cryptographic constructions and how to apply this knowledge to real-world applications. The course begins with a detailed discussion of how two parties who have a shared secret key can communicate securely when a powerful adversary eavesdrops and tampers with traffic. We will examine many deployed protocols and analyze mistakes in existing systems. The second half of the course discusses public-key techniques that let two or more parties generate a shared secret key. We will cover the relevant number theory and discuss public-key encryption, digital signatures, and authentication protocols. Towards the end of the course we will cover more advanced topics such as zero-knowledge, distributed protocols such as secure auctions, and a number of privacy mechanisms. Throughout the course students will be exposed to many exciting open problems in the field.

The course will include written homeworks and programming labs. The course is self-contained, however it will be helpful to have a basic understanding of discrete probability theory.

I mention this because topic mappers are going to face security issues and they had better be ready to at least discuss them. Even if the details are handed off to experts in security, including cryptography. Like law, security/cryptography aren’t good areas for self-help.

BTW, if this interests you, see Bruce Schneier’s homepage. Really nice collection of resources and other information on cryptography.

November 10, 2011

Machine Learning (Carnegie Mellon University)

Filed under: Computer Science,CS Lectures,Machine Learning — Patrick Durusau @ 6:33 pm

Machine Learning 10-701/15-781, Spring 2011 Carnegie Mellon University by Tom Mitchell.

Course Description:

Machine Learning is concerned with computer programs that automatically improve their performance through experience (e.g., programs that learn to recognize human faces, recommend music and movies, and drive autonomous robots). This course covers the theory and practical algorithms for machine learning from a variety of perspectives. We cover topics such as Bayesian networks, decision tree learning, Support Vector Machines, statistical learning methods, unsupervised learning and reinforcement learning. The course covers theoretical concepts such as inductive bias, the PAC learning framework, Bayesian learning methods, margin-based learning, and Occam’s Razor. Short programming assignments include hands-on experiments with various learning algorithms, and a larger course project gives students a chance to dig into an area of their choice. This course is designed to give a graduate-level student a thorough grounding in the methodologies, technologies, mathematics and algorithms currently needed by people who do research in machine learning.

I don’t know how other disciplines are faring but for a variety of CS topics, there are enough excellent online materials to complete the equivalent of an undergraduate if not master’s degree in CS.

October 29, 2011

We Really Don’t Know How To Compute!

Filed under: Algorithms,CS Lectures,Parallel Programming — Patrick Durusau @ 7:20 pm

We Really Don’t Know How To Compute! by Gerald Jay Sussman.

This is a must watch video! Sussman tries to make the case that we need to think differently about computing. For example, being able to backtrack the provenance of data in a series of operations. Or being able to maintain inconsistent world views while maintaining locally consistent world views in a single system. Or being able to say where world views diverge, without ever claiming either one to be correct/incorrect.

Argues in general for systems that will be robust enough for massively parallel programming. Where inconsistencies and the like are going to abound when applied to say all known medical literature. Isn’t going to be helpful if our systems fall over on their sides when they encounter inconsistency.

A lot of what Sussman says I think is largely applicable to parallel processing of topic maps. Certainly will be looking up some of his class videos from MIT.

From the webpage:

Summary

Gerald Jay Sussman compares our computational skills with the genome, concluding that we are way behind in creating complex systems such as living organisms, and proposing a few areas of improvement.

Bio

Gerald Jay Sussman is the Panasonic Professor of EE at MIT. Sussman is a coauthor (with Hal Abelson and Julie Sussman) of the MIT computer science textbook “Structure and Interpretation of Computer Programs”. Sussman has had a number of important contributions to Artificial Intelligence, and with his former student, Guy L. Steele Jr., invented the Scheme programming language in 1975.

About the conference

Strange Loop is a multi-disciplinary conference that aims to bring together the developers and thinkers building tomorrow’s technology in fields such as emerging languages, alternative databases, concurrency, distributed systems, mobile development, and the web.

October 23, 2011

Notation as a Tool of Thought – Iverson – Turing Lecture

Filed under: CS Lectures,Language,Language Design — Patrick Durusau @ 7:22 pm

Notation as a Tool of Thought by Kenneth E. Iverson – 1979 Turing Award Lecture

I saw this lecture tweeted with a link to a poor photocopy of a double column printing of the lecture.

I think you will find the single column version from the ACM awards site much easier to read.

Not to mention that the ACM awards site has all the Turing as well as other award lectures for viewing.

I suspect that a CS class could be taught using only ACM award lectures as the primary material. Perhaps someone already has, would appreciate a pointer if true.

October 21, 2011

Why are programmers smarter than runners?

Filed under: CS Lectures,Programming — Patrick Durusau @ 7:26 pm

Simple Made Easy by Rich Hickey.

A very impressive presentation that answers the burning question:

Why are programmers smarter than runners?

Summary:

Rich Hickey emphasizes simplicity’s virtues over easiness’, showing that while many choose easiness they may end up with complexity, and the better way is to choose easiness along the simplicity path.

This talk will give you the tools and rhetoric to argue for simplicity in programming. I was particularly impressed by the guardrail argument with regard to testing suites. Watch the video and you will see what I mean.

October 14, 2011

Hierarchical Temporal Memory

Filed under: CS Lectures,Hierarchical Temporal Memory (HTM),Machine Learning — Patrick Durusau @ 6:23 pm

Hierarchical Temporal Memory: How a Theory of the Neocortex May Lead to Truly Intelligent Machines by Jeff Hawkins.

Don’t skip because of the title!

Hawkins covers his theory of the neocortex but however you feel about that, 2/3 of the presentation is on algorithms, completely new material.

Very cool presentation on “Fixed Sparsity Distributed Representation” and lots of neural science stuff. Need to listen to it again and then read the books/papers.

What I liked about it was the notion that even in very noisy or missing data contexts, that highly reliable identifications can be made.

True enough, Hawkins was talking about vision, etc., but he didn’t bring up any reasons why that could not work in other data environments.

In other words, when can a program treat extra data about a subject as noise and recognize it anyway?

Or if some information is missing about a subject, have a program reliably recognize it.

Or if we only want to store some information and yet have reliable recognition?

Don’t know if any, some or all of those are possible but it is certainly worth finding out.

Description:

Jeff Hawkins (Numenta founder) presents as part of the UBC Department of Computer Science’s Distinguished Lecture Series, March 18, 2010.

Coaxing computers to perform basic acts of perception and robotics, let alone high-level thought, has been difficult. No existing computer can recognize pictures, understand language, or navigate through a cluttered room with anywhere near the facility of a child. Hawkins and his colleagues have developed a model of how the neocortex performs these and other tasks. The theory, called Hierarchical Temporal Memory, explains how the hierarchical structure of the neocortex builds a model of its world and uses this model for inference and prediction. To turn this theory into a useful technology, Hawkins has created a company called Numenta. In this talk Hawkins will describe the theory, its biological basis, and progress in applying Hierarchical Temporal Memory to machine learning problems.

Part of this theory was described in Hawkins’ 2004 book, On Intelligence. Further information can be found at www.Numenta.com

October 1, 2011

Bayesian Statistical Reasoning

Filed under: Bayesian Models,CS Lectures,Mathematics — Patrick Durusau @ 8:29 pm

DM SIG “Bayesian Statistical Reasoning ” 5/23/2011 by Prof. David Draper, PhD.

I think you will be surprised at how interesting and even compelling this presentation becomes at points. Particularly his comments early in the presentation about needing an analogy machine, to find things not expressed in the way you usually look for them. And he has concrete examples of where that has been needed.

Title: Bayesian Statistical Reasoning: an inferential, predictive and decision-making paradigm for the 21st century

Professor Draper gives examples of Bayesian inference, prediction and decision-making in the context of several case studies from medicine and health policy. There will be points of potential technical interest for applied mathematicians, statisticians, and computer scientists. Broadly speaking, statistics is the study of uncertainty: how to measure it well, and how to make good choices in the face of it. Statistical activities are of four main types: description of a data set, inference about the underlying process generating the data, prediction of future data, and decision-making under uncertainty. The last three of these activities are probability based. Two main probability paradigms are in current use: the frequentist (or relative-frequency) approach, in which you restrict attention to phenomena that are inherently repeatable under “identical” conditions and define P(A) to be the limiting relative frequency with which A would occur in hypothetical repetitions, as n goes to infinity; and the Bayesian approach, in which the arguments A and B of the probability operator P(A|B) are true-false propositions (with the truth status of A unknown to you and B assumed by you to be true), and P(A|B) represents the weight of evidence in favor of the truth of A, given the information in B. The Bayesian approach includes the frequentest paradigm as a special case,so you might think it would be the only version of probability used in statistical work today, but (a) in quantifying your uncertainty about something unknown to you, the Bayesian paradigm requires you to bring all relevant information to bear on the calculation; this involves combining information both internal and external to the data you’ve gathered, and (somewhat strangely) the external-information part of this approach was controversial in the 20th century, and (b) Bayesian calculations require approximating high-dimensional integrals (whereas the frequentist approach mainly relies on maximization rather than integration), and this was a severe limitation to the Bayesian paradigm for a long time (from the 1750s to the 1980s). The external-information problem has been solved by developing methods that separately handle the two main cases: (1) substantial external information, which is addressed by elicitation techniques, and (2) relatively little external information, which is covered by any of several methods for (in the jargon) specifying diffuse prior distributions. Good Bayesian work also involves sensitivity analysis: varying the manner in which you quantify the internal and external information across reasonable alternatives, and examining the stability of your conclusions. Around 1990 two things happened roughly simultaneously that completely changed the Bayesian computational picture: * Bayesian statisticians belatedly discovered that applied mathematicians (led by Metropolis), working at the intersection between chemistry and physics in the 1940s, had used Markov chains to develop a clever algorithm for approximating integrals arising in thermodynamics that are similar to the kinds of integrals that come up in Bayesian statistics, and * desk-top computers finally became fast enough to implement the Metropolis algorithm in a feasibly short amount of time. As a result of these developments, the Bayesian computational problem has been solved in a wide range of interesting application areas with small-to-moderate amounts of data; with large data sets, variational methods are available that offer a different approach to useful approximate solutions. The Bayesian paradigm for uncertainty quantification does appear to have one remaining weakness, which coincides with a strength of the frequentest paradigm: nothing in the Bayesian approach to inference and prediction requires you to pay attention to how often you get the right answer (thisis a form of calibration of your uncertainty assessments), which is an activity that’s (i) central to good science and decision-making and (ii) natural to emphasize from the frequentist point of view. However, it has recently been shown that calibration can readily be brought into the Bayesian story by means of decision theory, turning the Bayesian paradigm into an approach that is (in principle) both logically internally consistent and well-calibrated. In this talk I’ll (a) offer some historical notes about how we have arrived at the present situation and (b) give examples of Bayesian inference, prediction and decision-making in the context of several case studies from medicine and health policy. There will be points of potential technical interest for applied mathematicians, statisticians and computer scientists.

September 28, 2011

Practical Foundations for Programming Languages

Filed under: CS Lectures,Programming,Types — Patrick Durusau @ 7:33 pm

Practical Foundations for Programming Languages (pdf) by Robert Harper, Carnegie Mellon University.

From Chapter 1, page 3:

Programming languages are languages, a means of expressing computations in a form comprehensible to both people and machines. The syntax of a language specifies the means by which various sorts of phrases (expressions, commands, declarations, and so forth) may be combined to form programs. But what sort of thing are these phrases? What is a program made of?

The informal concept of syntax may be seen to involve several distinct concepts. The surface, or concrete, syntax is concerned with how phrases are entered and displayed on a computer. The surface syntax is usually thought of as given by strings of characters from some alphabet (say, ASCII or UniCode). The structural, or abstract, syntax is concerned with the structure of phrases, specifically how they are composed from other phrases. At this level a phrase is a tree, called an abstract syntax tree, whose nodes are operators that combine several phrases to form another phrase. The binding structure of syntax is concerned with the introduction and use of identifiers: how they are declared, and how declared identifiers are to be used. At this level phrases are abstract binding trees, which enrich abstract syntax trees with the concepts of binding and scope.

In this chapter we prepare the ground for all of our later work by defining precisely what are strings, abstract syntax trees, and abstract binding trees. The definitions are a bit technical, but are fundamentally quite simple and intuitive. It is probably best to skim this chapter on first reading, returning to it only as the need arises.

I am always amused when authors counsel readers to “skim” an early chapter and to return to it when in need. That works for the author, who already knows the material in the first chapter cold, works less well in my experience as a reader. How will I be aware that some future need could be satisfied by re-reading the first chapter? The first chapter is only nine (9) pages out of five hundred and seventy (570) so my suggestion would be to get the first chapter out of the way with a close reading.

From the preface:

This is a working draft of a book on the foundations of programming languages. The central organizing principle of the book is that programming language features may be seen as manifestations of an underlying type structure that governs its syntax and semantics. The emphasis, therefore, is on the concept of type, which codifies and organizes the computational universe in much the same way that the concept of set may be seen as an organizing principle for the mathematical universe. The purpose of this book is to explain this remark.

I think it is the view that “the concept of type…codifies and organizes the computational universe” that I find attractive. That being the case, we are free to construct computational universes that best fit our purposes, as opposed to fitting our purposes to particular computational universes.


Update: August 6, 2012 – First edition completed, see: There and Back Again

September 23, 2011

Oresoft Live Web Class

Filed under: CS Lectures,Data Mining — Patrick Durusau @ 6:11 pm

Oresoft Live Web Class YouTube Channel

I ran across this YouTube channel on a data mining alert I get from a search service. The data mining course looks like one of the more complete ones.

It stems from the Oresoft Academy, which conducts live virtual classes. If you have an interest in teaching, see the FAQ to see what is required to contribute to this effort.

The Oresoft playlist offers (as of 22 September 2011):

  • Algorithms (101 sessions)
  • Compiler Design (42 sessions)
  • Computer Graphics (7 sessions)
  • Finite Automata (5 sessions)
  • Graph Theory (9 sessions)
  • Heap Sort (13 sessions)
  • Java Tutorials (16 sessions)
  • Non-Determistic Finite Automata (14 sessions)
  • Oracle PL/SQL (27 sessions)
  • Oracle Server Concept (48 sessions, slides number to 49 due to numbering error)
  • Oracle SQL (17 sessions)
  • Pumping Lemma (6 sessions)
  • Regular Expression (14 sessions)
  • Turing Machines (10 sessions)
  • Web Data Mining (127 sessions)
« Newer PostsOlder Posts »

Powered by WordPress