## Archive for the ‘Mathematics’ Category

### Properties of Morphisms

Friday, May 17th, 2013

Properties of Morphisms by Jeremy Kun.

From the post:

This post is mainly mathematical. We left it out of our introduction to categories for brevity, but we should lay these definitions down and some examples before continuing on to universal properties and doing more computation. The reader should feel free to skip this post and return to it later when the words “isomorphism,” “monomorphism,” and “epimorphism” come up again. Perhaps the most important part of this post is the description of an isomorphism.

Isomorphisms, Monomorphisms, and Epimorphisms

Perhaps the most important paradigm shift in category theory is the focus on morphisms as the main object of study. In particular, category theory stipulates that the only knowledge one can gain about an object is in how it relates to other objects. Indeed, this is true in nearly all fields of mathematics: in groups we consider all isomorphic groups to be the same. In topology, homeomorphic spaces are not distinguished. The list goes on. The only way to do determine if two objects are “the same” is by finding a morphism with special properties. Barry Mazur gives a more colorful explanation by considering the meaning of the number 5 in his essay, “When is one thing equal to some other thing?” The point is that categories, more than existing to be a “foundation” for all mathematics as a formal system (though people are working to make such a formal system), exist primarily to “capture the essence” of mathematical discourse, as Mazur puts it. A category defines objects and morphisms, but literally all of the structure of a category lies in its morphisms. And so we study them.

If you are looking for something challenging to read over the weekend, Jeremy’s latest post on morphisms should fit the bill.

The question of “equality” is easy enough to answer as lexical equivalence in UTF-8, but what if you need something more?

Being mindful that “lexical equivalence in UTF-8″ is a highly unreliable form of “equality.”

Jeremy is easing us down the road where such discussions can happen with a great deal of rigor.

### Harvard Stat 221

Friday, May 10th, 2013

Harvard Stat 221 “Statistical Computing and Visualization”: by Sergiy Nesterko.

From the post:

Stat 221 is Statistical Computing and Visualization. It’s a graduate class on analyzing data without losing scientific rigor, and communicating your work. Topics span the full cycle of a data-driven project including project setup, design, implementation, and creating interactive user experiences to communicate ideas and results. We covered current theory and philosophy of building models for data, computational methods, and tools such as d3js, parallel computing with MPI, R.

See Sergily’s post for the lecture slides from this course.

### Big Data and the Topologist

Monday, May 6th, 2013

Big Data and the Topologist by Jesse Johnson.

From the post:

As I mentioned in my previous post, I plan to write a series of posts about study of large data sets, both the ways that high dimensional data has traditionally been studied and the topology that has recently been applied to this area. For anyone who has experience thinking about abstract geometric objects (as I assume most of the readers of this blog do) the concepts should seem pretty straightforward, and the difficulty is mostly in translation. So I will start with a post that focusses on defining terms. (Update: I’ve started a second blog The Shape of Data to look into these topics in more detail.)

You can also skip ahead to later posts in this series:

The study of large data sets in the abstract generally goes by two names: Data mining is the field that grew out of statistics, which considers ways to organize and summarize high dimensional data so that it can be understood by humans. Machine Learning is the subfield of computer science (particularly artificial intelligence) that looks for ways to have computers organize and summarize data, with the goal of having the computer make decisions. These two fields have a lot in common and I will not try to distinguish between them. There are also names for the application of these methods in different sciences, such as Bioinformatics and Cheminformatics. There are also rather notorious applications in marketing, which allow stores to know what you’re going to buy before you know.

We are given a collection of data, usually a set of ordered n-tuples, that come from a science experiment or surveys, or the data that retailers collect about you every time you use your credit card, etc. Some of the entries can be thought of as labels – the code number for the particular experiment, for example. The remaining coordinates/dimensions are often called features. If these features are numerical, then we can think of them as defining vectors in a Euclidean space, and this gives us our first glimpse of geometry. However, for high dimensional data, the Euclidean metric turns out to be problematic, so we will often want to use a different metric. The Euclidean metric is also problematic for binary features such as the presence of different genes in an organism.

This is a nice way to start the week!

### Categories as Types

Monday, May 6th, 2013

Categories as Types by Jeremy Kun.

From the post:

In this post we’ll get a quick look at two ways to define a category as a type in ML. The first way will be completely trivial: we’ll just write it as a tuple of functions. The second will involve the terribly-named “functor” expression in ML, which allows one to give a bit more structure on data types.

Jeremy gets us closer to application of category theory to programming with an introduction to types.

### Mathbabe, the book

Saturday, May 4th, 2013

Mathbabe, the book by Cathy O’Neil.

From the post:

Thanks to a certain friendly neighborhood mathbabe reader, I’ve created this mathbabe book, which is essentially all of my posts that I ever wrote (I think. Note sure about that.) bundled together mostly by date and stuck in a huge pdf. It comes to 1,243 pages.

I did it using leanpub.com, which charges \$0.99 per person who downloads the pdf. I’m not charging anything over that, because the way I look at it, it’s already free.

Speaking of that, I can see why I’d want a copy of this stuff, since it’s the best way I can think of to have a local version of a bunch of writing I’ve done over the past couple of years, but I don’t actually see why anyone else would. So please don’t think I’m expecting you to go buy this book! Even so, more than one reader has requested this, so here it is.

And one strange thing: I don’t think it required my password on WordPress.com to do it, I just needed the url for the RSS feed. So if you want to avoid paying 99 cents, I’m pretty sure you can go to leanpub or one of its competitors and create another, identical book using that same feed.

And for that matter you can also go build your own book about anything using these tools, which is pretty cool when you think about it. Readers, please tell me if there’s a way to do this that’s open source and free.

The Mathbabe “book” would be one that I would be interested in reading. I can think of several other blogs that fall into that category.

I hesitate to use the term “book” for such a collection.

Maybe I am confusing “monograph,” which is focused on a topic, with “book,” which applies to works beyond a certain length.

I think of my postings, once you remove the dated notice materials, as potential essays or chapters in a book.

But they would need fleshing out and polishing to qualify for more formal publication.

### Persistent Homology Talk at UIC: Slides

Friday, May 3rd, 2013

Persistent Homology Talk at UIC: Slides by Jeremy Kun.

From the post:

Today I gave a twenty-minute talk at UI Chicago as part of the first annual Chicago Area Student SIAM Conference. My talk was titled “Recent Developments in Persistent Homology,” and it foreshadows the theoretical foundations and computational implementations we’ll be laying out on this blog in the coming months. Here’s the abstract:

Persistent homology is a recently developed technique for analyzing the topology of data sets. We will give a rough overview of the technique and sample successful applications to areas such as natural image analysis & texture classification, breast and liver cancer classification, molecular dynamical systems, and others.

The talk was received very well — mostly, I believe, because I waved my hands on the theoretical aspects and spent most of my time talking about the applications.

The slides.

Jeremy doesn’t hold out much hope the slides will be useful sans the lecture surrounding them.

He does includes references so see what you think of the slides + references.

### Introducing Categories

Friday, April 26th, 2013

Introducing Categories by Jeremy Kun.

From the post:

It is time for us to formally define what a category is, to see a wealth of examples. In our next post we’ll see how the definitions laid out here translate to programming constructs. As we’ve said in our soft motivational post on categories, the point of category theory is to organize mathematical structures across various disciplines into a unified language. As such, most of this post will be devote to laying down the definition of a category and the associated notation. We will be as clear as possible to avoid a notational barrier for newcomers, so if anything is unclear we will clarify it in the comments.

Definition of a Category

Let’s recall some examples of categories we’ve seen on this blog that serve to motivate the abstract definition of a category. We expect the reader to be comfortable with sets, and to absorb or glaze over the other examples as comfort dictates. The reader who is uncomfortable with sets and functions on sets should stop here. Instead, visit our primers on proof techniques, which doubles as a primer on set theory (or our terser primer on set theory from a two years ago).

The go-to example of a category is that of sets: sets together with functions between sets form a category. We will state exactly what this means momentarily, but first some examples of categories of “sets with structure” and “structure-preserving maps.”

Not easy but not as difficult as some introductions to category theory.

Jeremy promises that the very next post jumps into code to show the “definition of a category as a type in ML.”

Plus some pro’s and con’s on proceeding this way.

### Categories, What’s the Point?

Wednesday, April 17th, 2013

Categories, What’s the Point? by Jeremy Kun.

From the post:

Perhaps primarily due to the prominence of monads in the Haskell programming language, programmers are often curious about category theory. Proponents of Haskell and other functional languages can put category-theoretic concepts on a pedestal or in a mexican restaurant, and their benefits can seem as mysterious as they are magical. For instance, the most common use of a monad in Haskell is to simulate the mutation of immutable data. Others include suspending and backtracking computations, and even untying tangled rope.

Category theory is often mocked (or praised) as the be-all and end-all of mathematical abstraction, and as such (and for other reasons I’ve explored on this blog) many have found it difficult to digest and impossible to master. However, in truth category theory arose from a need for organizing mathematical ideas based on their shared structure. In this post, I want to give a brief overview of what purpose category theory serves within mathematics, and emphasize what direction we’ll be going with it on this blog.

We should also note (writing this after the fact), that this article is meant to be a motivation to our future series on category theory. It is very difficult to explain what category theory is without going into very specific details, but we can explain by analogy what category theory achieves for us. That is the goal of this post.

I rather like the line:

However, in truth category theory arose from a need for organizing mathematical ideas based on their shared structure.

Which is one of the reasons why I keep trying to master category theory, for the ability to organize subject identifications based on their shared structures.

Not necessary for all subject identity cases but if topic maps extend into subjects like math, physics, etc., then it may be quite useful.

### Probabilistic Bounds — A Primer

Tuesday, April 16th, 2013

Probabilistic Bounds — A Primer by Jeremy Kun.

From the post:

Probabilistic arguments are a key tool for the analysis of algorithms in machine learning theory and probability theory. They also assume a prominent role in the analysis of randomized and streaming algorithms, where one imposes a restriction on the amount of storage space an algorithm is allowed to use for its computations (usually sublinear in the size of the input).

While a whole host of probabilistic arguments are used, one theorem in particular (or family of theorems) is ubiquitous: the Chernoff bound. In its simplest form, the Chernoff bound gives an exponential bound on the deviation of sums of random variables from their expected value.

This is perhaps most important to algorithm analysis in the following mindset. Say we have a program whose output is a random variable $X$. Moreover suppose that the expected value of $X$ is the correct output of the algorithm. Then we can run the algorithm multiple times and take a median (or some sort of average) across all runs. The probability that the algorithm gives a wildly incorrect answer is the probability that more than half of the runs give values which are wildly far from their expected value. Chernoff’s bound ensures this will happen with small probability.

So this post is dedicated to presenting the main versions of the Chernoff bound that are used in learning theory and randomized algorithms. Unfortunately the proof of the Chernoff bound in its full glory is beyond the scope of this blog. However, we will give short proofs of weaker, simpler bounds as a straightforward application of this blog’s previous work laying down the theory.

If the reader has not yet intuited it, this post will rely heavily on the mathematical formalisms of probability theory. We will assume our reader is familiar with the material from our first probability theory primer, and it certainly wouldn’t hurt to have read our conditional probability theory primer, though we won’t use conditional probability directly. We will refrain from using measure-theoretic probability theory entirely (some day my colleagues in analysis will like me, but not today).

Another heavy sledding post from Jeremy but if you persist, you will gain a deeper understanding of the algorithms of machine learning theory.

If that sounds esoteric, consider that it will help you question results produced by algorithms of machine learning.

Do you really want to take a machine’s “word” for something important?

Or do you want the chance to know why an answer is correct, questionable or incorrect?

### Visualization and Projection

Tuesday, April 16th, 2013

Visualization and Projection by Jesse Johnson.

From the post:

One of the common themes that I’ve emphasized so far on this blog is that we should try to analyze high dimensional data sets without being able to actually “see” them. However, it is often useful to visualize the data in some way, and having just introduced principle component analysis, this is probably a good time to start the discussion. There are a number of types of visualization that involve representing statistics about the data in different ways, but today we’ll look at ways of representing the actual data points in two or three dimensions.

In particular, what we want to do is to draw the individual data points of a high dimensional data set in a lower dimensional space so that the hard work can be done by the best pattern recognition machine there is: your brain. When you look at a two- or three-dimensional data set, you will naturally recognize patterns based on how close different points are to each other. Therefore, we want to represent the points so that the distances between them change as little as possible. In general, this is called projection, the term coming from the idea that we will do the same thing to the data as you do when you make shadow puppets: We project a high dimensional object (such as your three-dimensional hands) onto a lower dimensional object (such as the two-dimensional wall).

We’ve already used linear projection implicitly when thinking about higher dimensional spaces. For example, I suggested the you think about the three-dimensional space that we live in as being a piece of paper on the desk of someone living in a higher dimensional space. In the post about the curse of dimensionality, we looked at the three-dimensional cube from the side and saw that it was two-dimensional, then noted that a four-dimensional cube would look like a three-dimensional cube if we could look at it “from the side”.

When we look at an object from the side like this, we are essentially ignoring one of the dimensions. This the simplest form of projection, and in general we can choose to ignore more than one dimension at a time. For example, if you have data points in five dimensions and you want to plot them in two dimension, you could just pick the two dimensions that you thought were most important and plot the points based on those. It’s hard to picture how that works because you still have to think about the original five dimensional data. But this is similar to the picture if we were to take a two-dimensional data set and throw away one of the dimensions as in the left and middle pictures in the Figure below. You can see the shadow puppet analogy too: In the figure on the left, the light is to the right of the data, while in the middle, it’s above.

I hesitated over:

…then noted that a four-dimensional cube would look like a three-dimensional cube if we could look at it “from the side”

Barely remembering Martin Gardner’s column on visualizing higher dimensions and the illustrations of projecting a fourth dimensional box into three dimensions.

But the original post is describing a fourth dimensional cube in a fourth dimensional space being viewed “on its side” by a being that exists in fourth dimensional space.

That works.

How would you choose which dimensions to project for a human observer to judge?

### Homology Theory — A Primer

Tuesday, April 9th, 2013

Homology Theory — A Primer by Jeremy Kun.

From the post:

This series on topology has been long and hard, but we’re are quickly approaching the topics where we can actually write programs. For this and the next post on homology, the most important background we will need is a solid foundation in linear algebra, specifically in row-reducing matrices (and the interpretation of row-reduction as a change of basis of a linear operator).

Last time we engaged in a whirlwind tour of the fundamental group and homotopy theory. And we mean “whirlwind” as it sounds; it was all over the place in terms of organization. The most important fact that one should take away from that discussion is the idea that we can compute, algebraically, some qualitative features about a topological space related to “n-dimensional holes.” For one-dimensional things, a hole would look like a circle, and for two dimensional things, it would look like a hollow sphere, etc. More importantly, we saw that this algebraic data, which we called the fundamental group, is a topological invariant. That is, if two topological spaces have different fundamental groups, then they are “fundamentally” different under the topological lens (they are not homeomorphic, and not even homotopy equivalent).

Unfortunately the main difficulty of homotopy theory (and part of what makes it so interesting) is that these “holes” interact with each other in elusive and convoluted ways, and the algebra reflects it almost too well. Part of the problem with the fundamental group is that it deftly eludes our domain of interest: we don’t know a general method to compute the damn things!

Jeremy continues his series on topology and promises programs are not far ahead!

### Probability and Statistics Cookbook

Friday, April 5th, 2013

Probability and Statistics Cookbook by Matthias Vallentin.

From the webpage:

The cookbook contains a succinct representation of various topics in probability theory and statistics. It provides a comprehensive reference reduced to the mathematical essence, rather than aiming for elaborate explanations.

When Matthias says “succient,” he is quite serious:

But by the time you master the twenty-seven pages of this “cookbook,” you will have a very good grounding on probability and statistics.

### 100 Savvy Sites on Statistics and Quantitative Analysis

Wednesday, April 3rd, 2013

100 Savvy Sites on Statistics and Quantitative Analysis

From the post:

Nate Silver’s unprecedented accurate prediction of state-by-state election results in the most recent presidential race was a watershed moment for the public awareness of statistics. While data gathering and analysis has become a massive industry in the past decade, it hasn’t always been as well covered in the press or publicly accessible as it is now. With more and more of our daily interactions being mediated through computers and the internet, it is easier than ever to gather detailed quantitative data and do statistical analysis on that data derive valuable information and predictions from it.

Knowledge of statistics and quantitative analysis techniques is more valuable than ever. From biostatisticians to politicians and economists, people in every field are using statistics to further their careers and knowledge. These sites are some of the most useful, informative, and comprehensive on the web covering stats and quantitative analysis.

Covers everything from Comprehensive Statistics Sites and Big Data to Data Visualization and Sports Stats.

I first saw this at 100 Savvy Sites on Statistics and Quantitative Analysis by Vincent Granville.

### An Applet for the Investigation of Simpson’s Paradox

Tuesday, April 2nd, 2013

An Applet for the Investigation of Simpson’s Paradox by Kady Schneiter and Jürgen Symanzik. (Journal of Statistics Education, Volume 21, Number 1 (2013))

Simpson’s paradox is best illustrated by the University of California, Berkeley sex discrimination case. Taken in the aggregate, admissions to the graduate school appeared to greatly favor men. Taken by department, no department discriminated against women and most favored admission of women. Same data, different level of examination. That is Simpson’s paradox.

Abstract:

This article describes an applet that facilitates investigation of Simpson’s Paradox in the context of a number of real and hypothetical data sets. The applet builds on the Baker-Kramer graphical representation for Simpson’s Paradox. The implementation and use of the applet are explained. This is followed by a description of how the applet has been used in an introductory statistics class and a discussion of student responses to the applet.

In probability and statistics, Simpson’s paradox, or the Yule–Simpson effect, is a paradox in which a trend that appears in different groups of data disappears when these groups are combined, and the reverse trend appears for the aggregate data. This result is often encountered in social-science and medical-science statistics,[1] and is particularly confounding when frequency data are unduly given causal interpretations.[2] Simpson’s Paradox disappears when causal relations are brought into consideration.

A cautionary tale about the need to understand data sets and how combining them may impact outcomes of statistical analysis.

### Journal of Statistics Education

Tuesday, April 2nd, 2013

Journal of Statistics Education

From the mission statement:

The Journal of Statistics Education (JSE) disseminates knowledge for the improvement of statistics education at all levels, including elementary, secondary, post-secondary, post-graduate, continuing, and workplace education. It is distributed electronically and, in accord with its broad focus, publishes articles that enhance the exchange of a diversity of interesting and useful information among educators, practitioners, and researchers around the world. The intended audience includes anyone who teaches statistics, as well as those interested in research on statistical and probabilistic reasoning. All submissions are rigorously refereed using a double-blind peer review process.

Manuscripts submitted to the journal should be relevant to the mission of JSE. Possible topics for manuscripts include, but are not restricted to: curricular reform in statistics, the use of cooperative learning and projects, innovative methods of instruction, assessment, and research (including case studies) on students’ understanding of probability and statistics, research on the teaching of statistics, attitudes and beliefs about statistics, creative and tested ideas (including experiments and demonstrations) for teaching probability and statistics topics, the use of computers and other media in teaching, statistical literacy, and distance education. Articles that provide a scholarly overview of the literature on a particular topic are also of interest. Reviews of software, books, and other teaching materials will also be considered, provided these reviews describe actual experiences using the materials.

In addition JSE also features departments called “Teaching Bits: A Resource for Teachers of Statistics” and “Datasets and Stories.” “Teaching Bits” summarizes interesting current events and research that can be used as examples in the statistics classroom, as well as pertinent items from the education literature. The “Datasets and Stories” department not only identifies interesting datasets and describes their useful pedagogical features, but enables instructors to download the datasets for further analysis or dissemination to students.

Associated with the Journal of Statistics Education is the JSE Information Service. The JSE Information Service provides a source of information for teachers of statistics that includes the archives of EDSTAT-L (an electronic discussion list on statistics education), information about the International Association for Statistical Education, and links to many other statistics education sources.

If you are going to talk about big data, of necessity you are also going to talk about statistics.

A very good free online resource on statistics.

### New Book Explores the P-NP Problem [Explaining Topic Maps]

Monday, April 1st, 2013

New Book Explores the P-NP Problem by Shar Steed.

From the post:

The Golden Ticket: P, NP, and the Search for the Impossible, written by CCC Council and CRA board member, Lance Fortnow is now available. The inspiration for the book came in 2009 when Fortnow published an article on the P-NP problem for Communications of the ACM. With more than 200,000 downloads, the article is one of the website’s most popular, which signals that this is an issue that people are interested in exploring. The P-NP problem is the most important open problem in computer science because it attempts measure the limits of computation.

The book is written to appeal to readers outside of computer science and shed light on the fact that there are deep computational challenges that computer scientists face. To make it relatable, Fortnow developed the “Golden Ticket” analogy, comparing the P-NP problem to the search for the golden ticket in Charlie and the Chocolate Factory, a story many people can relate to. Fortnow avoids mathematical and technical terminology and even the formal definition of the P-NP problem, and instead uses examples to explain concepts

“My goal was to make the book relatable by telling stories. It is a broad based book that does not require a math or computer science background to understand it.”

Fortnow also credits CRA and CCC for giving him inspiration to write the book.

Fortnow has explained the P-NP problem without using “…mathematical and technical commentary and even the formal definition of the P-NP problem….”

Now, we were talking about how difficult it is to explain topic maps?

Suggest we all read this as a source of inspiration for better (more accessible) explanations and tutorials on topic maps.

(I just downloaded it to the Kindle reader on a VM running on my Ubuntu box. This promises to be a great read!)

### Mathematics Cannot Be Patented. Case Dismissed.

Friday, March 29th, 2013

Mathematics Cannot Be Patented. Case Dismissed. by Alan Schoenbaum.

From the post:

Score one for the good guys. Rackspace and Red Hat just defeated Uniloc, a notorious patent troll. This case never should have been filed. The patent never should have been issued. The ruling is historic because, apparently, it was the first time that a patent suit in the Eastern District of Texas has been dismissed prior to filing an answer in the case, on the grounds that the subject matter of the patent was found to be unpatentable. And was it ever unpatentable.

Red Hat indemnified Rackspace in the case. This is something that Red Hat does well, and kudos to them. They stand up for their customers and defend these Linux suits. The lawyers who defended us deserve a ton of credit. Bill Lee and Cynthia Vreeland of Wilmer Hale were creative and persuasive, and their strategy to bring the early motion to dismiss was brilliant.

The patent at issue is a joke. Uniloc alleged that a floating point numerical calculation by the Linux operating system violated U.S. Patent 5,892,697 – an absurd assertion. This is the sort of low quality patent that never should have been granted in the first place and which patent trolls buy up by the bushel full, hoping for fast and cheap settlements. This time, with Red Hat’s strong backing, we chose to fight.

The outcome was just what we had in mind. Chief Judge Leonard Davis found that the subject matter of the software patent was unpatentable under Supreme Court case law and, ruling from the bench, granted our motion for an early dismissal. The written order, which was released yesterday, is excellent and well-reasoned. It’s refreshing to see that the judiciary recognizes that many of the fundamental operations of a computer are pure mathematics and are not patentable subject matter. We expect, and hope, that many more of these spurious software patent lawsuits are dismissed on similar grounds.

A potential use case for a public topic map on patents?

At least on software patents?

Thinking that a topic map could be constructed of all the current patents that address mathematical operations, enabling academics and researchers to focus on factual analysis of the processes claimed by those patents.

From the factual analysis, other researchers, primarily lawyers and law students, could outline legal arguments, tailored for each patent, as to its invalidity.

A community resource, not unlike a patent bank, that would strengthen the community’s hand when dealing with patent trolls.

PS: I guess this means I need to stop working on my patent for addition.

### MapEquation.org

Tuesday, March 26th, 2013

MapEquation.org by Daniel Edler and Martin Rosvall.

What do we do?

We develop mathematics, algorithms and software to simplify and highlight important structures in complex systems.

What are our goals?

To navigate and understand big data like we navigate and understand the real world by maps.

Spend some time with Code and Publications as well.

I first saw this in a tweet by Chris@SocialTexture.

### Tensors and Their Applications…

Saturday, March 23rd, 2013

Tensors and Their Applications in Graph-Structured Domains by Maximilian Nickel and Volker Tresp. (Slides.)

Along with the slides, you will like abstract and bibliography found at: Machine Learning on Linked Data: Tensors and their Applications in Graph-Structured Domains.

Abstract:

Machine learning has become increasingly important in the context of Linked Data as it is an enabling technology for many important tasks such as link prediction, information retrieval or group detection. The fundamental data structure of Linked Data is a graph. Graphs are also ubiquitous in many other fields of application, such as social networks, bioinformatics or the World Wide Web. Recently, tensor factorizations have emerged as a highly promising approach to machine learning on graph-structured data, showing both scalability and excellent results on benchmark data sets, while matching perfectly to the triple structure of RDF. This tutorial will provide an introduction to tensor factorizations and their applications for machine learning on graphs. By the means of concrete tasks such as link prediction we will discuss several factorization methods in-depth and also provide necessary theoretical background on tensors in general. Emphasis is put on tensor models that are of interest to Linked Data, which will include models that are able to factorize large-scale graphs with millions of entities and known facts or models that can handle the open-world assumption of Linked Data. Furthermore, we will discuss tensor models for temporal and sequential graph data, e.g. to analyze social networks over time.

Devising a system to deal with the heterogeneous nature of linked data.

Just skimming the slides I could see, this looks very promising.

I first saw this in a tweet by Stefano Bertolo.

Update: I just got an email from Maximilian Nickel and he has altered the transition between slides. Working now!

From slide 53 forward is pure gold for topic map purposes.

Heavy sledding but let me give you one statement from the slides that should capture your interest:

Instance matching: Ranking of entities by their similarity in the entity-latent-component space.

What is more, Maximilian offers proof that the technique scales!

Complex, configurable, scalable determination of subject identity!

[Update: deleted note about issues with slides, which read: (Slides for ISWC 2012 tutorial, Chrome is your best bet. Even better bet, Chrome on Windows. Chrome on Ubuntu crashed every time I tried to go to slide #15. Windows gets to slide #46 before failing to respond. I have written to inquire about the slides.)]

### Methods of Proof — Induction

Saturday, March 23rd, 2013

Methods of Proof — Induction by Jeremy Kun.

Jeremy covers proof by induction in the final post for his “proof” series.

Induction is used to prove statements about natural numbers (positive integers).

Lars Marius Garshol recently concluded slides on big data with:

• Vast potential
• to both big data and machine learning
• Very difficult to realize that potential
• requires mathematics, which nobody knows
• We need to wake up!

Big Data 101 by Lars Marius Garshol.

If you want to step up your game with big data, you will need to master mathematics.

Excel and other software can do mathematics but can’t choose the mathematics to apply.

That requires you.

### The Shape of Data

Friday, March 22nd, 2013

The Shape of Data by Jesse Johnson.

Whether your goal is to write data intensive software, use existing software to analyze large, high dimensional data sets, or to better understand and interact with the experts who do these things, you will need a strong understanding of the structure of data and how one can try to understand it. On this blog, I plan to explore and explain the basic ideas that underlie modern data analysis from a very intuitive and minimally technical perspective: by thinking of data sets as geometric objects.

When I began learning about machine learning and data mining, I found that the intuition I had formed while studying geometry was extremely valuable in understanding the basic concepts and algorithms. My main obstacle has been to figure out what types of problems others are interested in solving, and what types of solutions would make the most difference. I hope that by sharing what I know (and what I continue to learn) from my own perspective, others will help me to figure out what are the major questions that drive this field.

A new blog that addresses the topology of data, in an accessible manner.

### A Concise Course in Algebraic Topology

Tuesday, March 12th, 2013

A Concise Course in Algebraic Topology by J.P. May. (PDF)

From the introduction:

The ﬁrst year graduate program in mathematics at the University of Chicago consists of three three-quarter courses, in analysis, algebra, and topology. The ﬁrst two quarters of the topology sequence focus on manifold theory and diﬀerential geometry, including diﬀerential forms and, usually, a glimpse of de Rham cohomology. The third quarter focuses on algebraic topology. I have been teaching the third quarter oﬀ and on since around 1970. Before that, the topologists, including me, thought that it would be impossible to squeeze a serious introduction to algebraic topology into a one quarter course, but we were overruled by the analysts and algebraists, who felt that it was unacceptable for graduate students to obtain their PhDs without having some contact with algebraic topology.

This raises a conundrum. A large number of students at Chicago go into topology, algebraic and geometric. The introductory course should lay the foundations for their later work, but it should also be viable as an introduction to the subject suitable for those going into other branches of mathematics. These notes reﬂect my eﬀorts to organize the foundations of algebraic topology in a way that caters to both pedagogical goals. There are evident defects from both points of view. A treatment more closely attuned to the needs of algebraic geometers and analysts would include Čech cohomology on the one hand and de Rham cohomology and ˇ perhaps Morse homology on the other. A treatment more closely attuned to the needs of algebraic topologists would include spectral sequences and an array of calculations with them. In the end, the overriding pedagogical goal has been the introduction of basic ideas and methods of thought.

Tough sledding but having insights, like those found in the GraphLab project, require a deeper than usual understanding of the issues at hand.

I first saw this in a tweet by Topology Fact.

### A Simple, Combinatorial Algorithm for Solving…

Tuesday, March 5th, 2013

A Simple, Combinatorial Algorithm for Solving SDD Systems in Nearly-Linear Time by Jonathan A. Kelner, Lorenzo Orecchia, Aaron Sidford, Zeyuan Allen Zhu.

Abstract:

In this paper, we present a simple combinatorial algorithm that solves symmetric diagonally dominant (SDD) linear systems in nearly-linear time. It uses very little of the machinery that previously appeared to be necessary for a such an algorithm. It does not require recursive preconditioning, spectral sparsification, or even the Chebyshev Method or Conjugate Gradient. After constructing a “nice” spanning tree of a graph associated with the linear system, the entire algorithm consists of the repeated application of a simple (non-recursive) update rule, which it implements using a lightweight data structure. The algorithm is numerically stable and can be implemented without the increased bit-precision required by previous solvers. As such, the algorithm has the fastest known running time under the standard unit-cost RAM model. We hope that the simplicity of the algorithm and the insights yielded by its analysis will be useful in both theory and practice.

In one popular account, the importance of the discovery was described this way:

The real value of the MIT paper, Spielman says, is in its innovative theoretical approach. “My work and the work of the folks at Carnegie Mellon, we’re solving a problem in numeric linear algebra using techniques from the field of numerical linear algebra,” he says. “Jon’s paper is completely ignoring all of those techniques and really solving this problem using ideas from data structures and algorithm design. It’s substituting one whole set of ideas for another set of ideas, and I think that’s going to be a bit of a game-changer for the field. Because people will see there’s this set of ideas out there that might have application no one had ever imagined.”

Thirty-two pages of tough sledding but if the commentaries are correct, this paper may have a major impact on graph processing.

### Methods of Proof — Contradiction

Friday, March 1st, 2013

Methods of Proof — Contradiction by Jeremy Kun.

From the post:

In this post we’ll expand our toolbox of proof techniques by adding the proof by contradiction. We’ll also expand on our knowledge of functions on sets, and tackle our first nontrivial theorem: that there is more than one kind of infinity.

Impossibility and an Example Proof by Contradiction

Many of the most impressive results in all of mathematics are proofs of impossibility. We see these in lots of different fields. In number theory, plenty of numbers cannot be expressed as fractions. In geometry, certain geometric constructions are impossible with a straight-edge and compass. In computing theory, certain programs cannot be written. And in logic even certain mathematical statements can’t be proven or disproven.

In some sense proofs of impossibility are hardest proofs, because it’s unclear to the layman how anyone could prove it’s not possible to do something. Perhaps this is part of human nature, that nothing is too impossible to escape the realm of possibility. But perhaps it’s more surprising that the main line of attack to prove something is impossible is to assume it’s possible, and see what follows as a result. This is precisely the method of proof by contradiction:

Assume the claim you want to prove is false, and deduce that something obviously impossible must happen.

There is a simple and very elegant example that I use to explain this concept to high school students in my guest lectures.

I hope you are following this series of posts but if not, at least read the example Jeremy has for proof by contradiction.

It’s a real treat.

### Methods of Proof — Contrapositive

Friday, March 1st, 2013

Methods of Proof — Contrapositive by Jeremy Kun.

From the post:

In this post we’ll cover the second of the “basic four” methods of proof: the contrapositive implication. We will build off our material from last time and start by defining functions on sets.

Functions as Sets

So far we have become comfortable with the definition of a set, but the most common way to use sets is to construct functions between them. As programmers we readily understand the nature of a function, but how can we define one mathematically? It turns out we can do it in terms of sets, but let us recall the desired properties of a function:

• Every input must have an output.
• Every input can only correspond to one output (the functions must be deterministic).

Jeremy continues his series on proof techniques.

Required knowledge for reading formal CS papers.

### Discrete Structures (University of Washington 2008)

Monday, February 25th, 2013

I found the following material following a link in Christophe Lalanne’s A bag of tweets / February 2013 on “Common Mistakes in Discrete Mathematics.”

Clearly organized around a text but it wasn’t clear which text was being used.

Backing up the URI, I found the homepage for: CSE 321: Discrete Structures 2008, which listed the textbook as Rosen, Discrete Mathematics and Its Applications, McGraw-Hill, 6th Edition. (BTW, there is a 7th edition, Discrete Mathematics and Its Applications).

Homework

Lecture Slides

Recorded Lectures

Post-Section Notes (note and a problem correction)

and, the origin of my inquiry:

Common Mistakes in Discrete Mathematics

In this section of the Guide we list many common mistakes that people studying discrete mathematics sometimes make. The list is organized chapter by chapter, based on when they first occur, but sometimes mistakes made early in the course perpetuate in later chapters. Also, some of these mistakes are remnants of misconceptions from high school mathematics (such as the impulse to assume that every operation distributes over every other operation).

In most cases we describe the mistake, give a concrete example, and then offer advice about how to avoid it. Note that additional advice about common mistakes in given, implicitly or explicitly, in the solutions to the odd-numbered exercises, which constitute the bulk of this Guide.

If 2008 sounds a bit old, you’re right. There is an update that requires a separate post. See: UW Courses in Computer Science and Engineering.

### Methods of Proof — Direct Implication

Saturday, February 16th, 2013

Methods of Proof — Direct Implication by Jeremy Kun.

From the post:

I recently posted an exploratory piece on why programmers who are genuinely interested in improving their mathematical skills can quickly lose stamina or be deterred. My argument was essentially that they don’t focus enough on mastering the basic methods of proof before attempting to read research papers that assume such knowledge. Also, there are a number of confusing (but in the end helpful) idiosyncrasies in mathematical culture that are often unexplained. Together these can cause enough confusion to stymie even the most dedicated reader. I have certainly experienced it enough to call the feeling familiar.

Now I’m certainly not trying to assert that all programmers need to learn mathematics to improve their craft, nor that learning mathematics will be helpful to any given programmer. All I claim is that someone who wants to understand why theorems are true, or how to tweak mathematical work to suit their own needs, cannot succeed without a thorough understanding of how these results are developed in the first place. Function definitions and variable declarations may form the scaffolding of a C program while the heart of the program may only be contained in a few critical lines of code. In the same way, the heart of a proof is usually quite small and the rest is scaffolding. One surely cannot understand or tweak a program without understanding the scaffolding, and the same goes for mathematical proofs.

And so we begin this series focusing on methods of proof, and we begin in this post with the simplest such methods. I call them the “basic four,” and they are:

• Proof by direct implication
• Proof by contrapositive, and
• Proof by induction.

This post will focus on the first one, while introducing some basic notation we will use in the future posts. Mastering these proof techniques does take some practice, and it helps to have some basic mathematical content with which to practice on. We will choose the content of set theory because it’s the easiest field in terms of definitions, and its syntax is the most widely used in all but the most pure areas of mathematics. Part of the point of this primer is to spend time demystifying notation as well, so we will cover the material at a leisurely (for an experienced mathematician: aggravatingly slow) pace.

Parallel processing, multi-core memory architectures, graphs and the like are a long way from the cookbook stage of programming.

If you want to be on the leading edge, some mathematics are going to be required.

This series can bring you one step closer to mathematical literacy.

I say “can” because whether it will or not, depends upon you.

### …no Hitchhiker’s Guide…

Saturday, February 9th, 2013

From the post:

Do you really want to get better at mathematics?

Remember when you first learned how to program? I do. I spent two years experimenting with Java programs on my own in high school. Those two years collectively contain the worst and most embarrassing code I have ever written. My programs absolutely reeked of programming no-nos. Hundred-line functions and even thousand-line classes, magic numbers, unreachable blocks of code, ridiculous code comments, a complete disregard for sensible object orientation, negligence of nearly all logic, and type-coercion that would make your skin crawl. I committed every naive mistake in the book, and for all my obvious shortcomings I considered myself a hot-shot programmer! At leaa st I was learning a lot, and I was a hot-shot programmer in a crowd of high-school students interested in game programming.

Even after my first exposure and my commitment to get a programming degree in college, it was another year before I knew what a stack frame or a register was, two more before I was anywhere near competent with a terminal, three more before I fully appreciated functional programming, and to this day I still have an irrational fear of networking and systems programming (the first time I manually edited the call stack I couldn’t stop shivering with apprehension and disgust at what I was doing).

A must read post if you want to be on the cutting edge of programming.

### The Evolution of Regression Modeling… [Webinar]

Wednesday, February 6th, 2013

The Evolution of Regression Modeling: From Classical Linear Regression to Modern Ensembles by Mikhail Golovnya and Illia Polosukhin.

Dates/Times:

Part 1: Fri March 1, 10 am, PST

Part 2: Friday, March 15, 10 am, PST

Part 3: Friday, March 29, 10 am, PST

Part 4: Friday, April 12, 10 am, PST

From the webpage:

Class Description: Regression is one of the most popular modeling methods, but the classical approach has significant problems. This webinar series address these problems. Are you are working with larger datasets? Is your data challenging? Does your data include missing values, nonlinear relationships, local patterns and interactions? This webinar series is for you! We will cover improvements to conventional and logistic regression, and will include a discussion of classical, regularized, and nonlinear regression, as well as modern ensemble and data mining approaches. This series will be of value to any classically trained statistician or modeler.

Details:

Part 1: March 1 – Regression methods discussed

•     Classical Regression
•     Logistic Regression
•     Regularized Regression: GPS Generalized Path Seeker
•     Nonlinear Regression: MARS Regression Splines

Part 2: March 15 – Hands-on demonstration of concepts discussed in Part 1

•     Step-by-step demonstration
•     Instructions for reproducing demo at your leisure
•     For the dedicated student: apply these methods to your own data (optional)

Part 3: March 29 – Regression methods discussed
*Part 1 is a recommended pre-requisite

•     Nonlinear Ensemble Approaches: TreeNet Gradient Boosting; Random Forests; Gradient Boosting incorporating RF
•     Ensemble Post-Processing: ISLE; RuleLearner

Part 4: April 12 – Hands-on demonstration of concepts discussed in part 3

•     Step-by-step demonstration
•     Instructions for reproducing demo at your leisure
•     For the dedicated student: apply these methods to your own data (optional)

Salford Systems offers other introductory videos, webinars and tutorial and case studies.

Regression modeling is a tool you will encounter in data analysis and is likely to be an important part of your exploration toolkit.

I first saw this at KDNuggets.

### Information field theory

Sunday, January 27th, 2013

Information field theory

From the webpage:

Information field theory (IFT) is information theory, the logic of reasoning under uncertainty, applied to fields. A field can be any quantity defined over some space, e.g. the air temperature over Europe, the magnetic field strength in the Milky Way, or the matter density in the Universe. IFT describes how data and knowledge can be used to infer field properties. Mathematically it is a statistical field theory and exploits many of the tools developed for such. Practically, it is a framework for signal processing and image reconstruction.

IFT is fully Bayesian. How else can infinitely many field degrees of freedom be constrained by finite data?

It can be used without the knowledge of Feynman diagrams. There is a full toolbox of methods.

It reproduces many known well working algorithms. This should be reassuring.

And, there were certainly previous works in a similar spirit. See below for IFT publications and previous works.

Anyhow, in many cases IFT provides novel rigorous ways to extract information from data.

Please, have a look! The specific literature is listed below and more general highlight articles on the right hand side.

Just in case you want to be on the cutting edge of information extraction.

And you might note that Feynman diagrams are graphic representations (maps) of complex mathematical equations.