Archive for the ‘Mathematics’ Category

The Matrix Calculus You Need For Deep Learning

Wednesday, February 7th, 2018

The Matrix Calculus You Need For Deep Learning by Terence Parr, Jeremy Howard.


This paper is an attempt to explain all the matrix calculus you need in order to understand the training of deep neural networks. We assume no math knowledge beyond what you learned in calculus 1, and provide links to help you refresh the necessary math where needed. Note that you do not need to understand this material before you start learning to train and use deep learning in practice; rather, this material is for those who are already familiar with the basics of neural networks, and wish to deepen their understanding of the underlying math. Don’t worry if you get stuck at some point along the way—just go back and reread the previous section, and try writing down and working through some examples. And if you’re still stuck, we’re happy to answer your questions in the Theory category at Note: There is a reference section at the end of the paper summarizing all the key matrix calculus rules and terminology discussed here.

Here’s a recommendation for reading the paper:

(We teach in University of San Francisco’s MS in Data Science program and have other nefarious projects underway. You might know Terence as the creator of the ANTLR parser generator. For more material, see Jeremy’s courses and University of San Francisco’s Data Institute in-person version of the deep learning course.

Apologies to Jeremy but I recognize ANTLR more quickly than I do Jeremy’s courses. (Need to fix that.)

The paper runs thirty-three pages and as the authors say, most of it is unnecessary unless you want to understand what’s happening under the hood with deep learning.

Think of it as the difference between knowing how to drive a sports car and being able to work on a sports car.

With the latter set of skills, you can:

  • tweak your sports car for maximum performance
  • tweak someone else’s sports car for less performance
  • detect someone tweaking your sports car

Read the paper, master the paper.

No test, just real world consequences that separate the prepared from the unprepared.

The vector algebra war: a historical perspective [Semantic Confusion in Engineering and Physics]

Tuesday, January 23rd, 2018

The vector algebra war: a historical perspective by James M. Chappell, Azhar Iqbal, John G. Hartnett, Derek Abbott.


There are a wide variety of different vector formalisms currently utilized in engineering and physics. For example, Gibbs’ three-vectors, Minkowski four-vectors, complex spinors in quantum mechanics, quaternions used to describe rigid body rotations and vectors defined in Clifford geometric algebra. With such a range of vector formalisms in use, it thus appears that there is as yet no general agreement on a vector formalism suitable for science as a whole. This is surprising, in that, one of the primary goals of nineteenth century science was to suitably describe vectors in three-dimensional space. This situation has also had the unfortunate consequence of fragmenting knowledge across many disciplines, and requiring a significant amount of time and effort in learning the various formalisms. We thus historically review the development of our various vector systems and conclude that Clifford’s multivectors best fulfills the goal of describing vectorial quantities in three dimensions and providing a unified vector system for science.

An image from the paper captures the “descent of the various vector systems:”

The authors contend for use of Clifford’s multivectors over the other vector formalisms described.

Assuming Clifford’s multivectors displace all other systems in use, the authors fail to answer how readers will access the present and past legacy of materials in other formalisms?

If the goal is to eliminate “fragmenting knowledge across many disciplines, and requiring a significant amount of time and effort in learning the various formalisms,” that fails in the absence of a mechanism to access existing materials using the Clifford’s multivector formalism.

Topic maps anyone?


Monday, December 11th, 2017

Mathwashing: How Algorithms Can Hide Gender and Racial Biases by Kimberley Mok.

From the post:

Scholars have long pointed out that the way languages are structured and used can say a lot about the worldview of their speakers: what they believe, what they hold sacred, and what their biases are. We know humans have their biases, but in contrast, many of us might have the impression that machines are somehow inherently objective. But does that assumption apply to a new generation of intelligent, algorithmically driven machines that are learning our languages and training from human-generated datasets? By virtue of being designed by humans, and by learning natural human languages, might these artificially intelligent machines also pick up on some of those same human biases too?

It seems that machines can and do indeed assimilate human prejudices, whether they are based on race, gender, age or aesthetics. Experts are now finding more evidence that supports this phenomenon of algorithmic bias. As sets of instructions that help machines to learn, reason, recognize patterns and perform tasks on their own, algorithms increasingly pervade our lives. And in a world where algorithms already underlie many of those big decisions that can change lives forever, researchers are finding that many of these algorithms aren’t as objective as we assume them to be.

If you have ever suffered from the delusion that algorithms, any algorithm is “objective,” this post is a must read. Or re-read to remind yourself that “objectivity” is a claim used to put your position beyond question for self-interest. Nothing more.

For my part, I’m not sure what’s unclear about data collected, algorithms chosen, interpretation of results, all being the results of bias?

There may be acceptable biases, or degrees of bias, but the goal of any measurement is a result, which automatically biases a measurer in favor of phenomena that can be measured by a convenient technique. Phenomena that cannot be easily measured, no matter how important, won’t be included.

By the same token, “bias-correction” is the introduction of an acceptable bias and/or limiting bias to what is seen as, to the person judging the presence of bias, to an acceptable level of bias.

Bias is omnipresent and while evaluating algorithms is important, always bear in mind you are choosing acceptable bias over unacceptable bias.

Or to mis-quote the Princess Bride: “Bias is everywhere. Anyone who says differently is selling something.” (Sharing Mathematical Text on the Web) [Leading Feds Into Woods of Logicism]

Wednesday, November 1st, 2017

From About is a website meant for sharing snippets of mathematical text with others on the web. This is a pastebin for mathematics. This website was born out of a one night hack on Sunday 25, 2012.

Posting and sharing

A new post can be composed by visiting the home page and writing or pasting code in the box on the left hand pane of the page. Once a post is composed and submitted, the page is saved and it becomes accessible with a new unique URL. The new page looks similar to this page and it has a unique URL of its own. The URL can be shared with anyone on the web and he or she will be able to visit your post.


The post can be composed in a mixture of plain text, LaTeX, Markdown and HTML. HTML tags commonly used for formatting text elements are supported. For a demonstration on how LaTeX is rendered, see the demo page. To quickly get started with posting math, see the tutorial.

Bug reports and suggestions

If you come across any bugs, or if you have any suggestions, please email Susam Pal at or report an issue at

Your mileage will vary but drawing on Principia Mathematica without citation will leave any government agents tracking your posts in the wilds of 20th century logicism. Unlikely they will damage anything.

If your Principia notation skills are weak, consider The Notation in Principia Mathematica to translate proofs into late 20th century logic notation.

Launch of the PhilMath Archive

Monday, May 29th, 2017

Launch of the PhilMath Archive: preprint server specifically for philosophy of mathematics

From the post:

PhilSci-Archive is pleased to announce the launch of the PhilMath-Archive, a preprint server specifically for the philosophy of mathematics. The PhilMath-Archive is offered as a free service to the philosophy of mathematics community. Like the PhilSci-Archive, its goal is to promote communication in the field by the rapid dissemination of new work. We aim to provide an accessible repository in which scholarly articles and monographs can find a permanent home. Works posted here can be linked to from across the web and freely viewed without the need for a user account.

PhilMath-Archive invites submissions in all areas of philosophy of mathematics, including general philosophy of mathematics, history of mathematics, history of philosophy of mathematics, history and philosophy of mathematics, philosophy of mathematical practice, philosophy and mathematics education, mathematical applicability, mathematical logic and foundations of mathematics.

For your reference, the PhilSci-Archive.


immersive linear algebra

Sunday, May 21st, 2017

immersive linear algebra by J. Ström, K. Åström, and T. Akenine-Möller.

Billed as:

The world’s first linear algebra book with fully interactive figures.

From the preface:

“A picture says more than a thousand words” is a common expression, and for text books, it is often the case that a figure or an illustration can replace a large number of words as well. However, we believe that an interactive illustration can say even more, and that is why we have decided to build our linear algebra book around such illustrations. We believe that these figures make it easier and faster to digest and to learn linear algebra (which would be the case for many other mathematical books as well, for that matter). In addition, we have added some more features (e.g., popup windows for common linear algebra terms) to our book, and we believe that those features will make it easier and faster to read and understand as well.

After using linear algebra for 20 years times three persons, we were ready to write a linear algebra book that we think will make it substantially easier to learn and to teach linear algebra. In addition, the technology of mobile devices and web browsers have improved beyond a certain threshold, so that this book could be put together in a very novel and innovative way (we think). The idea is to start each chapter with an intuitive concrete example that practically shows how the math works using interactive illustrations. After that, the more formal math is introduced, and the concepts are generalized and sometimes made more abstract. We believe it is easier to understand the entire topic of linear algebra with a simple and concrete example cemented into the reader’s mind in the beginning of each chapter.

Please contact us if there are errors to report, things that you think should be improved, or if you have ideas for better exercises etc. We sincerely look forward to hearing from you, and we will continuously improve this book, and add contributing people to the acknowledgement.
… (popups omitted)

Unlike some standards I could mention, but won’t, the authors number just about everything, making it easy to reference equations, illustrations, etc.


Q&A Cathy O’Neil…

Wednesday, January 4th, 2017

Q&A Cathy O’Neil, author of ‘Weapons of Math Destruction,’ on the dark side of big data by Christine Zhang.

From the post:

Cathy O’Neil calls herself a data skeptic. A former hedge fund analyst with a PhD in mathematics from Harvard University, the Occupy Wall Street activist left finance after witnessing the damage wrought by faulty math in the wake of the housing crash.

In her latest book, “Weapons of Math Destruction,” O’Neil warns that the statistical models hailed by big data evangelists as the solution to today’s societal problems, like which teachers to fire or which criminals to give longer prison terms, can codify biases and exacerbate inequalities. “Models are opinions embedded in mathematics,” she writes.

Great interview that hits enough high points to leave you wanting to learn more about Cathy and her analysis.

On that score, try:

Read her mathbabe blog.

Follow @mathbabedotorg.

Read Weapons of math destruction : how big data increases inequality and threatens democracy.

Try her new business: ORCAA [O’Neil Risk Consulting and Algorithmic Auditing].

From the ORCAA homepage:

ORCAA’s mission is two-fold. First, it is to help companies and organizations that rely on time and cost-saving algorithms to get ahead of this wave, to understand and plan for their litigation and reputation risk, and most importantly to use algorithms fairly.

The second half of ORCAA’s mission is this: to develop rigorous methodology and tools, and to set rigorous standards for the new field of algorithmic auditing.

There are bright line cases, sentencing, housing, hiring discrimination where “fair” has a binding legal meaning. And legal liability for not being “fair.”

Outside such areas, the search for “fairness” seems quixotic. Clients are entitled to their definitions of “fair” in those areas.

Researchers found mathematical structure that was thought not to exist [Topic Map Epistemology]

Tuesday, November 15th, 2016

Researchers found mathematical structure that was thought not to exist

From the post:

Researchers found mathematical structure that was thought not to exist. The best possible q-analogs of codes may be useful in more efficient data transmission.

The best possible q-analogs of codes may be useful in more efficient data transmission.

In the 1970s, a group of mathematicians started developing a theory according to which codes could be presented at a level one step higher than the sequences formed by zeros and ones: mathematical subspaces named q-analogs.

While “things thought to not exist” may pose problems for ontologies and other mechanical replicas of truth, topic maps are untroubled by them.

As the Topic Maps Data Model (TMDM) provides:

subject: anything whatsoever, regardless of whether it exists or has any other specific characteristics, about which anything whatsoever may be asserted by any means whatsoever

A topic map can be constrained by its author to be as stunted as early 20th century logical positivism or have a more post-modernist approach, somewhere in between or elsewhere, but topic maps in general are amenable to any such choice.

One obvious advantage of topic maps being that characteristics of things “thought not to exist” can be captured as they are discussed, only to result in the merging of those discussions with those following the discovery things “thought not to exist really do exist.”

The reverse is also true, that is topic maps can capture the characteristics of things “thought to exist” which are later “thought to not exist,” along with the transition from “existence” to being thought to be non-existent.

If existence to non-existence sounds difficult, imagine a police investigation where preliminary statements then change and or replaced by other statements. You may want to capture prior statements, no longer thought to be true, along with their relationships to later statements.

In “real world” situations, you need epistemological assumptions in your semantic paradigm that adapt to the world as experienced and not limited to the world as imagined by others.

Topic maps offer an open epistemological assumption.

Does your semantic paradigm do the same?

Encyclopedia of Distances

Thursday, November 3rd, 2016

Encyclopedia of Distances (4th edition) by Michel Marie Deza and Elena Deza.

Springer description:

This 4-th edition of the leading reference volume on distance metrics is characterized by updated and rewritten sections on some items suggested by experts and readers, as well a general streamlining of content and the addition of essential new topics. Though the structure remains unchanged, the new edition also explores recent advances in the use of distances and metrics for e.g. generalized distances, probability theory, graph theory, coding theory, data analysis.

New topics in the purely mathematical sections include e.g. the Vitanyi multiset-metric, algebraic point-conic distance, triangular ratio metric, Rossi-Hamming metric, Taneja distance, spectral semimetric between graphs, channel metrization, and Maryland bridge distance. The multidisciplinary sections have also been supplemented with new topics, including: dynamic time wrapping distance, memory distance, allometry, atmospheric depth, elliptic orbit distance, VLBI distance measurements, the astronomical system of units, and walkability distance.

Leaving aside the practical questions that arise during the selection of a ‘good’ distance function, this work focuses on providing the research community with an invaluable comprehensive listing of the main available distances.

As well as providing standalone introductions and definitions, the encyclopedia facilitates swift cross-referencing with easily navigable bold-faced textual links to core entries. In addition to distances themselves, the authors have collated numerous fascinating curiosities in their Who’s Who of metrics, including distance-related notions and paradigms that enable applied mathematicians in other sectors to deploy research tools that non-specialists justly view as arcane. In expanding access to these techniques, and in many cases enriching the context of distances themselves, this peerless volume is certain to stimulate fresh research.

Ransomed for $149 (US) per digital copy, this remarkable work that should have a broad readership.

From the introduction to the 2009 edition:

Distance metrics and distances have now become an essential tool in many areas of Mathematics and its applications including Geometry, Probability, Statistics, Coding/Graph Theory, Clustering, Data Analysis, Pattern Recognition, Networks, Engineering, Computer Graphics/Vision, Astronomy, Cosmology, Molecular Biology, and many other areas of science. Devising the most suitable distance metrics and similarities, to quantify the proximity between objects, has become a standard task for many researchers. Especially intense ongoing search for such distances occurs, for example, in Computational Biology, Image Analysis, Speech Recognition, and Information Retrieval.

Often the same distance metric appears independently in several different areas; for example, the edit distance between words, the evolutionary distance in Biology, the Levenstein distance in Coding Theory, and the Hamming+Gap or shuffle-Hamming distance.

(emphasis added)

I highlighted that last sentence to emphasize that Encyclopedia of Distances is a static and undisclosed topic map.

While readers familiar with the concepts:

edit distance between words, the evolutionary distance in Biology, the Levenstein distance in Coding Theory, and the Hamming+Gap or shuffle-Hamming distance.

could enumerate why those merit being spoken of as being “the same distance metric,” no indexing program can accomplish the same feat.

If each of those concepts had enumerated properties, which could be compared by an indexing program, readers could not only discover those “same distance metrics” but could also discover new rediscoveries of that same metric.

As it stands, readers must rely upon the undisclosed judgments of the Deza’s and hope they continue to revise and extend this work.

When they cease to do so, successive editors will be forced to re-acquire the basis for adding new/re-discovered metrics to it.

PS: Suggestions of similar titles that deal with non-metric distances? I’m familiar with works that impose metrics on non-metric distances but that’s not what I have in mind. That’s an arbitrary and opaque mapping from non-metric to metric.

Wild Maths – explore, imagine, experiment, create!

Wednesday, November 2nd, 2016

Wild Maths – explore, imagine, experiment, create!

From the webpage:

Mathematics is a creative subject. It involves spotting patterns, making connections, and finding new ways of looking at things. Creative mathematicians play with ideas, draw pictures, have the courage to experiment and ask good questions.

Wild Maths is a collection of mathematical games, activities and stories, encouraging you to think creatively. We’ve picked out some of our favourites below – have a go at anything that catches your eye. If you want to explore games, challenges and investigations linked by some shared mathematical areas, click on the Pathways link in the top menu.

The line:

It involves spotting patterns, making connections, and finding new ways of looking at things.

is true of data science as well.

I’m going to print out Can you traverse it?, to keep myself honest, if nothing else. 😉


Weapons of Math Destruction:… [Constructive Knowledge of Discriminatory Impact?]

Saturday, September 10th, 2016

Weapons of Math Destruction: invisible, ubiquitous algorithms are ruining millions of lives by Cory Doctorow.

From the post:

I’ve been writing about the work of Cathy “Mathbabe” O’Neil for years: she’s a radical data-scientist with a Harvard PhD in mathematics, who coined the term “Weapons of Math Destruction” to describe the ways that sloppy statistical modeling is punishing millions of people every day, and in more and more cases, destroying lives. Today, O’Neil brings her argument to print, with a fantastic, plainspoken, call to arms called (what else?) Weapons of Math Destruction.


I’ve followed Cathy’s posts long enough to recommend Weapons of Math Destruction sight unseen. (Publication date September 6, 2016.)

Warning: If you read Weapons of Math Destruction, unlike executives who choose models based on their “gut,” or “instinct,” you may be charged with constructive knowledge of how you model discriminates against group X or Y.

If, like a typical Excel user, you can honestly say “I type in the numbers here and the output comes out there,” it’s going to be hard to prove any intent to discriminate.

You are no more responsible for a result than a pump handle is responsible for cholera.

Doctorow’s conclusion:

O’Neil’s book is a vital crash-course in the specialized kind of statistical knowledge we all need to interrogate the systems around us and demand better.

depends upon your definition of “better.”

“Better” depends on your goals or those of a client.


PS: It is important to understand models/statistics/data so you can shape results to be your definition of “better.” But acknowledging all results are shaped. The critical question is “What shape do you want?”

Category Theory 1.2

Tuesday, August 30th, 2016

Category Theory 1.2 by Bartosz Milewski.

Brief notes on the first couple of minutes:

Our toolset includes:

Abstraction – lose the details – things that were different are now the same

Composition –


Identity – what is identical or considered to be identical

Composition and Identity define category theory.

Despite the bad press about category theory, I was disappointed when the video ended at the end of approximately 48 minutes.

Yes, it was that entertaining!

If you ever shied away from category theory, start with Category Theory 1.1 and follow on!

Or try Category Theory for Programmers: The Preface, also by Bartosz Milewski.

Category Theory 1.1

Thursday, August 25th, 2016

Motivation and philosophy.

Bartosz Milewski is the author of the category series: Category Theory for Programmers.


Category theory definition dependencies

Friday, August 5th, 2016

Category theory definition dependencies by John D. Cook.

From the post:

The diagram below shows how category theory definitions build on each other. Based on definitions in The Joy of Cats.


You will need John’s full size image for this to really be useful.

Prints to 8 1/2 x 11 paper.

There’s a test of your understanding of category theory.

Use John’s dependency graph and on (several) separate pages, jot down your understanding of each term.

Functors, Applicatives, and Monads in Plain English

Saturday, April 23rd, 2016

Functors, Applicatives, and Monads in Plain English by Russ Bishop.

From the post:

Let’s learn what Monads, Applicatives, and Functors are, only instead of relying on obscure functional vocabulary or category theory we’ll just, you know, use plain english instead.

See what you think.

I say Russ was successful.


Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy

Thursday, April 7th, 2016

Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy by Cathy O’Neil.


From the description at Amazon:

We live in the age of the algorithm. Increasingly, the decisions that affect our lives—where we go to school, whether we get a car loan, how much we pay for health insurance—are being made not by humans, but by mathematical models. In theory, this should lead to greater fairness: Everyone is judged according to the same rules, and bias is eliminated. But as Cathy O’Neil reveals in this shocking book, the opposite is true. The models being used today are opaque, unregulated, and uncontestable, even when they’re wrong. Most troubling, they reinforce discrimination: If a poor student can’t get a loan because a lending model deems him too risky (by virtue of his race or neighborhood), he’s then cut off from the kind of education that could pull him out of poverty, and a vicious spiral ensues. Models are propping up the lucky and punishing the downtrodden, creating a “toxic cocktail for democracy.” Welcome to the dark side of Big Data.

Tracing the arc of a person’s life, from college to retirement, O’Neil exposes the black box models that shape our future, both as individuals and as a society. Models that score teachers and students, sort resumes, grant (or deny) loans, evaluate workers, target voters, set parole, and monitor our health—all have pernicious feedback loops. They don’t simply describe reality, as proponents claim, they change reality, by expanding or limiting the opportunities people have. O’Neil calls on modelers to take more responsibility for how their algorithms are being used. But in the end, it’s up to us to become more savvy about the models that govern our lives. This important book empowers us to ask the tough questions, uncover the truth, and demand change.

Even if you have qualms about Cathy’s position, you have to admit that is a great book cover!

When I was in law school, I had F. Hodge O’Neal for corporation law. He is the O’Neal in O’Neal and Thompson’s Oppression of Minority Shareholders and LLC Members, Rev. 2d.

The publisher’s blurb is rather generous in saying:

Cited extensively, O’Neal and Thompson’s Oppression of Minority Shareholders and LLC Members shows how to take appropriate steps to protect minority shareholder interests using remedies, tactics, and maneuvers sanctioned by federal law. It clarifies the underlying cause of squeeze-outs and suggests proven arrangements for avoiding them.

You could read Oppression of Minority Shareholders and LLC Members that way but when corporate law is taught with war stories from the antics of the robber barons forward, you get the impression that isn’t why people read it.

Not that I doubt Cathy’s sincerity, on the contrary, I think she is very sincere about her warnings.

Where I disagree with Cathy is in thinking democracy is under greater attack now or that inequality is any greater problem than before.

If you read The Half Has Never Been Told: Slavery and the Making of American Capitalism by Edward E. Baptist:


carefully, you will leave it with deep uncertainty about the relationship of American government, federal, state and local to any recognizable concept of democracy. Or for that matter to the “equality” of its citizens.

Unlike Cathy as well, I don’t expect that shaming people is going to result in “better” or more “honest” data analysis.

What you can do is arm yourself to do battle on behalf of your “side,” both in terms of exposing data manipulation by others and concealing your own.

Perhaps there is room in the marketplace for a book titled: Suppression of Unfavorable Data. More than hiding data, what data to not collect? How to explain non-collection/loss? How to collect data in the least useful ways?

You would have to write it as a how to avoid these very bad practices but everyone would know what you meant. Could be the next business management best seller.

Overlay Journal – Discrete Analysis

Saturday, March 5th, 2016

The arXiv overlay journal Discrete Analysis has launched by Christian Lawson-Perfect.

From the post:

Discrete Analysis, a new open-access journal for articles which are “analytical in flavour but that also have an impact on the study of discrete structures”, launched this week. What’s interesting about it is that it’s an arXiv overlay journal founded by, among others, Timothy Gowers.

What that means is that you don’t get articles from Discrete Analysis – it just arranges peer review of papers held on the arXiv, cutting out almost all of the expensive parts of traditional journal publishing. I wasn’t really prepared for how shallow that makes the journal’s website – there’s a front page, and when you click on an article you’re shown a brief editorial comment with a link to the corresponding arXiv page, and that’s it.

But that’s all it needs to do – the opinion of Gowers and co. is that the only real value that journals add to the papers they publish is the seal of approval gained by peer review, so that’s the only thing they’re doing. Maths papers tend not to benefit from the typesetting services traditional publishers provide (or, more often than you’d like, are actively hampered by it).

One way the journal is adding value beyond a “yes, this is worth adding to the list of papers we approve of” is by providing an “editorial introduction” to accompany each article. These are brief notes, written by members of the editorial board, which introduce the topics discussed in the paper and provide some context, to help you decide if you want to read the paper. That’s a good idea, and it makes browsing through the articles – and this is something unheard of on the internet – quite pleasurable.

It’s not difficult to imagine “editorial introductions” with underlying mini-topic maps that could be explored on their own or that as you reach the “edge” of a particular topic map, it “unfolds” to reveal more associations/topics.

Not unlike a traditional street map for New York which you can unfold to find general areas but can then fold it up to focus more tightly on a particular area.

I hesitate to say “zoom” because in the application I have seen (important qualification), “zoom” uniformly reduces your field of view.

A more nuanced notion of “zoom,” for a topic map and perhaps for other maps as well, would be to hold portions of the current view stationary, say a starting point on an interstate highway and to “zoom” only a portion of the current view to show a detailed street map. That would enable the user to see a particular location while maintaining its larger context.

Pointers to applications that “zoom” but also maintain different levels of “zoom” in the same view? Given the fascination with “hairy” presentations of graphs that would have to be real winner.

Dimpl: An Efficient and Expressive DSL for Discrete Mathematics

Sunday, February 28th, 2016

Dimpl: An Efficient and Expressive DSL for Discrete Mathematics by Ronit Jha.


This paper describes the language DIMPL, a domain-specific language (DSL) for discrete mathematics. Based on Haskell, DIMPL carries all the advantages of a purely functional programming language. Besides containing a comprehensive library of types and efficient functions covering the areas of logic, set theory, combinatorics, graph theory, number theory and algebra, the DSL also has a notation akin to one used in these fields of study. This paper also demonstrates the benefits of DIMPL by comparing it with C, Fortran, MATLAB and Python &emdash; languages that are commonly used in mathematical programming.

From the comparison, solving simultaneous linear equations:


Much more is promised in the future for DIMPL:

Future versions of DIMPL will have an extended library comprising of modules for lattices, groups, rings, monoids and other discrete structures. They will also contain additional functions for the existing modules such as Graph and Tree. Moreover, incorporating Haskell’s support for pure parallelism and explicit concurrency in the library functions could significantly improve the efficiency of some functions on multi-core machines.

Can you guess the one thing that Ronit left out of his paper?

You guessed it!

Discrete Mathematics Programming Language – A Domain-Specific Language for Discrete Mathematics.

The Github URL for the repository. 😉

You should check out his homepage as well.

I have only touched the edges of this paper but it looks important.


I first saw this in a tweet by José A. Alonso

Google Embeds 2D & 3D Plots on Search!

Friday, February 12th, 2016

Jake Vanderplas tweeted:

Whoah… @google now embeds interactive 2D & 3D plots when you search for a function!

Seeing is believing (with controls no less):



or, sin(x+y)


In case you want to know more:

Go to: Calculator & unit converter and select How to graph equations and Geometry Calculator.

If your browser supports WebGL, Google will render 3d graphs.

What functions have you used Google to render?

A Gentle Introduction to Category Theory (Feb 2016 version)

Monday, February 8th, 2016

A Gentle Introduction to Category Theory (Feb 2016 version) by Peter Smith.

From the preface:

This Gentle Introduction is work in progress, developing my earlier ‘Notes onBasic Category Theory’ (2014–15).

The gadgets of basic category theory fit together rather beautifully in mul-tiple ways. Their intricate interconnections mean, however, that there isn’t asingle best route into the theory. Different lecture courses, different books, canquite appropriately take topics in very different orders, all illuminating in theirdifferent ways. In the earlier Notes, I roughly followed the order of somewhatover half of the Cambridge Part III course in category theory, as given in 2014by Rory Lucyshyn-Wright (broadly following a pattern set by Peter Johnstone;see also Julia Goedecke’s notes from 2013). We now proceed rather differently.The Cambridge ordering certainly has its rationale; but the alternative orderingI now follow has in some respects a greater logical appeal. Which is one reasonfor the rewrite.

Our topics, again in different arrangements, are also covered in (for example)Awodey’s good but uneven Category Theory and in Tom Leinster’s terrific – and appropriately titled – Basic Category Theory. But then, if there are some rightly admired texts out there, not to mention various sets of notes on category theory available online (see here), why produce another introduction to category theory?

I didn’t intend to! My goal all along has been to get to understand what light category theory throws on logic, set theory, and the foundations of mathematics. But I realized that I needed to get a lot more securely on top of basic category theory if I was eventually to pursue these more philosophical issues. So my earlier Notes began life as detailed jottings for myself, to help really fix ideas: and then – as can happen – the writing has simply taken on its own momentum. I am still concentrating mostly on getting the technicalities right and presenting them in apleasing order: I hope later versions will contain more motivational/conceptual material.

What remains distinctive about this Gentle Introduction, for good or ill, is that it is written by someone who doesn’t pretend to be an expert who usually operates at the very frontiers of research in category theory. I do hope, however,that this makes me rather more attuned to the likely needs of (at least some)beginners. I go rather slowly over ideas that once gave me pause, spend more time than is always usual in motivating key ideas and constructions, and I have generally aimed to be as clear as possible (also, I assume rather less background mathematics than Leinster or even Awodey). We don’t get terribly far: however,I hope that what is here may prove useful to others starting to get to grips with category theory. My own experience certainly suggests that initially taking things at a rather gentle pace as you work into a familiarity with categorial ways of thinking makes later adventures exploring beyond the basics so very much more manageable.

Check the Category Theory – Reading List, also by Peter Smith, to make sure you have the latest version of this work.

Be an active reader!

If you spot issues with the text:

Corrections, please, to ps218 at cam dot ac dot uk.

At the category theory reading page Peter mentions having retired after forty years in academia.

Writing an introduction to category theory! What a great way to spend retirement!

(Well, different people have different tastes.)

Typesetting Mathematics According to the ISO Standard [Access to ISO 80000-2:2009]

Thursday, January 28th, 2016

Typesetting Mathematics According to the ISO Standard by Nick Higham.

From the post:

In The Princeton Companion to Applied Mathematics we used the conventions that the constants e (the base of the natural logarithm) and i (the imaginary unit), and the d in derivatives and integrals, are typeset in an upright font. These conventions are part of an ISO standard, ISO 80000-2:2009. The standard is little-known, though there is an excellent article about it in TUGboat by Claudio Beccari, and Kopka and Daly’s A Guide to \LaTeX has a page on the standard (in section 7.4.10 of the fourth edition and section 5.4.10 of the third edition).

Nick mentions that you can get a copy of ISO 80000-2:2009 for about $150, and he also says:

However, it is easy to find a freely downloadable version via a Google search.

Let’s don’t be coy about this sort of thing: try

Every time an illegitimate privilege is acknowledged, it grows stronger.

I refuse to confer any legitimacy or recognition of legitimacy to restricted access to ISO 80000-2:2009.

And you?

Intuitionism and Constructive Mathematics 80-518/818 — Spring 2016

Saturday, January 9th, 2016

Intuitionism and Constructive Mathematics 80-518/818 — Spring 2016

From the course description:

In this seminar we shall read primary and secondary sources on the origins and developments of intuitionism and constructive mathematics from Brouwer and the Russian constructivists, Bishop, Martin-Löf, up to and including modern developments such as homotopy type theory. We shall focus both on philosophical and metamathematical aspects. Topics could include the Brouwer-Heyting-Kolmogorov (BHK) interpretation, Kripke semantics, topological semantics, the Curry-Howard correspondence with constructive type theories, constructive set theory, realizability, relations to topos theory, formal topology, meaning explanations, homotopy type theory, and/or additional topics according to the interests of participants.


  • Jean van Heijenoort (1967), From Frege to Gödel: A Source Book in Mathematical Logic 1879–1931, Cambridge, MA: Harvard University Press.
  • Michael Dummett (1977/2000), Elements of Intuitionism (Oxford Logic Guides, 39), Oxford: Clarendon Press, 1977; 2nd edition, 2000.
  • Michael Beeson (1985), Foundations of Constructive Mathematics, Heidelberg: Springer Verlag.
  • Anne Sjerp Troelstra and Dirk van Dalen (1988), Constructivism in Mathematics: An Introduction (two volumes), Amsterdam: North Holland.

Additional resources

Not online but a Spring course at Carnegie Mellon with a reading list that should exercise your mental engines!

Any subject with a two volume “introduction” (Anne Sjerp Troelstra and Dirk van Dalen), is likely to be heavy sledding. 😉

But the immediate relevance to topic maps is evident by this statement from Rosalie Iemhoff:

Intuitionism is a philosophy of mathematics that was introduced by the Dutch mathematician L.E.J. Brouwer (1881–1966). Intuitionism is based on the idea that mathematics is a creation of the mind. The truth of a mathematical statement can only be conceived via a mental construction that proves it to be true, and the communication between mathematicians only serves as a means to create the same mental process in different minds.

I would recast that to say:

Language is a creation of the mind. The truth of a language statement can only be conceived via a mental construction that proves it to be true, and the communication between people only serves as a means to create the same mental process in different minds.

There are those who claim there is some correspondence between language and something they call “reality.” Since no one has experienced “reality” in the absence of language, I prefer to ask: Is X useful for purpose Y? rather than the doubtful metaphysics of “Is X true?”

Think of it as helping get down to what’s really important, what’s in this for you?

BTW, don’t be troubled by anyone who suggests this position removes all limits on discussion. What motivations do you think caused people to adopt the varying positions they have now?

It certainly wasn’t a detached and disinterested search for the truth, whatever people may pretend once they have found the “truth” they are presently defending. The same constraints will persist even if we are truthful with ourselves.

Math Translator Wanted/Topic Map Needed: Mochizuki and the ABC Conjecture

Monday, January 4th, 2016

What if you Discovered the Answer to a Famous Math Problem, but No One was able to Understand It? by Kevin Knudson.

From the post:

The conjecture is fairly easy to state. Suppose we have three positive integers a,b,c satisfying a+b=c and having no prime factors in common. Let d denote the product of the distinct prime factors of the product abc. Then the conjecture asserts roughly there are only finitely many such triples with c > d. Or, put another way, if a and b are built up from small prime factors then c is usually divisible only by large primes.

Here’s a simple example. Take a=16, b=21, and c=37. In this case, d = 2x3x7x37 = 1554, which is greater than c. The ABC conjecture says that this happens almost all the time. There is plenty of numerical evidence to support the conjecture, and most experts in the field believe it to be true. But it hasn’t been mathematically proven — yet.

Enter Mochizuki. His papers develop a subject he calls Inter-Universal Teichmüller Theory, and in this setting he proves a vast collection of results that culminate in a putative proof of the ABC conjecture. Full of definitions and new terminology invented by Mochizuki (there’s something called a Frobenioid, for example), almost everyone who has attempted to read and understand it has given up in despair. Add to that Mochizuki’s odd refusal to speak to the press or to travel to discuss his work and you would think the mathematical community would have given up on the papers by now, dismissing them as unlikely to be correct. And yet, his previous work is so careful and clever that the experts aren’t quite ready to give up.

It’s not clear what the future holds for Mochizuki’s proof. A small handful of mathematicians claim to have read, understood and verified the argument; a much larger group remains completely baffled. The December workshop reinforced the community’s desperate need for a translator, someone who can explain Mochizuki’s strange new universe of ideas and provide concrete examples to illustrate the concepts. Until that happens, the status of the ABC conjecture will remain unclear.

It’s hard to imagine a more classic topic map problem.

At some point, Shinichi Mochizuki shared a common vocabulary with his colleagues in number theory and arithmetic geometry but no longer.

As Kevin points out:

The December workshop reinforced the community’s desperate need for a translator, someone who can explain Mochizuki’s strange new universe of ideas and provide concrete examples to illustrate the concepts.

Taking Mochizuki’s present vocabulary and working backwards to where he shared a common vocabulary with colleagues is simple enough to say.

The crux of the problem being that discussions are going to be fragmented, distributed in a variety of formal and informal venues.

Combining those discussions to construct a path back to where most number theorists reside today would require something with as few starting assumptions as is possible.

Where you could describe as much or as little about new subjects and their relations to other subjects as is necessary for an expert audience to continue to fill in any gaps.

I’m not qualified to venture an opinion on the conjecture or Mochizuki’s proof but the problem of mapping from new terminology that has its own context back to “standard” terminology is a problem uniquely suited to topic maps.

Street-Fighting Mathematics – Free Book – Lesson For Semanticists?

Friday, January 1st, 2016

Street-Fighting Mathematics: The Art of Educated Guessing and Opportunistic Problem Solving by Sanjoy Mahajan.

From the webpage:


In problem solving, as in street fighting, rules are for fools: do whatever works—don’t just stand there! Yet we often fear an unjustified leap even though it may land us on a correct result. Traditional mathematics teaching is largely about solving exactly stated problems exactly, yet life often hands us partly defined problems needing only moderately accurate solutions. This engaging book is an antidote to the rigor mortis brought on by too much mathematical rigor, teaching us how to guess answers without needing a proof or an exact calculation.

In Street-Fighting Mathematics, Sanjoy Mahajan builds, sharpens, and demonstrates tools for educated guessing and down-and-dirty, opportunistic problem solving across diverse fields of knowledge—from mathematics to management. Mahajan describes six tools: dimensional analysis, easy cases, lumping, picture proofs, successive approximation, and reasoning by analogy. Illustrating each tool with numerous examples, he carefully separates the tool—the general principle—from the particular application so that the reader can most easily grasp the tool itself to use on problems of particular interest. Street-Fighting Mathematics grew out of a short course taught by the author at MIT for students ranging from first-year undergraduates to graduate students ready for careers in physics, mathematics, management, electrical engineering, computer science, and biology. They benefited from an approach that avoided rigor and taught them how to use mathematics to solve real problems.

I have just started reading Street-Fighting Mathematics but I wonder if there is a parallel between mathematics and the semantics that everyone talks about capturing from information systems.

Consider this line:

Traditional mathematics teaching is largely about solving exactly stated problems exactly, yet life often hands us partly defined problems needing only moderately accurate solutions.

And re-cast it for semantics:

Traditional semantics (Peirce, FOL, SUMO, RDF) is largely about solving exactly stated problems exactly, yet life often hands us partly defined problems needing only moderately accurate solutions.

What if the semantics we capture and apply are sufficient for your use case? Complete with ROI for that use case.

Is that sufficient?

Estimating “known unknowns”

Saturday, December 12th, 2015

Estimating “known unknowns” by Nick Berry.

From the post:

There’s a famous quote from former Secretary of Defense Donald Rumsfeld:

“ … there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns – the ones we don’t know we don’t know.”

I write this blog. I’m an engineer. Whilst I do my best and try to proof read, often mistakes creep in. I know there are probably mistakes in just about everything I write! How would I go about estimating the number of errors?

The idea for this article came from a book I recently read by Paul J. Nahin, entitled Duelling Idiots and Other Probability Puzzlers (In turn, referencing earlier work by the eminent mathematician George Pólya).

Proof Reading2

Imagine I write a (non-trivially short) document and give it to two proof readers to check. These two readers (independantly) proof read the manuscript looking for errors, highlighting each one they find.

Just like me, these proof readers are not perfect. They, also, are not going to find all the errors in the document.

Because they work independently, there is a chance that reader #1 will find some errors that reader #2 does not (and vice versa), and there could be errors that are found by both readers. What we are trying to do is get an estimate for the number of unseen errors (errors detected by neither of the proof readers).*

*An alternate way of thinking of this is to get an estimate for the total number of errors in the document (from which we can subtract the distinct number of errors found to give an estimate to the number of unseen errros.

A highly entertaining posts on estimating “known unknowns,” such as the number of errors in a paper that has been proofed by two independent proof readers.

Of more than passing interest to me because I am involved in a New Testament Greek Lexicon project that is an XML encoding of a 500+ page Greek lexicon.

The working text is in XML, but not every feature of the original lexicon was captured in markup and even if that were true, we would still want to improve upon features offered by the lexicon. All of which depend upon the correctness of the original markup.

You will find Nick’s analysis interesting and more than that, memorable. Just in case you are asked about “estimating ‘known unknowns'” in a data science interview.

Only Rumsfeld could tell you how to estimate an “unknown unknowns.” I think it goes: “Watch me pull a number out of my ….”


I was found this post by following another post at this site, which was cited by Data Science Renee.

‘Outsiders’ Crack A 50-Year-Old Math Problem

Saturday, December 5th, 2015

‘Outsiders’ Crack A 50-Year-Old Math Problem by Erica Klarreich.

From the post:

In 2008, Daniel Spielman told his Yale University colleague Gil Kalai about a computer science problem he was working on, concerning how to “sparsify” a network so that it has fewer connections between nodes but still preserves the essential features of the original network.

Network sparsification has applications in data compression and efficient computation, but Spielman’s particular problem suggested something different to Kalai. It seemed connected to the famous Kadison-Singer problem, a question about the foundations of quantum physics that had remained unsolved for almost 50 years.

Over the decades, the Kadison-Singer problem had wormed its way into a dozen distant areas of mathematics and engineering, but no one seemed to be able to crack it. The question “defied the best efforts of some of the most talented mathematicians of the last 50 years,” wrote Peter Casazza and Janet Tremain of the University of Missouri in Columbia, in a 2014 survey article.

As a computer scientist, Spielman knew little of quantum mechanics or the Kadison-Singer problem’s allied mathematical field, called C*-algebras. But when Kalai, whose main institution is the Hebrew University of Jerusalem, described one of the problem’s many equivalent formulations, Spielman realized that he himself might be in the perfect position to solve it. “It seemed so natural, so central to the kinds of things I think about,” he said. “I thought, ‘I’ve got to be able to prove that.’” He guessed that the problem might take him a few weeks.

Instead, it took him five years. In 2013, working with his postdoc Adam Marcus, now at Princeton University, and his graduate student Nikhil Srivastava, now at the University of California, Berkeley, Spielman finally succeeded. Word spread quickly through the mathematics community that one of the paramount problems in C*-algebras and a host of other fields had been solved by three outsiders — computer scientists who had barely a nodding acquaintance with the disciplines at the heart of the problem.

Why all the excitement?

The proof of the Kadison-Singer problem implies that all the constructions in its dozen incarnations can, in principle, be carried out—quantum knowledge can be extended to full quantum systems, networks can be decomposed into electrically similar ones, matrices can be broken into simpler chunks. The proof won’t change what quantum physicists do, but it could have applications in signal processing, since it implies that collections of vectors used to digitize signals can be broken down into smaller frames that can be processed faster. The theorem “has potential to affect some important engineering problems,” Casazza said.

Just so you know, the same people who are saying it will be years before practical results emerge from this breakthrough are the same ones who assumed the answer to this problem was negative. 😉

I’m not saying techniques based on this work will be in JavaScript libraries next year but without trying, they never will be.


I first saw this in a post by Lars Marius Garshol

‘Not a Math Person’: [Teaching/Communication Strategies]

Wednesday, December 2nd, 2015

‘Not a Math Person’: How to Remove Obstacles to Learning Math by Katrina Schwartz.

From the post:

Stanford math education professor Jo Boaler spends a lot of time worrying about how math education in the United States traumatizes kids. Recently, a colleague’s 7-year-old came home from school and announced he didn’t like math anymore. His mom asked why and he said, “math is too much answering and not enough learning.”

This story demonstrates how clearly kids understand that unlike their other courses, math is a performative subject, where their job is to come up with answers quickly. Boaler says that if this approach doesn’t change, the U.S. will always have weak math education.

“There’s a widespread myth that some people are math people and some people are not,” Boaler told a group of parents and educators gathered at the 2015 Innovative Learning Conference. “But it turns out there’s no such thing as a math brain.” Unfortunately, many parents, teachers and students believe this myth and it holds them up every day in their math learning.

Intriguing article that suggests the solution to the lack of students in computer science and mathematics may well be to work on changing the attitudes of students…about themselves as computer science or mathematics students.

Something to remember when users are having a hard time grasping your explanation of semantics and/or topic maps.

Oh, another high point in the article, our brains physically swell and shrink:

Neuroscientists now know that the brain has the ability to grow and shrink. This was demonstrated in a study of taxi drivers in London who must memorize all the streets and landmarks in downtown London to earn a license. On average it takes people 12 tries to pass the test. Researchers found that the hippocampus of drivers studying for the test grew tremendously. But when those drivers retired, the brain shrank. Before this, no one knew the brain could grow and shrink like that.

It is only year two of the Human Brain Project and now we know that one neuron can have thousands of synapses and now that the infrastructure of the brain grows and shrinks. Information that wasn’t available at its start.

How do you succeed when the basic structure to be modeled keeps changing?

Perhaps that is why the Human Brain Project has no defined measure of “success”, other than spending all the allotted funds over a ten year period. That I am sure they will accomplish.

Graphical Linear Algebra

Tuesday, November 24th, 2015

Graphical Linear Algebra by Pawel Sobocinski.

From Episode 1, Makélélé and Linear Algebra.

Linear algebra is the Claude Makélélé of science and mathematics. Makélélé is a well-known, retired football player, a French international. He played in the famous Real Madrid team of the early 2000s. That team was full of “galácticos” — the most famous and glamorous players of their generation. Players like Zidane, Figo, Ronaldo and Roberto Carlos. Makélélé was hardly ever in the spotlight, he was paid less than his more celebrated colleagues and was frequently criticised by fans and journalists. His style of playing wasn’t glamorous. To the casual fan, there wasn’t much to get excited about: he didn’t score goals, he played boring, unimaginative, short sideways passes, he hardly ever featured in match highlights. In 2003 he signed for Chelsea for relatively little money, and many Madrid fans cheered. But their team started losing matches.

The importance of Makélélé’s role was difficult to appreciate for the non-specialist. But football insiders regularly described him as the work-horse, the engine room, the battery of the team. He sat deep in midfield, was always in the right place to disrupt opposition attacks, recovered possession, and got the ball out quickly to his teammates, turning defence into attack. Without Makélélé, the galácticos didn’t look quite so galactic.

Similarly, linear algebra does not get very much time in the spotlight. But many galáctico subjects of modern scientific research: e.g. artificial intelligence and machine learning, control theory, solving systems of differential equations, computer graphics, “big data“, and even quantum computing have a dirty secret: their engine rooms are powered by linear algebra.

Linear algebra is not very glamorous. It is normally taught to science undergraduates in their first year, to prepare for the more exciting stuff ahead. It is background knowledge. Everyone has to learn what a matrix is, and how to add and multiply matrices.

I have only read the first three or four posts but Pawel’s post look like a good way to refresh or acquire a “background” in linear algebra.

Math is important for “big data” and as Renee Teate reminded us in A Challenge to Data Scientists, bias can be lurking anywhere, data, algorithms, us, etc.

Or as I am fond of saying, “if you let me pick the data or the algorithm, I can produce a specified result, every time.”

Bear that in mind when someone tries to hurry past your questions about data, its acquisition, processing before you saw it, and/or wanting to know the details of an algorithm and how it was applied.

There’s a reason why people want to gloss over such matters and the answer isn’t a happy one, at least from the questioner’s perspective.

Refresh or get an background in linear algebra!

The more you know, the less vulnerable you will be to manipulation and/or fraud.

I first saw this in a tweet by Algebra Fact.

What’s the significance of 0.05 significance?

Tuesday, November 24th, 2015

What’s the significance of 0.05 significance? by Carl Anderson.

From the post:

Why do we tend to use a statistical significance level of 0.05? When I teach statistics or mentor colleagues brushing up, I often get the sense that a statistical significance level of α = 0.05 is viewed as some hard and fast threshold, a publishable / not publishable step function. I’ve seen grad students finish up an empirical experiment and groan to find that p = 0.052. Depressed, they head for the pub. I’ve seen the same grad students extend their experiment just long enough for statistical variation to swing in their favor to obtain p = 0.049. Happy, they head for the pub.

Clearly, 0.05 is not the only significance level used. 0.1, 0.01 and some smaller values are common too. This is partly related to field. In my experience, the ecological literature and other fields that are often plagued by small sample sizes are more likely to use 0.1. Engineering and manufacturing where larger samples are easier to obtain tend to use 0.01. Most people in most fields, however, use 0.05. It is indeed the default value in most statistical software applications.

This “standard” 0.05 level is typically associated with Sir R. A. Fisher, a brilliant biologist and statistician that pioneered many areas of statistics, including ANOVA and experimental design. However, the true origins make for a much richer story.

One of the best history/explanations of 0.05 significance I have ever read. Highly recommended!

In part because in the retelling of this story Carl includes references that will allow you to trace the story in even greater detail.

What is dogma today, 0.05 significance, started as a convention among scientists, without theory, without empirical proof, without any of gate keepers associated with scientific publishing of today.

Over time 0.05 significance has proved its utility. The question for you is what other dogmas of today rely on the chance practices of yesteryear?

I first saw this in a tweet by Kirk Borne.

How to teach gerrymandering…

Tuesday, October 13th, 2015

How to teach gerrymandering and its many subtle, hard problems by Cory Doctorow.

From the post:

Ben Kraft teaches a unit on gerrymandering — rigging electoral districts to ensure that one party always wins — to high school kids in his open MIT Educational Studies Program course. As he describes the problem and his teaching methodology, I learned that district-boundaries have a lot more subtlety and complexity than I’d imagined at first, and that there are some really chewy math and computer science problems lurking in there.

Kraft’s pedagogy is lively and timely and extremely relevant. It builds from a quick set of theoretical exercises and then straight into contemporary, real live issues that matter to every person in every democracy in the world. This would be a great unit to adapt for any high school civics course — you could probably teach it in middle school, too.

Certainly timely considering that congressional elections are ahead (in the United States) in 2016.

Also a reminder that in real life situations, mathematics, algorithms, computers, etc., are never neutral.

The choices you make determine who will serve and who will eat.

It was ever thus and those who pretend otherwise are trying to hide their hand on the scale.