Archive for the ‘Bugs’ Category

Cryptic genetic variation in software:…

Tuesday, October 14th, 2014

Cryptic genetic variation in software: hunting a buffered 41 year old bug by Sean Eddy.

From the post:

In genetics, cryptic genetic variation means that a genome can contain mutations whose phenotypic effects are invisible because they are suppressed or buffered, but under rare conditions they become visible and subject to selection pressure.

In software code, engineers sometimes also face the nightmare of a bug in one routine that has no visible effect because of a compensatory bug elsewhere. You fix the other routine, and suddenly the first routine starts failing for an apparently unrelated reason. Epistasis sucks.

I’ve just found an example in our code, and traced the origin of the problem back 41 years to the algorithm’s description in a 1973 applied mathematics paper. The algorithm — for sampling from a Gaussian distribution — is used worldwide, because it’s implemented in the venerable RANLIB software library still used in lots of numerical codebases, including GNU Octave. It looks to me that the only reason code has been working is that a compensatory “mutation” has been selected for in everyone else’s code except mine.


A bug hunting story to read and forward! Sean just bagged a forty-one (41) year old bug. What’s the oldest bug you have ever found?

When you reach the crux of the problem, you will understand why ambiguous, vague, incomplete and poorly organized standards annoy me to no end.

No guarantees of unambiguous results but if you need extra eyes on IT standards you know where to find me.

I first saw this in a tweet by Neil Saunders.

How to find bugs in MySQL

Sunday, April 20th, 2014

How to find bugs in MySQL by Roel Van de Paar.

From the post:

Finding bugs in MySQL is not only fun, it’s also something I have been doing the last four years of my life.

Whether you want to become the next Shane Bester (who is generally considered the most skilled MySQL bug hunter worldwide), or just want to prove you can outsmart some of the world’s best programmers, finding bugs in MySQL is a skill not reserved anymore to top QA engineers armed with a loads of scripts, expensive flash storage and top-range server hardware. Off course, for professionals that’s still the way to go, but now anyone with an average laptop and a standard HDD can have a lot of fun trying to find that elusive crash…

If you follow this post carefully, you may well be able to find a nice crashing bug (or two) running RQG (an excellent database QA tool). Linux would be the preferred testing OS, but if you are using Windows as your main OS, I would recommend getting Virtual Box and running a Linux guest in a suitably sized (i.e. large) VM. In terms of the acronym “RQG”, this stands for “Random Query Generator,” also named “randgen.”

If you’re not just after finding any bug out there (“bug hunting”), you can tune the RQG grammars (files that define what sort of SQL RQG executes) to more or less match your “issue area.” For example, if you are always running into a situation where the server crashes on a DELETE query (as seen at the end of the mysqld error log for example), you would want an SQL grammar that definitely has a variety of DELETE queries in it. These queries should be closely matched with the actual crashing query – crashes usually happen due to exactly the same, or similar statements with the same clauses, conditions etc.

Just in case you feel a bit old for an Easter egg hunt today, consider going on a MySQL bug hunt.

Curious, do you know of RQG-like suites for noSQL databases?

PS: RQG Documentation (github)

Bugs, features, and risk

Thursday, January 12th, 2012

Bugs, features, and risk by John D. Cook.

All software has bugs. Someone has estimated that production code has about one bug per 100 lines. Of course there’s some variation in this number. Some software is a lot worse, and some is a little better.

But bugs-per-line-of-code is not very useful for assessing risk. The risk of a bug is the probability of running into it multiplied by its impact. Some lines of code are far more likely to execute than others, and some bugs are far more consequential than others.

Devoting equal effort to testing all lines of code would be wasteful. You’re not going to find all the bugs anyway, so you should concentrate on the parts of the code that are most likely to run and that would produce the greatest harm if they were wrong.

Has anyone done error studies on RDF/OWL/LinkedData? Asking because obviously topic maps, Semantic Web, and other semantic applications are going to have errors.

Some obvious questions:

  • How does your application respond to bad data (errors)?
  • What data is most critical to be correct?
  • What is your acceptable error rate? (0 is not an acceptable answer)
  • What is the error rate for data entry with your application?

If you are interested in error correction, in semantic contexts or otherwise, start with General Error Detection, a set of pages maintained by Roy Panko.

From General Error Detection homepage:

Proofreading catches about 90% of all nonword spelling errors and about 70% of all word spelling errors. The table below shows that error detection varies widely by the type of task being done.

In general, our error detection rate only approaches 90% for simple mechanical errors, such as mistyping a number.

For logic errors, error detection is far worse, often 50% or less.

For omission errors, where we have left something out, correction rates are very low.