Bugs, features, and risk by John D. Cook.
All software has bugs. Someone has estimated that production code has about one bug per 100 lines. Of course there’s some variation in this number. Some software is a lot worse, and some is a little better.
But bugs-per-line-of-code is not very useful for assessing risk. The risk of a bug is the probability of running into it multiplied by its impact. Some lines of code are far more likely to execute than others, and some bugs are far more consequential than others.
Devoting equal effort to testing all lines of code would be wasteful. You’re not going to find all the bugs anyway, so you should concentrate on the parts of the code that are most likely to run and that would produce the greatest harm if they were wrong.
Has anyone done error studies on RDF/OWL/LinkedData? Asking because obviously topic maps, Semantic Web, and other semantic applications are going to have errors.
Some obvious questions:
- How does your application respond to bad data (errors)?
- What data is most critical to be correct?
- What is your acceptable error rate? (0 is not an acceptable answer)
- What is the error rate for data entry with your application?
If you are interested in error correction, in semantic contexts or otherwise, start with General Error Detection, a set of pages maintained by Roy Panko.
From General Error Detection homepage:
Proofreading catches about 90% of all nonword spelling errors and about 70% of all word spelling errors. The table below shows that error detection varies widely by the type of task being done.
In general, our error detection rate only approaches 90% for simple mechanical errors, such as mistyping a number.
For logic errors, error detection is far worse, often 50% or less.
For omission errors, where we have left something out, correction rates are very low.