Day Nine of a Predictive Coding Narrative: A scary search for false-negatives, a comparison of my CAR with the Griswold’s, and a moral dilemma by Ralph Losey.
From the post:
In this sixth installment I continue my description, this time covering day nine of the project. Here I do a quality control review of a random sample to evaluate my decision in day eight to close the search.
Ninth Day of Review (4 Hours)
I began by generating a random sample of 1,065 documents from the entire null set (95% +/- 3%) of all documents not reviewed. I was going to review this sample as a quality control test of the adequacy of my search and review project. I would personally review all of them to see if any were False Negatives, in other words, relevant documents, and if relevant, whether any were especially significant or Highly Relevant.
I was looking to see if there were any documents left on the table that should have been produced. Remember that I had already personally reviewed all of the documents that the computer had predicted were like to be relevant (51% probability). I considered the upcoming random sample review of the excluded documents to be a good way to check the accuracy of reliance on the computer’s predictions of relevance.
I know it is not the only way, and there are other quality control measures that could be followed, but this one makes the most sense to me. Readers are invited to leave comments on the adequacy of this method and other methods that could be employed instead. I have yet to see a good discussion of this issue, so maybe we can have one here.
I can appreciate Ralph’s apprehension at a hindsight review of decisions already made. In legal proceedings, decisions are made and they move forward. Some judgements/mistakes can be corrected, others are simply case history.