Archive for the ‘Skepticism’ Category

Dear “Skeptics,”… [Attn: All Data Scientists]

Tuesday, May 24th, 2016

Dear “Skeptics,” Bash Homeopathy and Bigfoot Less, Mammograms and War More by John Horgan.


Strings and multiverses can’t be experimentally detected. The theories aren’t falsifiable, which makes them pseudo-scientific, like astrology and Freudian psychoanalysis. Credit: parameter_bond/Flickr

The caption is from Horgan’s post. In case anyone asks, I retrieved and re-sized my own copy of the image.

From the post:

I hate preaching to the converted. If you were Buddhists, I’d bash Buddhism. But you’re skeptics, so I have to bash skepticism.

I’m a science journalist. I don’t celebrate science, I criticize it, because science needs critics more than cheerleaders. I point out gaps between scientific hype and reality. That keeps me busy, because, as you know, most peer-reviewed scientific claims are wrong.

So I’m a skeptic, but with a small S, not capital S. I don’t belong to skeptical societies. I don’t hang out with people who self-identify as capital-S Skeptics. Or Atheists. Or Rationalists.

When people like this get together, they become tribal. They pat each other on the back and tell each other how smart they are compared to those outside the tribe. But belonging to a tribe often makes you dumber.

Here’s an example involving two idols of Capital-S Skepticism: biologist Richard Dawkins and physicist Lawrence Krauss. Krauss recently wrote a book, A Universe from Nothing. He claims that physics is answering the old question, Why is there something rather than nothing?

Krauss’s book doesn’t come close to fulfilling the promise of its title, but Dawkins loved it. He writes in the book’s afterword: "If On the Origin of Species was biology’s deadliest blow to supernaturalism, we may come to see A Universe From Nothing as the equivalent from cosmology."

Just to be clear: Dawkins is comparing Lawrence Krauss to Charles Darwin. Why would Dawkins say something so foolish? Because he hates religion so much that it impairs his scientific judgment. He succumbs to what you might call “The Science Delusion.”

“The Science Delusion” is common among Capital-S Skeptics. You don’t apply your skepticism equally. You are extremely critical of belief in God, ghosts, heaven, ESP, astrology, homeopathy and Bigfoot. You also attack disbelief in global warming, vaccines and genetically modified food.

These beliefs and disbeliefs deserve criticism, but they are what I call “soft targets.” That’s because, for the most part, you’re bashing people outside your tribe, who ignore you. You end up preaching to the converted.

Meanwhile, you neglect what I call hard targets. These are dubious and even harmful claims promoted by major scientists and institutions. In the rest of this talk, I’ll give you examples of hard targets from physics, medicine and biology. I’ll wrap up with a rant about war, the hardest target of all.

To get the full flavor of what it means to be a skeptic, read this post and John’s accounts of the reactions to both his presentation and this post.

The “tell” of a target

Whether you are being skeptical of a popular (read “soft”) target like Bigfoot or skeptical of a “hard” target like psychiatric drugs, the reaction from believers is nearly universal: anger, denial and fairly rapidly, denunciation of yourself as unreasonable, etc.

Try being skeptical of a soft/hard target in your work.

Ask if there is racial bias in the algorithms you use day to day? Gender bias? If the answer is no, ask how do they know? Ask them to confirm it for you using data. What their hands closely during the demonstration.

After all, you are a data scientist and questions should be settled based on data and understanding the algorithms applied to them.


Being a skeptic with a small “s” is a hard job. But your project, department, enterprise will be better for you being that skeptic.

Imagine one effective White House skeptic prior to the second war on Iraq. No $trillions spent, no countless lives lost, no instability in the region, etc. Skeptics with a small “s” can make all the difference in the world.

Experts, Sources, Peer Review, Bad Poetry and Flint, Michigan.

Sunday, January 31st, 2016

Red faces at National Archive after Baldrick poem published with WW1 soldiers’ diaries.

From the post:

Officials behind the launch of a major initiative detailing lives of ordinary soldiers during the First World War were embarrassed by the discovery that they had mistakenly included the work of Blackadder character, Baldrick, in the achieve release.

The work, entitled ‘The German Guns’ and attributed to Private S.O. Baldrick, was actually written by the sitcom’s writers Richard Curtis and Ben Elton some 70 years after the end of the conflict. Elton was reported to be “delighted at the news” and friends said he was already checking to see if royalty payments may be due.

Although the archive release was scrutinised by experts, it is understood that the Baldrick poem was approved after a clerk recalled hearing Education Secretary Michael Gove referring to Baldrick in relation to the Great War, and assumed that he was of contemporary cultural significance.

Another illustration that experts and peer review aren’t the gold standards of correctness.

Or to put it differently: Mistakes happen, especially without sources.

If the only surviving information was Education Secretary Michael Gove referring to Baldrick, not only would the mistake be perpetuated but it would be immune to correction.

Citing and/or pointing to a digital resource that was the origin of the poem, would be more likely to trip warnings (by date of publication) or contain a currently recognizable reference, such as Blackadder.

The same lesson should be applied to reports such as Michael Moore’s claim:

1. While the Children in Flint Were Given Poisoned Water to Drink, General Motors Was Given a Special Hookup to the Clean Water. A few months after Gov. Snyder removed Flint from the clean fresh water we had been drinking for decades, the brass from General Motors went to him and complained that the Flint River water was causing their car parts to corrode when being washed on the assembly line. The governor was appalled to hear that GM property was being damaged, so he jumped through a number of hoops and quietly spent $440,000 to hook GM back up to the Lake Huron water, while keeping the rest of Flint on the Flint River water. Which means that while the children in Flint were drinking lead-filled water, there was one—and only one—address in Flint that got clean water: the GM factory.

Verification is especially important for me because I think Michael Moore is right and that predisposes me to accept his statements, without evidence.

In no particular order:

  • What “brass” from GM? Names, addresses, contact details. Links to statements?
  • What evidence did the “brass” present? Documents? Minutes of the meeting? Date?
  • What hoops did the Governor jump through? Who else in state government was aware of the request?
  • Where is the disbursement order for the $400,000 and related work orders?
  • Who was aware of any or all of these steps, in and out of government?

Those are some of the questions to ask to verify Michael Moore’s claim and, just as importantly, to lay a trail of knowledge and responsibility for the damage to the citizens of Flint.

Just because it was your job to hook GM back up to clean water, knowing that the citizens of Flint would be drinking water that corrodes auto parts, doesn’t make it right.

There are obligations that transcend personal interests or those of government.

Not poisoning innocents is one of those.

If there were sources for Michael’s account, people could start to be brought to justice. (See, sources really are important.)

A Timeline of Terrorism Warning: Incomplete Data

Wednesday, November 18th, 2015

A Timeline of Terrorism by Trevor Martin.

From the post:

The recent terrorist attacks in Paris have unfortunately once again brought terrorism to the front of many people’s minds. While thinking about these attacks and what they mean in a broad historical context I’ve been curious about if terrorism really is more prevalent today (as it feels), and if data on terrorism throughout history can offer us perspective on the terrorism of today.

In particular:

  • Have incidents of terrorism been increasing over time?
  • Does the amount of attacks vary with the time of year?
  • What type of attack and what type of target are most common?
  • Are the terrorist groups committing attacks the same over decades long time scales?

In order to perform this analysis I’m using a comprehensive data set on 141,070 terrorist attacks from 1970-2014 compiled by START.

Trevor writes a very good post and the visualizations are ones that you will find useful for this and other date.

However, there is a major incompleteness in Trevor’s data. If you follow the link for “comprehensive data set” and the FAQ you find there, you will find excluded from this data set:

Criterion III: The action must be outside the context of legitimate warfare activities.

So that excludes the equivalent of five Hiroshimas dropped on rural Cambodia (1969-1973), the first and second Iraq wars, the invasion of Afghanistan, numerous other acts of terrorism using cruise missiles and drones, all by the United States, to say nothing of the atrocities committed by Russia against a variety of opponents and other governments since 1970.

Depending on how you count separate acts, I would say the comprehensive data set is short by several orders of magnitude in accounting for all the acts of terrorism between 1970 to 2014.

If that additional data were added to the data set, I suspect (don’t know because the data set is incomplete) that who is responsible for more deaths and more terror would have a quite different result from that offered by Trevor.

So I don’t just idly complain, I will contact the United States Air Force to see if there are public records on how many bombing missions and how many bombs were dropped on Cambodia and in subsequent campaigns. That could be a very interesting data set all on its own.

Creating a genetic algorithm for beginners

Wednesday, September 16th, 2015

Creating a genetic algorithm for beginners by Lee Jacobson.

From the post:

A genetic algorithm (GA) is great for finding solutions to complex search problems. They’re often used in fields such as engineering to create incredibly high quality products thanks to their ability to search a through a huge combination of parameters to find the best match. For example, they can search through different combinations of materials and designs to find the perfect combination of both which could result in a stronger, lighter and overall, better final product. They can also be used to design computer algorithms, to schedule tasks, and to solve other optimization problems. Genetic algorithms are based on the process of evolution by natural selection which has been observed in nature. They essentially replicate the way in which life uses evolution to find solutions to real world problems. Surprisingly although genetic algorithms can be used to find solutions to incredibly complicated problems, they are themselves pretty simple to use and understand.

How they work

As we now know they’re based on the process of natural selection, this means they take the fundamental properties of natural selection and apply them to whatever problem it is we’re trying to solve.

The basic process for a genetic algorithm is:

  1. Initialization – Create an initial population. This population is usually randomly generated and can be any desired size, from only a few individuals to thousands.
  2. Evaluation – Each member of the population is then evaluated and we calculate a ‘fitness’ for that individual. The fitness value is calculated by how well it fits with our desired requirements. These requirements could be simple, ‘faster algorithms are better’, or more complex, ‘stronger materials are better but they shouldn’t be too heavy’.
  3. Selection – We want to be constantly improving our populations overall fitness. Selection helps us to do this by discarding the bad designs and only keeping the best individuals in the population.  There are a few different selection methods but the basic idea is the same, make it more likely that fitter individuals will be selected for our next generation.
  4. Crossover – During crossover we create new individuals by combining aspects of our selected individuals. We can think of this as mimicking how sex works in nature. The hope is that by combining certain traits from two or more individuals we will create an even ‘fitter’ offspring which will inherit the best traits from each of it’s parents.
  5. Mutation – We need to add a little bit randomness into our populations’ genetics otherwise every combination of solutions we can create would be in our initial population. Mutation typically works by making very small changes at random to an individuals genome.
  6. And repeat! – Now we have our next generation we can start again from step two until we reach a termination condition.


There are a few reasons why you would want to terminate your genetic algorithm from continuing it’s search for a solution. The most likely reason is that your algorithm has found a solution which is good enough and meets a predefined minimum criteria. Offer reasons for terminating could be constraints such as time or money.

A bit old, 2012, but it is a good introduction to genetic algorithms and if you read the comments (lots of those), you will find ports into multiple languages.

Important point here is to remember when presented with genetic algorithm results, be sure to ask for the fitness criteria, selection method, termination condition and the number of generations run.

Personally I would ask for the starting population and code as well.

There are any number of ways to produce an “objective” result from simply running a genetic algorithm so adopt that Heinlein adage: “Always cut cards.”

Applies in data science as it does in moon colonies.

New York City Subway Anthrax/Plague

Tuesday, May 5th, 2015

Spoiler Alert: This paper discusses a possible find of anthrax and plague DNA in the New York Subway. It concludes that either a related but harmless strain wasn’t considered and/or there was random sequencing error. In either case, it is a textbook example of the need for data skepticism.

Searching for anthrax in the New York City subway metagenome by Robert A Petit, III, Matthew Ezewudo, Sandeep J. Joseph, Timothy D. Read.

From the introduction:

In February 2015 Chris Mason and his team published an in-depth analysis of metagenomic data (environmental shotgun DNA sequence) from samples isolated from public surfaces in the New York City (NYC) subway system. Along with a ton of really interesting findings, the authors claimed to have detected DNA from the bacterial biothreat pathogens Bacillus anthracis (which causes anthrax) and Yersinia pestis (causes plague) in some of the samples. This predictably led to a huge interest from the press and scientists on social media. The authors followed up with an re-analysis of the data on, where they showed some results that suggested the tools that they were using for species identification overcalled anthrax and plague.

The NYC subway metagenome study raised very timely questions about using unbiased DNA sequencing for pathogen detection. We were interested in this dataset as soon as the publication appeared and started looking deeper into why the analysis software gave false positive results and indeed what exactly was found in the subway samples. We decided to wrap up the results of our preliminary analysis and put it on this site. This report focuses on the results for B. anthracis but we also did some preliminary work on Y.pestis and may follow up on this later.

The article gives a detailed accounting of the tools and issues involved in the identification of DNA fragments from pathogens. It is hard core science but it also illustrates how iffy hard core science can be. Sure, you have the data, that doesn’t mean you will reach the correct conclusion from it.

The authors also mention a followup study by Chris Mason, the author of the original paper, entitled:

The long road from Data to Wisdom, and from DNA to Pathogen by Christopher Mason.

From the introduction:

There is an oft-cited DIKW). Just because you have data, it takes some processing to get quality information, and even good information is not necessarily knowledge, and knowledge often requires context or application to become wisdom.

And from his conclusion:

But, perhaps the bigger issue is social. I confess I grossly underestimated how the press would sensationalize these results, and even the Department of Health (DOH) did not believe it, claiming it simply could not be true. We sent the MTA and the DOH our first draft upon submission in October 2014, the raw and processed data, as well as both of our revised drafts in December 2014 and January 2015, and we did get some feedback, but they also had other concerns at the time (Ebola guy in the subway). This is also different from what they normally do (PCR for specific targets), so we both learned from each other. Yet, upon publication, it was clear that Twitter and blogs provided some of the same scrutiny as the three reviewers during the two rounds of peer review. But, they went even deeper and dug into the raw data, within hours of the paper coming online, and I would argue that online reviewers have become an invaluable part of scientific publishing. Thus, published work is effectively a living entity before (bioRxiv), during (online), and after publication (WSJ, Twitter, and others), and online voices constitute an critical, ensemble 4th reviewer.

Going forward, the transparency of the methods, annotations, algorithms, and techniques has never been more essential. To this end, we have detailed our work in the supplemental methods, but we have also posted complete pipelines in this blog post on how to go from raw data to annotated species on Galaxy. Even better, the precise virtual machines and instantiation of what was run on a server needs to be tracked and guaranteed to be 100% reproducible. For example, for our .vcf characterizations of the human alleles, we have set up our entire pipeline on Arvados/Curoverse, free to use, so that anyone can take a .vcf file and run the exact same ancestry analyses and get the exact same results. Eventually, tools like this can automate and normalize computational aspects of metagenomics work, which is an ever-increasingly important component of genomics.


Data –>Information –>Knowledge –>Wisdom (DIKW).

sounds like:

evidence based data science.

to me.

Another quick point, note that Chris Mason and team made all their data available for others to review and Chris states that informal review was a valuable contributor to the scientific process.

That is an illustration of the value of transparency. Contrast that with the Obama Administration’s default position of opacity. Which one do you think serves a fact finding process better?

Perhaps that is the answer. The Obama administration isn’t interested in a fact finding process. It has found the “facts” that it wants and reaches its desired conclusions. What is there left to question or discuss?

Most misinformation inserted into Wikipedia may persist [Read Responsibly]

Tuesday, April 14th, 2015

Experiment concludes: Most misinformation inserted into Wikipedia may persist by Gregory Kohs.

A months-long experiment to deliberately insert misinformation into thirty different Wikipedia articles has been brought to an end, and the results may surprise you. In 63% of cases, the phony information persisted not for minutes or hours, but for weeks and months. Have you ever heard of Ecuadorian students dressed in formal three-piece suits, leading hiking tours of the Galapagos Islands? Did you know that during the testing of one of the first machines to make paper bags, two thumbs and a toe were lost to the cutting blade? And would it surprise you to learn that pain from inflammation is caused by the human body’s release of rhyolite, an igneous, volcanic rock?

None of these are true, but Wikipedia has been presenting these “facts” as truth now for more than six weeks. And the misinformation isn’t buried on seldom-viewed pages, either. Those three howlers alone have been viewed by over 125,000 Wikipedia readers thus far.

The second craziest thing of all may be that when I sought to roll back the damage I had caused Wikipedia, after fixing eight of the thirty articles, my User account was blocked by a site administrator. The most bizarre thing is what happened next: another editor set himself to work restoring the falsehoods, following the theory that a blocked editor’s edits must be reverted on sight.

Alex Brown tweeted this story along with the comment:

Wikipedia’s purported “self-correcting” prowess is more myth than reality

True, but not to pick on Wikipedia, the same is true for the benefits of peer review in general. A cursory survey of the posts at Retraction Watch will leave you wondering what peer reviewers are doing because it certainly isn’t reading assigned papers. At least not closely.

For historical references on peer review, see: Three myths about scientific peer review by Michael Nielsen.

Peer review is also used in grant processes, prompting the Wall Street Journal to call for lotteries to award NIH grants.

There are literally hundreds of other sources and accounts that demonstrate whatever functions peer review may have, quality assurance isn’t one of them. I suspect “gate keeping,” by academics who are only “gate keepers,” is its primary function.

The common thread running through all of these accounts is that you and only you can choose to read responsibly.

As a reader: Read critically! Do the statements in an article, post, etc., fit with what you know about the subject? Or with general experience? What sources did the author cite? Simply citing Pompous Work I does not mean Pompous Work I said anything about the subject. Check the citations by reading the citations. (You will be very surprised in some cases.) After doing your homework, if you still have doubts, such as with reported experiments, contact the author and explain what you have done thus far and your questions (nicely).

Even agreement between Pompous Work I and the author doesn’t mean you don’t have a good question. Pompous works are corrected year in and year out.

As an author: Do not cite papers you have not read. Do not cite papers because another author said a paper said. Verify your citations do exist and that they in fact support your claims. Post all of your data publicly. (No caveats, claims without supporting evidence are simply noise.)

Data Checking: Charlie Hebdo March

Tuesday, January 13th, 2015

I won’t reproduce the photographs because newspapers are picky about that sort of information but be aware that the photos of dignitaries “marching” in Paris aren’t what they appear to be.

First, Spot the difference: Female world leaders ‘Photoshopped’ out of Paris rally picture, Claire Cohen reports that a Israeli newspaper The Announcer (HaMevaser), photoshopped out all the women in the original image.

But the “march” was fakery from the outset. They were assembled on an empty street with lots of police presence. See: Paris march: TV wide shots reveal a different perspective on world leaders at largest demonstration in France’s history by Adam Withnall.

A photograph of a faked march that was further falsified by the Israeli newspaper The Announcer (HaMevaser). Closer to being accurate?

University Administrations and Data Checking

Wednesday, January 7th, 2015

Axel Brennicke and Björn Brembs, posted the following about university administrations in Germany.

Noam Chomsky, writing about the Death of American Universities, recently reminded us that reforming universities using a corporate business model leads to several easy to understand consequences. The increase of the precariat of faculty without benefits or tenure, a growing layer of administration and bureaucracy, or the increase in student debt. In part, this well-known corporate strategy serves to increase labor servility. The student debt problem is particularly obvious in countries with tuition fees, especially in the US where a convincing argument has been made that the tuition system is nearing its breaking point. The decrease in tenured positions is also quite well documented (see e.g., an old post). So far, and perhaps as may have been expected, Chomsky was dead on with his assessment. But how about the administrations?

To my knowledge, nobody has so far checked if there really is any growth in university administration and bureaucracy, apart from everybody complaining about it. So Axel Brennicke and I decided to have a look at the numbers. In Germany, employment statistics can be obtained from the federal statistics registry, Destatis. We sampled data from 2005 (the year before the Excellence Initiative and the Higher Education Pact) and the latest year we were able to obtain, 2012.

I’m sympathetic to the authors and their position, but that doesn’t equal verification of their claims about the data.

They have offered the data to anyone who want to check: Raw Data for Axel Brennicke and Björn Brembs.

Granting the article doesn’t detail their analysis, after downloading the data, what’s next? How would you go about verifying statements made in the article?

If people get in the habit of offering data for verification and no one looks, what guarantee of correctness will that bring?

The data passes the first test, it is actually present at the download site. Don’t laugh, the NSA has trouble making that commitment.

Do note that the files have underscores in their names which makes them appear to have spaces in their names. HINT: Don’t use underscores in file name. Ever.

The files are old style .xls files so just about anything recent should read them. Do be aware the column headers are in German.

The only description reads:

Employment data from DESTATIS about German university employment in 2005 and 2012

My first curiosity is the data being from two years only, 2005 and 2012. Just note that for now. What steps would you take with the data sets as they are?

I first saw this in a tweet by David Colquhoun.

Christmas Day: 1833

Thursday, December 25th, 2014

Charles Darwin’s voyage on Beagle unfolds online in works by ship’s artist by Maev Kennedy.


Slinging the monkey, Port Desire sketch by Conrad Martens on Christmas Day 1833 from Sketchbook III Photograph: Cambridge University Library

From the post:

On Christmas Day 1833, Charles Darwin and the crew of HMS Beagle were larking about at Port Desire in Patagonia, under the keen gaze of the ship’s artist, Conrad Martens.

The crew were mostly young men – Darwin himself, a recent graduate from Cambridge University, was only 22 – and had been given shore leave. Martens recorded them playing a naval game called Slinging the Monkey, which looks much more fun for the observers than the main participant. It involved a man being tied by his feet from a frame, swung about and jeered by his shipmates, until he manages to hit one of them with a stick, whereupon they change places.

Alison Pearn, of the Darwin Correspondence Project – which is seeking to assemble every surviving letter from and to the naturalist into a digital archive – said the drawings vividly brought to life one of the most famous voyages in the world. “It’s wonderful that everyone has the chance now to flick through these sketch books, in their virtual representation at the Cambridge digital library, and to follow the journey as Martens and Darwin actually saw it unfold.”

It would be a further 26 years before Darwin published his theory of evolution, On the Origin of Species by Means of Natural Selection, based partly on wildlife observations he made on board the Beagle. The voyage, and many of the people he met and the places he saw can be traced in scores of tiny lightning sketches made in pencil and watercolour by Martens – although unfortunately he joined the ship too late to record the weeping and hungover sailors in their chains – which have been placed online by Cambridge University library.

Anyone playing “slinging the monkey” at your house today?

If captured today, there would be megabytes if not gigabytes of cellphone video. But cellphone video would lack the perspective of the artist that captured a much broader scene than simply the game itself.

Video would give us greater detail about the game but at the loss of the larger context. What does that say about how to interpret body camera video? Does video capture “…what really happened?”

I first saw this in a tweet by the IHR, U. of London.

Michael Brown – Grand Jury Witness Index – Part 1

Wednesday, December 17th, 2014

I have completed the first half of the grand jury witness index for the Michael Brown case, covering volumes 1 – 12. (index volumes 13 -24, forthcoming)

The properties with each witness, along with others, will be used to identify that witness using a topic map.

Donate here to support this ongoing effort.

  1. Volume 1 Page 25 Line: 7 – Medical legal investigator – His report is Exhibit #1. (in released documents, 2014-5143-narrative-report-01.pdf)
  2. Volume 2 Page 20 Line: 6 – Crime Scene Detective with St. Louis County Police
  3. Volume 3 Page 7 Line: 7 – Crime Scene Detective with St. Louis County Police – 22 years with St. Louis – 14 years as crime scene detective
  4. Volume 3 Page 51 Line: 12 – Forensic Pathologist – St Louis City Medical Examiner’s Office (assistant medical examiner)
  5. Volume 4 Page 17 Line: 7 – Dorian Johnson
  6. Volume 5 Page 12 Line: 9 – Police Sergent – Ferguson Police – Since December 2001 (Volume-5 Page 14 – Prepared no written report)
  7. Volume 5 Page 75 Line: 11 – Detective St. Louis Police Department Two and 1/2 years
  8. Volume 5 Page 140 Line: 11 – Female FBI agent three and one-half years
  9. Volume 5 Page 196 Line: 23 – Darren Wilson (Volume-5 Page 197 talked to prosecutor before appearing)
  10. Volume 6 Page 149 Line: 18 – Witness #10
  11. Volume 6 Page 232 Line: 5 – Witness with marketing firm
  12. Volume 7 Page 9 Line: 1 – Canfield Green Apartments (female, no #)
  13. Volume 7 Page 153 Line: 9 – coming from a young lady’s house, passenger in white Monte Carlo
  14. Volume 8 Page 97 Line: 14 – Canfield Green Apartments, second floor, collecting Social Security, brother and his wife come over
  15. Volume 8 Page 173 Line: 9 – Detective St. Louis County Police Department – Since March 2008 (as detective) **primary case officer**
  16. Volume 8 Page 196 Line: 2 – Previously testified on Sept. 9th, page 7 Crime Scene Detective with St. Louis County Police – 22 years with St. Louis – 14 years as crime scene detective
  17. Volume 9 Page 7 Line: 7 – Sales consultant – Canfield Drive
  18. Volume 9 Page 68 Line: 15 – Visitor to Canfield Green Apartment Complex with wife
  19. Volume-10 Page 7 Line: 10 – Wife of witness in volume 9? visitor to complex
  20. Volume-10 Page 68 Line: 24 – Police officer, St. Louis County Police Department, assigned as a firearm and tool mark examiner in the crime laboratory.
  21. Volume-10 Page 128 Line: 8 – Detective, Crime Scene Unit for St. Louis County, 18 years as police officer, 3 years with crime scene – photographed Darren Wilson
  22. Volume-11 Page 6 Line: 21 – Canfield Apartment Complex, top floor, Living with girlfriend
  23. Volume-11 Page 59 Line: 7 – Girlfriend of witness at volume 11, page 6 – prosecutor has her renounce prior statements
  24. Volume-11 Page 80 Line: 7 – Drug chemist – crime lab
  25. Volume-11 Page 111 Line: 7 – Latent (fingerprint) examiner for the St. Louis County Police Department.
  26. Volume-11 Page 137 Line: 7 – Canfield Green Apartment Complex, fiancee for 3 1/2 to 4 years, south end of building, one floor above them, has children (boys)
  27. Volume-11 Page 169 Line: 16 – Doesn’t live at the Canfield Apartments, returning on August 9th to return?, in a van with husband, two daughters and granddaughter
  28. Volume-12 Page 11 Line: 7 – Husband of the witness driving the van, volume 11, page 169
  29. Volume-12 Page 51 Line: 15 – Special agent with the FBI assigned to the St. Louis field office, almost 24 years
  30. Volume-12 Page 102 Line: 18 – Lives in Northwinds Apartments, white ’99 Monte Carlo
  31. Volume-12 Page 149 Line: 6 – Contractor, retaining wall and brick patios

Caution: This list presents witnesses as they appeared and does not include the playing of prior statements and interviews. Those will be included in a separate index of statements because they play a role in identifying the witnesses who appeared before the grand jury.

The outcome of the Michael Brown grand jury was not the fault of the members of the grand jury. It was a result that was engineered by departing from usual and customary practices, distortion of evidence and misleading the grand jury about applicable law, among other things. All of that is hiding in plain sight in the grand jury transcripts.

Other Michael Brown Posts

Missing From Michael Brown Grand Jury Transcripts December 7, 2014. (The witness index I propose to replace.)

New recordings, documents released in Michael Brown case [LA Times Asks If There’s More?] Yes! December 9, 2014 (before the latest document dump on December 14, 2014).

Michael Brown Grand Jury – Presenting Evidence Before Knowing the Law December 10, 2014.

How to Indict Darren Wilson (Michael Brown Shooting) December 12, 2014.

More Missing Evidence In Ferguson (Michael Brown) December 15, 2014.

Michael Brown – Grand Jury Witness Index – Part 1 December 17, 2014. (above)

More Missing Evidence In Ferguson (Michael Brown)

Monday, December 15th, 2014

Saturday’s data dump from St. Louis County Prosecutor Robert McCulloch is still short at least two critical pieces of evidence. There is no copy of the “documents that we gave you to help in your deliberation.” And, there is no copy of the police map to “…guide the grand jury.”

I. The “documents that we gave you to help in your deliberations:”

The prosecutors gave the grand jury written documents that supplemented their various oral misstatements of the law in this case.

From Volume 24 - November 21, 2014 - Page  138: 

2 You have all the information you need in 

3 those documents that we gave you to help in your 

4 deliberation. 

That follows verbal mis-statement of the law by Ms. Whirley:

Volume 24 - November 21, 2014 - Page  137


13 	    MS. WHIRLEY: Is that in order to vote 

14 true bill, you also must consider whether you 

15 believe Darren Wilson, you find probable cause, 

16 that's the standard to believe that Darren Wilson 

17 committed the offense and the offenses are what is 

18 in the indictment and you must find probable cause 

19 to believe that Darren Wilson did not act in lawful 

20 self—defense, and you've got the last sheet talks 

21 about self—defense and talks about officer's use of 

22 force, because then you must also have probable 

23 cause to believe that Darren Wilson did not use 

24 lawful force in making an arrest. So you are 

25 considering self—defense and use of force in making 

Volume 24 - November 21, 2014 - Page  138 

Grand Jury — Ferguson Police Shooting Grand Jury 11/21/2014 

1 an arrest.

Where are the “documents that we gave you to help in your deliberation?”

Have you seen those documents? I haven’t.

And consider this additional misstatement of the law:

Volume 24 - November 21, 2014 - Page  139 

8 And the one thing that Sheila has 

9 explained as far as what you must find and as she 

10 said, it is kind of in Missouri it is kind of, the 

11 State has to prove in a criminal trial, the State 

12 has to prove that the person did not act in lawful 

13 self—defense or did not use lawful force in making, 

14 it is kind of like we have to prove the negative. 

15 So in this case because we are talking 

16 about probable cause, as we've discussed, you must 

17 find probable cause to believe that he committed the 

18 offense that you're considering and you must find 

19 probable cause to believe that he did not act in 

20 lawful self—defense. Not that he did, but that he 

21 did not and that you find probable cause to believe 

22 that he did not use lawful force in making the 

23 arrest. 

Just for emphasis:

the State has to prove that the person did not act in lawful self—defense or did not use lawful force in making, it is kind of like we have to prove the negative.

How hard is it to prove a negative? James Randi, James Randi Lecture @ Caltech – Cant Prove a Negative, points out that proving a negative is a logical impossibility.

The grand jury was given a logically impossible task in order to indict Darren Wilson.

What choice did the grand jury have but to return a “no true bill?”

More Misguidance: The police map, Grand Jury 101

A police map was created to guide the jury in its deliberations, a map that reflected the police view of the location of witnesses.

Volume 24 - November 21, 2014 - Page  26 

Grand Jury — Ferguson Police Shooting Grand Jury 11/21/2014 


10	 Q (By Ms. Alizadeh) Extra, okay, that's 

11 right. And you indicated that you, along with other 

12 investigators prepared this, which is your 

13 interpretation based upon the statements made of 

14 witnesses as to where various eyewitnesses were 

15 during, when I say shooting, obviously, there was a 

16 time period that goes along, the beginning of the 

17 time of the beginning of the incident until after 

18 the shooting had been done. And do you still feel 

19 that this map accurately reflects where witnesses 

20 said they were? 

21 A I do. 

22	 Q And just for your instruction, this just, 

23 this map is for your purposes in your deliberations 

24 and if you disagree with anything that's on the map, 

25 these little sticky things come right off. So 

Volume 24 - November 21, 2014 - Page  27 

Grand Jury — Ferguson Police Shooting Grand Jury 11/21/2014 

1 supposedly they come right off. 

2 A They do. 

3	 Q If you feel that this witness is not in 

4 the right place, you can move any of these stickers 

5 that you want and put them in the places where you 

6 think they belong. 

7 This is just something that is 

8 representative of what this witness believes where 

9 people were. If you all do with this what you will. 

10 Also there was a legend that was 

11 provided for all of you regarding the numbers 

12 because the numbers that were assigned witnesses are 

13 not the same numbers as the witnesses testimony in 

14 this grand jury. 


Two critical statements:


11... And you indicated that you, along with other 

12 investigators prepared this, which is your 

13 interpretation based upon the statements made of 

14 witnesses as to where various eyewitnesses were 

15 during, when I say shooting,

So the map represents the detective’s opinion about other witnesses, and:

3	 Q If you feel that this witness is not in 

4 the right place, you can move any of these stickers 

5 that you want and put them in the places where you 

6 think they belong.

The witness gave the grand jury a map, to guide its deliberations but we will never know what map that was, because the stickers can be moved.

Pretty neat trick, giving the grand jury guidance that can never be disclosed to others.


You have seen the quote from the latest data dump from the prosecutor’s office:

McCulloch apologized in a written statement for any confusion that may have occurred by failing to initially release all of the interview transcripts. He said he believes he has now released all of the grand jury evidence, except for photos of Brown’s body and anything that could lead to witnesses being identified.

The written instructions to the grand jury and the now unknowable map (Grand Jury 101) aren’t pictures of Brown’s body or anything that could identify a witness. Where are they?

Please make a donation to support further research on the grand jury proceedings concerning Michael Brown. Future work will include:

  • A witness index to the grand jury transcripts
  • An exhibit index to the grand jury transcripts
  • Analysis of the grand jury transcript for patterns by the prosecuting attorneys, both expected and unexpected
  • A concordance of the grand jury transcripts
  • Suggestions?

Donations will enable continued analysis of the grand jury transcripts, which, along with other evidence, may establish a pattern of conduct that was not happenstance or coincidence, but in fact was, enemy action.

Thanks for your support!

Other Michael Brown Posts

Missing From Michael Brown Grand Jury Transcripts December 7, 2014. (The witness index I propose to replace.)

New recordings, documents released in Michael Brown case [LA Times Asks If There’s More?] Yes! December 9, 2014 (before the latest document dump on December 14, 2014).

Michael Brown Grand Jury – Presenting Evidence Before Knowing the Law December 10, 2014.

How to Indict Darren Wilson (Michael Brown Shooting) December 12, 2014.

More Missing Evidence In Ferguson (Michael Brown) December 15, 2014. (above)

How to Indict Darren Wilson (Michael Brown Shooting)

Friday, December 12th, 2014

The Missouri Attorney General’s office needs to remove St. Louis Prosecuting Attorney Robert P. McCulloch from the Michael Brown case. Then convene a grand jury led to represent the public’s interest and not that of Darren Wilson.

As we saw in Michael Brown Grand Jury – Presenting Evidence Before Knowing the Law, an indictment of Darren Wilson for second degree murder in the death of Michael Brown only requires probable cause (“a reasonable belief that a person has committed a crime”) to find that:

  1. Darren Wilson (a person)
  2. intentionally shot (knowingly causes)
  3. to kill Michael Brown (another person) or
  4. to inflict serious injury on Michael Brown (another person)
  5. and Michael Brown dies (death)

It need not be a long and drawn out grand jury like the first one.

Just in case the Missouri Attorney General takes my advice (yeah, right), here is a thumbnail sketch to avoid a repetition of the prior defective grand jury process.

First witness, the chief investigating officer. Establish a scale map of the area and the locations of Darren Wilson’s vehicle, Darren Wilson’s claimed position and the final location of Michael Brown.

A map something like:

Michael Brown map

(See this map in full at:, it was authored by Richard Johnson.)

Elicit the following facts from the chief investigating officer:

  1. Michael Brown was in fact unarmed.
  2. Officer Darren Wilson said that he shot Michael Brown. (hearsay is admissible in grand jury proceedings)
  3. Officer Darren Wilson was also armed with police issued Mace at the time of the shooting.
  4. Officer Darren Wilson had pursued Michael Brown for over 100 feet from any initial contact.
  5. Michael Brown’s body had no traces of Mace on it.
  6. Officer Darren Wilson’s issued Mace was unused.
  7. Michael Brown was shot eight (8) times, three of them in the head.
  8. The medical examiner concluded that Michael Brown died as a result of gun shot wounds on 9 August 2014.

Unnecessary but to give the grand jury the human side of the story, call the witness from the second floor of the apartment building who testified to the grand jury:

Volume 8 – September 30, 2014 – Page 114

23 A Okay. Then my brother noticed, he said

24 wait a minute, looks like they’re struggling. We

25 are looking at the car, we can see them tussling,

Volume 8 – September 30, 2014 – Page 115

1 all right. His head was above the truck for a

2 moment and then it went below it.

3 Q Okay.

4 A All right. And it was still tussling.

5 His friend had backed up a step back on the

6 sidewalk, then we heard a shot. His friend ran this

7 direction, Michael ran to this driveway right here,

8 beside this building.

9 Q Just so we can be clear, this street is

10 Copper Creek Court?

11 A Right.

12 Q So you are saying, you had the pointer,

13 the little laser ——

14 A Right, right here.

15 Q —— at the corner of Canfield Drive and

16 Copper Creek Court?

17 A Right, he had ran towards this way. As

18 he’s running ——

19 Q He’s running east down Canfield?

20 A As he’s running this way, the officer got

21 out of his truck, came around from the back, got to

22 this side where he was now on the driver’s side

23 because he had a clear line of Michael over here.

24 Then he assumed his position with the

25 pistol. As he turned around, as he came around, he

Volume 8 – September 30, 2014 – Page 116

1 was coming up with the gun. He held the gun up like

2 this. (indicating) When he got to here, Michael was

3 standing right on the grass and he was like looking

4 down at his body.

5 Q Okay. Let me stop you here. At this

6 point have you seen anything in Michael’s hands?

7 A No.

8 Q When he was stopped, when they were

9 talking down the street, did you see anything in his

10 hands?

11 A No.

12 Q How about the other boy, anything in his

13 hands?

14 A No.

15 Q They weren’t carrying anything that you

16 saw?

17 A No.

18 Q And then you said, you know how important

19 some of this gesturing has been, right?

20 A Uh—huh, right.

21 Q So they are here to actually witness what

22 you are going to do. And so you say when Michael

23 Brown gets to, is he in the grass actually?

24 A He’s is standing at the very edge. Okay.

25 The driveways are blacktop, he is stopped right at

Volume 8 – September 30, 2014 – Page 117

1 the blacktop right, at the very edge.

2 Q Okay.

3 A His back was turned to the officer.

4 Q Okay.

5 A And he had his hands like this, like he’s

6 looking down at his body to see.

7 Q Okay. Can I ask you to stand up that will

8 really help them to see what you’re doing and he’s

9 stopped now?

10 A He’s stopped with his back towards the

11 officer and he stopped and he was doing this. As he

12 was trying to see where he was shot.

13 Q Okay.

14 A All right.

15 Q Uh—huh.

16 A As he was turning, at that time the

17 officer had already been around to the back of his

18 truck and got into his spot. By the time he got

19 there, while Michael was there, he was slowly

20 turning around and the officer said stop. When

21 Michael turned around, he just put his hands up like

22 this. They were shoulder high, they weren’t above

23 his head, but he did have them up. He had them out

24 like this, all right, palms facing him like this.

25 The officer said stop again. Michael

Volume 8 – September 30, 2014 – Page 118

1 then took a step, a few steps it took for him to get

2 from that blacktop to the street. When he stepped

3 out on the street, the officer said stop one more

4 time and then he fired. He fired three to four

5 shots. When he hit him, he went back. Can I stand?

6 Q Sure.

7 A When he hit him he, did like this, and he

8 went like, like his balance —— he started staggering

9 and he looked up at the officer like why.

10 Q Now, just to be clear, you can’t hear him

11 say anything?

12 A I can’t hear him say that, but he’s

13 looking at him and he is doing, you know. So then

14 as he’s stopped, he’s trying to steady, he starts

15 staggering, my brother says, he’s not going to stand

16 up, he’s getting ready to fall, he’s getting ready

17 to fall.

18 He looks like he was trying to stay

19 on his feet, and he started staggering toward the

20 police officer and he still had his hands up.

21 At some point between the officer’s

22 truck, which by that time this is about 30, 35 feet,

23 when he reached out into the street, he started

24 walking toward the officer, the officer took three

25 steps back and he yelled out stop to Michael again

Volume 8 – September 30, 2014 – Page 119

1 three times.

2 Michael’s steadily walking toward

3 him. More or less to me and to my brothers, he was

4 staggering.

5 Q Okay. To your brothers, did you have more

6 than one brother?

7 A Well, I mean my brother. I didn’t mean to

8 say brothers, my brother. He was staggering, you

9 know. And as he was staggering forward, his head,

10 his body kind of went down at an angle. He was like

11 this, more or less fighting to stay up. You could

12 see his legs wobbling.

13 Q Were his hands the way you had them?

14 A His hands were coming down like this, all

15 right. And he had his head up and he’s facing the

16 officer like this and he is steadily moving, and the

17 officer was moving back, stop. He yelled stop the

18 third time, he let off four more shops, but as he

19 was firing, Michael was falling. After he stopped

20 firing, Michael, he went down face first, smack.

What do you think? Probable cause for:

  1. Darren Wilson (a person)
  2. intentionally shot – 8 times (knowingly causes)
  3. to kill Michael Brown (another person) or
  4. to inflict serious injury on Michael Brown (another person)
  5. and Michael Brown dies (death)

Unless you think a police officer yelling “stop” is a license to kill, there is more than enough evidence for probable cause to indict for second degree murder. Total grand jury time, perhaps a day or a day and a half.

Should the grand jury ask about self-defense, lawful arrest, etc. the proper response is that all of those are great questions but under Missouri law, the responsibility to answer those questions resides with the trier of fact, whether it is a judge or jury. In a trial, both sides are represented with a judge to insure that all sides have an opportunity to present their side. In a grand jury proceeding, only the State is represented so it would be unfair for the State to attempt to represent both sides.

Don’t be fooled into “accepting” the grand jury’s decision. Another grand jury can and should be chosen to properly consider the Michael Brown shooting. Even more importantly, all those connected to the first grand jury should be investigated to determine who decided to throw the first grand jury. I can’t believe that an assistant prosecutor made that decision all on their own.

Michael Brown Grand Jury – Presenting Evidence Before Knowing the Law

Wednesday, December 10th, 2014

News coverage of the Michael Brown grand jury has proceeded like the prosecution in the case. It has been “look at this,” “now look at that,” with no rhyme or reason to the presentation. Big mistakes were made but in context, a pattern emerges that does not appear to be the result of chance or incompetence.

That pattern includes things that missing that are expected in any grand jury proceeding.

For example, did you know the grand jurors were never told what laws might apply to this case until the very end? And even there we don’t know what was said to the jurors.

4 GRAND JUROR: So you are going to give us 

5 those guidelines for us? 

6 	    MS. WHIRLEY: Right . 

7 	    MS. ALIZADEH: We're not going to give you 

8 the facts and say if he did this and then this, if 

9 you believe this, then this. But we're going to 

10 give you what the law says when a law officer can 

11 use force to affect an arrest and when that force 

12 can be deadly. And then also when a person can use 

13 force to defend themselves and when that force can 

14 be deadly. 

15 There is all kind of things about whether 

16 or not the person is an initial aggressor, you know. 

17 And under the law, a law enforcement officer can be 

18 an initial aggressor, unless his arrest is unlawful. 

This exchange happens in Volume 24, page 108, lines 4-18. Problem is, we don’t know what “laws” were actually given to the grand jury or in what form. More missing “evidence.”

Notice that the prosecutors deviated from the normal pattern of grand jury proceedings.

When the Grand Jury meets, the district attorney or an assistant district attorney designated by the district attorney will either read or explain the proposed Indictment (sometimes referred to as a Bill of Indictment) to the Grand Jury and will acquaint them with the witnesses who will testify. This is done to allow the Grand Jurors to familiarize themselves with the parties involved in case one or more members are disqualified to serve (see p. 21 and 22). (Grand Jury Handbook, page 25) (To the same effect but in federal grand juries, Antitrust Division Grand Jury Practice Manual page IV-2)

Outlining the law to a grand jury sets a context in which they place evidence and separate the important from the trivial or even irrelevant. You can scour all twenty-four volumes but in particular volume one and you will find no such assistance for the grand jury in this case.

I will outline the laws that should have been given to the grand jury at the outset of this investigation and then the consequences of not having those laws all along will be more evident.

Please shout if I fail to give specific references and/or hyperlinks to resources that I cite. I am less interested in your hearing my summary than I am in providing you with the ability to see the primary materials for yourself. (Another characteristic of a well authored topic map.)

There are two possible charges that could have been given to the grand jury, well, a properly assisted grand jury in this case, first and second degree murder. Let’s look at the laws in both cases.

First Degree Murder

First degree murder, penalty–person under sixteen years of age not to receive death penalty.

565.020. 1. A person commits the crime of murder in the first degree if he knowingly causes the death of another person after deliberation upon the matter.

The elements of first degree murder are:

  • person commits
  • knowingly causes
  • death of another person
  • after deliberation on the matter

You may have heard the term “premeditated” murder before. Essentially someone who plans to murder another person and then carries it out. There’s no specific time limit required for deliberation.

As a tactical matter, a prosecutor would not give the grand jury a first degree murder indictment in this case because there is no evidence of deliberation. The only reason for giving it in this case is to get the grand jury accustomed to the idea of not returning a true bill on any charge.

For the Michael Brown grand jury, absent some evidence that Darren Wilson knew and had some plan to murder Michael Brown, I would leave this one out.

Second Degree Murder

Until December 31, 2016–Second degree murder, penalty.

565.021. 1. A person commits the crime of murder in the second degree if he:

(1) Knowingly causes the death of another person or, with the purpose of causing serious physical injury to another person, causes the death of another person; or

(omitted language on murder in the course of commission of a felony as irrelevant)

The elements of second degree murder are:

  • person commits
  • knowingly causes
  • death of another person
  • or with purpose of serious injury
  • causes the death of another person

This illustrates the reason for instructing the grand jury on the law before they start hearing evidence. It enables them to sort out useful from non-useful testimony and evidence.

For example, do you see anything in the elements of second degree murder that allows killing of another person if the other person has been smoking marijuana? Or does it permit killing of another person for jaywalking? Or if a jaywalker runs away? What if you “tussle” with a police officer? Fair game? No, it doesn’t say any of those things.

Think about reading the grand jury transcripts and marking out witnesses and evidence that isn’t relevant to the elements:

  • person commits
  • knowingly causes
  • death of another person
  • or with purpose of serious injury
  • causes the death of another person

Not today but I will be annotating that list with points in the transcript that provide “probably cause” for each of those points.

You will have noticed from the quoted portion of the transcript that defense counsel ALIZADEH gives the jury instructions on use of force in self-defense and by a police officer.

That’s not a typo, I really mean defense counsel ALIZADEH. Why? I have appended the full statute provisions at the end of this post but in part:


Use of force in defense of persons provides in part:

5. The defendant shall have the burden of injecting the issue of justification under this section.

Who raised it? Defense counsel ALIZADEH.

Force by a police officer

Until December 31, 2016–Law enforcement officer’s use of force in making an arrest provides in part:

4. The defendant shall have the burden of injecting the issue of justification under this section.

Who raised it? Defense counsel ALIZADEH.

Voluntary Manslaughter

I suspect the jury was also instructed on voluntary manslaughter, which was also inappropriate because like the other statutes, Until December 31, 2016–Voluntary manslaughter, penalty–under influence of sudden passion, defendant’s burden to inject provides that:

2. The defendant shall have the burden of injecting the issue of influence of sudden passion arising from adequate cause under subdivision (1) of subsection 1 of this section.

“Sudden passion from adequate cause” under Missouri law is a defense to second degree murder. What that means is that if you are charged with second degree murder, the trial jury (not the grand jury) can find you guilty of voluntary manslaughter as a responsive verdict. See: Until December 31, 2016–Lesser degree offenses of first and second degree murder–instruction on lesser offenses, when. (And for your convenience, below.)

Again, must be raised by and probably was raised by Defense counsel ALIZADEH.


The only facts that the grand jury had to find probable cause for in its hearings and deliberations were:

  • person commits (Darren Wilson)
  • knowingly causes (not accidental, on purpose)
  • death of another person (Michael Brown’s death)
  • or with purpose of serious injury (multiple wounds)
  • causes the death of another person (Michael Brown’s death)

That’s it in a nutshell.

The trial jury or judge alone reaches decisions on self-defense, force by a police officer, “sudden passion from adequate cause,” and other issues. Not a grand jury.

Knowing the law, review the transcripts to say whether there was probable cause or not.

PS: Sorry, almost forgot:

The best-known definition of probable cause is “a reasonable belief that a person has committed a crime”

From Probable Cause at Princeton University.

If you think shooting an unarmed person eight times leads to a reasonable belief a crime has been committed, then you would return a true bill for second degree murder.

Supplemental Missouri statutes

Use of force in defense of persons (563.031), Law enforcement officer’s use of force in making an arrest (563.046), Voluntary manslaughter (565.023), and Lesser degree offenses of first and second degree murder (565.025), below.

Use of force in defense of persons.

563.031. 1. A person may, subject to the provisions of subsection 2 of this section, use physical force upon another person when and to the extent he or she reasonably believes such force to be necessary to defend himself or herself or a third person from what he or she reasonably believes to be the use or imminent use of unlawful force by such other person, unless:

(1) The actor was the initial aggressor; except that in such case his or her use of force is nevertheless justifiable provided:

(a) He or she has withdrawn from the encounter and effectively communicated such withdrawal to such other person but the latter persists in continuing the incident by the use or threatened use of unlawful force; or

(b) He or she is a law enforcement officer and as such is an aggressor pursuant to section 563.046; or

(c) The aggressor is justified under some other provision of this chapter or other provision of law;

(2) Under the circumstances as the actor reasonably believes them to be, the person whom he or she seeks to protect would not be justified in using such protective force;

(3) The actor was attempting to commit, committing, or escaping after the commission of a forcible felony.

2. A person may not use deadly force upon another person under the circumstances specified in subsection 1 of this section unless:

(1) He or she reasonably believes that such deadly force is necessary to protect himself, or herself or her unborn child, or another against death, serious physical injury, or any forcible felony;

(2) Such force is used against a person who unlawfully enters, remains after unlawfully entering, or attempts to unlawfully enter a dwelling, residence, or vehicle lawfully occupied by such person; or

(3) Such force is used against a person who unlawfully enters, remains after unlawfully entering, or attempts to unlawfully enter private property that is owned or leased by an individual claiming a justification of using protective force under this section.

3. A person does not have a duty to retreat from a dwelling, residence, or vehicle where the person is not unlawfully entering or unlawfully remaining. A person does not have a duty to retreat from private property that is owned or leased by such individual.

4. The justification afforded by this section extends to the use of physical restraint as protective force provided that the actor takes all reasonable measures to terminate the restraint as soon as it is reasonable to do so.

5. The defendant shall have the burden of injecting the issue of justification under this section. If a defendant asserts that his or her use of force is described under subdivision (2) of subsection 2 of this section, the burden shall then be on the state to prove beyond a reasonable doubt that the defendant did not reasonably believe that the use of such force was necessary to defend against what he or she reasonably believed was the use or imminent use of unlawful force.

Until December 31, 2016–Law enforcement officer’s use of force in making an arrest.

563.046. 1. A law enforcement officer need not retreat or desist from efforts to effect the arrest, or from efforts to prevent the escape from custody, of a person he reasonably believes to have committed an offense because of resistance or threatened resistance of the arrestee. In addition to the use of physical force authorized under other sections of this chapter, he is, subject to the provisions of subsections 2 and 3, justified in the use of such physical force as he reasonably believes is immediately necessary to effect the arrest or to prevent the escape from custody.

2. The use of any physical force in making an arrest is not justified under this section unless the arrest is lawful or the law enforcement officer reasonably believes the arrest is lawful.

3. A law enforcement officer in effecting an arrest or in preventing an escape from custody is justified in using deadly force only

(1) When such is authorized under other sections of this chapter; or

(2) When he reasonably believes that such use of deadly force is immediately necessary to effect the arrest and also reasonably believes that the person to be arrested

(a) Has committed or attempted to commit a felony; or

(b) Is attempting to escape by use of a deadly weapon; or

(c) May otherwise endanger life or inflict serious physical injury unless arrested without delay.

4. The defendant shall have the burden of injecting the issue of justification under this section.

Until December 31, 2016–Voluntary manslaughter, penalty–under influence of sudden passion, defendant’s burden to inject.

565.023. 1. A person commits the crime of voluntary manslaughter if he:

(1) Causes the death of another person under circumstances that would constitute murder in the second degree under subdivision (1) of subsection 1 of section 565.021, except that he caused the death under the influence of sudden passion arising from adequate cause; or

(2) Knowingly assists another in the commission of self-murder.

2. The defendant shall have the burden of injecting the issue of influence of sudden passion arising from adequate cause under subdivision (1) of subsection 1 of this section.

Until December 31, 2016–Lesser degree offenses of first and second degree murder–instruction on lesser offenses, when.

565.025. 1. With the exceptions provided in subsection 3 of this section
and subsection 3 of section 565.021, section 556.046 shall be used for the
purpose of consideration of lesser offenses by the trier in all homicide cases.

2. The following lists shall comprise, in the order listed, the lesser
degree offenses:

(1) The lesser degree offenses of murder in the first degree are:

(a) Murder in the second degree under subdivisions (1) and (2) of
subsection 1 of section 565.021;

(b) Voluntary manslaughter under subdivision (1) of subsection 1 of
section 565.023; and

(c) Involuntary manslaughter under subdivision (1) of subsection 1 of
section 565.024;

(2) The lesser degree offenses of murder in the second degree are:

(a) Voluntary manslaughter under subdivision (1) of subsection 1 of
section 565.023; and

(b) Involuntary manslaughter under subdivision (1) of subsection 1 of
section 565.024.

3. No instruction on a lesser included offense shall be submitted unless
requested by one of the parties or the court.

New recordings, documents released in Michael Brown case [LA Times Asks If There’s More?] Yes!

Tuesday, December 9th, 2014

Ferguson, Mo.: New recordings, documents released in Michael Brown case

By James Queally and Maria L. La Ganga write for the Los Angeles Times:

It remains unclear whether all of the documents and transcripts connected to the grand jury investigation have been made public. Emails and phone calls to the St. Louis County Prosecutor’s Office late Monday were not immediately returned. Grand jury proceedings are usually secret, but McCulloch had pledged to release the evidence if Wilson was not indicted. (emphasis added)

I can answer that question without asking the St. Louis County Prosecutor’s Office.


For example and only as an example:

Read Grand Jury Volume 24, at page 69, lines 21-25:

21 now you've completed your police report in this
22 case; is that right?
23 A I have.
24 Q How many pages is your police report?
25 A I don't know exactly, 1,100, 1,200

So, where are the 1,100 to 1,200 pages of report by the crime scene detective who testified three (3) times before the grand jury?

Not present in the documents released thus far.

There are other documents missing, some of them even more critical than this report but I will cover those in other posts.

Data Skepticism: Citations

Sunday, December 7th, 2014

There are two recent posts on citation practices that merit comparison.

The first is Citations for sale by Megan Messerly, which reads in part:

The U.S. News and World Report rankings have long been regarded as the Bible of university reputation metrics.

But when the outlet released its first global rankings in October, many were surprised. UC Berkeley, which typically hovers in the twenties in the national pecking order, shot to third in the international arena. The university also placed highly in several subjects, including first place in math.

Even more surprising, though, was that a little-known university in Saudi Arabia, King Abdulaziz University, or KAU, ranked seventh in the world in mathematics — despite the fact that it didn’t have a doctorate program in math until two years ago.

“I thought this was really bizarre,” said UC Berkeley math professor Lior Pachter. “I had never heard of this university and never heard of it in the context of mathematics.”

As he usually does when rankings are released, Pachter received a round of self-congratulatory emails from fellow faculty members. He, too, was pleased that his math department had ranked first. But he was also surprised that his school had edged out other universities with reputable math departments, such as MIT, which did not even make the top 10.

For the sake of ranking

It was enough to inspire Pachter to conduct his own review of the newly minted rankings. His inquiry revealed that KAU had aggressively recruited professors from a list of top scientists with the most frequently referenced papers, often referred to as highly cited researchers.

“The more I’ve learned, the more shocked and disgusted I’ve been,” Pachter said.

Citations are an indicator of academic clout, but they are also a crucial metric used in compiling several university rankings. There may be many reasons for hiring highly cited researchers, but rankings are one clear result of KAU’s investment. The worry, some researchers have said, is that citations and, ultimately, rankings may be KAU’s primary aim. KAU did not respond to repeated requests for comment via phone and email for this article.

On Halloween, Pachter published his findings about KAU’s so-called “highly-cited researcher program” in a post on his blog. It elicited many responses from his colleagues in the comment section, some of whom had experience working with KAU.

Pachter refers to earlier work of his own that makes claims about ranking universities highly suspect so one wonders why the bother?

I first saw this in a tweet by Lior Pachter.

In any event, you should also consider: Best Papers vs. Top Cited Papers in Computer Science (since 1996)

From the post:

The score in the bracket after each conference represents its average MAP score. MAP (Mean Average Precision) is a measure to evaluate the ranking performance. The MAP score of a conference in a year is calculated by viewing best papers of the conference in the corresponding year as the ground truth and the top cited papers as the ranking results.

Check the number out (the hyperlinks take you to the section in question):

AAAI (0.16) | ACL (0.13) | ACM MM (0.17) | ACSAC (0.27) | ALT (0.07) | APSEC (0.33) | ASIACRYPT (0.16) | CHI (0.2) | CIKM (0.19) | COMPSAC (0.6) | CONCUR (0.09) | CVPR (0.25) | CoNEXT (0.16) | DAC (0.07) | DASFAA (0.27) | DATE (0.11) | ECAI (0.0) | ECCV (0.42) | ECOOP (0.22) | EMNLP (0.14) | ESA (0.4) | EUROCRYPT (0.07) | FAST (0.18) | FOCS (0.07) | FPGA (0.59) | FSE (0.4) | HPCA (0.31) | HPDC (0.59) | ICALP (0.2) | ICCAD (0.13) | ICCV (0.07) | ICDE (0.48) | ICDM (0.13) | ICDT (0.25) | ICIP (0.0) | ICME (0.43) | ICML (0.12) | ICRA (0.16) | ICSE (0.24) | IJCAI (0.11) | INFOCOM (0.18) | IPSN (0.69) | ISMAR (0.57) | ISSTA (0.33) | KDD (0.33) | LICS (0.26) | LISA (0.07) | MOBICOM (0.09) | MobiHoc (0.02) | MobiSys (0.06) | NIPS (0.0) | NSDI (0.13) | OSDI (0.24) | PACT (0.37) | PLDI (0.3) | PODS (0.13) | RTAS (0.03) | RTSS (0.29) | S&P (0.09) | SC (0.14) | SCAM (0.5) | SDM (0.18) | SEKE (0.09) | SIGCOMM (0.1) | SIGIR (0.14) | SIGMETRICS (0.14) | SIGMOD (0.08) | SODA (0.12) | SOSP (0.41) | SOUPS (0.24) | SPAA (0.14) | STOC (0.21) | SenSys (0.4) | UIST (0.32) | USENIX ATC (0.1) | USENIX Security (0.18) | VLDB (0.18) | WSDM (0.2) | WWW (0.09) |

Universities and their professors conferred validity on the capricious ratings of U.S. News and World Report. Pachter’s own research has shown the ratings to be nearly fictional for comparison purposes. Yet at the same time, Pachter decrys what he sees as gaming of the rating system.

Crying “foul” in a game of capricious ratings, a game favors one’s own university, seems quite odd. Social practices at KAU may differ from universities in the United States but being ethnocentric about university education isn’t a good sign for university education in general.

Missing From Michael Brown Grand Jury Transcripts

Sunday, December 7th, 2014

What’s missing from the Michael Brown grand jury transcripts? Index pages. For 22 out of 24 volumes of grand jury transcripts, the index page is missing. Here’s the list:

  • volume 1 – page 4 missing
  • volume 2 – page 4 missing
  • volume 3 – page 4 missing
  • volume 4 – page 4 missing
  • volume 5 – page 4 missing
  • volume 6 – page 4 missing
  • volume 7 – page 4 missing
  • volume 8 – page 4 missing
  • volume 9 – page 4 missing
  • volume 10 – page 4 missing
  • volume 11 – page 4 missing
  • volume 12 – page 4 missing
  • volume 13 – page 4 missing
  • volume 14 – page 4 missing
  • volume 15 – page 4 missing
  • volume 16 – page 4 missing
  • volume 17 – page 4 missing
  • volume 18 – page 4 missing
  • volume 19 – page 4 missing
  • volume 20 – page 4 missing
  • volume 21 – page 4 present
  • volume 22 – page 4 missing
  • volume 23 – page 4 missing
  • volume 24 – page 4 present

As you can see from the indexes in volumes 21 and 24, they not terribly useful but better than combing twenty-four volumes (4799 pages of text) to find where a witness testifies.

Someone (court reporter?) made a conscious decision to take action that makes the transcripts harder to user.

Perhaps this is, as they say, “chance.”

Stay tuned for posts later this week that upgrade that to “coincidence” and beyond.

Is prostitution really worth £5.7 billion a year? [Data Skepticism]

Monday, November 10th, 2014

Is prostitution really worth £5.7 billion a year? by David Spiegelhalter.

From the post:

The EU has demanded rapid payment of £1.7 billion from the UK because our economy has done better than predicted, and some of this is due to the prostitution market now being considered as part of our National Accounts and contributing an extra £5.3 billion to GDP at 2009 prices, which is 0.35% of GDP, half that of agriculture. But is this a reasonable estimate?

This £5.3 billion figure was assessed by the Office of National Statistics in May 2014 based on the following assumptions, derived from this analysis. To quote the ONS:

  • Number of prostitutes in UK: 61,000
  • Average cost per visit: £67
  • Clients per prostitute per week: 25
  • Number of weeks worked per year: 52

Multiply these up and you get £5.3 billion at 2009 prices, around £5.7 billion now.

An excellent example of data skepticism. Taking commonly available data, David demonstrates the “£5.7 billion a year” claim depends on 400,000 Englishmen visiting prostitutes every three (3) days. Existing data on use of prostitutes suggests that figure is far too high.

There are other problems with the data. See David’s post for the details.

BTW, there was some quibbling about the price for prostitutes, as in being too low. Perhaps the authors of the original estimate were accustomed to government subsidized prostitutes. 😉

Should prostitution pricing come up in your data analysis, one source (not necessarily a reliable one) is Havocscope Prostitution Prices. The price for a UK street prostitute is listed in U.S. dollars at $20.00. Even lower than the original estimate. Would dramatically increase the number of required visits, by about a factor of five (5).

Core Econ: a free economics textbook

Wednesday, November 5th, 2014

Core Econ: a free economics textbook by Cathy O’Neil.

From the post:

Today I want to tell you guys about, a free (although you do have to register) textbook my buddy Suresh Naidu is using this semester to teach out of and is also contributing to, along with a bunch of other economists.

(image omitted)

It’s super cool, and I wish a class like that had been available when I was an undergrad. In fact I took an economics course at UC Berkeley and it was a bad experience – I couldn’t figure out why anyone would think that people behaved according to arbitrary mathematical rules. There was no discussion of whether the assumptions were valid, no data to back it up. I decided that anybody who kept going had to be either religious or willing to say anything for money.

Not much has changed, and that means that Econ 101 is a terrible gateway for the subject, letting in people who are mostly kind of weird. This is a shame because, later on in graduate level economics, there really is no reason to use toy models of society without argument and without data; the sky’s the limit when you get through the bullshit at the beginning. The goal of the Core Econ project is to give students a taste for the good stuff early; the subtitle on the webpage is teaching economics as if the last three decades happened.

Skepticism of government economic forecasts and data requires knowledge of the lingo and assumptions of economics. This introduction won’t get you to that level but it is a good starting place.


Suppressing Authentic Information

Monday, November 3rd, 2014

In my continuing search for information on the authenticity of Dabiq (see: Dabiq, ISIS and Data Skepticism) I encountered Slick, agile and modern – the IS media machine by Mina Al-Lami.

Mina makes it clear that IS (ISIL/ISIS) has been the target of a campaign to shut down all authentic outlets for news from the group:

IS has always relied heavily on hordes of online supporters to amplify its message. But their role has become increasingly important in recent months as the group’s official presence on a variety of social media platforms has been shut down and moved underground.

The group’s ability to keep getting its message out in the face of intensive counter-measures is due to the agility, resilience and adaptability of this largely decentralized force.

Until July this year, IS, like most jihadist groups, had a very strong presence on Twitter, with all its central and regional media outlets officially active on the platform. However, its military successes on the ground in Iraq and Syria in June triggered a concerted and sustained clampdown on the group’s accounts.

IS was initially quick to replace these accounts, in what became a game of whack-a-mole between IS and the Twitter administration. But by July the group appeared to have abandoned any attempt to maintain an official open presence there.

Instead, IS began experimenting with a string of less known social media platforms. These included the obscure Friendica, Quitter and Diaspora – all of which promise better privacy and data-protection than Twitter – as well as the popular Russian VKontakte.

Underground channels

While accounts on Friendica and Quitter were shut down within days, the official IS presence on Diaspora and VKontakte lasted several weeks before their involvement in the distribution of high profile beheading videos caused them too to be shut down.

Since the accounts on VKontakte were closed in September, IS appears to have resorted to underground channels to surface its material, making no attempt to advertise an official social media presence. Perhaps surprisingly, this has not yet caused any problems for the group in terms of authenticating its output.

Once a message has surfaced – via channels that are currently difficult to pin down – it is disseminated by loosely affiliated media groups who are capable of mobilizing a vast network of individual supporters on social media to target specific audiences.

Unfortunately, Mina misses the irony of reporting that IS has no authentic outlets in one breath to relying in the next breath on non-authentic materials (such as Dabiq) to talk about the group’s social media prowess.

Suppression of authentic content outlets for IS leaves an interested reader at the mercy of governments, news organizations and others who have a variety of motives for attributing content to IS.

As I mentioned in my last post:

Debates about international and national policy should not be based on faked evidence (such as “yellow cake uranium“) or faked publications.

I have heard the argument that IS content recruits support for terrorism. I have read propaganda attributed to IS, the Khmer Rouge, the KKK and terrorists sponsored by Western governments. I can report not the slightest interest in supporting or participating with any of them.

The recruitment argument is a variation of the fear of allowing gays, drug use, drinking, etc., on television would result in children growing up to be gay drug addicts with drinking problems. I can report that no sane person credits that fear today. (If you have that fear, contact your local mental health service for an appointment.)

Why is IS attractive? Hard to say given the lack of authentic information on its goals and platform, perhaps its reported opposition to corrupt governments in the Middle East?

If I weren’t concerned with corrupt Western governments I might be more concerned with governments in the Middle East. But, as they say, best to start cleaning your own house before complaining about the state of another’s.

Dabiq, ISIS and Data Skepticism

Wednesday, October 22nd, 2014

If you are following the Middle East, no doubt you have heard that ISIS/ISIL publishes Dabiq, a magazine that promotes its views. It isn’t hard to find articles quoting from Dabiq, but I wanted to find copies of Dabiq itself.

Clarion Project (Secondary Source for Dabiq)

After a bit of searching, I found that the Clarion Project is posting every issue of Dabiq as it appears.

The hosting site, Clarion Project, is a well known anti-Muslim hate group. The founders of the Clarion Project just happened to be full time employees of Aish Hatorah, a pro-Israel organization.

Coverage of Dabiq by Mother Jones (who should know better), ISIS Magazine Promotes Slavery, Rape, and Murder of Civilians in God’s Name relies on The Clarion Project “reprint” of Dabiq.

Internet Archive (Secondary Source for Dabiq)

The Islamic State Al-Hayat Media Centre (HMC) presents Dabiq Issue #1 (July 5, 2014).

All the issues at the Internet Archive claim to be from: “The Islamic State Al-Hayat Media Centre (HMC). I say “claim to be from” because uploading to the Internet Archive only requires an account with a verified email address. Anyone could have uploaded the documents.

Robert Mackey writes for the New York Times: Islamic State Propagandists Boast of Sexual Enslavement of Women and Girls and references Dabiq. I asked Robert for his source for Dabiq and he responded that it was the Internet Archive version.

Wall Street Journal

In Why the Islamic State Represents a Dangerous Turn in the Terror Threat, Gerald F. Seib writes:

It isn’t necessary to guess at what ISIS is up to. It declares its aims, tactics and religious rationales boldly, in multiple languages, for all the world to see. If you want to know, simply call up the first two editions of the organization’s remarkably sophisticated magazine, Dabiq, published this summer and conveniently offered in English online.

Gerald implies, at least to me, that Dabiq has a “official” website where it appears in multiple languages. But if you read Gerald’s article, there is no link to such a website.

I wrote to Gerald today to ask what site he meant when referring to Dabiq. I have not heard back from Gerald as of posting but will insert his response when it arrives.

The Jamestown Foundation

The Jamestown Foundation website featured: Hot Issue: Dabiq: What Islamic State’s New Magazine Tells Us about Their Strategic Direction, Recruitment Patterns and Guerrilla Doctrine by Michael W. S. Ryan, saying:

On the first day of Ramadan (June 28), the Islamic State in Iraq and Syria (ISIS) declared itself the new Islamic State and the new Caliphate (Khilafah). For the occasion, Abu Bakr al-Baghdadi, calling himself Caliph Ibrahim, broke with his customary secrecy to give a surprise khutbah (sermon) in Mosul before being rushed back into hiding. Al-Baghdadi’s khutbah addressed what to expect from the Islamic State. The publication of the first issue of the Islamic State’s official magazine, Dabiq, went into further detail about the Islamic State’s strategic direction, recruitment methods, political-military strategy, tribal alliances and why Saudi Arabia’s concerns that the Kingdom may be the Islamic State’s next target are well-founded.

Which featured a thumbnail of the cover of the first issue of Dabiq, with the following legend:

Dabiq Magazine (Source: Twitter user @umOmar246)

Well, that’s a problem because the Twitter user “@umOmar246” doesn’t exist.

Don’t take my word for it, go to Twitter, search for “umOmar246,” limit search results to people and you will see:

twitter results

I took the screen shot today just in case the results change at some point in time.

Other Media

Other media carry the same stories but without even attempting to cite a source. For example:

Jerusalem Post: ISIS threatens to conquer the Vatican, ‘break the crosses of the infidels’. Source? None.

Global News: The twisted view of ISIS in latest issue of propaganda magazine Dabiq by Nick Logan.

I don’t think that Nick appreciates the irony of the title of his post. Yes, this is a twisted view of ISIS. The question is who is responsible for it?

General Comments

Pick any issue of Dabiq and skim through it. What impressed me was the “over the top” presentation of cruelty. The hate literature I am familiar with (I grew up in the Deep South in the 1960’s) usually portrays the atrocities of others, not the group in question. Hate literature places its emphasis on the “other” group, the one to be targeted, not itself.


First and foremost, the lack of any “official” site of origin for Dabiq makes me highly suspicious of the authenticity of the materials that claim to originate with ISIS.

Second, why would ISIS rely upon the Clarion Project as a distributor for its English language version of Dabiq, along with the Internet Archive?

Third, what are we to make of missing @umOmar246 from Twitter? Before you say that the account has closed,
doesn’t know that user either:

twitter counter results

A different aspect of consistency on distributed data. The aspect of getting “caught” because distributed data is difficult to make consistent.

Fourth, the media coverage examined relies upon sites with questionable authenticity but cites the material found there as though authoritative. Is this a new practice in journalism? Some of the media outlets examined are hardly new and upcoming Internet news sites.

Finally, the content of the magazines themselves don’t ring true for hate literature.


Debates about international and national policy should not be based on faked evidence (such as “yellow cake uranium“) or faked publications.

Based on what I have uncovered so far, attributing Dabiq to ISIS is highly questionable.

It appears to be an attempt to discredit ISIS and to provide a basis for whipping up support for military action by the United States and its allies.

The United States destroyed the legitimate government of Iraq on the basis of lies and fabrications. If only for nationalistic reasons, not spending American funds and lives based on a tissue of lies, let’s not make the same mistake again.

Disclaimer: I am not a supporter of ISIS nor would I choose to live in their state should they establish one. However, it will be their state and I lack the arrogance to demand that others follow social, religious or political norms that I prefer.

PS: If you have suggestions for other factors that either confirm a link between ISIS and Dabiq or cast further doubt on such a link, please post them in comments. Thanks!

Know Your Algorithms and Data!

Sunday, September 21st, 2014

average of legs

If you let me pick the algorithm or the data, I can produce any result you want.

Something to keep in mind when listening to reports of “facts.”

Or as Nietzsche would say:

There are no facts, only interpretations.

There are people who are so naive that they don’t realize interpretations other than their are possible. Avoid them unless you have need of followers for some reason.

I first saw this in a tweet by Chris Arnold.

Credulity Question for Interviewees

Tuesday, July 1st, 2014

Max Fisher authored: Map: The 193 foreign countries the NSA spies on and the 4 it doesn’t, which has the following map:

nsa authority map

Max covers the history of the authority of the NSA to spy on governments, organizations, etc., so see his post for the details.

A credulity question for interviewees:

What countries are being spied upon by the NSA without permission? Color in those countries with a #2 pencil.

If they make no changes to the map, you can close the interview early. (The correct answer is six, including the United States.)

Clearly a candidate for phishing attacks, violation of security protocols, pass phrase/password sharing, frankly surprised they made it to the interview.

The case for big cities, in 1 map

Thursday, February 20th, 2014

The case for big cities, in 1 map by Chris Cillizza.

From the post:

New Yorkers who don’t live in New York City hate the Big Apple. Missourians outside of St. Louis and Kansas City are skeptical about the people (and politicians) who come from the two biggest cities in the state. Politicians from the Chicago area (and inner suburbs) often meet skepticism when campaigning in downstate Illinois. You get the idea. People who don’t live in the big cities tend to resent those who do.

Fair enough. Growing up in semi-rural southeastern Connecticut, I always hated Hartford. (Not really.) But, this map built by Reddit user Alexandr Trubetskoy shows — in stark terms — how much of the country’s economic activity (as measured by the gross domestic product) is focused in a remarkably small number of major cities.

A great map, at least if you live in the greater metro area of any of these cities.

I could 21 red spots, although on the East coast they are so close together some were fused together.

It is also an illustration that a map doesn’t always tell the full story.

Say 21 or more cities produce have of the GDP.

Care to guess how many states are responsible for 50% of the agricultural production in the United States?


Mapping Twitter Topic Networks:…

Thursday, February 20th, 2014

Mapping Twitter Topic Networks: From Polarized Crowds to Community Clusters by Marc A. Smith, Lee Rainie, Ben Shneiderman and Itai Himelboim.

From the post:

Conversations on Twitter create networks with identifiable contours as people reply to and mention one another in their tweets. These conversational structures differ, depending on the subject and the people driving the conversation. Six structures are regularly observed: divided, unified, fragmented, clustered, and inward and outward hub and spoke structures. These are created as individuals choose whom to reply to or mention in their Twitter messages and the structures tell a story about the nature of the conversation.

If a topic is political, it is common to see two separate, polarized crowds take shape. They form two distinct discussion groups that mostly do not interact with each other. Frequently these are recognizably liberal or conservative groups. The participants within each separate group commonly mention very different collections of website URLs and use distinct hashtags and words. The split is clearly evident in many highly controversial discussions: people in clusters that we identified as liberal used URLs for mainstream news websites, while groups we identified as conservative used links to conservative news websites and commentary sources. At the center of each group are discussion leaders, the prominent people who are widely replied to or mentioned in the discussion. In polarized discussions, each group links to a different set of influential people or organizations that can be found at the center of each conversation cluster.

While these polarized crowds are common in political conversations on Twitter, it is important to remember that the people who take the time to post and talk about political issues on Twitter are a special group. Unlike many other Twitter members, they pay attention to issues, politicians, and political news, so their conversations are not representative of the views of the full Twitterverse. Moreover, Twitter users are only 18% of internet users and 14% of the overall adult population. Their demographic profile is not reflective of the full population. Additionally, other work by the Pew Research Center has shown that tweeters’ reactions to events are often at odds with overall public opinion— sometimes being more liberal, but not always. Finally, forthcoming survey findings from Pew Research will explore the relatively modest size of the social networking population who exchange political content in their network.

Great study on political networks but all the more interesting for introducing an element of sanity into discussions about Twitter.

At a minimum, Twitter having 18% of all Internet users and 14% of the overall adult population casts serious doubt on metrics using Twitter to rate software popularity. (“It’s all we have” is a pretty lame excuse for using bad metrics.)

Not to say it isn’t important to mine Twitter data for what content it holds but at the same time to remember Twitter isn’t the world.

I first saw this at Mapping Twitter Topic Networks: From Polarized Crowds to Community Clusters by FullTextReports.

On Being a Data Skeptic

Saturday, February 15th, 2014

On Being a Data Skeptic by Cathy O’Neil. (pdf)

From Skeptic, Not Cynic:

I’d like to set something straight right out of the gate. I’m not a data cynic, nor am I urging other people to be. Data is here, it’s growing, and it’s powerful. I’m not hiding behind the word “skeptic” the way climate change “skeptics” do, when they should call themselves deniers.

Instead, I urge the reader to cultivate their inner skeptic, which I define by the following characteristic behavior. A skeptic is someone who maintains a consistently inquisitive attitude toward facts, opinions, or (especially) beliefs stated as facts. A skeptic asks questions when confronted with a claim that has been taken for granted. That’s not to say a skeptic brow-beats someone for their beliefs, but rather that they set up reasonable experiments to test those beliefs. A really excellent skeptic puts the “science” into the term “data science.”

In this paper, I’ll make the case that the community of data practitioners needs more skepticism, or at least would benefit greatly from it, for the following reason: there’s a two-fold problem in this community. On the one hand, many of the people in it are overly enamored with data or data science tools. On the other hand, other people are overly pessimistic about those same tools.

I’m charging myself with making a case for data practitioners to engage in active, intelligent, and strategic data skepticism. I’m proposing a middle-of-the-road approach: don’t be blindly optimistic, don’t be blindly pessimistic. Most of all, don’t be awed. Realize there are nuanced considerations and plenty of context and that you don’t necessarily have to be a mathematician to understand the issues.

It’s a scant 26 pages, cover and all but “On Being a Data Skeptic” is well worth your time.

I particularly liked Cathy’s coverage of issues such as: People Get Addicted to Metrics, which ends with separate asides to “nerds,” and “business people.” Different cultures and different ways of “hearing” the same content. Rather than trying to straddle those communities, Cathy gave them separate messages.

You will find her predator/prey model particularly interesting.

On the whole, I would say her predator/prey analysis should not be limited to modeling. See what you think.

Finding Occam’s razor in an era of information overload

Thursday, November 21st, 2013

Finding Occam’s razor in an era of information overload

From the post:

How can the actions and reactions of proteins so small or stars so distant they are invisible to the human eye be accurately predicted? How can blurry images be brought into focus and reconstructed?

A new study led by physicist Steve Pressé, Ph.D., of the School of Science at Indiana University-Purdue University Indianapolis, shows that there may be a preferred strategy for selecting mathematical models with the greatest predictive power. Picking the best model is about sticking to the simplest line of reasoning, according to Pressé. His paper explaining his theory is published online this month in Physical Review Letters, a preeminent international physics journal.

“Building mathematical models from observation is challenging, especially when there is, as is quite common, a ton of noisy data available,” said Pressé, an assistant professor of physics who specializes in statistical physics. “There are many models out there that may fit the data we do have. How do you pick the most effective model to ensure accurate predictions? Our study guides us towards a specific mathematical statement of Occam’s razor.”

Occam’s razor is an oft cited 14th century adage that “plurality should not be posited without necessity” sometimes translated as “entities should not be multiplied unnecessarily.” Today it is interpreted as meaning that all things being equal, the simpler theory is more likely to be correct.

Comforting that the principles of good modeling have not changed since the 14th century. (Occam’s Razor)

Bear in mind Occam’s Razor is guidance and not a hard and fast rule.

On the other hand, particularly with “big data,” be wary of complex models.

Especially the ones that retroactively “predict” unique events as a demonstration of their model.

If you are interested in the full “monty:”

Nonadditive Entropies Yield Probability Distributions with Biases not Warranted by the Data by Steve Pressé, Kingshuk Ghosh, Julian Lee, and Ken A. Dill. Phys. Rev. Lett. 111, 180604 (2013)


Different quantities that go by the name of entropy are used in variational principles to infer probability distributions from limited data. Shore and Johnson showed that maximizing the Boltzmann-Gibbs form of the entropy ensures that probability distributions inferred satisfy the multiplication rule of probability for independent events in the absence of data coupling such events. Other types of entropies that violate the Shore and Johnson axioms, including nonadditive entropies such as the Tsallis entropy, violate this basic consistency requirement. Here we use the axiomatic framework of Shore and Johnson to show how such nonadditive entropy functions generate biases in probability distributions that are not warranted by the underlying data.

Statistics Done Wrong

Saturday, November 2nd, 2013

Statistics Done Wrong by Alex Reinhart.

From the post:

If you’re a practicing scientist, you probably use statistics to analyze your data. From basic t tests and standard error calculations to Cox proportional hazards models and geospatial kriging systems, we rely on statistics to give answers to scientific problems.

This is unfortunate, because most of us don’t know how to do statistics.

Statistics Done Wrong is a guide to the most popular statistical errors and slip-ups committed by scientists every day, in the lab and in peer-reviewed journals. Many of the errors are prevalent in vast swathes of the published literature, casting doubt on the findings of thousands of papers. Statistics Done Wrong assumes no prior knowledge of statistics, so you can read it before your first statistics course or after thirty years of scientific practice.

Dive in: the whole guide is available online!

Something to add to your data skeptic bag.

As a matter of fact, a summary of warning signs for these problems would fit on 81/2 by 11 (or A4) paper.

Thinking when you show up to examine a data set, you have Statistic Done Wrong with the web address on the back of your laminated cheat sheets.

Part of being a data skeptic is intuiting where to push so that the data “as presented” unravels.

I first saw this in Nat Torkington’s Four short links: 30 October 2013.

PubMed Commons

Sunday, October 27th, 2013

PubMed Commons

From the webpage:

PubMed Commons is a system that enables researchers to share their opinions about scientific publications. Researchers can comment on any publication indexed by PubMed, and read the comments of others. PubMed Commons is a forum for open and constructive criticism and discussion of scientific issues. It will thrive with high quality interchange from the scientific community. PubMed Commons is currently in a closed pilot testing phase, which means that only invited participants can add and view comments in PubMed.

Just in case you are looking for a place to practice your data skepticism skills.

In closed beta now but when it opens up…, pick an article in a field that interests you or at random.

Just my suggestion but try to do very high quality comments and check with others on your analysis.

A record of to the point, non-shrill, substantive comments might be a nice addition to your data skeptic resume. (Under papers re-written/retracted.)

Lectures on scientific computing with Python

Sunday, October 27th, 2013

Lectures on scientific computing with Python by J.R. Johansson.

From the webpage:

A set of lectures on scientific computing with Python, using IPython notebooks.

Read only versions of the lectures:

To debunk pitches, proposals, articles, demos, etc., you will need to know, among other things, how scientific computing should be done.

Scientific computing is a very large field so take this as a starting point, not a destination.

Trouble at the lab [Data Skepticism]

Sunday, October 27th, 2013

Trouble at the lab, Oct. 19, 2013, The Economist.

From the web page:

“I SEE a train wreck looming,” warned Daniel Kahneman, an eminent psychologist, in an open letter last year. The premonition concerned research on a phenomenon known as “priming”. Priming studies suggest that decisions can be influenced by apparently irrelevant actions or events that took place just before the cusp of choice. They have been a boom area in psychology over the past decade, and some of their insights have already made it out of the lab and into the toolkits of policy wonks keen on “nudging” the populace.

Dr Kahneman and a growing number of his colleagues fear that a lot of this priming research is poorly founded. Over the past few years various researchers have made systematic attempts to replicate some of the more widely cited priming experiments. Many of these replications have failed. In April, for instance, a paper in PLoS ONE, a journal, reported that nine separate experiments had not managed to reproduce the results of a famous study from 1998 purporting to show that thinking about a professor before taking an intelligence test leads to a higher score than imagining a football hooligan.

The idea that the same experiments always get the same results, no matter who performs them, is one of the cornerstones of science’s claim to objective truth. If a systematic campaign of replication does not lead to the same results, then either the original research is flawed (as the replicators claim) or the replications are (as many of the original researchers on priming contend). Either way, something is awry.

The numbers will make you a militant data skeptic:

  • Original results could be duplicated for only 6 out of 53 landmark studies of cancer.
  • Drug company could reproduce only 1/4 of 67 “seminal studies.”
  • NIH official estimates at least three-quarters of publishing biomedical finding would be hard to reproduce.
  • Three-quarter of published paper in machine learning are bunk due to overfitting.

Those and more examples await you in this article from The Economist.

As the sub-heading for the article reads:

Scientists like to think of science as self-correcting. To an alarming degree, it is not

You may not mind misrepresenting facts to others, but do you want other people misrepresenting facts to you?

Do you have a professional data critic/skeptic on call?