Archive for the ‘Science’ Category

John Carlisle Hunts Bad Science (you can too!)

Tuesday, June 6th, 2017

Carlisle’s statistics bombshell names and shames rigged clinical trials by Leonid Schneider.

From the post:

John Carlisle is a British anaesthesiologist, who works in a seaside Torbay Hospital near Exeter, at the English Channel. Despite not being a professor or in academia at all, he is a legend in medical research, because his amazing statistics skills and his fearlessness to use them exposed scientific fraud of several of his esteemed anaesthesiologist colleagues and professors: the retraction record holder Yoshitaka Fujii and his partner Yuhji Saitoh, as well as Scott Reuben and Joachim Boldt. This method needs no access to the original data: the number presented in the published paper suffice to check if they are actually real. Carlisle was fortunate also to have the support of his journal, Anaesthesia, when evidence of data manipulations in their clinical trials was found using his methodology. Now, the editor Carlisle dropped a major bomb by exposing many likely rigged clinical trial publications not only in his own Anaesthesia, but in five more anaesthesiology journals and two “general” ones, the stellar medical research outlets NEJM and JAMA. The clinical trials exposed in the latter for their unrealistic statistics are therefore from various fields of medicine, not just anaesthesiology. The medical publishing scandal caused by Carlisle now is perfect, and the elite journals had no choice but to announce investigations which they even intend to coordinate. Time will show how seriously their effort is meant.

Carlisle’s bombshell paper “Data fabrication and other reasons for non-random sampling in 5087 randomised, controlled trials in anaesthetic and general medical journals” was published today in Anaesthesia, Carlisle 2017, DOI: 10.1111/anae.13962. It is accompanied by an explanatory editorial, Loadsman & McCulloch 2017, doi: 10.1111/anae.13938. A Guardian article written by Stephen Buranyi provides the details. There is also another, earlier editorial in Anaesthesia, which explains Carlisle’s methodology rather well (Pandit, 2012).

… (emphasis in original)

Cutting to the chase, Carlisle found 90 papers with statistical patterns unlikely to occur by chance in 5,087 clinical trials.

There is a wealth of science papers to be investigated, Sarah Boon, in 21st Century Science Overload points out (2016) there are 2.5 million new scientific papers published every year, in 28,100 active scholarly peer-reviewed journals (2014).

Since Carlisle has done eight (8) journals, that leaves ~28,092 for your review. 😉

Happy hunting!

PS: I can easily imagine an exercise along these lines being the final project for a data mining curriculum. You?

Launch of the PhilMath Archive

Monday, May 29th, 2017

Launch of the PhilMath Archive: preprint server specifically for philosophy of mathematics

From the post:

PhilSci-Archive is pleased to announce the launch of the PhilMath-Archive, http://philsci-archive.pitt.edu/philmath.html a preprint server specifically for the philosophy of mathematics. The PhilMath-Archive is offered as a free service to the philosophy of mathematics community. Like the PhilSci-Archive, its goal is to promote communication in the field by the rapid dissemination of new work. We aim to provide an accessible repository in which scholarly articles and monographs can find a permanent home. Works posted here can be linked to from across the web and freely viewed without the need for a user account.

PhilMath-Archive invites submissions in all areas of philosophy of mathematics, including general philosophy of mathematics, history of mathematics, history of philosophy of mathematics, history and philosophy of mathematics, philosophy of mathematical practice, philosophy and mathematics education, mathematical applicability, mathematical logic and foundations of mathematics.

For your reference, the PhilSci-Archive.

Enjoy!

Fraudulent Peer Review – Clue? Responded On Time!

Sunday, April 23rd, 2017

107 cancer papers retracted due to peer review fraud by Cathleen O’Grady.

As if peer review weren’t enough of a sham, some authors took it to another level:


It’s possible to fake peer review because authors are often asked to suggest potential reviewers for their own papers. This is done because research subjects are often blindingly niche; a researcher working in a sub-sub-field may be more aware than the journal editor of who is best-placed to assess the work.

But some journals go further and request, or allow, authors to submit the contact details of these potential reviewers. If the editor isn’t aware of the potential for a scam, they then merrily send the requests for review out to fake e-mail addresses, often using the names of actual researchers. And at the other end of the fake e-mail address is someone who’s in on the game and happy to send in a friendly review.

Fake peer reviewers often “know what a review looks like and know enough to make it look plausible,” said Elizabeth Wager, editor of the journal Research Integrity & Peer Review. But they aren’t always good at faking less obvious quirks of academia: “When a lot of the fake peer reviews first came up, one of the reasons the editors spotted them was that the reviewers responded on time,” Wager told Ars. Reviewers almost always have to be chased, so “this was the red flag. And in a few cases, both the reviews would pop up within a few minutes of each other.”

I’m sure timely submission of reviews weren’t the only basis for calling fraud but it is an amusing one.

It’s past time to jettison the bloated machinery of peer review. Judge work by its use, not where it’s published.

Every NASA Image In One Archive – Crowd Sourced Index?

Monday, April 17th, 2017

NASA Uploaded Every Picture It Has to One Amazing Online Archive by Will Sabel Courtney.

From the post:

Over the last five decades and change, NASA has launched hundreds of men and women from the planet’s surface into the great beyond. But America’s space agency has had an emotional impact on millions, if not billions, of others who’ve never gone past the Karmann Line separating Earth from space, thanks to the images, audio, and video generated by its astronauts and probes. NASA has given us our best glimpses at distant galaxies and nearby planets—and in the process, helped up appreciate our own world even more.

And now, the agency has placed them all in one place for everyone to see: images.nasa.gov.

No, viewing this site will not be considered an excuse for a late tax return. 😉

On the other hand, it’s an impressive bit of work, although a search only interface seems a bit thin to me.

The API docs don’t offer much comfort:

Name Description
q (optional) Free text search terms to compare to all 
indexed metadata.
center (optional) NASA center which published the media.
description(optional) Terms to search for in “Description” fields.
keywords (optional) Terms to search for in “Keywords” fields. 
Separate multiple values with commas.
location (optional) Terms to search for in “Location” fields.
media_type(optional) Media types to restrict the search to. 
Available types: [“image”, “audio”]. 
Separate multiple values with commas.
nasa_id (optional) The media asset’s NASA ID.
photographer(optional) The primary photographer’s name.
secondary_creator(optional) A secondary photographer/videographer’s name.
title (optional) Terms to search for in “Title” fields.
year_start (optional) The start year for results. Format: YYYY.
year_end (optional) The end year for results. Format: YYYY.

With no index, your results depend on your blind guessing the metadata entered by a NASA staffer.

Well, for “moon” I would expect “the Moon,” but the results are likely to include moons of other worlds, etc.

Indexing this collection has all the marks of a potential crowd sourcing project:

  1. Easy to access data
  2. Free data
  3. Interesting data
  4. Metadata

Interested?

5 Million Fungi

Friday, February 17th, 2017

5 Million Fungi – Every living thing is crawling with microorganisms — and you need them to survive by Dan Fost.

Fungus is growing in Brian Perry’s refrigerator — and not the kind blooming in someone’s forgotten lunch bag.

No, the Cal State East Bay assistant professor has intentionally packed his shelves with 1,500 Petri dishes, each containing a tiny sample of fungus from native and endemic Hawaiian plant leaves. The 45-year-old mycologist (a person who studies the genetic and biochemical properties of fungi, among many other things) figures hundreds of those containers hold heretofore-unknown species.

The professor’s work identifying and cataloguing fungal endophytes — microscopic fungi that live inside plants — carries several important implications. Scientists know little about the workings of these fungi, making them a particularly exciting frontier for examination: Learning about endophytes’ relationships to their host plants could save many endangered species; farmers have begun tapping into their power to help crops build resistance to pathogens; and researchers are interested in using them to unlock new compounds to make crucial medicines for people.

The only problem — finding, naming, and preserving them before it’s too late.
… (emphasis in original)

According to Naveed Davoodian in A Long Way to Go: Protecting and Conserving Endangered Fungi, you don’t need to travel to exotic locales to contribute to our knowledge of fungi in the United States.

Willow Nero, editor of McIlvainea: Journal of American Amateur Mycology writes in Commit to Mycology:


I hope you’ll do your part as a NAMA member by renewing your commitment to mycology—the science, that is. When we convene at the North American foray later this year, our leadership will present (and later publish in this journal) clear guidelines so mycologists everywhere can collect reliable data about fungi as part of the North American Mycoflora Project. We will let you know where to start and how to carry your momentum. All we ask is that you join us. Catalogue them all! Or at least set an ambitious goal for yourself or your local NAMA-affiliated club.

I did peek at the North American Mycoflora Project, which has this challenging slogan:

Without a sequenced specimen, it’s a rumor

Sounds like your kind of folks. 😉

Mycology as a hobby has three distinct positives: One, you are not in front your computer monitor. Two, you are gaining knowledge. Three, (hopefully) you will decide to defend fellow residents who cannot defend themselves.

Repulsion On A Galactic Scale (Really Big Data/Visualization)

Tuesday, January 31st, 2017

Newly discovered intergalactic void repels Milky Way by Rol Gal.

From the post:

For decades, astronomers have known that our Milky Way galaxy—along with our companion galaxy, Andromeda—is moving through space at about 1.4 million miles per hour with respect to the expanding universe. Scientists generally assumed that dense regions of the universe, populated with an excess of galaxies, are pulling us in the same way that gravity made Newton’s apple fall toward earth.

In a groundbreaking study published in Nature Astronomy, a team of researchers, including Brent Tully from the University of Hawaiʻi Institute for Astronomy, reports the discovery of a previously unknown, nearly empty region in our extragalactic neighborhood. Largely devoid of galaxies, this void exerts a repelling force, pushing our Local Group of galaxies through space.

Astronomers initially attributed the Milky Way’s motion to the Great Attractor, a region of a half-dozen rich clusters of galaxies 150 million light-years away. Soon after, attention was drawn to a much larger structure called the Shapley Concentration, located 600 million light-years away, in the same direction as the Great Attractor. However, there has been ongoing debate about the relative importance of these two attractors and whether they suffice to explain our motion.

The work appears in the January 30 issue of Nature Astronomy and can be found online here.

Additional images, video, and links to previous related productions can be found at http://irfu.cea.fr/dipolerepeller.

If you are looking for processing/visualization of data on a galactic scale, this work by Yehuda Hoffman, Daniel Pomarède, R. Brent Tully & Hélène M. Courtois, hits the spot!

It is also a reminder that when you look up from your social media device, there is a universe waiting to be explored.

Twistance – “Rogue” Twitter Accounts – US Federal Science Agencies

Thursday, January 26th, 2017

Alice Stollmeyer has put together Twistance:

Twitter + resistance = #Twistance. “Rogue” Twitter accounts from US federal science agencies.

As of 26 January 2017, 44 members and 5,133 subscribers.

A long overdue step towards free speech for government employees and voters making decisions on what is known inside the federal government.

Caution:

A claim to be an “alternative” account may or may not be true. As with the official accounts, evaluate factual claims for yourself. Use good security practices when communicating with unknown accounts. (Some of the account names are very close in spelling but are separate accounts.)

  • Alt Hi Volcanoes NP The Unofficial “Resistance” team of Hawaii Volcanoes National Park. Not taxpayer funded.
  • Alt HHS Unofficial and unaffiliated resistance account by concerned scientists for humanity.
  • The Alt NPS and EPA Real news regarding the NPS, EPA, climate science and environmentalism
  • Alt Science Raising awareness of climate change and other threats posed by science denial. Not affiliated with the US gov. #Resist
  • Alternative CDC Unofficial unaffiliated resistance account by concerned scientists for humanity.
  • Alternative HeHo A parody account for the Herbert Hoover National Historic Site
  • Alternative NIH Unofficial group of science advocates. Stand up for science, rights, equality, social justice, & ultimately, for the health of humanity.
  • Alternative NOAA The Unofficial “Resistance” team of the NOAA. Account not tax payer subsidized. We study the oceans, and the atmosphere to understand our planet. #MASA
  • AltBadlandsNatPark You’ll never shut us down, Drumpf!
  • Alt-Badlands NPS Bigly fake #badlandsnationalpark. ‘Sad!’ – Donald J Trump. #badlands #climate #science #datarefuge #resist #resistance
  • AltEPA He can take our official Twitter but he’ll never take our FREEDOM. UNOFFICIALLY resisting.
  • altEPA The Unofficial “Resistance” team of U.S. Environmental Protection Agency. Not taxpayer subsidised! Environmental conditions may vary from alternative facts.
  • AltFDA Uncensored FDA
  • AltGlacierNPS The unofficial Twitter site for Glacier National Park of Science Fact.
  • AltHot Springs NP The Resistance Account of America’s First Resort and Preserve. Account Run By Friends of HSNP.
  • AltLassenVolcanicNP The Unofficial “Resistance” team. Within peaceful mountain forests you will find hissing fumaroles and boiling mud pots and people ready to fight for science.
  • AltMountRainierNPS Unofficial “Resistance” Team from the Mount Rainier National Park Service. Protecting what’s important..
  • AltNASA The unofficial #resist team of the National Aeronautics and Space Administration.
  • AltOlympicNPS Unofficial resistance team of the Olympic National Park. protecting what’s important and fighting fascism with science.
  • AltRockyNPS Unofficial account that is being held for people associated with RMNP. DM if you might be interested in it.
  • AltUSARC USARC’s main duties are to develop an integrated national Arctic research policy and to assist in establishing an Arctic research plan to implement it.
  • AltUSDA Resisting the censorship of facts and science. Truth wins in the end.
  • AltUSForestService The unofficial, and unsanctioned, “Resistance” team for the U.S. Forest Service. Not an official Forest Service account, not publicly funded, citizen run.
  • AltUSFWS The Alt U.S. Fish Wildlife Service (AltUSFWS) is dedicated to the conservation, protection and enhancement of fish, wildlife and plants and their habitats
  • AltUSFWSRefuge The Alt U.S. Fish Wildlife Service (AltUSFWSRefuge) is dedicated to the conservation, protection and enhancement of fish, wildlife and plants and their habitats
  • ALTUSNatParkSer The Unofficial team of U.S. National Park Service. Not taxpayer subsidised! Come for rugged scenery, fossil beds, 89 million acres of landscape
  • AltUSNatParkService The Unofficial #Resistance team of U.S. National Park Service. Not taxpayer subsidised! Come for rugged scenery, facts & 89 million acres of landscape #climate
  • AltNWS The Unofficial Resistance team of U.S. National Weather Service. Not taxpayer subsidized! Come for non-partisan science-based weather, water, and climate info.
  • AltYellowstoneNatPar We are a group of employees and scientists in Yellowstone national park. We are here to continue providing the public with important information
  • AltYosemiteNPS “Unofficial” Resistance Team. Reporting facts & protecting what’s important!
  • Angry National Park Preserving the ecological and historical integrity of National Parks while also making them available and accessible for public use and enjoyment dammit all.
  • BadHombreLands NPS Unofficial feed of Badlands NP. Protecting rugged scenery, fossil beds, 244,000 acres of mixed-grass prairie & wildlife from two-bit cheetoh-hued despots.
  • BadlandsNPSFans Shmofficial fake feed of South Dakota’s Badlands National Park (Great Again™ Edition) Account not run by park employees, current or former, so leave them alone.
  • GlacierNPS The alternative Twitter site for Glacier National Park.
  • March for Science Planning a March for Science. Date TBD. We’ll let you know when official merchandise is out to cover march costs.
  • NOAA (uncensored)
  • Resistance_NASA We are a #Resist sect of the National Aeronautics and Space Administration.
  • Rogue NASA The unofficial “Resistance” team of NASA. Not an official NASA account. Not managed by gov’t employees. Come for the facts, stay for the snark.
  • NatlParksUnderground We post the information Donald Trump censors #FindYourPark #NPS100
  • NWS Podunk We’re the third wheel of forecast offices. We still use WSR-57. Winner of Biggest Polygon at the county fair. Not an actual NWS office…but we should be.
  • Rogue NOAA Research on our climate, oceans, and marine resources should be subject to peer [not political] review. *Not an official NOAA account*
  • Stuff EPA Would Say We post info that Donald Trump censors. We report what the U.S. Environmental Protection Agency would say. Chime in w/ #StuffEPAWouldSay
  • U.S. EPA – Ungagged Ungagged news, links, tips, and conversation that the U.S. Environmental Protection Agency is unable to tell you. Not directly affiliated with @EPA.
  • U.S. Science Service Uncensored & unofficial tweets re: the science happening at the @EPA, @USDA, @NatParkService, @NASA, @NOAA etc. #ClimateChangeIsReal #DefendScience

The Course of Science

Wednesday, December 21st, 2016

No doubt you will recognize “other” scientists in this description:

scientific-process

Select the image to get a larger and legible view.

I should point out that “facts” and “truth” have been debated recently in the news media without a Jesuit in sight. So, science isn’t the only area with “iffy” processes and results.

Posted by AlessondraSpringmann on Twitter.

A Reproducible Workflow

Friday, August 26th, 2016

The video is 104 seconds and highly entertaining!

From the description:

Reproducible science not only reduce errors, but speeds up the process of re-runing your analysis and auto-generate updated documents with the results. More info at: www.bit.ly/reprodu

How are you making your data analysis reproducible?

Enjoy!

Neil deGrasse Tyson and the Religion of Science

Thursday, July 14th, 2016

The next time you see Neil deGrasse Tyson chanting “holy, holy, holy” at the altar of science, re-read The 7 biggest problems facing science, according to 270 scientists by Julia Belluz, Brad Plumer, and Brian Resnick.

From the post:


The scientific process, in its ideal form, is elegant: Ask a question, set up an objective test, and get an answer. Repeat. Science is rarely practiced to that ideal. But Copernicus believed in that ideal. So did the rocket scientists behind the moon landing.

But nowadays, our respondents told us, the process is riddled with conflict. Scientists say they’re forced to prioritize self-preservation over pursuing the best questions and uncovering meaningful truths.

Ah, a quick correction to: “So did the rocket scientists behind the moon landing.”

Not!

The post Did Politics Fuel the Space Race? points to a White House transcript that reveals politics drove the race to the moon:

James Webb – NASA Administrator, President Kennedy.


James Webb: All right, then let me say this: if I go out and say that this is the number-one priority and that everything else must give way to it, I’m going to lose an important element of support for your program and for your administration.

President Kennedy [interrupting]: By who? Who? What people? Who?

James Webb: By a large number of people.

President Kennedy: Who? Who?

James Webb: Well, particularly the brainy people in industry and in the universities who are looking at a solid base.

President Kennedy: But they’re not going to pay the kind of money to get that position that we are [who we are] spending it. I say the only reason you can justify spending this tremendous…why spend five or six billion dollars a year when all these other programs are starving to death?

James Webb: Because in Berlin you spent six billion a year adding to your military budget because the Russians acted the way they did. And I have some feeling that you might not have been as successful on Cuba if we hadn’t flown John Glenn and demonstrated we had a real overall technical capability here.

President Kennedy: We agree. That’s why we wanna put this program…. That’s the dramatic evidence that we’re preeminent in space.

The rocket to the moon wasn’t about science, it about “…dramatic evidence that we’re preeminent in space.

If you need a not so recent example, consider the competition between Edison and Westinghouse in what Wikipedia titles: War of Currents.

Science has always been a mixture of personal ambition, politics, funding, etc.

That’s not to take anything away from science but a caution to remember it is and always has been a human enterprise.

Tyson’s claims for science should be questioned and judged like all other claims.

The No-Value-Add Of Academic Publishers And Peer Review

Tuesday, June 21st, 2016

Comparing Published Scientific Journal Articles to Their Pre-print Versions by Martin Klein, Peter Broadwell, Sharon E. Farb, Todd Grappone.

Abstract:

Academic publishers claim that they add value to scholarly communications by coordinating reviews and contributing and enhancing text during publication. These contributions come at a considerable cost: U.S. academic libraries paid $1.7 billion for serial subscriptions in 2008 alone. Library budgets, in contrast, are flat and not able to keep pace with serial price inflation. We have investigated the publishers’ value proposition by conducting a comparative study of pre-print papers and their final published counterparts. This comparison had two working assumptions: 1) if the publishers’ argument is valid, the text of a pre-print paper should vary measurably from its corresponding final published version, and 2) by applying standard similarity measures, we should be able to detect and quantify such differences. Our analysis revealed that the text contents of the scientific papers generally changed very little from their pre-print to final published versions. These findings contribute empirical indicators to discussions of the added value of commercial publishers and therefore should influence libraries’ economic decisions regarding access to scholarly publications.

The authors have performed a very detailed analysis of pre-prints, 90% – 95% of which are published as open pre-prints first, to conclude there is no appreciable difference between the pre-prints and the final published versions.

I take “…no appreciable difference…” to mean academic publishers and the peer review process, despite claims to the contrary, contribute little or no value to academic publications.

How’s that for a bargaining chip in negotiating subscription prices?

Where Has Sci-Hub Gone?

Saturday, June 18th, 2016

While I was writing about the latest EC idiocy (link tax), I was reminded of Sci-Hub.

Just checking to see if it was still alive, I tried http://sci-hub.io/.

404 by standard DNS service.

If you are having the same problem, Mike Masnick reports in Sci-Hub, The Repository Of ‘Infringing’ Academic Papers Now Available Via Telegram, you can access Sci-Hub via:

I’m not on Telegram, yet, but that may be changing soon. 😉

BTW, while writing this update, I stumbled across: The New Napster: How Sci-Hub is Blowing Up the Academic Publishing Industry by Jason Shen.

From the post:


This is obviously piracy. And Elsevier, one of the largest academic journal publishers, is furious. In 2015, the company earned $1.1 billion in profits on $2.9 billion in revenue [2] and Sci-hub directly attacks their primary business model: subscription service it sells to academic organizations who pay to get access to its journal articles. Elsevier filed a lawsuit against Sci-Hub in 2015, claiming Sci-hub is causing irreparable injury to the organization and its publishing partners.

But while Elsevier sees Sci-Hub as a major threat, for many scientists and researchers, the site is a gift from the heavens, because they feel unfairly gouged by the pricing of academic publishing. Elsevier is able to boast a lucrative 37% profit margin because of the unusual (and many might call exploitative) business model of academic publishing:

  • Scientists and academics submit their research findings to the most prestigious journal they can hope to land in, without getting any pay.
  • The journal asks leading experts in that field to review papers for quality (this is called peer-review and these experts usually aren’t paid)
  • Finally, the journal turns around and sells access to these articles back to scientists/academics via the organization-wide subscriptions at the academic institution where they work or study

There’s piracy afoot, of that I have no doubt.

Elsevier:

  • Relies on research it does not sponsor
  • Research results are submitted to it for free
  • Research is reviewed for free
  • Research is published in journals of value only because of the free contributions to them
  • Elsevier makes a 37% profit off of that free content

There is piracy but Jason fails to point to Elsevier as the pirate.

Sci-Hub/Alexandra Elbakyan is re-distributing intellectual property that was stolen by Elsevier from the academic community, for its own gain.

It’s time to bring Elsevier’s reign of terror against the academic community to an end. Support Sci-Hub in any way possible.

Volumetric Data Analysis – yt

Friday, June 17th, 2016

One of those rotating homepages:

Volumetric Data Analysis – yt

yt is a python package for analyzing and visualizing volumetric, multi-resolution data from astrophysical simulations, radio telescopes, and a burgeoning interdisciplinary community.

Quantitative Analysis and Visualization

yt is more than a visualization package: it is a tool to seamlessly handle simulation output files to make analysis simple. yt can easily knit together volumetric data to investigate phase-space distributions, averages, line integrals, streamline queries, region selection, halo finding, contour identification, surface extraction and more.

Many formats, one language

yt aims to provide a simple uniform way of handling volumetric data, regardless of where it is generated. yt currently supports FLASH, Enzo, Boxlib, Athena, arbitrary volumes, Gadget, Tipsy, ART, RAMSES and MOAB. If your data isn’t already supported, why not add it?

From the non-rotating part of the homepage:

To get started using yt to explore data, we provide resources including documentation, workshop material, and even a fully-executable quick start guide demonstrating many of yt’s capabilities.

But if you just want to dive in and start using yt, we have a long list of recipes demonstrating how to do various tasks in yt. We even have sample datasets from all of our supported codes on which you can test these recipes. While yt should just work with your data, here are some instructions on loading in datasets from our supported codes and formats.

Professional astronomical data and tools like yt put exploration of the universe at your fingertips!

Enjoy!

Ten Simple Rules for Effective Statistical Practice

Sunday, June 12th, 2016

Ten Simple Rules for Effective Statistical Practice by Robert E. Kass, Brian S. Caffo, Marie Davidian, Xiao-Li Meng, Bin Yu, Nancy Reid (Ciation: Kass RE, Caffo BS, Davidian M, Meng X-L, Yu B, Reid N (2016) Ten Simple Rules for Effective Statistical Practice. PLoS Comput Biol 12(6): e1004961. doi:10.1371/journal.pcbi.1004961)

From the post:

Several months ago, Phil Bourne, the initiator and frequent author of the wildly successful and incredibly useful “Ten Simple Rules” series, suggested that some statisticians put together a Ten Simple Rules article related to statistics. (One of the rules for writing a PLOS Ten Simple Rules article is to be Phil Bourne [1]. In lieu of that, we hope effusive praise for Phil will suffice.)

I started to copy out the “ten simple rules,” sans the commentary but that would be a disservice to my readers.

Nodding past a ten bullet point listing isn’t going to make your statistics more effective.

Re-write the commentary on all ten rules to apply them to every project. The focusing of the rules on your work will result in specific advice and examples for your field.

Who knows? Perhaps you will be writing a ten simple rule article in your specific field, sans Phil Bourne as a co-author. (Do be sure and cite Phil.)

PS: For the curious: Ten Simple Rules for Writing a PLOS Ten Simple Rules Article by Harriet Dashnow, Andrew Lonsdale, Philip E. Bourne.

Software Carpentry Bug BBQ (June 13th, 2016)

Sunday, June 5th, 2016

Software Carpentry Bug BBQ

From the post:

Software Carpentry is having a Bug BBQ on June 13th

Software Carpentry is aiming to ship a new version (5.4) of the Software Carpentry lessons by the end of June. To help get us over the finish line we are having a Bug BBQ on June 13th to squash as many bugs as we can before we publish the lessons. The June 13th Bug BBQ is also an opportunity for you to engage with our world-wide community. For more info about the event, read-on and visit our Bug BBQ website.

How can you participate? We’re asking you, members of the Software Carpentry community, to spend a few hours on June 13th to wrap up outstanding tasks to improve the lessons. Ahead of the event, the lesson maintainers will be creating milestones to identify all the issues and pull requests that need to be resolved we wrap up version 5.4. In addition to specific fixes laid out in the milestones, we also need help to proofread and bugtest the
lessons.

Where will this be? Join in from where you are: No need to go anywhere – if you’d like to participate remotely, start by having a look at the milestones on the website to see what tasks are still open, and send a pull request with your ideas to the corresponding repo. If you’d like to get together with other people working on these lessons live, we have created this map for live sites that are being organized. And if there’s no site listed near you, organize one yourself and let us know you are doing that here so that we can add your site to the map!

The Bug BBQ is going to be a great chance to get the community together, get our latest lessons over the finish line, and wrap up a product that gives you and all our contributors credit for your hard work with a citable object – we will be minting a DOI for this on publication.

A community BBQ that is open to everyone, dietary restrictions or not!

And the organizers have removed distance as a consideration for “attending.”

For those of us on non-BBQ diets, a unique opportunity to participate with others in the community for a worthy cause.

Mark your calendars today!

Reproducible Research Resources for Research(ing) Parasites

Friday, June 3rd, 2016

Reproducible Research Resources for Research(ing) Parasites by Scott Edmunds.

From the post:

Two new research papers on scabies and tapeworms published today showcase a new collaboration with protocols.io. This demonstrates a new way to share scientific methods that allows scientists to better repeat and build upon these complicated studies on difficult-to-study parasites. It also highlights a new means of writing all research papers with citable methods that can be updated over time.

While there has been recent controversy (and hashtags in response) from some of the more conservative sections of the medical community calling those who use or build on previous data “research parasites”, as data publishers we strongly disagree with this. And also feel it is unfair to drag parasites into this when they can teach us a thing or two about good research practice. Parasitology remains a complex field given the often extreme differences between parasites, which all fall under the umbrella definition of an organism that lives in or on another organism (host) and derives nutrients at the host’s expense. Published today in GigaScience are articles on two parasitic organisms, scabies and on the tapeworm Schistocephalus solidus. Not only are both papers in parasitology, but the way in which these studies are presented showcase a new collaboration with protocols.io that provides a unique means for reporting the Methods that serves to improve reproducibility. Here the authors take advantage of their open access repository of scientific methods and a collaborative protocol-centered platform, and we for the first time have integrated this into our submission, review and publication process. We now also have a groups page on the portal where our methods can be stored.

A great example of how sharing data advances research.

Of course, that assumes that one of your goals is to advance research and not solely yourself, your funding and/or your department.

Such self-centered as opposed to research-centered individuals do exist, but I would not malign true parasites by describing them as such, even colloquially.

The days of science data hoarders are numbered and one can only hope that the same is true for the “gatekeepers” of humanities data, manuscripts and artifacts.

The only known contribution of hoarders or “gatekeepers” has been to the retarding of their respective disciplines.

Given the choice of advancing your field along with yourself, or only yourself, which one will you choose?

MATISSE – Solar System Exploration

Saturday, April 30th, 2016

MATISSE: A novel tool to access, visualize and analyse data from planetary exploration missions by Angelo Zinzi, Maria Teresa Capria, Ernesto Palomba, Paolo Giommi, Lucio Angelo Antonelli.

Abstract:

The increasing number and complexity of planetary exploration space missions require new tools to access, visualize and analyse data to improve their scientific return.

ASI Science Data Center (ASDC) addresses this request with the web-tool MATISSE (Multi-purpose Advanced Tool for the Instruments of the Solar System Exploration), allowing the visualization of single observation or real-time computed high-order products, directly projected on the three-dimensional model of the selected target body.

Using MATISSE it will be no longer needed to download huge quantity of data or to write down a specific code for every instrument analysed, greatly encouraging studies based on joint analysis of different datasets.

In addition the extremely high-resolution output, to be used offline with a Python-based free software, together with the files to be read with specific GIS software, makes it a valuable tool to further process the data at the best spatial accuracy available.

MATISSE modular structure permits addition of new missions or tasks and, thanks to dedicated future developments, it would be possible to make it compliant to the Planetary Virtual Observatory standards currently under definition. In this context the recent development of an interface to the NASA ODE REST API by which it is possible to access to public repositories is set.

Continuing a long tradition of making big data and tools for processing big data freely available online (hint, hint, Panama Papers hoarders), this paper describes MATISSE (Multi-purpose Advanced Tool for the Instruments for the Solar System Exploration), which you can find online at:

http://tools.asdc.asi.it/matisse.jsp

Data currently available:

MATISSE currently ingests both public and proprietary data from 4 missions (ESA Rosetta, NASA Dawn, Chinese Chang’e-1 and Chang’e-2), 4 targets (4 Vesta, 21 Lutetia, 67P ChuryumovGerasimenko, the Moon) and 6 instruments (GIADA, OSIRIS, VIRTIS-M, all onboard Rosetta, VIR onboard Dawn, elemental abundance maps from Gamma Ray Spectrometer, Digital Elevation Models by Laser Altimeter and Digital Ortophoto by CCD Camera from Chang’e-1 and Chang’e-2).

If those names don’t sound familiar (links to mission pages):

4 Vesta – asteriod (NASA)

21 Lutetia – asteroid (ESA)

67P ChuryumovGerasimenko – comet (ESA)

the Moon – As in “our” moon.

You can do professional level research on extra-worldly data, but with worldly data (Panama Papers), not so much. Don’t be deceived by the forthcoming May 9th dribble of corporate data from the Panama Papers. Without the details contained in the documents, it’s little more than a suspect’s list.

Loading the Galaxy Network of the “Cosmic Web” into Neo4j

Saturday, April 23rd, 2016

Loading the Galaxy Network of the “Cosmic Web” into Neo4j by Michael Hunger.

Cypher script for loading “Cosmic Web” into Neo4j.

You remember “Cosmic Web:”

cosmic-web-fll-full-visualization-kim-albrecht

Enjoy!

300 Terabytes of Raw Collider Data

Saturday, April 23rd, 2016

CERN Just Dropped 300 Terabytes of Raw Collider Data to the Internet by Andrew Liptak.

From the post:

Yesterday, the European Organization for Nuclear Research (CERN) dropped a staggering amount of raw data from the Large Hadron Collider on the internet for anyone to use: 300 terabytes worth.

The data includes a 100 TB “of data from proton collisions at 7 TeV, making up half the data collected at the LHC by the CMS detector in 2011.” The release follows another infodump from 2014, and you can take a look at all of this information through the CERN Open Data Portal. Some of the information released is simply the raw data that CERN’s own scientists have been using, while another segment is already processed, with the anticipated audience being high school science courses.

It’s not the same as having your own cyclotron in the backyard with a bubble chamber but its the next best thing!

If you have been looking for “big data” to stretch your limits, this fits the bill nicely.

Peer Review Fails, Again.

Saturday, April 23rd, 2016

One in 25 papers contains inappropriately duplicated images, screen finds by Cat Ferguson.

From the post:

Elisabeth Bik, a microbiologist at Stanford, has for years been a behind-the-scenes force in scientific integrity, anonymously submitting reports on plagiarism and image duplication to journal editors. Now, she’s ready to come out of the shadows.

With the help of two editors at microbiology journals, she has conducted a massive study looking for image duplication and manipulation in 20,621 published papers. Bik and co-authors Arturo Casadevall and Ferric Fang (a board member of our parent organization) found 782 instances of inappropriate image duplication, including 196 published papers containing “duplicated figures with alteration.” The study is being released as a pre-print on bioArxiv.

I don’t know if the refusal of three (3) journals to date to publish this work or that peer reviewers of the original papers missed the duplication is the sadder news about this paper.

Being in the business of publishing, not in the business of publishing correct results, the refusal to publish an article that establishes the poor quality of those publications, is perhaps understandable. Not acceptable but understandable.

Unless the joke is on the reading public and other researchers. Publications are just that, publications. May or may not resemble any experiment or experience that can be duplicated by others. Rely on published results at your own peril.

Transparent access to all data and not peer review is the only path to solving this problem.

Laypersons vs. Scientists – “…laypersons may be prone to biases…”

Saturday, March 12th, 2016

The “distinction” between laypersons and scientists is more a world view about some things than “all scientists are rational” or “all laypersons are irrational.” Scientists and laypersons can be just as rational and/or irrational, depending upon the topic at hand.

Having said that, The effects of social identity threat and social identity affirmation on laypersons’ perception of scientists by Peter Nauroth, et al., finds, unsurprisingly, that if a layperson’s social identity is threatened by research, they have a less favorable view of the scientists involved.

Abstract:

Public debates about socio-scientific issues (e.g. climate change or violent video games) are often accompanied by attacks on the reputation of the involved scientists. Drawing on the social identity approach, we report a minimal group experiment investigating the conditions under which scientists are perceived as non-prototypical, non-reputable, and incompetent. Results show that in-group affirming and threatening scientific findings (compared to a control condition) both alter laypersons’ evaluations of the study: in-group affirming findings lead to more positive and in-group threatening findings to more negative evaluations. However, only in-group threatening findings alter laypersons’ perceptions of the scientists who published the study: scientists were perceived as less prototypical, less reputable, and less competent when their research results imply a threat to participants’ social identity compared to a non-threat condition. Our findings add to the literature on science reception research and have implications for understanding the public engagement with science.

Perceived attacks on personal identity have negative consequences for the “reception” of science.

Implications for public engagement with science

Our findings have immediate implications for public engagement with science activities. When laypersons perceive scientists as less competent, less reputable, and not representative of the scientific community and the scientist’s opinion as deviating from the current scientific state-of-the-art, laypersons might be less willing to participate in constructive discussions (Schrodt et al., 2009). Furthermore, our mediation analysis suggests that these negative perceptions deepen the trench between scientists and laypersons concerning the current scientific state-of-the-art. We speculate that these biases might actually even lead to engagement activities to backfire: instead of developing a mutual understanding they might intensify laypersons’ misconceptions about the scientific state-of-the-art. Corroborating this hypothesis, Binder et al. (2011) demonstrated that discussions about controversial science topics may in fact polarize different groups around a priori positions. Additional preliminary support for this hypothesis can also be found in case studies about public engagement activities in controversial socio-scientific issues. Some of these reports (for two examples, see Lezaun and Soneryd, 2007) indicate problems to maintain a productive atmosphere between laypersons and experts in the discussion sessions.

Besides these practical implications, our results also add further evidence to the growing body of literature questioning the validity of the deficit model in science communication according to which people’s attitudes toward science are mainly determined by their knowledge about science (Sturgis and Allum, 2004). We demonstrated that social identity concerns profoundly influence laypersons’ perceptions and evaluations of scientific results regardless of laypersons’ knowledge. However, our results also question whether involving laypersons in policy decision processes based upon scientific evidence is reasonable in all socio-scientific issues. Particularly when the scientific evidence has potential negative consequences for social groups, our research suggests that laypersons may be prone to biases based upon their social affiliations. For example, if regular video game players were involved in decision-making processes concerning potential sales restrictions of violent video games, they would be likely to perceive scientific evidence demonstrating detrimental effects of violent video games as shoddy and the respective researchers as disreputable (Greitemeyer, 2014; Nauroth et al., 2014, 2015).(emphasis added)

The principle failure of this paper is its failure to study the scientific community and its reaction within science to research that attacks the personal identity of its participants.

I don’t think it is reading too much into the post: Academic, Not Industrial Secrecy, where one group said:

We want restrictions on who could do the analyses.

to say that attacks on personal identity leads to boorish behavior on the part of scientists.

Laypersons and scientists emit a never ending stream of examples of prejudice, favoritism, sycophancy, sloppy reasoning, to say nothing of careless and/or low quality work.

Reception of science among laypersons might improve if the scientific community abandoned its facade of “it’s objective, it’s science.”

That facade was tiresome by WWII and to keep repeating now is a disservice to the scientific community.

All of our efforts, in any field, are human endeavors and thus subject to the vagaries and uncertainties human interaction.

Live with it.

How to read and understand a scientific paper….

Friday, February 26th, 2016

How to read and understand a scientific paper: a guide for non-scientists by Jennifer Raff.

From the post:

Last week’s post (The truth about vaccinations: Your physician knows more than the University of Google) sparked a very lively discussion, with comments from several people trying to persuade me (and the other readers) that their paper disproved everything that I’d been saying. While I encourage you to go read the comments and contribute your own, here I want to focus on the much larger issue that this debate raised: what constitutes scientific authority?

It’s not just a fun academic problem. Getting the science wrong has very real consequences. For example, when a community doesn’t vaccinate children because they’re afraid of “toxins” and think that prayer (or diet, exercise, and “clean living”) is enough to prevent infection, outbreaks happen.

“Be skeptical. But when you get proof, accept proof.” –Michael Specter

What constitutes enough proof? Obviously everyone has a different answer to that question. But to form a truly educated opinion on a scientific subject, you need to become familiar with current research in that field. And to do that, you have to read the “primary research literature” (often just called “the literature”). You might have tried to read scientific papers before and been frustrated by the dense, stilted writing and the unfamiliar jargon. I remember feeling this way! Reading and understanding research papers is a skill which every single doctor and scientist has had to learn during graduate school. You can learn it too, but like any skill it takes patience and practice.

I want to help people become more scientifically literate, so I wrote this guide for how a layperson can approach reading and understanding a scientific research paper. It’s appropriate for someone who has no background whatsoever in science or medicine, and based on the assumption that he or she is doing this for the purpose of getting a basic understanding of a paper and deciding whether or not it’s a reputable study.

Copy each of Jennifer’s steps, as you follow them, in a notebook with your results from applying them. That will help you remember the rules but help capture your understanding of paper.

BTW, there is also a fully worked example of applying these rules to a vaccine safety study.

Compare this post to Keshav’s How to Read a Paper.

Their techniques vary but both lead to a greater understanding of any paper you read.

Is Failing to Attempt to Replicate, “Just Part of the Whole Science Deal”?

Tuesday, February 16th, 2016

Genomeweb posted this summary of Stuart Firestein’s op-ed on failure to replicate:

Failure to replicate experiments is just part of the scientific process, writes Stuart Firestein, author and former chair of the biology department at Columbia University, in the Los Angeles Times. The recent worries over a reproducibility crisis in science are overblown, he adds.

“Science would be in a crisis if it weren’t failing most of the time,” Firestein writes. “Science is full of wrong turns, unconsidered outcomes, omissions and, of course, occasional facts.”

Failures to repeat experiments and the struggle to figure out what went wrong has also fed a number of discoveries, he says. For instance, in 1921, biologist Otto Loewi studied beating hearts from frogs in saline baths, one with the vagus nerve removed and one with it still intact. When the solution from the heart with the nerve still there was added to the other bath, that heart also slowed, suggesting that the nerve secreted a chemical that slowed the contractions.

However, Firestein notes Loewi and other researchers had trouble replicating the results for nearly six years. But that led the researchers to find that seasons can affect physiology and that temperature can affect enzyme function: Loewi’s first experiment was conducted at night and in the winter, while the follow-up ones were done during the day in heated buildings or on warmer days. This, he adds, also contributed to the understanding of how synapses fire, a finding for which Loewi shared the 1936 Nobel Prize.

“Replication is part of [the scientific] process, as open to failure as any other step,” Firestein adds. “The mistake is to think that any published paper or journal article is the end of the story and a statement of incontrovertible truth. It is a progress report.”

You will need to read Firestein’s comments in full: just part of the scientific process, to appreciate my concerns.

For example, Firestein says:


Absolutely not. Science is doing what it always has done — failing at a reasonable rate and being corrected. Replication should never be 100%. Science works beyond the edge of what is known, using new, complex and untested techniques. It should surprise no one that things occasionally come out wrong, even though everything looks correct at first.

I don’t know, would you say an 85% failure to replicate rate is significant? Drug development: Raise standards for preclinical cancer research, C. Glenn Begley & Lee M. Ellis, Nature 483, 531–533 (29 March 2012) doi:10.1038/483531a. Or over half of psychology studies? Over half of psychology studies fail reproducibility test. just to name two studies on replication.

I think we can agree with Firestein that replication isn’t at 100% but at what level are the attempts to replicate?

From what Firestein says,

“Replication is part of [the scientific] process, as open to failure as any other step,” Firestein adds. “The mistake is to think that any published paper or journal article is the end of the story and a statement of incontrovertible truth. It is a progress report.”

Systematic attempts at replication (and its failure) should be part and parcel of science.

Except…, that it’s obviously not.

If it were, there would have been no earth shaking announcements that fundamental cancer research experiments could not be replicated.

Failures to replicate would have been spread out over the literature and gradually resolved with better data, methods, if not both.

Failure to replicate is a legitimate part of the scientific method.

Not attempting to replicate, “I won’t look too close at your results if you don’t look too closely at mine,” isn’t.

There an ugly word for avoiding looking too closely at your own results or those of others.

You Can Confirm A Gravity Wave!

Saturday, February 13th, 2016

Unless you have been unconscious since last Wednesday, you have heard about the confirmation of Einstein’s 1916 prediction of gravitational waves.

An very incomplete list of popular reports include:

Einstein, A Hunch And Decades Of Work: How Scientists Found Gravitational Waves (NPR)

Einstein’s gravitational waves ‘seen’ from black holes (BBC)

Gravitational Waves Detected, Confirming Einstein’s Theory (NYT)

Gravitational waves: breakthrough discovery after a century of expectation (Guardian)

For the full monty, see the LIGO Scientific Collaboration itself.

Which brings us to the iPython notebook with the gravitational wave discovery data: Signal Processing with GW150914 Open Data

From the post:

Welcome! This ipython notebook (or associated python script GW150914_tutorial.py ) will go through some typical signal processing tasks on strain time-series data associated with the LIGO GW150914 data release from the LIGO Open Science Center (LOSC):

To begin, download the ipython notebook, readligo.py, and the data files listed below, into a directory / folder, then run it. Or you can run the python script GW150914_tutorial.py. You will need the python packages: numpy, scipy, matplotlib, h5py.

On Windows, or if you prefer, you can use a python development environment such as Anaconda (https://www.continuum.io/why-anaconda) or Enthought Canopy (https://www.enthought.com/products/canopy/).

Questions, comments, suggestions, corrections, etc: email losc@ligo.org

v20160208b

Unlike the toadies at the New England Journal of Medicine, Parasitic Re-use of Data? Institutionalizing Toadyism, Addressing The Concerns Of The Selfish, the scientists who have labored for decades on the gravitational wave question are giving their data away for free!

Not only giving the data away, but striving to help others learn to use it!

Beyond simply “doing the right thing,” and setting an example for other scientists, this is a great opportunity to learn more about signal processing.

Signal processing being an important method of “subject identification” when you stop to think about it in a large number of domains.

Detecting a gravity wave is beyond your personal means but with the data freely available…, further analysis is a matter of interest and perseverance.

Are You A Scientific Twitter User or Polluter?

Saturday, February 6th, 2016

Realscientists posted this image to Twitter:

science

Self-Scoring Test:

In the last week, how often have you retweeted without “read[ing] the actual paper” pointed to by a tweet?

How many times did you retweet in total?

Formula: retweets w/o reading / retweets in total = % of retweets w/o reading.

No scale with superlatives because I don’t have numbers to establish a baseline for the “average” Twitter user.

I do know that I see click-bait, out-dated and factually wrong material retweeted by people who know better. That’s Twitter pollution.

Ask yourself: Am I a scientific Twitter user or a polluter?

Your call.

Another Victory For Peer Review – NOT! Cowardly Science

Wednesday, January 27th, 2016

Pressure on controversial nanoparticle paper builds by Anthony King.

From the post:

The journal Science has posted an expression of concern over a controversial 2004 paper on the synthesis of palladium nanoparticles, highlighting serious problems with the work. This follows an investigation by the US funding body the National Science Foundation (NSF), which decided that the authors had falsified research data in the paper, which reported that crystalline palladium nanoparticle growth could be mediated by RNA.1 The NSF’s 2013 report on the issue, and a letter of reprimand from May last year, were recently brought into the open by a newspaper article.

The chief operating officer of the NSF identified ‘an absence of care, if not sloppiness, and most certainly a departure from accepted practices’. Recommended actions included sending letters of reprimand, requiring the subjects contact the journal to make a correction and barring the two chemists from serving as a peer reviewer, adviser or consultant for the NSF for three years.

Science notes that, though the ‘NSF did not find that the authors’ actions constituted misconduct, it nonetheless concluded that there “were significant departures from research practice”.’ The NSF report noted it would no longer fund the paper’s senior authors chemists Daniel Feldheim and Bruce Eaton at the University of Colorado, Boulder, who ‘recklessly falsified research data’, unless they ‘take specific actions to address issues’ in the 2004 paper. Science said it is working with the two authors ‘to understand their response to the NSF final ruling’.

Feldheim and Eaton have been under scrutiny since 2008, when an investigation by their former employer North Carolina State University, US, concluded the 2004 paper contained falsified data. According to Retraction Watch, Science said it would retract the paper as soon as possible.

I’m not a subscriber to Science, unfortunately, but if you are, can you write to Marcia McNutt, Editor-in-Chief to ask why findings of “recklessly falsified research data,” merits an expression of concern?

What’s with that? Concern?

In many parts of the United States, you can be murdered with impunity for DWB, Driving While Black, but you can falsify research data and only merit an expression of “concern” from Science?

Not to mention that the NSF doesn’t think that falsifying research evidence is “misconduct.”

The NSF needs to document what it thinks “misconduct” means. I don’t think it means what they think it means.

Every profession has bad apples but what is amazing in this case is the public kid glove handling of known falsifiers of evidence.

What is required for a swift and effective response against scientific misconduct?

Vivisection of human babies?

Or would that only count if they failed to have a petty cash account and to reconcile it on a monthly basis?

Apple Watches Lowers Your IQ – Still Want One For Christmas?

Wednesday, November 25th, 2015

Philip Elmer-DeWitt reports Apple Watch Owners Glance at Their Wrists 60 to 80 Times a Day.

The vast majority of those uses are not to check the time.

The reports Philip summarizes say that interactions last only a few seconds but how long does it take to break your train of thought?

Which reminded me of Vanessa Loder‘s post: Why Multi-Tasking Is Worse Than Marijuana For Your IQ.

From Vanessa’s post:

What makes you more stupid – smoking marijuana, emailing while talking on the phone or losing a night’s sleep?

Researchers at the Institute of Psychiatry at the University of London studied 1,100 workers at a British company and found that multitasking with electronic media caused a greater decrease in IQ than smoking pot or losing a night’s sleep.

For those of you in Colorado, this means you should put down your phone and pick up your pipe! In all seriousness, in today’s tech heavy world, the temptation to multi-task is higher than it’s ever been. And this has become a major issue. We don’t focus and we do too many things at once. We also aren’t efficient or effective when we stay seated too long.

If a colleague gives you an Apple Watch for Christmas, be very wary.

Apple is likely to complain that my meta-comparison isn’t the same as a controlled study and I have to admit, it’s not.

If Apple wants to get one hundred people together for about a month, with enough weed, beer, snack food, PS4s, plus Apple Watches, my meta-analysis can be put to the test.

The Consumer Safety Commission should sponsor that type of testing.

Imagine, being a professional stoner. 😉

Building Software, Building Community: Lessons from the rOpenSci Project

Tuesday, November 17th, 2015

Building Software, Building Community: Lessons from the rOpenSci Project by Carl Boettiger, Scott Chamberlain, Edmund Hart, Karthik Ram.

Abstract:

rOpenSci is a developer collective originally formed in 2011 by graduate students and post-docs from ecology and evolutionary biology to collaborate on building software tools to facilitate a more open and synthetic approach in the face of transformative rise of large and heterogeneous data. Born on the internet (the collective only began through chance discussions over social media), we have grown into a widely recognized effort that supports an ecosystem of some 45 software packages, engages scores of collaborators, has taught dozens of workshops around the world, and has secured over $480,000 in grant support. As young scientists working in an academic context largely without direct support for our efforts, we have first hand experience with most of the the technical and social challenges WSSSPE seeks to address. In this paper we provide an experience report which describes our approach and success in building an effective and diverse community.

Given the state of world affairs, I can’t think of a better time for the publication of this article.

The key lesson that I urge you to draw from this paper is the proactive stance of the project in involving and reaching out to build a community around this project.

Too many projects (and academic organizations for that matter) take the approach that others know they exist and so they sit waiting for volunteers and members to queue up.

Very often they are surprised and bitter that the queue of volunteers and members is so sparse. If anyone dares to venture that more outreach might be helpful, the response is nearly always, sure, you go do that and let us know when it is successful.

How proactive are you in promoting your favorite project?

PS: The rOpenSci website.

Statistical Reporting Errors in Psychology (1985–2013) [1 in 8]

Tuesday, October 27th, 2015

Do you remember your parents complaining about how far the latest psychology report departed from their reality?

Turns out there may be a scientific reason why those reports were as far off as your parents thought (or not).

The prevalence of statistical reporting errors in psychology (1985–2013) by Michèle B. Nuijten , Chris H. J. Hartgerink, Marcel A. L. M. van Assen, Sacha Epskamp, Jelte M. Wicherts, reports:

This study documents reporting errors in a sample of over 250,000 p-values reported in eight major psychology journals from 1985 until 2013, using the new R package “statcheck.” statcheck retrieved null-hypothesis significance testing (NHST) results from over half of the articles from this period. In line with earlier research, we found that half of all published psychology papers that use NHST contained at least one p-value that was inconsistent with its test statistic and degrees of freedom. One in eight papers contained a grossly inconsistent p-value that may have affected the statistical conclusion. In contrast to earlier findings, we found that the average prevalence of inconsistent p-values has been stable over the years or has declined. The prevalence of gross inconsistencies was higher in p-values reported as significant than in p-values reported as nonsignificant. This could indicate a systematic bias in favor of significant results. Possible solutions for the high prevalence of reporting inconsistencies could be to encourage sharing data, to let co-authors check results in a so-called “co-pilot model,” and to use statcheck to flag possible inconsistencies in one’s own manuscript or during the review process.

This is an open access article so dig in for all the details discovered by the authors.

The R package statcheck: Extract Statistics from Articles and Recompute P Values is quite amazing. The manual for statcheck should have you up and running in short order.

I did puzzle over the proposed solutions:

Possible solutions for the high prevalence of reporting inconsistencies could be to encourage sharing data, to let co-authors check results in a so-called “co-pilot model,” and to use statcheck to flag possible inconsistencies in one’s own manuscript or during the review process.

All of those are good suggestions but we already have the much valued process of “peer review” and the value-add of both non-profit and commercial publishers. Surely those weighty contributions to the process of review and publication should be enough to quell this “…systematic bias in favor of significant results.”

Unless, of course, dependence on “peer review” and the value-add of publishers for article quality is entirely misplaced. Yes?

What area with “p-values reported as significant” will fall to statcheck next?

Tomas Petricek on The Against Method

Tuesday, October 13th, 2015

Tomas Petricek on The Against Method by Tomas Petricek.

From the webpage:

How is computer science research done? What we take for granted and what we question? And how do theories in computer science tell us something about the real world? Those are some of the questions that may inspire computer scientist like me (and you!) to look into philosophy of science. I’ll present the work of one of the more extreme (and interesting!) philosophers of science, Paul Feyerabend. In “Against Method”, Feyerabend looks at the history of science and finds that there is no fixed scientific methodology and the only methodology that can encompass the rich history is ‘anything goes’. We see (not only computer) science as a perfect methodology for building correct knowledge, but is this really the case? To quote Feyerabend:

“Science is much more ‘sloppy’ and ‘irrational’ than its methodological image.”

I’ll be mostly talking about Paul Feyerabend’s “Against Method”, but as a computer scientist myself, I’ll insert a number of examples based on my experience with theoretical programming language research. I hope to convince you that looking at philosophy of science is very much worthwhile if we want to better understand what we do and how we do it as computer scientists!

The video runs an hour and about eighteen minutes but is worth every minute of it. As you can imagine, I was particularly taken with Tomas’ emphasis on the importance of language. Tomas goes so far as to suggest that disagreements about “type” in computer science stem from fundamentally different understandings of the word “type.”

I was reminded of Stanley Fish‘s “Doing What Comes Naturally (DWCN).

DWCN is a long and complex work but in brief Fish argues that we are all members of various “interpretive communities,” and that each of those communities influence how we understand language as readers. Which should come as assurance to those who fear intellectual anarchy and chaos because our interpretations are always within the context of an interpretative community.

Two caveats on Fish. As far as I know, Fish has never made the strong move and pointed out that his concept of “interpretative communities is just as applicable to natural sciences as it is to social sciences. What passes as “objective” today is part and parcel of an interpretative community that has declared it so. Other interpretative communities can and do reach other conclusions.

The second caveat is more sad than useful. Post-9/11, Fish and a number of other critics who were accused of teaching cultural relativity of values felt it necessary to distance themselves from that position. While they could not say that all cultures have the same values (factually false), they did say that Western values, as opposed to those of “cowardly, murdering,” etc. others, were superior.

If you think there is any credibility to that post-9/11 position, you haven’t read enough Chompsky. 9/11 wasn’t 1/100,0000 of the violence the United States has visited on civilians in other countries after the Korea War.