Parsing Academic Articles on Deadline

A group of researchers is trying to help science journalists parse academic articles on deadline by Joseph Lichterman.

From the post:

About 1.8 million new scientific papers are published each year, and most are of little consequence to the general public — or even read, really; one study estimates that up to half of all academic studies are only read by their authors, editors, and peer reviewers.

But the papers that are read can change our understanding of the universe — traces of water on Mars! — or impact our lives here on earth — sea levels rising! — and when journalists get called upon to cover these stories, they’re often thrown into complex topics without much background or understanding of the research that led to the breakthrough.

As a result, a group of researchers at Columbia and Stanford are in the process of developing Science Surveyor, a tool that algorithmically helps journalists get important context when reporting on scientific papers.

“The idea occurred to me that you could characterize the wealth of scientific literature around the topic of a new paper, and if you could do that in a way that showed the patterns in funding, or the temporal patterns of publishing in that field, or whether this new finding fit with the overall consensus with the field — or even if you could just generate images that show images very rapidly what the huge wealth, in millions of articles, in that field have shown — [journalists] could very quickly ask much better questions on deadline, and would be freed to see things in a larger context,” Columbia journalism professor Marguerite Holloway, who is leading the Science Surveyor effort, told me.

Science Surveyor is still being developed, but broadly the idea is that the tool takes the text of an academic paper and searches academic databases for other studies using similar terms. The algorithm will surface relevant articles and show how scientific thinking has changed through its use of language.

For example, look at the evolving research around neurogenesis, or the growth of new brain cells. Neurogenesis occurs primarily while babies are still in the womb, but it continues through adulthood in certain sections of the brain.

Up until a few decades ago, researchers generally thought that neurogenesis didn’t occur in humans — you had a set number of brain cells, and that’s it. But since then, research has shown that neurogenesis does in fact occur in humans.

“This tells you — aha! — this discovery is not an entirely new discovery,” Columbia professor Dennis Tenen, one of the researchers behind Science Surveyor, told me. “There was a period of activity in the ’70s, and now there is a second period of activity today. We hope to produce this interactive visualization, where given a paper on neurogenesis, you can kind of see other related papers on neurogenesis to give you the context for the story you’re telling.”

Given the number of papers published every year, an algorithmic approach like Science Surveyor is an absolute necessity.

But imagine how much richer the results would be if one of the three or four people who actually read the paper could easily link it to other research and context?

Or perhaps being a researcher who discovers the article and then blazes a trail to non-obvious literature that is also relevant?

Search engines now capture what choices users make in the links they follow but that’s a fairly crude approximation of relevancy of a particular resource. Such as not specifying why a particular resource is relevant.

Usage of literature should decide which articles merit greater attention from machine or human annotators. The last amount of humanities literature is never cited by anyone. Why spend resources annotating content that no one is likely to read?

Comments are closed.