Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

February 27, 2015

Data journalism: How to find stories in numbers

Filed under: Journalism,News,Reporting — Patrick Durusau @ 12:29 pm

Data journalism: How to find stories in numbers by Sandra Crucianelli.

From the post:

Colleagues often ask me what data journalism is. They’re confused by why it needs its own name — don’t all journalists use data?

The term is shorthand for ‘database journalism’ or ‘data-driven journalism’, where journalists find stories, or angles for stories, within large volumes of data.

It overlaps with investigative journalism in requiring lots of research, sometimes against people’s wishes. It can also overlap with data visualisation, as it requires close collaboration between journalists and digital specialists to find the best ways of presenting data.

So why get involved with spreadsheets and visualisation tools? At its most basic, adding data can give a story a new, factual dimension. But delving into datasets can also reveal new stories, or new aspects to them, that may not have otherwise surfaced.

Data journalism can also sometimes tell complicated stories more easily or clearly than relying on words alone — so it’s particularly useful for science journalists.

It can seem daunting if you’re trained in print or broadcast media. But I’ll introduce you to some new skills, and show you some excellent digital tools, so you too can soon find your feet as a data journalist.

Sandra gives as good an introduction to data journalism as you are likely to find. Her post covers everything from finding story ideas, researching relevant data, data processing and of course, presenting your findings in a persuasive way.

A must read for starting journalists but also for anyone needing an introduction to looking at data that supports a story (or not).

February 26, 2015

Gregor Aisch – Information Visualization, Data Journalism and Interactive Graphics

Filed under: Journalism,News,Reporting,Visualization — Patrick Durusau @ 8:04 pm

Gregor has two sites that I wanted to bring to your attention on information visualization, data journalism and interactive graphics.

The first one, driven-by-data.net are graphics from New York Times stories created by Gregor and others. Impressive graphics. If you are looking for visualization ideas, not a bad place to stop.

The second one, Vis4.net is a blog that features Gregor’s work. But more than a blog, if you choose the navigation links at the top of the page:

Color – Posts on color.

Code – Posts focused on code.

Cartography – Posts on cartography.

Advice – Advice (not for the lovelorn).

Archive – Archive of his posts.

Rather than a long list of categories (ahem), Gregor has divided his material into easy to recognize and use divisions.

Always nice when you see a professional at work!

Enjoy!

February 21, 2015

This is Visual Journalism [100]

Filed under: Graphics,Infographics,Journalism,News — Patrick Durusau @ 3:31 pm

This is Visual Journalism [100] by Tiago Veloso.

From the post:

Edition number one hundred of our round up of infographics from the print industry, and the selection we pulled together today is a perfect celebration – after all, we have dozens of new works from newsrooms all over the world, making this one of the biggest selections published on Visualoop so far.

And in less than a month, we’ll be covering the 23rd edition of Malofiej Awards – the world’s main stage for journalistic infographics. We’ve actually begun our coverage, with two great posts: our friend Marco Vergotti, infographic editor of Época magazine, made this special infographic about last year’s Malofiej Awards; and this exclusive interview with the main responsible for the success of the event, the Spanish journalist Javier Errea. If you missed these posts, we definitively recommend you to read them.

I count fifty-four (54) stunning infographics from print publications.

Before you skip these as “just print infographics” remember that print infographics can’t rely on interaction with a user.

They either capture the attention of a reader or fail, usually miserably.

Which of these capture your attention? How would you duplicate that in a more forgiving digital environment?

PS: If you can’t capture and hold a user’s attention, the quality or capabilities of your software aren’t going to have an opportunity to shine.

February 10, 2015

Avoiding Civilian Casualties – Don’t Look, Don’t Tell

Filed under: Government,Journalism,News,Reporting — Patrick Durusau @ 4:35 pm

Statistics are a bloodless way to tell the story of a war but in the absence of territory to claim (or claim/reclaim as was the case in Vietnam) and lacking an independent press to document the war, there isn’t much else to report. But even the simple statistics are twisted.

In the “war” against ISIS (unauthorized, Obama soon to ask Congress for ISIS war authority), Nancy A. Youssef reports in U.S. Won’t Admit to Killing a Single Civilian in the ISIS War:

Five months and 1,800-plus strikes into the U.S. air campaign against ISIS, and not a single civilian has been killed, officially. But Pentagon officials concede that they really have no way of telling for sure who has died in their attacks‚—and admit that no one will ever know how many have been slain.

A free and independent press reported the My Lai Massacre, which was only one of the untold number of atrocities against civilians in Vietname. The current generation of “journalists” drink the military’s Kool-Aid with no effort to document the impact of its airstrikes. Instead of bemoaning the lack of independent reports, the media should be the origin of independent reports.

How do you square the sheepish admission from the Pentagon that they don’t know who had dies in their attacks with statements by the U.S. Ambassador to Iraq, Stuart Jones, claiming that 6,000 ISIS fighters and half their leadership has been killed?

That sounds administration top-heavy if one out of every two fighters is a leader. Inside the beltway in D.C. is the only known location with that ratio of “leaders” to “followers.” But realistically, if the Pentagon had those sort of numbers, they would be repeating them in every daily briefing. Yes? Unless you think Ambassador Jones has a crystal ball, the most likely case is those numbers are fictional.

I don’t doubt there have been civilian casualties. In war there are always civilian casualties. What troubles me is the don’t look, don’t tell position of the U.S. military command in order to make war palatable to a media savvy public.

War should be unpalatable. It should be presented in its full gory details, videos of troops bleeding out on the battlefield, burials, families torn apart, women, children and men killed in ways we don’t want to imagine, all of the aspects that make it unpalatable.

If nothing else, it will sharpen the debate on war powers in Congress because then the issue won’t be model towns and cars but people who are dying before our very eyes on social media. How many more lives will we take to save the Arab world from Arabs?

February 7, 2015

Geojournalism.org

Filed under: Geographic Data,Geography,Geospatial Data,Journalism,Mapping,Maps — Patrick Durusau @ 3:05 pm

Geojournalism.org

From the webpage:

Geojournalism.org provides online resources and training for journalists, designers and developers to dive into the world of data visualization using geographic data.

From the about page:

Geojournalism.org is made for:

Journalists

Reporters, editors and other professionals involved on the noble mission of producing relevant news for their audiences can use Geojournalism.org to produce multimedia stories or simple maps and data visualization to help creating context for complex environmental issues

Developers

Programmers and geeks using a wide variety of languages and tools can drink on the vast knowledge of our contributors. Some of our tutorials explore open source libraries to make maps, infographics or simply deal with large geographical datasets

Designers

Graphic designers and experts on data visualizations find in the Geojournalism.org platform a large amount of resources and tips. They can, for example, improve their knowledge on the right options for coloring maps or how to set up simple charts to depict issues such as deforestation and climate change

It is one thing to have an idea or even a story and quite another to communicate it effectively to a large audience. Geojournalism is designed as a community site that will help you communicate geophysical data to a non-technical audience.

I think it is clear that most governments are shy about accurate and timely communication with their citizens. Are you going to be one of those who fills in the gaps? Geojournalism.org is definitely a site you will be needing.

February 1, 2015

DJA Newsletter [If you can’t see the data, it’s not news, just rumor.]

Filed under: Journalism,News,Reporting — Patrick Durusau @ 11:04 am

DJA Newsletter: The best of Data Journalism every month

From the about page:

The Global Editors Network is a cross-platform community of editors-in-chief and media innovators committed to sustainable, high-quality journalism, empowering newsrooms through a variety of programmes designed to inspire, connect and share. Our online community allows media innovators from around the world to interact and collaborate on projects created through our programmes. The GEN Board Members support this mission and have signed the GEN Manifesto.

We are driven by a journalistic imperative and a common goal: Content and Engagement First. To that end, we support all kinds of organizations and media outlets, to define a vision for the future of journalism and enhance its quality through innovation and cooperation. Freedom of information and independence of the news media are, and will remain, the main credo of the Global Editors Network and we will back all efforts to enhance press freedom worldwide.

The links in this month’s newsletter:

  1. Every active satellite orbiting earth
  2. Islam in Europe – the gap between perceptions and reality
  3. What news sources does China block?
  4. What happens when you scrape AirBnB data?
  5. RiseUp revolutions

Looking forward to seeing more issue of the DJA newsletter!

July 10, 2014

Data Journalism: Overpromised/Underdelivered?

Filed under: Journalism,News,Reporting — Patrick Durusau @ 3:02 pm

Alberto Cairo: Data journalism needs to up its own standards by Alberto Cairo.

From the post:

Did you know that wearing a helmet when riding a bike may be bad for you? Or that it’s possible to infer the rise of kidnappings in Nigeria from news reports? Or that we can predict the year when a majority of Americans will reject the death penalty (hint: 2044)? Or that it’s possible to see that healthcare prices in the U.S. are “insane” in 15 simple charts? Or that the 2015 El Niño event may increase the percentage of Americans who accept climate change as a reality?

But I have to confess my disappointment with the new wave of data journalism — at least for now. All the questions in the first paragraph are malarkey. Those stories may not be representative of everything that FiveThirtyEight, Vox, or The Upshot are publishing — I haven’t gathered a proper sample — but they suggest that, when you pay close attention at what they do, it’s possible to notice worrying cracks that may undermine their own core principles.

In my present interpretation of his examples, Alberto has good reason to complain.

But that doesn’t mean re-cast any of the stories would be closer to some “truth.” Rather they would be closer to my norms for such stories. Which isn’t the same thing.

Or as Nietzsche would say: There are no facts, only interpretations.

People from presidents on down lay claim to “facts.” Your opponents can be pilloried for ignoring “facts.” Current mis-adventures in domestic and foreign security of the United States are predicated on emotional insecurities packaged as “facts.”

Acknowledging Nietzsche puts all “facts” on an even footing.

Enough diverse “facts” and it is harder to agree to spend $Trillions pursuing a security that is pushed further away with every dollar spent.

Visual Journalism Training Resources

Filed under: Journalism,News,Reporting — Patrick Durusau @ 10:48 am

BBC Opens Up Internal Visual Journalism Training Resources to the Public by Gannon Burgett.

From the post:

Last week, the BBC College of Journalism opened up their training website to the public. Full of educational resources created by and for the internal BBC team, these professional videos and guides run through a number of circumstances and suggestions for approaching visual journalism.

Set to be open for a 12 month trial run, the videos and podcasts cover topics that range from safety when harmed in the field, to iPhone photojournalism, to basic three-point lighting techniques and even videos that show you how to properly use satellite phones when capturing stories in unconventional areas.

A rather extraordinary set of resources!

Should give you a window into how the BBC views news reporting as well as the tools for news reporting on your own.

To see all the resources, see the BBC Academy page.

I first saw this in a tweet by Michael Peter Edson.

May 23, 2014

Overview (new release)

Filed under: Journalism,News,Reporting — Patrick Durusau @ 6:57 pm

Overview (new release)

A new version of Overview was released last Monday. The GitHub page lists the following new features:

  • Overview will reserve less memory on Windows with 32-bit Java. That means it won’t support larger document sets; it also means it won’t crash on startup for some users.
  • Overview now starts in “single-user mode”. You won’t be prompted for a username or password.
  • Overview will automatically open up a browser window to http://localhost:9000 when it’s ready to go.
  • You can export huge document sets without running out of memory.

Installation and upgrade instructions: Installation and upgrade instructions: https://github.com/overview/overview-server/wiki/Installing-and-Running-Overview

For more details on how Overview supports “document-driven journalism,” see the Overview Project homepage.

May 9, 2014

The Data Journalism Handbook

Filed under: Journalism,News,Reporting — Patrick Durusau @ 6:41 pm

The Data Journalism Handbook edited by Jonathan Gray, Liliana Bounegru and Lucy Chambers.

From the webpage:

The Data Journalism Handbook is a free, open source reference book for anyone interested in the emerging field of data journalism.

It was born at a 48 hour workshop at MozFest 2011 in London. It subsequently spilled over into an international, collaborative effort involving dozens of data journalism’s leading advocates and best practitioners – including from the Australian Broadcasting Corporation, the BBC, the Chicago Tribune, Deutsche Welle, the Guardian, the Financial Times, Helsingin Sanomat, La Nacion, the New York Times, ProPublica, the Washington Post, the Texas Tribune, Verdens Gang, Wales Online, Zeit Online and many others.

A practical tome, it is available in English, Russian, French, German and Georgian.

A very useful and highly entertaining read.

Enjoy and recommend it to others!

March 12, 2014

“The Upshot”

Filed under: Journalism,News,Reporting — Patrick Durusau @ 8:03 pm

“The Upshot” is the New York Times’ replacement for Nate Silver’s FiveThirtyEight by John McDuling.

From the post:

“The Upshot.” That’s the name the New York Times is giving to its new data-driven venture, focused on politics, policy and economic analysis and designed to fill the void left by Nate Silver, the one-man traffic machine whose statistical approach to political reporting was a massive success.

David Leonhardt, the Times’ former Washington bureau chief, who is in charge of The Upshot, told Quartz that the new venture will have a dedicated staff of 15, including three full-time graphic journalists, and is on track for a launch this spring. “The idea behind the name is, we are trying to help readers get to the essence of issues and understand them in a contextual and conversational way,” Leonhardt says. “Obviously, we will be using data a lot to do that, not because data is some secret code, but because it’s a particularly effective way, when used in moderate doses, of explaining reality to people.”

The New York Times’ own public editor admitted that Silver, a onetime baseball stats geek, never really fit into the paper’s culture, and that “a number of traditional and well-respected Times journalists disliked his work.” But Leonhardt says being part of the Times is an “enormous advantage” for The Upshot. “The Times is in an extremely strong position digitally. We are going to be very much a Times product. Having said that, we are not going to do stuff the same way the Times does.” The tone, he said, will be more like having “a journalist sitting next to you, or sending you an email.”

I really like the New York Times for its long tradition of excellence in news gathering. Couple that with technologies to connect its staff’s collective insights with the dots and it would be a formidable enterprise.

November 5, 2013

Statistics + Journalism = Data Journalism ?

Filed under: Journalism,News — Patrick Durusau @ 1:37 pm

Statistics + Journalism = Data Journalism ? by Armin Grossenbacher.

From the post:

Statistics+journalism=data journalism is not the full truth. The equation may make sense because statistics are the most important source for data journalism. But data journalism needs more than statistics and classic journalism: finding the story behind the data coupled with know how in specific tools (analysis, visualising) lead to the storytelling data journalism needs.

To get an idea what data journalism means and what skills are needed a free MOOC with well-known experts will be offered starting next year.

The MOOC is: Doing Journalism with Data: First Steps, Skills and Tools

No dates but said to be coming in early 2014.

Merging data on deadline?

Interesting both for the content and the tools that journalists use to explore data.

February 25, 2013

Computational Journalism

Filed under: Journalism,News — Patrick Durusau @ 11:50 am

Computational Journalism by Jonathan Stray.

From the webpage:

Maybe it’s not obvious that computer science and journalism go together, but they do!

Computational journalism combines classic journalistic values of storytelling and public accountability with techniques from computer science, statistics, the social sciences, and the digital humanities.

This course, given at the University of Hong Kong during January-February 2013, is an advanced look at how techniques from visualization, natural language processing, social network analysis, statistics, and cryptography apply to four different areas of journalism: finding stories through data mining, communicating what you’ve learned, filtering an overwhelming volume of information, and tracking the spread of information and effects.

The course assumes knowledge of computer science, including standard algorithms and linear algebra. The assignments are in Python and require programming experience. But this introductory video, which explains the topics covered, is for everyone.

For more, see the syllabus, or jump directly to a lecture:

  1. Basics. Feature vectors, clustering, projections.
  2. Text analysis. Tokenization, TF-IDF, topic modeling.
  3. Algorithmic filters. Information overload. Newsblaster and Google News.
  4. Hybrid filters. Social networks as filters. Collaborative Filtering.
  5. Social network analysis. Using it in journalism. Centrality algorithms.
  6. Knowledge representation. Structured data. Linked open data. General Q&A.
  7. Drawing conclusions. Randomness. Competing hypotheses. Causation.
  8. Security, surveillance, and privacy. Cryptography. Threat modeling.

CS knowledge and programming experience still required.

Interfaces will lessen that need over time but that knowledge/experience will help you question when interfaces have given odd results.

I would settle for journalists who question reports, like the Mandiant advertisement on cybersecurity last week. (Crowdsourcing Cybersecurity: A Proposal (Part 1))

Even the talking heads on the PBS Sunday morning news treated it as serious content. It was poorly written/researched ad copy, nothing more.

Of course, you would have to read the first couple of pages to discover that, not just skim the press release.

I first saw this at Christophe Lalanne’s A bag of tweets / February 2013.

February 17, 2013

Finding tools vs. making tools:…

Filed under: Journalism,News,Software — Patrick Durusau @ 8:17 pm

Finding tools vs. making tools: Discovering common ground between computer science and journalism by Nick Diakopoulos.

From the post:

The second Computation + Journalism Symposium convened recently at the Georgia Tech College of Computing to ask the broad question: What role does computation have in the practice of journalism today and in the near future? (I was one of its organizers.) The symposium attracted almost 150 participants, both technologists and journalists, to discuss and debate the issues and to forge a multi-disciplinary path forward around that question.

Topics for panels covered the gamut, from precision and data journalism, to verification of visual content, news dissemination on social media, sports and health beats, storytelling with data, longform interfaces, the new economic landscape of content, and the educational needs of aspiring journalists. But what made these sessions and topics really pop was that participants on both sides of the computation and journalism aisle met each other in a conversational format where intersections and differences in the ways they viewed these topics could be teased apart through dialogue. (Videos of the sessions are online.)

While the panelists were all too civilized for any brawls to break out, mixing two disciplines as different as computing and journalism nonetheless did lead to some interesting discussions, divergences, and opportunities that I’d like to explore further here. Keeping these issues top-of-mind should help as this field moves forward.

Tool foragers and tool forgers

The following metaphor is not meant to be incendiary, but rather to illuminate two different approaches to tool innovation that seemed apparent at the symposium.

Imagine you live about 10,000 years ago, on the cusp of the Neolithic Revolution. The invention of agriculture is just around the corner. It’s spring and you’re hungry after the long winter. You can start scrounging around for berries and other tasty roots to feed you and your family — or you can stop and try to invent some agricultural implements, tools adapted to your own local crops and soil that could lead to an era of prosperity. If you take the inventive approach, you might fail, and there’s a real chance you’ll starve trying — while foraging will likely guarantee you another year of subsistence life.

What role does computation have in your field of practice?

February 6, 2013

Simon Rogers

Filed under: Data Mining,Journalism,News — Patrick Durusau @ 2:56 pm

Simon Rogers

From the “about” page:

Simon Rogers is editor of guardian.co.uk/data, an online data resource which publishes hundreds of raw datasets and encourages its users to visualise and analyse them – and probably the world’s most popular data journalism website.

He is also a news editor on the Guardian, working with the graphics team to visualise and interpret huge datasets.

He was closely involved in the Guardian’s exercise to crowdsource 450,000 MP expenses records and the organisation’s coverage of the Afghanistan and Iraq Wikileaks war logs. He was also a key part of the Reading the Riots team which investigated the causes of the 2011 England disturbances.

Previously he was the launch editor of the Guardian’s online news service and has edited the paper’s science section. He has edited three Guardian books, including How Slow Can You Waterski and The Hutton Inquiry and its impact.

If you are interested in “data journalism,” data mining or visualization, Simon’s site is one of the first to bookmark.

December 13, 2012

Crowdsourcing campaign spending: …

Filed under: Crowd Sourcing,Government Data,Journalism — Patrick Durusau @ 3:43 pm

Crowdsourcing campaign spending: What ProPublica learned from Free the Files by Amanda Zamora.

From the post:

This fall, ProPublica set out to Free the Files, enlisting our readers to help us review political ad files logged with Federal Communications Commission. Our goal was to take thousands of hard-to-parse documents and make them useful, helping to reveal hidden spending in the election.

Nearly 1,000 people pored over the files, logging detailed ad spending data to create a public database that otherwise wouldn’t exist. We logged as much as $1 billion in political ad buys, and a month after the election, people are still reviewing documents. So what made Free the Files work?

A quick backstory: Free the Files actually began last spring as an effort to enlist volunteers to visit local TV stations and request access to the “public inspection file.” Stations had long been required to keep detailed records of political ad buys, but they were only available on paper and required actually traveling to the station.

In August, the FCC ordered stations in the top 50 markets to begin posting the documents online. Finally, we would be able to access a stream of political ad data based on the files. Right?

Wrong. It turns out the FCC didn’t require stations to submit the data in anything that approaches an organized, standardized format. The result was that stations sent in a jumble of difficult to search PDF files. So we decided if the FCC or stations wouldn’t organize the information, we would.

Enter Free the Files 2.0. Our intention was to build an app to help translate the mishmash of files into structured data about the ad buys, ultimately letting voters sort the files by market, contract amount and candidate or political group (which isn’t possible on the FCC’s web site), and to do it with the help of volunteers.

In the end, Free the Files succeeded in large part because it leveraged data and community tools toward a single goal. We’ve compiled a bit of what we’ve learned about crowdsourcing and a few ideas on how news organizations can adapt a Free the Files model for their own projects.

The team who worked on Free the Files included Amanda Zamora, engagement editor; Justin Elliott, reporter; Scott Klein, news applications editor; Al Shaw, news applications developer, and Jeremy Merrill, also a news applications developer. And thanks to Daniel Victor and Blair Hickman for helping create the building blocks of the Free the Files community.

The entire story is golden but a couple of parts shine brighter for me than the others.

Design consideration:

The success of Free the Files hinged in large part on the design of our app. The easier we made it for people to review and annotate documents, the higher the participation rate, the more data we could make available to everyone. Our maxim was to make the process of reviewing documents like eating a potato chip: “Once you start, you can’t stop.”

Let me re-say that: The easier it is for users to author topic maps, the more topic maps they will author.

Yes?

Semantic Diversity:

But despite all of this, we still can’t get an accurate count of the money spent. The FCC’s data is just too dirty. For example, TV stations can file multiple versions of a single contract with contradictory spending amounts — and multiple ad buys with the same contract number means radically different things to different stations. But the problem goes deeper. Different stations use wildly different contract page designs, structure deals in idiosyncratic ways, and even refer to candidates and groups differently.

All true but knowing the semantics vary ahead of time, station to station, why not map the semantics in the markets ahead of time?

Granting I second their request to the FCC to request standardized data but having standardized blocks doesn’t mean the information has the same semantics.

The OMB can’t keep the same semantics for a handful of terms in one document.

What chance is there with dozens and dozens of players in multiple documents?

April 10, 2012

A new framework for innovation in journalism: How a computer scientist would do it

Filed under: Journalism,News,Subject Identity — Patrick Durusau @ 6:40 pm

A new framework for innovation in journalism: How a computer scientist would do it

Andrew Phelps writes:

What if journalism were invented today? How would a computer scientist go about building it, improving it, iterating it?

He might start by mapping out some fundamental questions: What are the project’s values and goals? What consumer needs would it satisfy? How much should be automated, how much human-powered? How could it be designed to be as efficient as possible?

Computer science Ph.D. Nick Diakopoulos has attempted to create a new framework for innovation in journalism. His new white paper, commissioned by CUNY’s Tow-Knight Center for Entrepreneurial Journalism, does not provide answers so much as a different way to come up with questions.

Diakopolous identified 27 computing concepts that could apply to journalism — think natural language processing, machine learning, game engines, virtual reality, information visualization — and pored over thousands of research papers to determine which topics get the most (and least) attention. (There are untapped opportunities in robotics, augmented reality, and motion capture, it turns out.)

He thinks computer science and journalism have a lot in common, actually. They are both fundamentally concerned with information. Acquiring it, storing it, modifying it, presenting it.

Suggest you read his paper in full: Cultivating the Landscape of Innovation in Computational Journalism.

Intrigued by the idea of gauging the opportunities along a continuum of activities. Could be a stunning visual of how subject identity is handled across activities and/or technologies.

Interested?

« Newer Posts

Powered by WordPress