Archive for the ‘Census Data’ Category

Why the Obsession with Tables?

Thursday, May 2nd, 2013

Why the Obsession with Tables? by Robert Kosara.

From the post:

Lots of data are still presented and released as tables. But why, when we know that visual representations are so much easier to read and understand? Eric Newburger from the U.S. Census Bureau has an interesting theory.

In a short talk on visualization at the Census Bureau, he describes how in the 1880s, the Census published maps and charts. Many of those are actually amazingly well done, even by today’s standards. But starting with 1890 census, they were replaced with tables.

This, according to Newburger, was due to an important innovation: the Hollerith Tabulating Machine. The new machines were much faster and could slice and dice the data in a lot of new ways, but their output ended up in tables. Throughout the 20th century, the Census created enormous amount of tables, with only a small fraction of the data shown as maps or charts.

Newburger argues that people don’t bother trying to read tables, whereas visualizations are much more likely to catch their attention and get them interested in the underlying data. We clearly have the means to create any visualization we want today, and there is plenty of data available, so why keep publishing tables? It’s a matter of the attitudes towards data, and these can be hard to change after more than 100 years:

Suggestions of images from maps and charts from the Census in the 1880s?

If the Hollerith Tabulating Machine is responsible for the default to tables, it is also responsible for spreadsheets?

Quicker for a machine to produce but less useful to an end user.

Mapping the census…

Sunday, February 10th, 2013

Mapping the census: how one man produced a library for all by Simon Rogers.

From the post:

The census is an amazing resource – so full of data it’s hard to know where to begin. And increasingly where to begin is by putting together web-based interactives – like this one on language and this on transport patterns that we produced this month.

But one academic is taking everything back to basics – using some pretty sophisticated techniques. Alex Singleton, a lecturer in geographic information science (GIS) at Liverpool University has used R to create the open atlas project.

Singleton has basically produced a detailed mapping report – as a PDF and vectored images – on every one of the local authorities of England & Wales. He automated the process and has provided the code for readers to correct and do something with. In each report there are 391 pages, each with a map. That means, for the 354 local authorities in England & Wales, he has produced 127,466 maps.

Check out Simon’s post to see why Singleton has undertaken such a task.

Question: Was the 2011 census more “transparent,” or “useful” after Singleton’s work or before?

I would say more “transparent” after Singleton’s work.

You?

U.S. Census Bureau Offers Public API for Data Apps

Monday, July 30th, 2012

U.S. Census Bureau Offers Public API for Data Apps by Nick Kolakowski.

From the post:

For any software developers with an urge to play around with demographic or socio-economic data: the U.S. Census Bureau has launched an API for Web and mobile apps that can slice that statistical information in all sorts of nifty ways.

The API draws data from two sets: the 2010 Census (statistics include population, age, sex, and race) and the 2006-2010 American Community Survey (offers information on education, income, occupation, commuting, and more). In theory, developers could use those datasets to analyze housing prices for a particular neighborhood, or gain insights into a city’s employment cycles.

The APIs include no information that could identify an individual. (emphasis added)

Suppose it should say: “Some assembly required.”

Similar resources at Data.gov and Google Public Data Explorer.

I first saw this at: Dashboard Insight.

1940 US Census Indexing Progress Report—May 18, 2012

Sunday, May 20th, 2012

1940 US Census Indexing Progress Report—May 18, 2012

From the post:

We’re finishing our 7th week of indexing and we are a breath away from having 40% of the entire collection indexed. I hear from so many people words of amazement at the things this indexing community has accomplished. In 7 weeks we’ve collectively indexed more than 55 million names. It is truly amazing. With 111,612 indexers now signed up to index and arbitrate, we have a formidable team making some great things happen. Let’s keep up the great work.

It is a popular data set but isn’t the whole story.

What do you think are the major factors that contribute to their success?

1940 Census (U.S.A.)

Tuesday, April 3rd, 2012

1940 Census (U.S.A.)

From the “about” page:

Census records are the only records that describe the entire population of the United States on a particular day. The 1940 census is no different. The answers given to the census takers tell us, in detail, what the United States looked like on April 1, 1940, and what issues were most relevant to Americans after a decade of economic depression.

The 1940 census reflects economic tumult of the Great Depression and President Franklin D. Roosevelt’s New Deal recovery program of the 1930s. Between 1930 and 1940, the population of the Continental United States increased 7.2% to 131,669,275. The territories of Alaska, Puerto Rico, American Samoa, Guam, Hawaii, the Panama Canal, and the American Virgin Islands comprised 2,477,023 people.

Besides name, age, relationship, and occupation, the 1940 census included questions about internal migration; employment status; participation in the New Deal Civilian Conservation Corps (CCC), Works Progress Administration (WPA), and National Youth Administration (NYA) programs; and years of education.

Great for ancestry and demographic studies. What other data would you use with this census information?

MPC – Minnesota Population Center

Tuesday, March 6th, 2012

MPC – Minnesota Population Center

I mentioned the Integrated Public Use Microdata Series (IPUMS-USA) data set last year which self-describes as:

IPUMS-USA is a project dedicated to collecting and distributing United States census data. Its goals are to:

  • Collect and preserve data and documentation
  • Harmonize data
  • Disseminate the data absolutely free!

Use it for GOOD — never for EVIL

There is international data and more U.S. data that may be of interest:

Statistics Finland is making further utilisation of statistical data easier

Sunday, January 29th, 2012

Statistics Finland is making further utilisation of statistical data easier

From the post:

Statistics Finland has confirmed new Terms of Use for the utilisation of already published statistical data. In them, Statistics Finland grants a universal, irrevocable right to the use of the data published in its www.stat.fi website service and in related free statistical databases. The right extends to use for both commercial and non-commercial purposes. The aim is to make further utilisation of the data easier and thereby increase the exploitation and effectiveness of statistics in society.

At the same time, an open interface has been built to the StatFin database. The StatFin database is a general database built with AC-Axis tools that is free-of-charge and contains a wide array of statistical data on a variety of areas in society. It contains data from some 200 sets of statistics, thousands of tables and hundreds of millions of individual data cells. The contents of the StatFin database have been systematically widened in the past few years and its expansion with various information contents and regional divisions will be continued even further.

Curious if the free commercial re-use of government collected data (paid for by taxpayers) favors established re-sellers of data or startups that will combine existing data in interesting ways. Thoughts?

First seen at Christophe Lalanne’s Bag of Tweets for January 2012.

Opening Up the Domesday Book

Thursday, December 22nd, 2011

Opening Up the Domesday Book by Sam Leon.

From the post:

Domesday Book might be one of the most famous government datasets ever created. Which makes it all the stranger that it’s not freely available online – at the National Archives, you have to pay £2 per page to download copies of the text.

Domesday is pretty much unique. It records the ownership of almost every acre of land in England in 1066 and 1086 – a feat not repeated in modern times. It records almost every household. It records the industrial resources of an entire nation, from castles to mills to oxen.

As an event, held in the traumatic aftermath of the Norman conquest, the Domesday inquest scarred itself deeply into the mindset of the nation – and one historian wrote that on his deathbed, William the Conqueror regretted the violence required to complete it. As a historical dataset, it is invaluable and fascinating.

In my spare time, I’ve been working on making Domesday Book available online at Open Domesday. In this, I’ve been greatly aided by the distinguished Domesday scholar Professor John Palmer, and his geocoded dataset of settlements and people in Domesday, created with AHRC funding in the 1990s.

I guess it really is all a matter of perspective. I have never thought of the Domesday Book as a “government dataset….” ;-)

Certainly would make an interesting basis for a chronological topic map tracing the ownership and fate of “…almost every acre of land in England….”