Archive for the ‘Data Replication’ Category

University Administrations and Data Checking

Wednesday, January 7th, 2015

Axel Brennicke and Björn Brembs, posted the following about university administrations in Germany.

Noam Chomsky, writing about the Death of American Universities, recently reminded us that reforming universities using a corporate business model leads to several easy to understand consequences. The increase of the precariat of faculty without benefits or tenure, a growing layer of administration and bureaucracy, or the increase in student debt. In part, this well-known corporate strategy serves to increase labor servility. The student debt problem is particularly obvious in countries with tuition fees, especially in the US where a convincing argument has been made that the tuition system is nearing its breaking point. The decrease in tenured positions is also quite well documented (see e.g., an old post). So far, and perhaps as may have been expected, Chomsky was dead on with his assessment. But how about the administrations?

To my knowledge, nobody has so far checked if there really is any growth in university administration and bureaucracy, apart from everybody complaining about it. So Axel Brennicke and I decided to have a look at the numbers. In Germany, employment statistics can be obtained from the federal statistics registry, Destatis. We sampled data from 2005 (the year before the Excellence Initiative and the Higher Education Pact) and the latest year we were able to obtain, 2012.

I’m sympathetic to the authors and their position, but that doesn’t equal verification of their claims about the data.

They have offered the data to anyone who want to check: Raw Data for Axel Brennicke and Björn Brembs.

Granting the article doesn’t detail their analysis, after downloading the data, what’s next? How would you go about verifying statements made in the article?

If people get in the habit of offering data for verification and no one looks, what guarantee of correctness will that bring?


The data passes the first test, it is actually present at the download site. Don’t laugh, the NSA has trouble making that commitment.

Do note that the files have underscores in their names which makes them appear to have spaces in their names. HINT: Don’t use underscores in file name. Ever.

The files are old style .xls files so just about anything recent should read them. Do be aware the column headers are in German.

The only description reads:

Employment data from DESTATIS about German university employment in 2005 and 2012

My first curiosity is the data being from two years only, 2005 and 2012. Just note that for now. What steps would you take with the data sets as they are?

I first saw this in a tweet by David Colquhoun.

SymmetricDS

Thursday, October 13th, 2011

SymmetricDS

From the website:

SymmetricDS is an asynchronous data replication software package that supports multiple subscribers and bi-directional synchronization. It uses web and database technologies to replicate tables between relational databases, in near real time if desired. The software was designed to scale for a large number of databases, work across low-bandwidth connections, and withstand periods of network outage.

By using database triggers, SymmetricDS guarantees that data changes are captured and atomicity is preserved. Support for database vendors is provided through a Database Dialect layer, with implementations for MySQL, Oracle, SQL Server, PostgreSQL, DB2, Firebird, HSQLDB, H2, and Apache Derby included.

This is very cool!

(Spotted by Marko Rodriguez)