Archive for the ‘Data Replication’ Category

If You See Something, Save Something (Poke A Censor In The Eye)

Thursday, August 17th, 2017

If You See Something, Save Something – 6 Ways to Save Pages In the Wayback Machine by Alexis Rossi.

From the post:

In recent days many people have shown interest in making sure the Wayback Machine has copies of the web pages they care about most. These saved pages can be cited, shared, linked to – and they will continue to exist even after the original page changes or is removed from the web.

There are several ways to save pages and whole sites so that they appear in the Wayback Machine. Here are 6 of them.

In the comments, Ellen Spertus mentions a 7th way: Donate to the Internet Archive!

It’s the age of censorship, by governments, DMCA, the EU (right to be forgotten), Facebook, Google, Twitter and others.

Poke a censor in the eye, see something, save something to the Wayback Machine.

The Wayback Machine can’t stop all censorship, so save local and remote copies as well.

Keep poking until all censors go blind.

University Administrations and Data Checking

Wednesday, January 7th, 2015

Axel Brennicke and Björn Brembs, posted the following about university administrations in Germany.

Noam Chomsky, writing about the Death of American Universities, recently reminded us that reforming universities using a corporate business model leads to several easy to understand consequences. The increase of the precariat of faculty without benefits or tenure, a growing layer of administration and bureaucracy, or the increase in student debt. In part, this well-known corporate strategy serves to increase labor servility. The student debt problem is particularly obvious in countries with tuition fees, especially in the US where a convincing argument has been made that the tuition system is nearing its breaking point. The decrease in tenured positions is also quite well documented (see e.g., an old post). So far, and perhaps as may have been expected, Chomsky was dead on with his assessment. But how about the administrations?

To my knowledge, nobody has so far checked if there really is any growth in university administration and bureaucracy, apart from everybody complaining about it. So Axel Brennicke and I decided to have a look at the numbers. In Germany, employment statistics can be obtained from the federal statistics registry, Destatis. We sampled data from 2005 (the year before the Excellence Initiative and the Higher Education Pact) and the latest year we were able to obtain, 2012.

I’m sympathetic to the authors and their position, but that doesn’t equal verification of their claims about the data.

They have offered the data to anyone who want to check: Raw Data for Axel Brennicke and Björn Brembs.

Granting the article doesn’t detail their analysis, after downloading the data, what’s next? How would you go about verifying statements made in the article?

If people get in the habit of offering data for verification and no one looks, what guarantee of correctness will that bring?


The data passes the first test, it is actually present at the download site. Don’t laugh, the NSA has trouble making that commitment.

Do note that the files have underscores in their names which makes them appear to have spaces in their names. HINT: Don’t use underscores in file name. Ever.

The files are old style .xls files so just about anything recent should read them. Do be aware the column headers are in German.

The only description reads:

Employment data from DESTATIS about German university employment in 2005 and 2012

My first curiosity is the data being from two years only, 2005 and 2012. Just note that for now. What steps would you take with the data sets as they are?

I first saw this in a tweet by David Colquhoun.

SymmetricDS

Thursday, October 13th, 2011

SymmetricDS

From the website:

SymmetricDS is an asynchronous data replication software package that supports multiple subscribers and bi-directional synchronization. It uses web and database technologies to replicate tables between relational databases, in near real time if desired. The software was designed to scale for a large number of databases, work across low-bandwidth connections, and withstand periods of network outage.

By using database triggers, SymmetricDS guarantees that data changes are captured and atomicity is preserved. Support for database vendors is provided through a Database Dialect layer, with implementations for MySQL, Oracle, SQL Server, PostgreSQL, DB2, Firebird, HSQLDB, H2, and Apache Derby included.

This is very cool!

(Spotted by Marko Rodriguez)