Norwegian Ethnological Research [The Early Years] by Lars Marius Garshol.
From the post:
The definitive book on Norwegian farmhouse ale is Odd Nordland’s “Brewing and beer traditions in Norway,” published in 1969. That book is now sadly totally unavailable, except from libraries. In the foreword Nordland writes that the book is based on a questionnaire issued by Norwegian Ethnological Research in 1952 and 1957. After digging a little I discovered that this material is actually still available at the institute. The questionnaire is number 35, running to 103 questions.
Because the questionnaire responses in general often contain descriptions of quite personal matters, access to the answers is restricted. However, by paying a quite stiff fee, describing the research I wanted to use the material for, and signing a legal agreement, I was sent a CD with all the answers to questionnaire 35. The contents are quite daunting: 1264 numbered JPEG files, with no metadata of any kind. The files are scans of individual pages of responses, plus one cover page for each Norwegian province. Most of the responses are handwritten, and legibility varies dramatically. Some, happily, are typewritten.
I appended “[The Early Years]” to the title because Lars has embarked on an adventure that can last as long as he remains interested.
Sixty-two year old survey results leave Lars wondering exactly what was meant in some cases. Keep that in mind the next time you search for word usage across centuries. Matching exact strings isn’t the same thing as matching the meanings attached to those strings.
You can imagine what gaps and ambiguities might exist when the time period stretches to centuries, if not millennia, and our knowledge of the languages is learned in a modern context.
The understanding we capture is our own, which hopefully has some connection to earlier witnesses. Recording that process is a uniquely human activity and one that I am glad Lars is sharing with a larger audience.
Looking forward to hearing about more results!
PS: Do you have a similar “data mining” story to share? Including the use of command line tool stories but working with non-electronic resources as well.