Perseus Gives Big Humanities Data Wings by Ian Armas Foster.
From the post:
“How do we think about the human record when our brains are not capable of processing all the data in isolation?” asked Professor Gregory Crane of students in a lecture hall at the University of Kansas.
But when he posed this question, Crane wasn’t referencing modern big data to a bunch of computer science majors. Rather, he was discussing data from ancient texts with a group of those studying the humanities (and one computer science major).
Crane, a professor of classics, adjunct professor of computer science, and chair of Technology and Entrepreneurship at Tufts University, spoke about the efforts of the Perseus Project, a project whose goals include storing and analyzing ancient texts with an eye toward building a global humanities model.
(video omitted)
The next step in humanities is to create that Crane calls “a dialogue among civilizations.” With regard to the study of humanities, it is to connect those studying classical Greek with those studying classical Latin, Arabic, and even Chinese. Like physicists want to model the universe, Crane wants to model the progression of intelligence and art on a global scale throughout human history.
… (a bit later)
Surprisingly, the biggest barrier is not actually the amount of space occupied by the data of the ancient texts, but rather the language barriers. Currently, the Perseus Project covers over a trillion words, but those words are split up into 400 languages. To give a specific example, Crane presented a 12th century Arabic document. It was pristine and easily readable—to anyone who can read ancient Arabic.
Substitute “semantic” for “language” in “language barriers” and I think the comment is right on the mark.
Assuming that you could read the “12th century Arabic document” and understand its semantics, where would you record your reading to pass it along to others?
Say you spot the name of a well known 12th figure. Must every reader duplicate your feat of reading and understanding the document to make that same discovery?
Or can we preserve your “discovery” for other readers?
Topic maps anyone?