Humanities « Another Word For It

October 16, 2018

There’s a Spectre Haunting the Classics, It’s Called the TLG

Filed under: Classics,Greek,Humanities — Patrick Durusau @ 6:50 pm

Today being National Dictionary Day (a U.S. oddity), I was glad to see a tweet boasting of 28 Greek lexica for online searching.

While it is true that 28 Greek lexica are available for searching, only the results are available for eight (8) of them, access to the other twenty (20), depending upon a subscription to the TLG project.

Funded entirely with public monies and donations, the TLG created IP agreements with publishers of Greek texts, which succeeded in walling off this collection from the public for decades. Some of the less foul guardians at the TLG have prevailed upon it to offer a limited subset of the corpus for free. How kind.

Advances in digitization and artificial intelligence aided transcription promise access to original Greek materials in the not too distant future.

I look forward to a future when classicists look puzzled at mention of the TLG and then brighten to say: “Oh, that was when classics resources were limited to the privileged few.”

Comments Off

October 10, 2018

Passwords: Philology, Security, Authentication

Filed under: Cryptography,Humanities,Security — Patrick Durusau @ 4:29 pm

Passwords: Philology, Security, Authentication by Brian Lennon.

Disclaimer: I haven’t seen Passwords, yet, but it’s description and reviews prompted me to mention it here.

That and finding an essay with the same titie, verbatim, by the same author, published in Diacritics, Volume 43.1 (2015) 82-104. Try: Passwords: Philology, Security, Authentication. (Hosted on the Academia site so you will need an account (free) to download the essay (also free).

From the publisher:

Cryptology, the mathematical and technical science of ciphers and codes, and philology, the humanistic study of natural or human languages, are typically understood as separate domains of activity. But Brian Lennon contends that these two domains, both concerned with authentication of text, should be viewed as contiguous. He argues that computing’s humanistic applications are as historically important as its mathematical and technical ones. What is more, these humanistic uses, no less than cryptological ones, are marked and constrained by the priorities of security and military institutions devoted to fighting wars and decoding intelligence.

Lennon’s history encompasses the first documented techniques for the statistical analysis of text, early experiments in mechanized literary analysis, electromechanical and electronic code-breaking and machine translation, early literary data processing, the computational philology of late twentieth-century humanities computing, and early twenty-first-century digital humanities. Throughout, Passwords makes clear the continuity between cryptology and philology, showing how the same practices flourish in literary study and in conditions of war.

Lennon emphasizes the convergence of cryptology and philology in the modern digital password. Like philologists, hackers use computational methods to break open the secrets coded in text. One of their preferred tools is the dictionary, that preeminent product of the philologist’s scholarly labor, which supplies the raw material for computational processing of natural language. Thus does the historic overlap of cryptology and philology persist in an artifact of computing—passwords—that many of us use every day.

Reviews (from the website):

“Passwords is a fascinating book. What is especially impressive is the author’s deft and knowing engagements with both the long histories of computational text processing and the many discourses that make up literary philology. This is just the sort of work that the present mania for the digital demands, and yet books that actually live up to those demands are few and far between. Lennon is one of the few scholars who is even capable of managing that feat, and he does so here with style and erudition.”—David Golumbia, Virginia Commonwealth University

“A stunning intervention, Passwords rivets our attention to the long history of our present fascination with the digital humanities. Through a series of close, contextual readings, from ninth-century Arabic philology and medieval European debates on language to twentieth-century stylometry and machine translation, this book recalls us to a series of engagements with language about which ‘all of us—we scholars, we philologists,’ as Lennon puts it, ought to know more. Passwords is eloquent and timely, and it offers a form of deep, institutional-lexical study, which schools us in a refusal to subordinate scholarship in the humanities to the identitarian and stabilizing imperatives of the national-security state.”—Jeffrey Sacks, University of California, Riverside

Not surprisingly, I think a great deal was lost when humanities, especially those areas focused on language, stopped interacting with computer sciences. Sometime after the development of the first compilers but I don’t know that history in detail. Suggested reading?

Comments Off

December 27, 2017

No Peer Review at FiveThirtyEight

Filed under: Humanities,Peer Review,Researchers,Science — Patrick Durusau @ 10:47 am

Politics Moves Fast. Peer Review Moves Slow. What’s A Political Scientist To Do? by Maggie Koerth-Baker

From the post:

Politics has a funny way of turning arcane academic debates into something much messier. We’re living in a time when so much in the news cycle feels absurdly urgent and partisan forces are likely to pounce on any piece of empirical data they can find, either to champion it or tear it apart, depending on whether they like the result. That has major implications for many of the ways knowledge enters the public sphere — including how academics publicize their research.

That process has long been dominated by peer review, which is when academic journals put their submissions in front of a panel of researchers to vet the work before publication. But the flaws and limitations of peer review have become more apparent over the past decade or so, and researchers are increasingly publishing their work before other scientists have had a chance to critique it. That’s a shift that matters a lot to scientists, and the public stakes of the debate go way up when the research subject is the 2016 election. There’s a risk, scientists told me, that preliminary research results could end up shaping the very things that research is trying to understand.
…

The legend of peer review catching and correcting flaws has a long history. A legend much tarnished by the Top 10 Retractions of 2017 and similar reports. Retractions are self admissions of the failure of peer review. By the hundreds.

Withdrawal of papers isn’t the only debunking of peer review. The reports, papers, etc., on the failure of peer review include: “Data fabrication and other reasons for non-random sampling in 5087 randomised, controlled trials in anaesthetic and general medical journals,” Anaesthesia, Carlisle 2017, DOI: 10.1111/anae.13962; “The peer review drugs don’t work” by Richard Smith; “One in 25 papers contains inappropriately duplicated images, screen finds” by Cat Ferguson.

Koerth-Baker’s quoting of Justin Esarey to support peer review is an example of no or failed peer review at FiveThirtyEight.

…
But, on aggregate, 100 studies that have been peer-reviewed are going to produce higher-quality results than 100 that haven’t been, said Justin Esarey, a political science professor at Rice University who has studied the effects of peer review on social science research. That’s simply because of the standards that are supposed to go along with peer review – clearly reporting a study’s methodology, for instance – and because extra sets of eyes might spot errors the author of a paper overlooked.
…

Koerth-Baker acknowledges the failures of peer review but since the article is premised upon peer review insulating the public from “bad science,” she runs in Justin Esarey, “…who has studied the effects of peer review on social science research.” One assumes his “studies” are mentioned to embue his statements with an aura of authority.

Debunking Esarey’s authority to comment on the “…effects of peer review on social science research” doesn’t require much effort. If you scan his list of publications you will find Does Peer Review Identify the Best Papers?, which bears the sub-title, A Simulation Study of Editors, Reviewers, and the Social Science Publication Process.

Esarey’s comments on the effectiveness of peer review are not based on fact but on simulations of peer review systems. Useful work no doubt but hardly the confessing witness needed to exonerate peer review in view of its long history of failure.

To save you chasing the Esarey link, the abstract reads:

How does the structure of the peer review process, which can vary from journal to journal, influence the quality of papers published in that journal? In this paper, I study multiple systems of peer review using computational simulation. I find that, under any system I study, a majority of accepted papers will be evaluated by the average reader as not meeting the standards of the journal. Moreover, all systems allow random chance to play a strong role in the acceptance decision. Heterogen eous reviewer and reader standards for scientific quality drive both results. A peer review system with an active editor (who uses desk rejection before review and does not rely strictly on reviewer votes to make decisions ) can mitigate some of these effects.

If there were peer reviewers, editors, etc., at FiveThirtyEight, shouldn’t at least one of them looked beyond the title Does Peer Review Identify the Best Papers? to ask Koerth-Baker what evidence Esarey has for his support of peer review? Or is agreement with Koerth-Baker sufficient?

Peer review persists for a number of unsavory reasons, prestige, professional advancement, enforcement of discipline ideology, pretension of higher quality of publications, let’s not add a false claim of serving the public.

Comments Off

November 3, 2017

August 2, 2017

It’s more than just overlap: Text As Graph

Filed under: Graphs,Humanities,Hyperedges,Hypergraphs,Texts,XML — Patrick Durusau @ 12:57 pm

It’s more than just overlap: Text As Graph – Refining our notion of what text really is—this time for sure! by Ronald Haentjens Dekker and David J. Birnbaum.

Abstract:

The XML tree paradigm has several well-known limitations for document modeling and processing. Some of these have received a lot of attention (especially overlap), and some have received less (e.g., discontinuity, simultaneity, transposition, white space as crypto-overlap). Many of these have work-arounds, also well known, but—as is implicit in the term “work-around”—these work-arounds have disadvantages. Because they get the job done, however, and because XML has a large user community with diverse levels of technological expertise, it is difficult to overcome inertia and move to a technology that might offer a more comprehensive fit with the full range of document structures with which researchers need to interact both intellectually and programmatically. A high-level analysis of why XML has the limitations it has can enable us to explore how an alternative model of Text as Graph (TAG) might address these types of structures and tasks in a more natural and idiomatic way than is available within an XML paradigm.

Hyperedges, texts and XML, what more could you need?

This paper merits a deep read and testing by everyone interested in serious text modeling.

You can’t read the text but here is a hypergraph visualization of an excerpt from Lewis Carroll’s “The hunting of the Snark:”

The New Testament, the Hebrew Bible, to say nothing of the Rabbinic commentaries on the Hebrew Bible and centuries of commentary on other texts could profit from this approach.

Put your text to the test and share how to advance this technique!

Comments Off

July 27, 2017

Tired of Chasing Ephemera? Open Greek and Latin Design Sprint (bids in August, 2017)

Filed under: Classics,Greek,Humanities,Interface Research/Design,Language — Patrick Durusau @ 3:06 pm

Tired of reading/chasing the ephemera explosion in American politics?

I’ve got an opportunity for you to contribute to a project with texts preserved by hand for thousands of years!

Design Sprint for Perseus 5.0/Open Greek and Latin

From the webpage:

We announced in June that Center for Hellenic Studies had signed a contract with Intrepid.io to conduct a design sprint that would support Perseus 5.0 and the Open Greek and Latin collection that it will include. Our goal was to provide a sample model for a new interface that would support searching and reading of Greek, Latin, and other historical languages. The report from that sprint was handed over to CHS yesterday and we, in turn, have made these materials available, including both the summary presentation and associated materials. The goal is to solicit comment and to provide potential applicants to the planned RFP with access to this work as soon as possible.

The sprint took just over two weeks and was an intensive effort. An evolving Google Doc with commentary on the Intrepid Wrap-up slides for the Center for Hellenic studies should now be visible. Readers of the report will see that questions remain to be answered. How will we represent Perseus, Open Greek and Latin, Open Philology, and other efforts? One thing that we have added and that will not change will be the name of the system that this planned implementation phase will begin: whether it is Perseus, Open Philology or some other name, it will be powered by the Scaife Digital Library Viewer, a name that commemorates Ross Scaife, pioneer of Digital Classics and a friend whom many of us will always miss.

The Intrepid report also includes elements that we will wish to develop further — students of Greco-Roman culture may not find “relevance” a helpful way to sort search reports. The Intrepid Sprint greatly advanced our own thinking and provided us with a new starting point. Anyone may build upon the work presented here — but they can also suggest alternate approaches.

…

The core deliverables form an impressive list:

At the moment we would summarize core deliverables as:

A new reading environment that captures the basic functionality of the Perseus 4.0 reading environment but that is more customizable and that can be localized efficiently into multiple modern languages, with Arabic, Persian, German and English as the initial target languages. The overall Open Greek and Latin team is, of course, responsible for providing the non-English content. The Scaife DL Viewer should make it possible for us to localize into multiple languages as efficiently as possible.
The reading environment should be designed to support any CTS-compliant collection and should be easily configured with a look and feel for different collections.
The reading environment should contain a lightweight treebank viewer — we don’t need to support editing of treebanks in the reading environment. The functionality that the Alpheios Project provided for the first book of the Odyssey would be more than adequate. Treebanks are available under the label “diagram” when you double-click on a Greek word.
The reading environment should support dynamic word/phrase level alignments between source text and translation(s). Here again, the The functionality that the Alpheios Project provided for the first book of the Odyssey would be adequate. More recent work implementing this functionality is visible at Tariq Yousef’s work at http://divan-hafez.com/ and http://ugarit.ialigner.com/.
The system must be able to search for both specific inflected forms and for all forms of a particular word (as in Perseus 4.0) in CTS-compliant epiDoc TEI XML. The search will build upon the linguistically analyzed texts available in https://github.com/gcelano/CTSAncientGreekXML. This will enable searching by dictionary entry, by part of speech, and by inflected form. For Greek, the base collection is visible at the First Thousand Years of Greek website (which now has begun to accumulate a substantial amount of later Greek). CTS-compliant epiDoc Latin texts can be found at https://github.com/OpenGreekAndLatin/csel-dev/tree/master/data and https://github.com/PerseusDL/canonical-latinLit/tree/master/data.
The system should ideally be able to search Greek and Latin that is available only as uncorrected OCR-generated text in hOCR format. Here the results may follow the image-front strategy familiar to academics from sources such as Jstor. If it is not feasible to integrate this search within the three months of core work, then we need a plan for subsequent integration that Leipzig and OGL members can implement later.
The new system must be scalable and updating from Lucene to Elasticsearch is desirable. While these collections may not be large by modern standards, they are substantial. Open Greek and Latin currently has c. 67 million words of Greek and Latin at various stages of post-processing and c. 90 million words of addition translations from Greek and Latin into English,French, German and Italian, while the Lace Greek OCR Project has OCR-generated text for 1100 volumes.
The system integrate translations and translation alignments into the searching system, so that users can search either in the original or in modern language translations where we provide this data. This goes back to work by David Bamman in the NEH-funded Dynamic Lexicon Project (when he was a researcher at Perseus at Tufts). For more recent examples of this, see http://divan-hafez.com/ and Ugarit. Note that one reason to adopt CTS URNs is to simplify the task of display translations of source texts — the system is only responsible for displaying translations insofar as they are available via the CTS API.
The system must provide initial support for a user profile. One benefit of the profile is that users will be able to define their own reading lists — and the Scaife DL Viewer will then be able to provide personalized reading support, e.g., word X already showed up in your reading at places A, B, and C, while word Y, which is new to you, will appear 12 times in the rest of your planned readings (i.e., you should think about learning that word). By adopting the CTS data model, we can make very precise reading lists, defining precise selections from particular editions of particular works. We also want to be able to support an initial set of user contributions that are (1) easy to implement technically and (2) easy for users to understand and perform. Thus we would support fixing residual data entry errors, creating alignments between source texts and translations, improving automated part of speech tagging and lemmatization but users would go to external resources to perform more complex tasks such as syntactic markup (treebanking).
We would welcome a bids that bring to bear expertise in the EPUB format and that could help develop a model for representing for representing CTS-compliant Greek and Latin sources in EPUB as a mechanism to make these materials available on smartphones. We can already convert our TEI XML into EPUB. The goal here is to exploit the easiest ways to optimize the experience. We can, for example, convert one or more of our Greek and Latin lexica into the EPUB Dictionary format and use our morphological analyses to generate links from particular forms in a text to the right dictionary entry or entries. Can we represent syntactically analyzed sentences with SVG? Can we include dynamic translation alignments?
Bids should consider including a design component. We were very pleased with the Design Sprint that took place in July 2017 and would like to include a follow-up Design Sprint in early 2018 that will consider (1) next steps for Greek and Latin and (2) generalizing our work to other historical languages. This Design Sprint might well go to a separate contractor (thus providing us also with a separate point of view on the work done so far).
Work must be build upon the Canonical Text Services Protocol. Bids should be prepared to build upon https://github.com/Capitains, but should also be able to build upon other CTS servers (e.g., https://github.com/ThomasK81/LightWeightCTSServer and cts.informatik.uni-leipzig.de).
All source code must be available on Github under an appropriate open license so that third parties can freely reuse and build upon it.
Source code must be designed and documented to facilitate actual (not just legally possible) reuse.
The contractor will have the flexibility to get the job done but will be expected to work as closely as possible with, and to draw wherever possible upon the on-going work done by, the collaborators who are contributing to Open Greek and Latin. The contractor must have the right to decide how much collaboration makes sense.

You can use your data science skills to sell soap, cars, ED treatments, or even apocalyptically narcissistic politicians, or, you can advance Perseus 5.0.

Your call.

Comments Off

July 13, 2017

Locate Your Representative/Senator In Hell

Filed under: Government,Humanities,Literature,Maps,Politics,Visualization — Patrick Durusau @ 3:38 pm

Mapping Dante’s Inferno, One Circle of Hell at a Time by Anika Burgess.

From the post:

I found myself, in truth, on the brink of the valley of the sad abyss that gathers the thunder of an infinite howling. It was so dark, and deep, and clouded, that I could see nothing by staring into its depths.”

This is the vision that greets the author and narrator upon entry the first circle of Hell—Limbo, home to honorable pagans—in Dante Alighieri’s Inferno, the first part of his 14th-century epic poem, Divine Comedy. Before Dante and his guide, the classical poet Virgil, encounter Purgatorio and Paradiso, they must first journey through a multilayered hellscape of sinners—from the lustful and gluttonous of the early circles to the heretics and traitors that dwell below. This first leg of their journey culminates, at Earth’s very core, with Satan, encased in ice up to his waist, eternally gnawing on Judas, Brutus, and Cassius (traitors to God) in his three mouths. In addition to being among the greatest Italian literary works, Divine Comedy also heralded a craze for “infernal cartography,” or mapping the Hell that Dante had created.
… (emphasis in original)

Burgess has collected seven (7) traditional maps of the Inferno. I take them to be early essays in the art of visualization. They are by no means, individually or collectively, the definitive visualizations of the Inferno.

The chief deficit of all seven, to me, is the narrowness of the circles/ledges. As I read the Inferno, Dante and Virgil are not pressed for space. Expanding and populating the circles more realistically is one starting point.

The Inferno has no shortage of characters in each circle, Dante predicting the fate of Pope Boniface VIII, to place him in the eight circle of Hell (simoniacs A subclass of fraud.). (Use the online Britannica with caution. It’s entry for Boniface VIII doesn’t even mention the Inferno. (As of July 13, 2017.)

I would like to think being condemned to Hell by no less than Dante would rate at least a mention in my biography!

Sadly, Dante is no longer around to add to the populace of the Inferno but new visualizations could take the opportunity to update the resident list for Hell!

It’s an exercise in visualization, mapping, 14th century literature, and, an excuse to learn the name of your representative and senators.

Enjoy!

Comments Off

July 11, 2017

The Classical Language Toolkit

Filed under: Classics,History,Humanities,Natural Language Processing — Patrick Durusau @ 4:26 pm

The Classical Language Toolkit

From the webpage:

The Classical Language Toolkit (CLTK) offers natural language processing (NLP) support for the languages of Ancient, Classical, and Medieval Eurasia. Greek and Latin functionality are currently most complete.

Goals

compile analysis-friendly corpora;

collect and generate linguistic data;

act as a free and open platform for generating scientific research.

You are sure to find one or more languages of interest:

Collecting, analyzing and mapping Tweets can be profitable and entertaining, but tomorrow or perhaps by next week, almost no one will read them again.

The texts in this project survived by hand preservation for thousands of years. People are still reading them.

How about you?

Comments Off

June 8, 2017

Roman Roads (Drawn Like The London Subway)

Filed under: History,Humanities,Mapping,Maps,Visualization — Patrick Durusau @ 8:20 pm

Roman Roads by Sasha Trubetskoy.

See Trubetskoy’s website for a much better rendering of this map of Roman roads, drawn in subway-style.

From the post:

It’s finally done. A subway-style diagram of the major Roman roads, based on the Empire of ca. 125 AD.

Creating this required far more research than I had expected—there is not a single consistent source that was particularly good for this. Huge shoutout to: Stanford’s ORBIS model, The Pelagios Project, and the Antonine Itinerary (found a full PDF online but lost the url).

The lines are a combination of actual, named roads (like the Via Appia or Via Militaris) as well as roads that do not have a known historic name (in which case I creatively invented some names). Skip to the “Creative liberties taken” section for specifics.

How long would it actually take to travel this network? That depends a lot on what method of transport you are using, which depends on how much money you have. Another big factor is the season – each time of year poses its own challenges. In the summer, it would take you about two months to walk on foot from Rome to Byzantium. If you had a horse, it would only take you a month.

However, no sane Roman would use only roads where sea travel is available. Sailing was much cheaper and faster – a combination of horse and sailboat would get you from Rome to Byzantium in about 25 days, Rome to Carthage in 4-5 days. Check out ORBIS if you want to play around with a “Google Maps” for Ancient Rome. I decided not to include maritime routes on the map for simplicity’s sake.
…

Subway-style drawing lose details but make relationships between routes clearer. Or at least that is one of the arguments in their favor.

Thoughts on a subway-style drawing that captures the development of the Roman road system? To illustrate how that corresponds in broad strokes to the expansion of Rome?

Be sure to visit Trubetskoy’s homepage. Lot’s of interesting maps and projects.

Comments Off

Medieval illuminated manuscripts

Filed under: Art,Humanities,Manuscripts — Patrick Durusau @ 4:55 pm

Medieval illuminated manuscripts by Robert Miller (reference and instruction librarian at the University of Maryland University College)

From the post:

With their rich representation of medieval life and thought, illuminated manuscripts serve as primary sources for scholars in any number of fields: history, literature, art history, women’s studies, religious studies, philosophy, the history of science, and more.

But you needn’t be conducting research to immerse yourself in the world of medieval manuscripts. The beauty, pathos, and earthy humor of illuminated manuscripts make them a delight for all. Thanks to digitization efforts by libraries and museums worldwide, the colorful creations of the medieval imagination—dreadful demons, armies of Amazons, gardens, gems, bugs, birds, celestial vistas, and simple scenes of everyday life—are easily accessible online.
…

I count:

10 twitter accounts to follow/search
11 sites with manuscript collections
15 blogs and other manuscript sites

A great resource for students of all ages who are preparing research papers!

Enjoy and pass this one along!

Comments Off

June 7, 2017

Where the Greeks and Romans White Supremacists?

Filed under: Art,Bias,Diversity,History,Humanities — Patrick Durusau @ 3:02 pm

Why We Need to Start Seeing the Classical World in Color by Sarah E. Bond.

From the post:

Modern technology has revealed an irrefutable, if unpopular, truth: many of the statues, reliefs, and sarcophagi created in the ancient Western world were in fact painted. Marble was a precious material for Greco-Roman artisans, but it was considered a canvas, not the finished product for sculpture. It was carefully selected and then often painted in gold, red, green, black, white, and brown, among other colors.

A number of fantastic museum shows throughout Europe and the US in recent years have addressed the issue of ancient polychromy. The Gods in Color exhibit travelled the world between 2003–15, after its initial display at the Glyptothek in Munich. (Many of the photos in this essay come from that exhibit, including the famed Caligula bust and the Alexander Sarcophagus.) Digital humanists and archaeologists have played a large part in making those shows possible. In particular, the archaeologist Vinzenz Brinkmann, whose research informed Gods in Color, has done important work, applying various technologies and ultraviolet light to antique statues in order to analyze the minute vestiges of paint on them and then recreate polychrome versions.

Acceptance of polychromy by the public is another matter. A friend peering up at early-20th-century polychrome terra cottas of mythological figures at the Philadelphia Museum of Art once remarked to me: “There is no way the Greeks were that gauche.” How did color become gauche? Where does this aesthetic disgust come from? To many, the pristine whiteness of marble statues is the expectation and thus the classical ideal. But the equation of white marble with beauty is not an inherent truth of the universe. Where this standard came from and how it continues to influence white supremacist ideas today are often ignored.

Most museums and art history textbooks contain a predominantly neon white display of skin tone when it comes to classical statues and sarcophagi. This has an impact on the way we view the antique world. The assemblage of neon whiteness serves to create a false idea of homogeneity — everyone was very white! — across the Mediterranean region. The Romans, in fact, did not define people as “white”; where, then, did this notion of race come from?

…

A great post and reminder that learning history (or current events) through a particular lens isn’t the same as the only view of history (or current events).

I originally wrote “an accurate view of history….” but that’s not true. At best we have one or more views and when called upon to act, make decisions upon those views. “Accuracy” is something that lies beyond our human grasp.

The reminder I would add to this post is that recognition of a lens, in this case, the absence of color in our learning of history, isn’t overcome by our naming it and perhaps nodding in agreement, yes, that was a short fall in our learning.

“Knowing” about the coloration of familiar art work doesn’t erase centuries of considering it without color. No amount of pretending will make it otherwise.

Humanists should learn about and promote the use of colorization so the youth of today learn different traditions than the ones we learned.

Comments Off

February 1, 2017

Digital Humanities / Studies: U.Pitt.Greenberg

Filed under: Digital Research,Humanities,Literature,Social Sciences,XML,XQuery — Patrick Durusau @ 9:13 pm

Digital Humanities / Studies: U.Pitt.Greenberg maintained by Elisa E. Beshero-Bondar.

I discovered this syllabus and course materials by accident when one of its modules on XQuery turned up in a search. Backing out of that module I discovered this gem of a digital humanities course.

The course description:

Our course in “digital humanities” and “digital studies” is designed to be interdisciplinary and practical, with an emphasis on learning through “hands-on” experience. It is a computer course, but not a course in which you learn programming for the sake of learning a programming language. It’s a course that will involve programming, and working with coding languages, and “putting things online,” but it’s not a course designed to make you, in fifteen weeks, a professional website designer. Instead, this is a course in which we prioritize what we can investigate in the Humanities and related Social Sciences fields about cultural, historical, and literary research questions through applications in computer coding and programming, which you will be learning and applying as you go in order to make new discoveries and transform cultural objects—what we call “texts” in their complex and multiple dimensions. We think of “texts” as the transmittable, sharable forms of human creativity (mainly through language), and we interface with a particular text in multiple ways through print and electronic “documents.” When we refer to a “document,” we mean a specific instance of a text, and much of our work will be in experimenting with the structures of texts in digital document formats, accessing them through scripts we write in computer code—scripts that in themselves are a kind of text, readable both by humans and machines.

Your professors are scholars and teachers of humanities, not computer programmers by trade, and we teach this course from our backgrounds (in literature and anthropology, respectively). We teach this course to share coding methods that are highly useful to us in our fields, with an emphasis on working with texts as artifacts of human culture shaped primarily with words and letters—the forms of “written” language transferable to many media (including image and sound) that we can study with computer modelling tools that we design for ourselves based on the questions we ask. We work with computers in this course as precision instruments that help us to read and process great quantities of information, and that lead us to make significant connections, ask new kinds of questions, and build models and interfaces to change our reading and thinking experience as people curious about human history, culture, and creativity.

Our focus in this course is primarily analytical: to apply computer technologies to represent and investigate cultural materials. As we design projects together, you will gain practical experience in editing and you will certainly fine-tune your precision in writing and thinking. We will be working primarily with eXtensible Markup Language (XML) because it is a powerful tool for modelling texts that we can adapt creatively to our interests and questions. XML represents a standard in adaptability and human-readability in digital code, and it works together with related technologies with which you will gain working experience: You’ll learn how to write XPath expressions: a formal language for searching and extracting information from XML code which serves as the basis for transforming XML into many publishable forms, using XSLT and XQuery. You’ll learn to write XSLT: a programming “stylesheet” transforming language designed to convert XML to publishable formats, as well as XQuery, a query (or search) language for extracting information from XML files bundled collectively. You will learn how to design your own systematic coding methods to work on projects, and how to write your own rules in schema languages (like Schematron and Relax-NG) to keep your projects organized and prevent errors. You’ll gain experience with an international XML language called TEI (after the Text Encoding Initiative) which serves as the international standard for coding digital archives of cultural materials. Since one of the best and most widely accessible ways to publish XML is on the worldwide web, you’ll gain working experience with HTML code (a markup language that is a kind of XML) and styling HTML with Cascading Stylesheets (CSS). We will do all of this with an eye to your understanding how coding works—and no longer relying without question on expensive commercial software as the “only” available solution, because such software is usually not designed with our research questions in mind.

We think you’ll gain enough experience at least to become a little dangerous, and at the very least more independent as investigators and makers who wield computers as fit instruments for your own tasks. Your success will require patience, dedication, and regular communication and interaction with us, working through assignments on a daily basis. Your success will NOT require perfection, but rather your regular efforts throughout the course, your documenting of problems when your coding doesn’t yield the results you want. Homework exercises are a back-and-forth, intensive dialogue between you and your instructors, and we plan to spend a great deal of time with you individually over these as we work together. Our guiding principle in developing assignments and working with you is that the best way for you to learn and succeed is through regular practice as you hone your skills. Our goal is not to make you expert programmers (as we are far from that ourselves)! Rather, we want you to learn how to manipulate coding technologies for your own purposes, how to track down answers to questions, how to think your way algorithmically through problems and find good solutions.

Skimming the syllabus rekindles an awareness of the distinction between the “hard” sciences and the “difficult” ones.

Enjoy!

Update:

After yesterday’s post, Elisa Beshero-Bondar tweeted this one course is now two:

At a new homepage: newtFire {dh|ds}!

Enjoy!

Comments Off

January 13, 2017

Humanities Digital Library [A Ray of Hope]

Filed under: Digital Library,Humanities,Library,Open Access — Patrick Durusau @ 10:16 am

Humanities Digital Library (Launch Event)

From the webpage:

Date
17 Jan 2017, 18:00 to 17 Jan 2017, 19:00

…

Venue

IHR Wolfson Conference Suite, NB01/NB02, Basement, IHR, Senate House, Malet Street, London WC1E 7HU

Description

6-7pm, Tuesday 17 January 2017

Wolfson Conference Suite, Institute of Historical Research

Senate House, Malet Street, London, WC1E 7HU

www.humanities-digital-library.org

About the Humanities Digital Library

The Humanities Digital Library is a new Open Access platform for peer reviewed scholarly books in the humanities.

The Library is a joint initiative of the School of Advanced Study, University of London, and two of the School’s institutes—the Institute of Historical Research and the Institute of Advanced Legal Studies.

From launch, the Humanities Digital Library offers scholarly titles in history, law and classics. Over time, the Library will grow to include books from other humanities disciplines studied and researched at the School of Advanced Study. Partner organisations include the Royal Historical Society whose ‘New Historical Perspectives’ series will appear in the Library, published by the Institute of Historical Research.

Each title is published as an open access PDF, with copies also available to purchase in print and EPUB formats. Scholarly titles come in several formats—including monographs, edited collections and longer and shorter form works.
(emphasis in the original)

Timely evidence that not everyone in the UK is barking mad! “Barking mad” being the only explanation I can offer for the Investigatory Powers Bill.

I won’t be attending but if you can, do and support the Humanities Digital Library after it opens.

Comments Off

December 2, 2016

War and Peace & R

Filed under: Humanities,Literature,R,Visualization — Patrick Durusau @ 5:13 pm

No, not a post about R versus Python but about R and Tolstoy‘s War and Peace.

Using R to Gain Insights into the Emotional Journeys in War and Peace by Wee Hyong Tok.

From the post:

How do you read a novel in record time, and gain insights into the emotional journey of main characters, as they go through various trials and tribulations, as an exciting story unfolds from chapter to chapter?

I remembered my experiences when I start reading a novel, and I get intrigued by the story, and simply cannot wait to get to the last chapter. I also recall many conversations with friends on some of the interesting novels that I have read awhile back, and somehow have only vague recollection of what happened in a specific chapter. In this post, I’ll work through how we can use R to analyze the English translation of War and Peace.

War and Peace is a novel by Leo Tolstoy, and captures the salient points about Russian history from the period 1805 to 1812. The novel consists of the stories of five families, and captures the trials and tribulations of various characters (e.g. Natasha and Andre). The novel consists of about 1400 pages, and is one of the longest novels that have been written.

We hypothesize that if we can build a dashboard (shown below), this will allow us to gain insights into the emotional journey undertaken by the characters in War and Peace.
…

Impressive work, even though I would not use it as a short-cut to “read a novel in record time.”

Rather I take this as an alternative way of reading War and Peace, one that can capture insights a casual reader may miss.

Moreover, the techniques demonstrated here could be used with other works of literature, or even non-fictional works.

Imagine conducting this analysis over the reportedly more than 7,000 page full CIA Torture Report, for example.

A heatmap does not connect any dots, but points a user towards places where interesting dots may be found.

Certainly a tool for exploring large releases/leaks of text data.

Enjoy!

PS: Large, tiresome, obscure-on-purpose, government reports to practice on with this method?

Comments Off

November 22, 2016

Practical Palaeography: Recreating the Exeter Book in a Modern Day ‘Scriptorium’

Filed under: Humanities,Manuscripts,Palaeography — Patrick Durusau @ 11:31 am

Practical Palaeography: Recreating the Exeter Book in a Modern Day ‘Scriptorium’

From the post:

Dr Johanna Green is a lecturer in Book History and Digital Humanities at the University of Glasgow. Her PhD (English Language, University of Glasgow 2012) focused on a palaeographical study of the textual division and subordination of the Exeter Book manuscript. Here, she tells us about the first of two sessions she led for the Society of Northumbrian Scribes, a group of calligraphers based in North East England, bringing palaeographic research and modern-day calligraphy together for the public.
(emphasis in original)

Not phrased in subject identity language, but concerns familiar to the topic map community are not far away:

…
My own research centres on the scribal hand of the manuscript, specifically the ways in which the poems are divided and subdivided from one another and the decorative designs used for these litterae notabiliores throughout. For much of my research, I have spent considerable time (perhaps more than I am willing to admit) wondering where one ought to draw the line with palaeography. When do the details become so tiny to no longer be of any significance? When are they just important enough to mean something significant for our understanding of how the manuscript was created and arranged? How far am I willing to argue that these tiny features have significant impact? Is, for example, this littera notabilior Đ on f. 115v (Judgement Day I, left) different enough in a significant way to this H on f.97v, (The Partridge, bottom right), and in turn are both of these litterae notabiliores performing a different function than the H on f.98r (Soul and Body II, far right)?[5]
(emphasis in original, footnote omitted)
…

When Dr. Green says:

…When do the details become so tiny to no longer be of any significance?…

I would say: When do the subjects (details) become so tiny we want to pass over them in silence? That is they could be but are not represented in a topic map.

Green ends her speculation, to a degree, by enlisting scribes to re-create the manuscript of interest under her observation.

I’ll leave her conclusions for her post but consider a secondary finding:

…
The experience also made me realise something else: I had learned much by watching them write and talking to them during the process, but I had also learned much by trying to produce the hand myself. Rather than return to Glasgow and teach my undergraduates the finer details of the script purely through verbal or written description, perhaps providing space for my students to engage in the materials of manuscript production, to try out copying a script/exemplar for themselves would help increase their understanding of the process of writing and, in turn, deepen their knowledge of the constituent parts of a letter and their significance in palaeographic endeavour. This last is something I plan to include in future palaeography teaching.
…

Dr. Green’s concern over palaeographic detail illustrates two important points about topic maps:

Potential subjects for a topic map are always unbounded.
Different people “see” different subjects.

Which also account for my yawn when Microsoft drops the Microsoft Concept Graph of more than 5.4 million concepts.

…[M]ore than 5.4 million concepts[?]

Hell, Copleston’s History of Western Philosophy easily has more concepts.

But the Microsoft Concept Graph is more useful than a topic map of Copleston in your daily, shallow, social sea.

What subjects do you see and how would capturing them and their identities make a difference in your life (professional or otherwise)?

Comments Off

October 18, 2016

S20-211a Hebrew Bible Technology Buffet – November 20, 2016 (save that date!)

Filed under: Bible,Humanities — Patrick Durusau @ 7:10 pm

S20-211a Hebrew Bible Technology Buffet

From the webpage:

On Sunday, November 20th 2016, from 1:00 PM to 3:30 PM, GERT will host a session with the theme “Hebrew Bible Technology Buffet” at the SBL Annual Meeting in room 305 of the Convention Center. Barry Bandstra of Hope College will preside.

The session has four presentations:

Janet Dyk, Vrije Universiteit Amsterdam and Dirk Roorda, Royal Netherlands Academy of Arts and Sciences
Valence Patterns and Translation Proposals within SHEBANQ (30 min)

Mathias Coeckelbergs, Katholieke Universiteit Leuven
Using Topic Modeling for Multilingual Concept Comparison. Evidence from the Hebrew Bible and the Septuagint (30 min)

Randall Tan, Global Bible Initiative
Total Link: The Power of Alignment for Bible Translation (30 min)

Drayton C. Benner, Miklal Software Solutions and James R. Covington, University of Chicago
A New Transliteration of the Hebrew Bible (30 min)

Presentations will be followed by a discussion session.

You will need to register for the Annual Meeting to attend the session.

Assuming they are checking “badges” to make sure attendees have registered. Registration is very important to those who “foster” biblical scholarship by comping travel and rooms for their close friends.

PS: The website reports non-member registration is $490.00. I would like to think that is a mis-print but I suspect its not.

That’s one way to isolate yourself from an interested public. By way of contrast, snail-mail Biblical Greek courses in the 1890’s had tens of thousands of subscribers. When academics complain of being marginalized, use this as an example of self-marginalization.

Comments Off

August 24, 2016

DATNAV: …Navigate and Integrate Digital Data in Human Rights Research [Ethics]

Filed under: Ethics,Human Rights,Humanities — Patrick Durusau @ 2:54 pm

DATNAV: New Guide to Navigate and Integrate Digital Data in Human Rights Research by Zara Rahman.

From the introduction in the Guide:

From online videos of rights violations, to satellite images of environmental degradation, to eyewitness accounts disseminated on social media, we have access to more relevant data today than ever before. When used responsibly, this data can help human rights professionals in the courtroom, when working with governments and journalists, and in documenting historical record.

Acquiring, disseminating and storing digital data is also becoming increasingly affordable. As costs continue to decrease and new platforms are
developed, opportunities for harnessing these data sources for human rights work increase.

But integrating data collection and management into the day to day work of human rights research and documentation can be challenging, even overwhelming, for individuals and organisations. This guide is designed to help you navigate and integrate new data forms into your human rights work.

It is the result of a collaboration between Amnesty International, Benetech, and The Engine Room that began in late 2015. We conducted a series of interviews, community consultations, and surveys to understand whether digital data was being integrated into human rights work. In the vast majority of cases, we found that it wasn’t. Why?

Mainly, human rights researchers appeared to be overwhelmed by the possibilities. In the face of limited resources, not knowing how to get started or whether it would be worthwhile, most people we spoke to refrained from even attempting to strengthen their work with digital data.

To support everyone in the human rights field in navigating this complex environment, we convened a group of 16 researchers and technical experts in a castle outside Berlin, Germany in May 2016 to draft this guide over four days of intense reflection and writing.

There are additional reading resources at: https://engn.it/datnav.

The issue of ethics comes up quickly in human rights research and here the authors write:

Seven things to consider before using digital data for human rights

Would digital data genuinely help answer your research questions? What are the pros and cons of the particular source or medium? What might you learn from past uses of similar technology?

What sources are likely to be collecting or capturing the kinds of information you need? What is the context in which it is being produced and used? Will the people or organisations on which your work is focused be receptive to these types of data?

How easily will new forms of data integrate into your existing workflow? Do you realistically have the time and money to collect, store, analyze and especially to verify this data? Can anyone on your team comfortably support the technology?

Who owns or controls the data you will be using? Companies, government, or adversaries? How difficult is it to get? Is it a fair or legal collection method? What is the internal stance on this? Do you have true informed consent from individuals?

How will digital divides and differences in local access to online platforms, computers or phones, affect representation of different populations? Would conclusions based on the data reinforce inequalities, stereotypes or blind spots?

Are organisational protocols for confidentiality and security in digital communication and data handling sufficiently robust to deal with risks to you, your partners and sources? Are security tools and processes updated frequently enough?

Do you have safeguards in place to prevent and deal with any secondary trauma from viewing digital content that you or your partners may experience at personal and organisational levels?

(Page 15)

Before I reveal my #0 consideration, consider the following story as setting the background.

At a death penalty seminar (certainly a violation of human rights), a practitioner reported a case where the prosecuting attorney said a particular murder case was a question of “good versus evil.” In the course of preparing for that case, it was discovered that while teaching a course for paralegals, the prosecuting attorney had a sexual affair with one of his students. Affidavits were obtained, etc., and a motion was filed in the pending criminal case entitled: Motion To Define Good and Evil.

There was a mix of opinions on whether blind-siding the prosecuting attorney with his personal failings, with the fallout for his family, was a legitimate approach?

My question was: Did they consider asking the prosecuting attorney to take the death penalty off the table, in exchange for not filing the Motion To Define Good and Evil? A question of effective use of the information and not about the legitimacy of using it.

For human rights violations, my #0 Question would be:

0. Can the information be used to stop and/or redress human rights violations without harming known human rights victims?

The other seven questions, like “…all deliberate speed…,” are a game played by non-victims.

Comments Off

July 31, 2016

Digital Humanities In the Library

Filed under: Digital Library,Humanities,Library — Patrick Durusau @ 3:17 pm

Digital Humanities In the Library / Of the Library: A dh+lib Special Issue

A special issue of dh + lib introduced by Sarah Potvin, Thomas Padilla and Caitlin Christian-Lamb in their essay: Digital Humanities In the Library / Of the Library, saying:

What are the points of contact between digital humanities and libraries? What is at stake, and what issues arise when the two meet? Where are we, and where might we be going? Who are “we”? By posing these questions in the CFP for a new dh+lib special issue, the editors hoped for sharp, provocative meditations on the state of the field. We are proud to present the result, ten wide-ranging contributions by twenty-two authors, collectively titled “Digital Humanities In the Library / Of the Library.”

We make the in/of distinction pointedly. Like the Digital Humanities (DH), definitions of library community are typically prefigured by “inter-” and “multi-” frames, rendered as work and values that are interprofessional, interdisciplinary, and multidisciplinary. Ideally, these characterizations attest to diversified yet unified purpose, predicated on the application of disciplinary expertise and metaknowledge to address questions that resist resolution from a single perspective. Yet we might question how a combinatorial impulse obscures the distinct nature of our contributions and, consequently, our ability to understand and respect individual agency. Working across the similarly encompassing and amorphous contours of the Digital Humanities compels the library community to reckon with its composite nature.
…

All of the contributions merit your attention but I was especially taken by: When Metadata Becomes Outreach: Indexing, Describing, and Encoding For DH by Emma Annette Wilson and Mary Alexander has this gem that will resonate with topic map fans:

…
DH projects require high-quality metadata in order to thrive, and the bigger the project, the more important that metadata becomes to make data discoverable, navigable, and open to computational analysis. The functions of all metadata are to allow our users to identify and discover resources through records acting as surrogates of resources, and to discover similarities, distinctions, and other nuances within single texts or across a corpus. High quality metadata brings standardization to the project by recording elements’ definitions, obligations, repeatability, rules for hierarchical structure, and attributes. Input guidelines and the use of controlled vocabularies bring consistencies that promote findability for researchers and users alike.
…

Modulo my reservations about the data/metadata distinction depending upon a point of view and all of them being subjects in any event, its hard to think of a clearer statement of the value that a topic map could bring to a DH project.

Consistencies can peacefully co-exist with with historical or present-day inconsistencies, at least so long as you are using a topic map.

I commend the entire issue to your for reading!

Comments Off

June 19, 2016

Electronic Literature Organization

Filed under: Humanities,Literature,Media — Patrick Durusau @ 4:04 pm

Electronic Literature Organization

From the “What is E-Lit” page:

Electronic literature, or e-lit, refers to works with important literary aspects that take advantage of the capabilities and contexts provided by the stand-alone or networked computer. Within the broad category of electronic literature are several forms and threads of practice, some of which are:

Hypertext fiction and poetry, on and off the Web

Kinetic poetry presented in Flash and using other platforms

Computer art installations which ask viewers to read them or otherwise have literary aspects

Conversational characters, also known as chatterbots

Interactive fiction

Literary apps

Novels that take the form of emails, SMS messages, or blogs

Poems and stories that are generated by computers, either interactively or based on parameters given at the beginning

Collaborative writing projects that allow readers to contribute to the text of a work

Literary performances online that develop new ways of writing

The ELO showcase, created in 2006 and with some entries from 2010, provides a selection outstanding examples of electronic literature, as do the two volumes of our Electronic Literature Collection.

The field of electronic literature is an evolving one. Literature today not only migrates from print to electronic media; increasingly, “born digital” works are created explicitly for the networked computer. The ELO seeks to bring the literary workings of this network and the process-intensive aspects of literature into visibility.

The confrontation with technology at the level of creation is what distinguishes electronic literature from, for example, e-books, digitized versions of print works, and other products of print authors “going digital.”

Electronic literature often intersects with conceptual and sound arts, but reading and writing remain central to the literary arts. These activities, unbound by pages and the printed book, now move freely through galleries, performance spaces, and museums. Electronic literature does not reside in any single medium or institution.

…

I was looking for a recent presentation by Allison Parrish on bots when I encountered Electronic Literature Organization (ELO).

I was attracted by the bot discussion at a recent conference but as you can see, the range of activities of the ELO is much broader.

Enjoy!

Comments Off

April 6, 2016

Exploratory Programming for the Arts and Humanities

Filed under: Art,Humanities,Programming — Patrick Durusau @ 8:28 pm

Exploratory Programming for the Arts and Humanities by Nick Montfort.

From the webpage:

This book introduces programming to readers with a background in the arts and humanities; there are no prerequisites, and no knowledge of computation is assumed. In it, Nick Montfort reveals programming to be not merely a technical exercise within given constraints but a tool for sketching, brainstorming, and inquiring about important topics. He emphasizes programming’s exploratory potential—its facility to create new kinds of artworks and to probe data for new ideas.

The book is designed to be read alongside the computer, allowing readers to program while making their way through the chapters. It offers practical exercises in writing and modifying code, beginning on a small scale and increasing in substance. In some cases, a specification is given for a program, but the core activities are a series of “free projects,” intentionally underspecified exercises that leave room for readers to determine their own direction and write different sorts of programs. Throughout the book, Montfort also considers how computation and programming are culturally situated—how programming relates to the methods and questions of the arts and humanities. The book uses Python and Processing, both of which are free software, as the primary programming languages.

Full Disclosure: I haven’t seen a copy of Exploratory Programming.

I am reluctant to part with $40.00 US for either print or an electronic version where the major heads in the table of contents read as follows:

1 Modifying a Program

2 Calculating

3 Double, Double

4 Programming Fundamentals

5 Standard Starting Points

6 Text I

7 Text II

8 Image I

9 Image II

10 Text III

11 Statistics and Visualization

12 Animation

13 Sound

14 Interaction

15 Onward

The table of contents shows more than one hundred pages out of two hundred and sixty-three are spend on introduction to computer programming topics.

Text, which has a healthy section on string operations, merits a mere seventy pages. The other one hundred pages is split between visualization, sound, animation, etc.

Compare that table of contents with this one*:

Chapter One – Modular Programming: An Approach

Chapter Two – Data Entry and Text Verification

Chapter Three – Index and Concordance

Chapter Four – Text Criticism

Chapter Five – Improved Searching Techniques

Chapter Six – Morphological Analysis

Which table of contents promises to be more useful for exploration?

Personal computers are vastly more powerful today than when the second table of contents was penned.

Yet, students start off as though they are going to write their own tools from scratch. Unlikely and certainly not the best use of their time.

In depth coverage of the NLTK Toolkit historical or contemporary texts, in depth, would teach them a useful tool. A tool they could apply to other material.

To cover machine learning, consider Weka. A tool students can learn in class and then apply in new and different situations.

There are tools for image and sound analysis but the important term is tool.

Just as we don’t teach students to make their own paper, we should focus on enabling them to reap the riches that modern software tools offer.

Or to put it another way, let’s stop repeating the past and move forward.

Comments Off

March 12, 2016

Laypersons vs. Scientists – “…laypersons may be prone to biases…”

Filed under: Humanities,Psychology,Science — Patrick Durusau @ 9:09 pm

The “distinction” between laypersons and scientists is more a world view about some things than “all scientists are rational” or “all laypersons are irrational.” Scientists and laypersons can be just as rational and/or irrational, depending upon the topic at hand.

Having said that, The effects of social identity threat and social identity affirmation on laypersons’ perception of scientists by Peter Nauroth, et al., finds, unsurprisingly, that if a layperson’s social identity is threatened by research, they have a less favorable view of the scientists involved.

Abstract:

Public debates about socio-scientific issues (e.g. climate change or violent video games) are often accompanied by attacks on the reputation of the involved scientists. Drawing on the social identity approach, we report a minimal group experiment investigating the conditions under which scientists are perceived as non-prototypical, non-reputable, and incompetent. Results show that in-group affirming and threatening scientific findings (compared to a control condition) both alter laypersons’ evaluations of the study: in-group affirming findings lead to more positive and in-group threatening findings to more negative evaluations. However, only in-group threatening findings alter laypersons’ perceptions of the scientists who published the study: scientists were perceived as less prototypical, less reputable, and less competent when their research results imply a threat to participants’ social identity compared to a non-threat condition. Our findings add to the literature on science reception research and have implications for understanding the public engagement with science.

Perceived attacks on personal identity have negative consequences for the “reception” of science.

Implications for public engagement with science

Our findings have immediate implications for public engagement with science activities. When laypersons perceive scientists as less competent, less reputable, and not representative of the scientific community and the scientist’s opinion as deviating from the current scientific state-of-the-art, laypersons might be less willing to participate in constructive discussions (Schrodt et al., 2009). Furthermore, our mediation analysis suggests that these negative perceptions deepen the trench between scientists and laypersons concerning the current scientific state-of-the-art. We speculate that these biases might actually even lead to engagement activities to backfire: instead of developing a mutual understanding they might intensify laypersons’ misconceptions about the scientific state-of-the-art. Corroborating this hypothesis, Binder et al. (2011) demonstrated that discussions about controversial science topics may in fact polarize different groups around a priori positions. Additional preliminary support for this hypothesis can also be found in case studies about public engagement activities in controversial socio-scientific issues. Some of these reports (for two examples, see Lezaun and Soneryd, 2007) indicate problems to maintain a productive atmosphere between laypersons and experts in the discussion sessions.

Besides these practical implications, our results also add further evidence to the growing body of literature questioning the validity of the deficit model in science communication according to which people’s attitudes toward science are mainly determined by their knowledge about science (Sturgis and Allum, 2004). We demonstrated that social identity concerns profoundly influence laypersons’ perceptions and evaluations of scientific results regardless of laypersons’ knowledge. However, our results also question whether involving laypersons in policy decision processes based upon scientific evidence is reasonable in all socio-scientific issues. Particularly when the scientific evidence has potential negative consequences for social groups, our research suggests that laypersons may be prone to biases based upon their social affiliations. For example, if regular video game players were involved in decision-making processes concerning potential sales restrictions of violent video games, they would be likely to perceive scientific evidence demonstrating detrimental effects of violent video games as shoddy and the respective researchers as disreputable (Greitemeyer, 2014; Nauroth et al., 2014, 2015).(emphasis added)

The principle failure of this paper is its failure to study the scientific community and its reaction within science to research that attacks the personal identity of its participants.

I don’t think it is reading too much into the post: Academic, Not Industrial Secrecy, where one group said:

We want restrictions on who could do the analyses.

to say that attacks on personal identity leads to boorish behavior on the part of scientists.

Laypersons and scientists emit a never ending stream of examples of prejudice, favoritism, sycophancy, sloppy reasoning, to say nothing of careless and/or low quality work.

Reception of science among laypersons might improve if the scientific community abandoned its facade of “it’s objective, it’s science.”

That facade was tiresome by WWII and to keep repeating now is a disservice to the scientific community.

All of our efforts, in any field, are human endeavors and thus subject to the vagaries and uncertainties human interaction.

Live with it.

Comments Off

December 12, 2015

For Linguists on Your Holiday List

Filed under: Humanities,Humor,Linguistics,Semiotics — Patrick Durusau @ 3:55 pm

Hey Linguists!—Get Them to Get You a Copy of The Speculative Grammarian Essential Guide to Linguistics.

From the website:

Hey Linguists! Do you know why it is better to give than to receive? Because giving requires a lot more work! You have to know what someone likes, what someone wants, who someone is, to get them a proper, thoughtful gift. That sounds like a lot of work.

No, wait. That’s not right. It’s actually more work to be the recipient—if you are going to do it right. You can’t just trust people to know what you like, what you want, who you are.

You could try to help your loved ones understand a linguist’s needs and wants and desires—but you’d have to give them a mini course on historical, computational, and forensic linguistics first. Instead, you can assure them that SpecGram has the right gift for you—a gift you, their favorite linguist, will treasure for years to come: The Speculative Grammarian Essential Guide to Linguistics.

So drop some subtle or not-so-subtle hints and help your loved ones do the right thing this holiday season: gift you with this hilarious compendium of linguistic sense and nonsense.

If you need to convince your friends and family that they can’t find you a proper gift on their own, send them one of the images below, and try to explain to them why it amuses you. That’ll show ’em! (More will be added through the rest of 2015, just in case your friends and family are a little thick.)

• If guilt is more your style, check out 2013’s Sad Holiday Linguists.

• If semi-positive reinforcement is your thing, check out 2014’s Because You Can’t Do Everything You Want for Your Favorite Linguist.

Disclaimer: I haven’t proofed the diagrams against the sources cited. Rely on them at your own risk.

There are others but the Hey Semioticians! reminded me of John Sowa (sorry John):

The greatest mistake across all disciplines is taking ourselves (and our positions) far too seriously.

Enjoy!

Comments Off

November 18, 2015

On Teaching XQuery to Digital Humanists [Lesson in Immediate ROI]

Filed under: Humanities,XQuery — Patrick Durusau @ 3:55 pm

On Teaching XQuery to Digital Humanists by Clifford B. Anderson.

A paper presented at Balisage 2014 but still a great read for today. In particular where Clifford makes the case for teaching XQuery to humanists:

Making the Case for XQuery

I may as well state upfront that I regard XQuery as a fantastic language for digital humanists. If you are involved in marking up documents in XML, then learning XQuery will pay long-term dividends. I do have arguments for this bit of bravado. My reasons for lifting up XQuery as a programing language of particular interest to digital humanists are essentially three:

XQuery is domain-appropriate for digital humanists.

…

Let’s take each of these points in turn.

First, XQuery fits the domain of digital humanists. Admittedly, I am focusing here on a particular area of the digital humanities, namely the domain of digital text editing and analysis. In that domain, however, XQuery proves a nearly perfect match to the needs of digital humanists.

If you scour the online communities related to digital humanities, you will repeatedly find conversations about which programming languages to learn. Predictably, the advice is all over the map. PHP is easy to learn, readily accessible, and the language of many popular projects in the digital humanities such as Omeka and Neatline. Javascript is another obvious choice given its ubiquity. Others recommend Python or Ruby. At the margins, you’ll find the statistically-inclined recommending R. There are pluses and minuses to learning any of these languages. When you are working with XML, however, they all fall short. Inevitably, working with XML in these languages will require learning how to use packages to read XML and convert it to other formats.

Learning XQuery eliminates any impedance between data and code. There is no need to import any special packages to work with XML. Rather, you can proceed smoothly from teaching XML basics to showing how to navigate XML documents with XPath to querying XML with XQuery. You do not need to jump out of context to teach students about classes, objects, tables, or anything as awful-sounding as “shredding” XML documents or storing them as “blobs.” XQuery makes it possible for students to become productive without having to learn as many computer science or software engineering concepts. A simple four or five line FLWOR expression can easily demonstrate the power of XQuery and provide a basis for students’ tinkering and exploration. (emphasis added)
…

I commend the rest of the paper to you for reading but Clifford’s first point nails why learn XQuery for humanists and others.

The part I highlighted above sums it up:

XQuery makes it possible for students to become productive without having to learn as many computer science or software engineering concepts. A simple four or five line FLWOR expression can easily demonstrate the power of XQuery and provide a basis for students’ tinkering and exploration. (emphasis added)

Whether you are a student, scholar or even a type-A business type, what do you want?

To get sh*t done!

A few of us like tinkering with edge cases, proofs, theorems and automata, but having the needed output on time or sooner, really makes the day for most folks.

A minimal amount of XQuery expressions will open up XML encoded data for your custom exploration. You can experience an immediate ROI from the time you spend learning XQuery. Which will prompt you to learn more XQuery.

Think of learning XQuery as a step towards user independence. Independence from the choices made by unseen and unknown programmers.

Are you ready to take that step?

Comments Off

November 13, 2015

You do not want to be an edge case [The True Skynet: Your Homogenized Future]

Filed under: Design,Humanities,Identification,Programming — Patrick Durusau @ 1:15 pm

You do not want to be an edge case.

John D. Cook writes:

Hilary Mason made an important observation on Twitter a few days ago:

You do not want to be an edge case in this future we are building.

Systems run by algorithms can be more efficient on average, but make life harder on the edge cases, people who are exceptions to the system developers’ expectations.

Algorithms, whether encoded in software or in rigid bureaucratic processes, can unwittingly discriminate against minorities. The problem isn’t recognized minorities, such as racial minorities or the disabled, but unrecognized minorities, people who were overlooked.

For example, two twins were recently prevented from getting their drivers licenses because DMV software couldn’t tell their photos apart. Surely the people who wrote the software harbored no malice toward twins. They just didn’t anticipate that two drivers licence applicants could have indistinguishable photos.

I imagine most people reading this have had difficulty with software (or bureaucratic procedures) that didn’t anticipate something about them; everyone is an edge case in some context. Maybe you don’t have a middle name, but a form insists you cannot leave the middle name field blank. Maybe there are more letters in your name or more children in your family than a programmer anticipated. Maybe you choose not to use some technology that “everybody” uses. Maybe you happen to have a social security number that hashes to a value that causes a program to crash.

When software routinely fails, there obviously has to have a human override. But as software improves for most people, there’s less apparent need to make provision for the exceptional cases. So things could get harder for edge cases as they get better for more people.

Recent advances in machine learning have led reputable thinkers (Steven Hawking for example) to envision a future where an artificial intelligence will arise to dispense with humanity.

If you think you have heard that theme before, you have, most recently as Skynet, an entirely fictional creation in the Terminator science fiction series.

Given that no one knows how the human brain works, much less how intelligence arises, despite such alarmist claims making good press, the risk is less than a rogue black hole or a gamma-ray burst. I don’t lose sleep over either one of those, do you?

The greater “Skynet” threat to people and their cultures is the enforced homogenization of language and culture.

John mentions lacking a middle name but consider the complexities of Japanese names. Due to the creeping infection of Western culture and computer-based standardization, many Japanese list their names in Western order, given name, family name, instead of the Japanese order of family name, given name.

Even languages can start the slide to being “edge cases,” as you will see from the erosion of Hangul (Korean alphabet) from public signs in Seoul.

Computers could be preserving languages and cultural traditions, they have the capacity and infinite patience.

But they are not being used for that purpose.

Cellphones, for example, are linking humanity into a seething mass of impoverished social interaction. Impoverished social interaction that is creating more homogenized languages, not preserving diverse ones.

Not only should you be an edge case but you should push back against the homogenizing impact of computers. The diversity we lose could well be your own.

Comments Off

August 17, 2015

Programming for Humanists at TAMU [and Business Types]

Filed under: Humanities,Programming — Patrick Durusau @ 4:44 pm

Programming for Humanists at TAMU

From the webpage:

[What is DH?] Digital Humanities studies the intersection and mutual influence of humanities ideas and digital methods, with the goal of understanding how the use of digital technologies and approaches alters the practice and theory of humanities scholarship. In this sense it is concerned with studying the emergence of scholarly disciplines and communicative practices at a time when those are in flux, under the influence of rapid technological, institutional and cultural change. As a way of identifying digital interests and efforts within traditional humanities fields, the term “digital humanities” also identifies, in a general way, any kind of critical engagement with digital tools and methods in a humanities context. This includes the creation of digital editions and digital text or image collections, and the creation and use of digital tools for the investigation and analysis of humanities research materials. – Julia Flanders, Northeastern University (http://goo.gl/BJeXk2)

Programming4Humanists is a two-semester course sequence designed to introduce participants to methodologies, coding, and programming languages associated with the Digital Humanities. We focus on creation, editing, and searchability of digital archives, but also introduce students to data mining and statistical analysis. Our forte at Texas A&M University (TAMU) is Optical Character Recognition of early modern texts, a skill we learned in completing the Early Modern OCR Project, or eMOP. Another strength that the Initiative for Digital Humanities, Media, and Culture (http://idhmc.tamu.edu) at TAMU brings to this set of courses is the Texas A&M University Press book series called “Programming for Humanists.” We use draft and final versions of these books, as well as many additional resources available on companion web pages, for participants in the workshop. The books in this series are of course upon publication available to anyone, along with the companion sites, whether the person has participated in the workshop or not. However, joining the Programming4Humanists course enables participants to communicate with the authors of these books for the sake of asking questions and indeed, through their questioning, helping us to improve the books and web materials. Our goal is to help people learn Digital Humanities methods and techniques.

Participants

Those who should attend include faculty, staff, librarians, undergraduates, and graduate students, interested in making archival and cultural materials available to a wide audience while encoding them digitally according to best practices, standards that will allow them to submit their digital editions for peer review by organizations such as the MLA Committee for Scholarly Edition and NINES / 18thConnect. Librarians will be especially interested in learning our OCR procedures as a means for digitizing large archives. Additionally, scholars, students, and librarians will receive an introduction to text mining and XQuery, the latter used for analyzing semantically rich data sets. This course gives a good overview of what textual and archival scholars are accomplishing in the field of Digital Humanities, even though the course is primarily concerned with teaching skills to participants. TAMU graduate and undergraduate students may take this course for 2 credit hours, see Schedule of Classes for Fall 2015: LBAR 489 or 689 Digital Scholarship and Publication.

Prerequisites

No prior knowledge is required but some familiarity with TEI/XML, HTML, and CSS will be helpful (See previous Programming 4 Humanists course syllabus). Certificate registrants will receive certificates confirming that they have a working knowledge of Drupal, XSLT, XQuery, and iPython Notebooks. Registration for those getting a certificate includes continued access to all class videos during the course period and an oXygen license. Non-certificate registrants will have access to the class videos for one week.

Everything that Julia says is true and this course will be very valuable for traditional humanists.

It will also be useful for business types who aren’t quants or CS majors/minors. The same “friendly” learning curve is suitable to both audiences.

You won’t be a “power user” at the end of this course but you will sense when CS folks are blowing smoke. It happens.

Comments Off

June 2, 2015

New Testament Virtual Manuscript Room

Filed under: Bible,Humanities,Manuscripts — Patrick Durusau @ 6:39 pm

New Testament Virtual Manuscript Room

From the webpage:

This site is devoted to the study of Greek New Testament manuscripts. The New Testament Virtual Manuscript Room is a place where scholars can come to find the most exhaustive list of New Testament manuscript resources, can contribute to marking attributes about these manuscripts, and can find state of the art tools for researching this rich dataset.

While our tools are reasonably functional for anonymous users, they provide additional features and save options once a user has created an account and is logged in on the site. For example, registered users can save transcribed pages to their personal account and create personalized annotations to images.

A close friend has been working on this project for several years. Quite remarkable although I would prefer it to feature Hebrew (and older) texts.

Comments Off

Spatial Humanities Workshop

Filed under: Humanities,Mapping,Maps,Spatial Data — Patrick Durusau @ 9:51 am

Spatial Humanities Workshop by Lincoln Mullen.

From the webpage:

Scholars in the humanities have long paid attention to maps and space, but in recent years new technologies have created a resurgence of interest in the spatial humanities. This workshop will introduce participants to the following subjects:

how mapping and spatial analysis are being used in humanities disciplines

how to find, create, and manipulate spatial data

how to create historical layers on interactive maps

how to create data-driven maps

how to tell stories and craft arguments with maps

how to create deep maps of places

how to create web maps in a programming language

how to use a variety of mapping tools

how to create lightweight and semester-long mapping assignments

The seminar will emphasize the hands-on learning of these skills. Each day we will pay special attention to preparing lesson plans for teaching the spatial humanities to students. The aim is to prepare scholars to be able to teach the spatial humanities in their courses and to be able to use maps and spatial analysis in their own research.

Ahem, the one thing Larry forgets to mention is that he is a major player in spatial humanities. His homepage is an amazing place.

The seminar materials don’t disappoint. It would be better to be at the workshop but in lieu of attending, working through these materials will leave you well grounded in spatial humanities.

Comments Off

May 19, 2015

Civil War Navies Bookworm

Filed under: History,Humanities,Indexing,Ngram Viewer,Searching,Text Analytics — Patrick Durusau @ 6:39 pm

Civil War Navies Bookworm by Abby Mullen.

From the post:

If you read my last post, you know that this semester I engaged in building a Bookworm using a government document collection. My professor challenged me to try my system for parsing the documents on a different, larger collection of government documents. The collection I chose to work with is the Official Records of the Union and Confederate Navies. My Barbary Bookworm took me all semester to build; this Civil War navies Bookworm took me less than a day. I learned things from making the first one!

This collection is significantly larger than the Barbary Wars collection—26 volumes, as opposed to 6. It encompasses roughly the same time span, but 13 times as many words. Though it is still technically feasible to read through all 26 volumes, this collection is perhaps a better candidate for distant reading than my first corpus.

The document collection is broken into geographical sections, the Atlantic Squadron, the West Gulf Blockading Squadron, and so on. Using the Bookworm allows us to look at the words in these documents sequentially by date instead of having to go back and forth between different volumes to get a sense of what was going on in the whole navy at any given time.
…

Before you ask:

The earlier post: Text Analysis on the Documents of the Barbary Wars

More details on Bookworm.

As with all ngram viewers, exercise caution in assuming a text string has uniform semantics across historical, ethnic, or cultural fault lines.

Comments Off

May 8, 2015

Digital Approaches to Hebrew Manuscripts

Filed under: Digital Research,Humanities,Library,Manuscripts — Patrick Durusau @ 7:48 pm

Digital Approaches to Hebrew Manuscripts

Monday 18th – Tuesday 19th of May 2015

From the webpage:

We are delighted to announce the programme for On the Same Page: Digital Approaches to Hebrew Manuscripts at King’s College London. This two-day conference will explore the potential for the computer-assisted study of Hebrew manuscripts; discuss the intersection of Jewish Studies and Digital Humanities; and share methodologies. Amongst the topics covered will be Hebrew palaeography and codicology, the encoding and transcription of Hebrew texts, the practical and theoretical consequences of the use of digital surrogates and the visualisation of manuscript evidence and data. For the full programme and our Call for Posters, please see below.
…

Organised by the Departments of Digital Humanities and Theology & Religious Studies (Jewish Studies)
Co-sponsor: Centre for Late Antique & Medieval Studies (CLAMS), King’s College London

I saw this at the blog for DigiPal: Digital Resource and Database of Palaeolography, Manuscript Studies and Diplomatic. Confession, I have never understood how the English derive acronyms and this confounds me as much as you.

Be sure to look around at the DigiPal site. There are numerous manuscript images, annotation techniques, and other resources for those who foster scholarship by contributing to it.

Comments Off

April 9, 2015

Web Gallery of Art

Filed under: Art,Humanities — Patrick Durusau @ 11:11 am

Web Gallery of Art

From the homepage:

The Web Gallery of Art is a virtual museum and searchable database of European fine arts from 11th to 19th centuries. It was started in 1996 as a topical site of the Renaissance art, originated in the Italian city-states of the 14th century and spread to other countries in the 15th and 16th centuries. Intending to present Renaissance art as comprehensively as possible, the scope of the collection was later extended to show its Medieval roots as well as its evolution to Baroque and Rococo via Mannerism. Encouraged by the feedback from the visitors, recently 19th-century art was also included. However, we do not intend to present 20th-century and contemporary art.

The collection has some of the characteristics of a virtual museum. The experience of the visitors is enhanced by guided tours helping to understand the artistic and historical relationship between different works and artists, by period music of choice in the background and a free postcard service. At the same time the collection serves the visitors’ need for a site where various information on art, artists and history can be found together with corresponding pictorial illustrations. Although not a conventional one, the collection is a searchable database supplemented by a glossary containing articles on art terms, relevant historical events, personages, cities, museums and churches.

The Web Gallery of Art is intended to be a free resource of art history primarily for students and teachers. It is a private initiative not related to any museums or art institutions, and not supported financially by any state or corporate sponsors. However, we do our utmost, using authentic literature and advice from professionals, to ensure the quality and authenticity of the content.

We are convinced that such a collection of digital reproductions, containing a balanced mixture of interlinked visual and textual information, can serve multiple purposes. On one hand it can simply be a source of artistic enjoyment; a convenient alternative to visiting a distant museum, or an incentive to do just that. On the other hand, it can serve as a tool for public education both in schools and at home.

The Gallery doesn’t own the works in question and so resolves the copyright issue thus:

The Web Gallery of Art is copyrighted as a database. Images and documents downloaded from this database can only be used for educational and personal purposes. Distribution of the images in any form is prohibited without the authorization of their legal owner.

The Gallery suggests contacting the Scala Group (or Art Resource, Scala’s U.S. representative) if you need rights beyond educational and personal purposes.

To see how images are presented, view 10 random images from the database. (Warning: The 10 random images link will work only once. If you try it again, images briefly display and then an invalid CGI environment message pops up. Suspect if you clear the browser cache it should work a second time.)

BTW, you can listen to classical music in the background while you browse/search. That is a very nice touch.

The site offers other features and options so take time to explore.

Having seen some of Michelangelo‘s works in person, I can attest no computer screen can duplicate that experience. However, if given the choice between viewing a pale imitation on a computer screen and not seeing his work at all, the computer version is a no brainer.

Comments Off

Older Posts »

Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

October 16, 2018

October 10, 2018

December 27, 2017

November 3, 2017

August 2, 2017

July 27, 2017

July 13, 2017

July 11, 2017

June 8, 2017

June 7, 2017

February 1, 2017

January 13, 2017

December 2, 2016

November 22, 2016

October 18, 2016

August 24, 2016

July 31, 2016

June 19, 2016

April 6, 2016

March 12, 2016

December 12, 2015

November 18, 2015

November 13, 2015

August 17, 2015

June 2, 2015

May 19, 2015

May 8, 2015

April 9, 2015