Archive for the ‘History’ Category

A people’s history of the United States [A Working Class Winter Is Coming]

Sunday, December 25th, 2016

A people’s history of the United States by Howard Zinn.

From the webpage:

The full text of Howard Zinn’s superb people’s history of the United States, spanning over 500 years from Columbus’s “discovery” of America in 1492 to the Clinton presidency in 1996.

I think this is the first edition text (1980), which has been updated and can be purchased here.

Be sure to visit/use (either personally or for teaching): Teaching A People’s History:

Its goal is to introduce students to a more accurate, complex, and engaging understanding of United States history than is found in traditional textbooks and curricula. The empowering potential of studying U.S. history is often lost in a textbook-driven trivial pursuit of names and dates. People’s history materials and pedagogy emphasize the role of working people, women, people of color, and organized social movements in shaping history. Students learn that history is made not by a few heroic individuals, but instead by people’s choices and actions, thereby also learning that their own choices and actions matter.

Buy the book, share it and the website as widely as possible.

A working class winter is coming.

Egyptological Museum Search

Tuesday, November 22nd, 2016

Egyptological Museum Search

From the post:

The Egyptological museum search is a PHP tool aimed to facilitate locating the descriptions and images of ancient Egyptian objects in online catalogues of major museums. Online catalogues (ranging from selections of highlights to complete digital inventories) are now offered by almost all major museums holding ancient Egyptian items and have become indispensable in research work. Yet the variety of web interfaces and of search rules may overstrain any person performing many searches in different online catalogues.

Egyptological museum search was made to provide a single search point for finding objects by their inventory numbers in major collections of Egyptian antiquities that have online catalogues. It tries to convert user input into search queries recognised by museums’ websites. (Thus, for example, stela Geneva D 50 is searched as “D 0050,” statue Vienna ÄS 5046 is searched as “AE_INV_5046,” and coffin Turin Suppl. 5217 is searched as “S. 05217.”) The following online catalogues are supported:

The search interface uses a short list of aliases for museums.

Once you see/use the interface proper, here, I hope you are interested in volunteering to improve it.

The Postal Museum (UK)

Saturday, November 19th, 2016

The Postal Museum

Set to open in mid-2017, the Postal Museum covers five hundred years of “Royal Mail.”

It’s Online catalogue has more than 120,000 records describing its collection.

Which includes this gem:


Registering for the catalogue will enable you to access downloadable content, save searches, create wish-lists, etc. Registration is free and worth the effort.

The site is in beta and my confirmation email displayed as blank in Thunderbird but viewing source gave the confirmation URL.

A terminology issue. Where the tabs for an item say “Ordering and Viewing,” they mean requesting an items to be retrieved for you to view on a specified day.

I was confused because I thought “ordering” meant obtaining a copy, print or digital of the item in question.

The turnpike road map above is available in a somewhat larger size but not nearly large enough for actual use.

Very high resolution images of maps and similar materials would be a welcome addition to the resources already available.


PS: I didn’t look but the Postal Museum has resources on stamps as well. 😉

The History of Cartography

Wednesday, July 20th, 2016

The History of Cartography

From the webpage:

The first volume of the History of Cartography was published in 1987 and the three books that constitute Volume Two appeared over the following eleven years. In 1987 the worldwide web did not exist, and since 1998 book publishing has gone through a revolution in the production and dissemination of work. Although the large format and high quality image reproduction of the printed books (see right column) are still well-suited to the requirements for the publishing of maps, the online availability of material is a boon to scholars and map enthusiasts.

On this site the University of Chicago Press is pleased to present the first three volumes of the History of Cartography in PDF format. Navigate to the PDFs from the left column. Each chapter of each book is a single PDF. The search box on the left allows searching across the content of all the PDFs that make up the first six books.

Links to the parts, which are then divided into separate PDF files of each chapter:

Volume One: Cartography in Prehistoric, Ancient, and Medieval Europe and the Mediterranean

Volume Two: Book 1: Cartography in the Traditional Islamic and South Asian Societies

Volume Two: Book 2: Cartography in the Traditional East and Southeast Asian Societies

Volume Two: Book 3: Cartography in the Traditional African, American, Arctic, Australian, and Pacific Societies

Volume Three: Cartography in the European Renaissance, Part 1

Volume Three: Cartography in the European Renaissance, Part 2

Unless you want to index the parts for yourself, remember the search box at this site that searches across all six volumes.

This can be a real time sink, deeply educational but a time sink none the less.

Judicial Decision Making, Pulling Back the Curtain (Miranda v. Arizona)

Wednesday, June 15th, 2016

Miranda v. Arizona: Exploring Primary Sources Behind the Supreme Court Case by Stephen Wesson.

From the post:

You have the right to remain silent….” These words, and the rest of the legal warning that follows, are so well-known that they’ve almost become a synonym for “You’re under arrest.” They occupy such a familiar place in popular culture that it might seem as though they’d been part of U.S. law for centuries. However, the now-ubiquitous Miranda warning only came into being fifty years ago, when the Supreme Court ruled that the rights of a criminal suspect, Ernesto Miranda, had been violated because he had not been informed of his Constitutional protections against self-incrimination.

The Library of Congress is marking this landmark anniversary with the launch of Miranda v. Arizona: The Rights to Justice, an online presentation of historical documents that shed light on the arguments around, and the reaction to, the Miranda ruling of 1966. These documents, which include papers written by and for several Supreme Court justices, allow students to explore the issues discussed by the justices as they considered the ramifications of the case. In addition, letters from law enforcement officers and members of the public illuminate the contentious public debate that erupted after the ruling.

One particularly powerful document for students to analyze is a page from a memorandum that associate justice William Brennan sent to chief justice Earl Warren about the case. Acknowledging that his 21-page response is lengthy, Brennan explains, “this will be one of the the most important opinions of our time…”

He then focuses on two words from Warren’s opinion that he says go “to the basic thrust of the approach to be taken.” He expounds,

An important collection of documents, not only as background to Miranda v. Arizona but also as insight into decision making in the Supreme Court.

Decisions are announced by the media in sound-bite sized chunks, which fail to portray the complexity of Court opinions, much less the process by which they are created.

I can think of any number of cases that merit this sort of treatment or even deeper, inter-linked collections of documents.


Office of the Historian, U.S. Department of State (+ XQuery)

Thursday, April 28th, 2016

Office of the Historian (website) : Office of the Historian, U.S. Department of State (Github).

All of the XQuery code and data from the website is available at Github.

You will find such goodies as:

Office of the Historian Subject Taxonomy of the History of U.S. Foreign Relations (XML)

Foreign Relations of the United States

The Foreign Relations of the United States (FRUS) series presents the official documentary historical record of major U.S. foreign policy decisions and significant diplomatic activity. The series is published in print and online editions at the U.S. Department of State Office of the Historian website.

Encoded using TEI with additional tools for quality checking.

Impressive but perhaps not as immediately useful as:

A Guide to the United States’ History of Recognition, Diplomatic, and Consular Relations, by Country, since 1776

I checked and there is an entry for Texas that will need to be updated depending on who you listen to in Texas.

There are XML, Schematron, XQuery files galore so there is plenty of production and/or practice material, depending upon your interests.

History Unfolded: US Newspapers and the Holocaust [Editors/Asst. Editors?]

Tuesday, April 12th, 2016

History Unfolded: US Newspapers and the Holocaust

From the webpage:

What did American newspapers report about the Holocaust during World War II? Citizen historians participating in History Unfolded: US Newspapers and the Holocaust will help the US Holocaust Memorial Museum answer this question.

Your Role

Participants will explore their local newspapers for articles about the Holocaust, and submit their research into a centralized database. The collected data will show trends in American reporting.

Citizen historians like you will explore Holocaust history as both an American story and a local story, learn how to use primary sources in historical research, and challenge assumptions about American knowledge of and responses to the Holocaust.

Project Outcomes

Data from History Unfolded: U.S. Newspapers and the Holocaust will be used for two main purposes:
to inform the Museum’s upcoming exhibition on Americans and the Holocaust, and to enhance scholarly research about the American press and the Holocaust.

Our Questions

  • What did people in your community know about the event?
  • Was the information accurate?
  • What do the newspapers tell us about how local and national leaders and community members reacted to news about the event?

Historical Background

During the 1930s, a deeply rooted isolationism pervaded American public opinion. Americans were scornful of Europe’s inability to organize its affairs following the destruction of WWI and feared being drawn into European matters. As a result, news about the Holocaust arrived in an America fraught with isolation, cynicism, and fear of being deceived by government propaganda. Even so, the way the press told the story of the Holocaust—the space allocated, the location of the news in the paper, and the editorial opinions—shaped American reactions.

U.S. Press Coverage of the Holocaust

The press has influence on public opinion. Media attention enhances the importance of an issue in the eyes of the public. The U.S. press had reported on Nazi violence against Jews in Germany as early as 1933. It covered extensively the Nuremberg Laws of 1935 and the expanded German antisemitic legislation of 1938 and 1939. The nationwide state-sponsored violence of November 9-10, 1938, known as Kristallnacht, made front page news in dailies across the U.S.

As the magnitude of anti-Jewish violence increased in 1939-1941, many American newspapers ran descriptions of German shooting operations, first in Poland and later after the invasion of the Soviet Union. As early as July 2, 1942, the New York Times reported on the operations of the killing center in Chelmno, based on sources from the Polish underground. The article, however, appeared on page six of the newspaper.

During the Holocaust, the American press did not always publicize reports of Nazi atrocities in full or with prominent placement. For example, the New York Times, the nation’s leading newspaper, generally deemphasized the murder of the Jews in its news coverage. Although the Times covered the December 1942 statement of the Allies condemning the mass murder of European Jews on its front page, it placed coverage of the more specific information released on page ten, significantly minimizing its importance. Similarly, on July 3, 1944, the Times provided on page 3 a list by country of the number of Jews “eradicated”; the Los Angeles Times places the report on page 5.

How did your hometown cover these events?

I first saw this in What did Americans know as the Holocaust unfolded? Quite a lot, it turns out. by Tara Bahrampour, follow @TaraBahrampour.

I have registered for the project and noticed that although author bylines are captured, there doesn’t seem to be a routine to capture editors, assistant editors, etc. Newspapers don’t assemble themselves.

The site focuses on twenty (20) major events, starting with “Dachau Opens,” March 22, 1933 and ending with “FDR Delivers His Forth Inaugural Address,” January 20, 1945.

The interfaces seem very intuitive and I am looking forward to searching my local newspaper for one or more of these events.

PS: Anti-Semites didn’t and don’t exist in isolation. Graphing relationships over history in your community may help explain some of the news coverage you do or don’t find.

Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy

Thursday, April 7th, 2016

Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy by Cathy O’Neil.


From the description at Amazon:

We live in the age of the algorithm. Increasingly, the decisions that affect our lives—where we go to school, whether we get a car loan, how much we pay for health insurance—are being made not by humans, but by mathematical models. In theory, this should lead to greater fairness: Everyone is judged according to the same rules, and bias is eliminated. But as Cathy O’Neil reveals in this shocking book, the opposite is true. The models being used today are opaque, unregulated, and uncontestable, even when they’re wrong. Most troubling, they reinforce discrimination: If a poor student can’t get a loan because a lending model deems him too risky (by virtue of his race or neighborhood), he’s then cut off from the kind of education that could pull him out of poverty, and a vicious spiral ensues. Models are propping up the lucky and punishing the downtrodden, creating a “toxic cocktail for democracy.” Welcome to the dark side of Big Data.

Tracing the arc of a person’s life, from college to retirement, O’Neil exposes the black box models that shape our future, both as individuals and as a society. Models that score teachers and students, sort resumes, grant (or deny) loans, evaluate workers, target voters, set parole, and monitor our health—all have pernicious feedback loops. They don’t simply describe reality, as proponents claim, they change reality, by expanding or limiting the opportunities people have. O’Neil calls on modelers to take more responsibility for how their algorithms are being used. But in the end, it’s up to us to become more savvy about the models that govern our lives. This important book empowers us to ask the tough questions, uncover the truth, and demand change.

Even if you have qualms about Cathy’s position, you have to admit that is a great book cover!

When I was in law school, I had F. Hodge O’Neal for corporation law. He is the O’Neal in O’Neal and Thompson’s Oppression of Minority Shareholders and LLC Members, Rev. 2d.

The publisher’s blurb is rather generous in saying:

Cited extensively, O’Neal and Thompson’s Oppression of Minority Shareholders and LLC Members shows how to take appropriate steps to protect minority shareholder interests using remedies, tactics, and maneuvers sanctioned by federal law. It clarifies the underlying cause of squeeze-outs and suggests proven arrangements for avoiding them.

You could read Oppression of Minority Shareholders and LLC Members that way but when corporate law is taught with war stories from the antics of the robber barons forward, you get the impression that isn’t why people read it.

Not that I doubt Cathy’s sincerity, on the contrary, I think she is very sincere about her warnings.

Where I disagree with Cathy is in thinking democracy is under greater attack now or that inequality is any greater problem than before.

If you read The Half Has Never Been Told: Slavery and the Making of American Capitalism by Edward E. Baptist:


carefully, you will leave it with deep uncertainty about the relationship of American government, federal, state and local to any recognizable concept of democracy. Or for that matter to the “equality” of its citizens.

Unlike Cathy as well, I don’t expect that shaming people is going to result in “better” or more “honest” data analysis.

What you can do is arm yourself to do battle on behalf of your “side,” both in terms of exposing data manipulation by others and concealing your own.

Perhaps there is room in the marketplace for a book titled: Suppression of Unfavorable Data. More than hiding data, what data to not collect? How to explain non-collection/loss? How to collect data in the least useful ways?

You would have to write it as a how to avoid these very bad practices but everyone would know what you meant. Could be the next business management best seller.

1,000 Hours of Early Jazz

Thursday, March 10th, 2016

1,000 Hours of Early Jazz Recordings Now Online: Archive Features Louis Armstrong, Duke Ellington & Much More

From the post:

David W. Niven spent his life amassing a vast record collection, all dedicated to the sounds of Early Jazz. As a kid during the 1920s, he started buying jazz records with money earned from his paper route. By World War II, Niven, now a college student, had thousands of LPs. “All the big names of jazz, along with lesser legends, were included,” Niven later said, and “I found myself with a first class treasure of early jazz music.” Louis Armstrong, Bix Beiderbecke, Duke Ellington, and much, much more.

For the sake of his children, Niven started transferring his record collection to cassette tapes during the 1980s and prefacing them with audio commentaries that offer background information on each recording. Now, years after his death (1991), his collection of “Early Jazz Legends” has made its way to the web, thanks to archivist Kevin J. Powers. If you head over to, you can stream/download digitized versions of 650 cassette tapes, featuring over 1,000 hours of early jazz music. There’s also scans of liner cards for each recording.

Every recitation of history is incomplete but some are more incomplete than others.

Imagine trying to create a recitation about the mid to late 1960’s without examples of the music, posters, incense, body counts, napalm, etc.

Here’s one slice of the early 20th century for your listening enjoyment.

‘You Were There!’ Historical Evidence Of Participation

Saturday, February 13th, 2016

Free: British Pathé Puts Over 85,000 Historical Films on YouTube by Jonathan Crow.

From the post:

British Pathé was one of the leading producers of newsreels and documentaries during the 20th Century. This week, the company, now an archive, is turning over its entire collection — over 85,000 historical films – to YouTube.

The archive — which spans from 1896 to 1976 – is a goldmine of footage, containing movies of some of the most important moments of the last 100 years. It’s a treasure trove for film buffs, culture nerds and history mavens everywhere. In Pathé’s playlist “A Day That Shook the World,” which traces an Anglo-centric history of the 20th Century, you will find clips of the Wright Brothers’ first flight, the bombing of Hiroshima and Neil Armstrong’s walk on the moon, alongside footage of Queen Victoria’s funeral and Roger Bannister’s 4-minute mile. There’s, of course, footage of the dramatic Hindenburg crash and Lindbergh’s daring cross-Atlantic flight. And then you can see King Edward VIII abdicating the throne in 1936, Hitler’s first speech upon becoming the German Chancellor in 1933 and the eventual Pearl Harbor attack in December 1941 (above).

But the really intriguing part of the archive is seeing all the ephemera from the 20th Century, the stuff that really makes the past feel like a foreign country – the weird hairstyles, the way a city street looked, the breathtakingly casual sexism and racism. There’s a rush in seeing history come alive. Case in point, this documentary from 1967 about the wonders to be found in a surprisingly monochrome Virginia.

A treasure trove of over 85,000 historical films!

With modern face recognition technology, imagine mining these films and matching faces up against other photographic archives.

Rather than seeing George Wallace, for example, as a single nasty piece of work during the 1960’s, we may identify the followers of such “leaders.”

Those who would discriminate on the basis of race, gender, religion, sexual orientation, ethnic origin, language, etc. are empowered by those of similar views.

One use of this historical archive would be to “out” the followers of such bigots.

To protect “former” fascists supporters on the International Olympic Committee, the EU will protest any search engine that reports such results.

You should judge the IOC by their supporters as well. (Not the athletes, but the IOC.)

Finding Roman Roads

Saturday, February 6th, 2016

You (yes, you) can find Roman roads using data collected by lasers by Barbara Speed.

Barbara reports that using Lidar data available from the UK Survey portal, David Rateledge was able to discover a Roman road between Ribchester and Lancaster.

She closes with:

The Environment Agency is planning to release 11 Terabytes (for Luddites: that’s an awful lot of data) worth of LIDAR information as part of the Department for Engironment, Food and Rural Affairs’ open data initiative, available through this portal. Which means that any of us could download it and dig about for more lost roads.

That seems a bit thin on the advice side, if you are truly interested in using the data to find Roman roads and other sites.

An article posted under ‘Lost’ Roman road is discovered, doesn’t provide more on the technique but does point to Roman Roads in Lancashire. Interesting site but no help on using the data.

I can’t comment on the ease of use or documentation but LiDAR tools are available at: Free LiDAR tools.

See also my post on the OpenTopography Project.

Unpublished Black History Photos (NYT)

Wednesday, February 3rd, 2016

The New York Times is unearthing unpublished photos from its archives for Black History Month by Shan Wang.

From the post:

In this black and white photo taken by a New York Times staff photographer, two unidentified second graders at Princeton’s Nassau Street Elementary School stand in front of a classroom blackboard. Some background text accompanies the image, pointing to a 1964 Times article about school integration and adding that the story “offered a caveat that still resonates, noting that in the search for a thriving and equal community, ‘good schooling is not enough.’”

Times readers wrote in to ask specifically about the second graders in the photo, so the Times updated the post with a comment form asking readers to share anything they might know about the girl and boy depicted.

Great background on the Unpublished Black History project at the Times.

Public interfaces enable contribution of information on selected images along with comments.

Unlike the US Intelligence community, the Times is willing to admit that its prior conduct may not reflect (then) or current values.

If a private, for-profit organization can be that honest, what’s the deal with government agencies?

Must be that accountability thing that Republicans are always trying to foist off onto public school teachers and public school teachers alone.

No accountability for elected officials and/or their appointees and cronies.

How to Build a TimesMachine [New York Times from 1851-2001]

Tuesday, February 2nd, 2016

How to Build a TimesMachine by Jane Cotler and Evan Sandaus.

From the post:

At the beginning of this year, we quietly expanded TimesMachine, our virtual microfilm reader, to include every issue of The New York Times published between 1981 and 2002. Prior to this expansion, TimesMachine contained every issue published between 1851 and 1980, which consisted of over 11 million articles spread out over approximately 2.5 million pages. The new expansion adds an additional 8,035 complete issues containing 1.4 million articles over 1.6 million pages.


Creating and expanding TimesMachine presented us with several interesting technical challenges, and in this post we’ll describe how we tackled two. First, we’ll discuss the fundamental challenge with TimesMachine: efficiently providing a user with a scan of an entire day’s newspaper without requiring the download of hundreds of megabytes of data. Then, we’ll discuss a fascinating string matching problem we had to solve in order to include articles published after 1980 in TimesMachine.

It’s not all the extant Hebrew Bible witnesses, both images and transcription, or all extant cuneiform tablets with existing secondary literature, but if you are interested in more recent events, what a magnificent resource!

Tesseract-ocr gets a shout-out and link for its use on the New York Times archives.

The string matching solution for search shows the advantages of finding a “nearly perfect” solution.

Math whizzes of ancient Babylon figured out forerunner of calculus

Thursday, January 28th, 2016

The video is very cool and goes along with:

Math whizzes of ancient Babylon figured out forerunner of calculus by Ron Cowen.


What could have happened if a forerunner to calculus wasn’t forgotten for 1400 years?

A sharper question would be:

What if you didn’t lose corporate memory with every promotion, retirement or person leaving the company?

We have all seen it happen and all of us have suffered from it.

What if the investment in expertise and knowledge wasn’t flushed away with promotion, retirement, departure?

That would have to be one helluva ontology to capture everyone’s expertise and knowledge.

What if it wasn’t a single, unified or even “logical” ontology? What if it only represented the knowledge that was important to capture for you and yours? Not every potential user for all time.

Just as we don’t all wear the same uniforms to work everyday, we should not waste time looking for a universal business language for corporate memory.

Unless you are in the business of filling seats for such quixotic quests.

I prefer to deliver a measurable ROI if its all the same to you.

Are you ready to stop hemorrhaging corporate knowledge?

New York Public Library – 180K Hi-Res Images/Metadata

Thursday, January 7th, 2016

NYPL Releases Hi-Res Images, Metadata for 180,000 Public Domain Items in its Digital Collections

from the post:

JANUARY 6, 2016 — The New York Public Library has expanded access to more than 180,000 items with no known U.S. copyright restrictions in its Digital Collections database, releasing hi-res images, metadata, and tools facilitating digital creation and reuse. The release represents both a simplification and an enhancement of digital access to a trove of unique and rare materials: a removal of administration fees and processes from public domain content, and also improvements to interfaces — popular and technical — to the digital assets themselves. Online users of the NYPL Digital Collections website will find more prominent download links and filters highlighting restriction-free content; while more technically inclined users will also benefit from updates to the Library’s collections API enabling bulk use and analysis, as well as data exports and utilities posted to NYPL’s GitHub account. These changes are intended to facilitate sharing, research and reuse by scholars, artists, educators, technologists, publishers, and Internet users of all kinds. All subsequently digitized public domain collections will be made available in the same way, joining a growing repository of open materials.

“The New York Public Library is committed to giving our users access to information and resources however possible,” said Tony Marx, president of the Library. “Today, we are going beyond providing our users with digital facsimiles that give only an impression of something we have in our physical collection. By making our highest-quality assets freely available, we are truly giving our users the greatest access possible to our collections in the digital environment.”

To encourage novel uses of its digital resources, NYPL is also now accepting applications for a new Remix Residency program. Administered by the Library’s digitization and innovation team, NYPL Labs, the residency is intended for artists, information designers, software developers, data scientists, journalists, digital researchers, and others to make transformative and creative uses of digital collections and data,and the public domain assets in particular. Two projects will be selected, receiving financial and consultative support from Library curators and technologists.

To provide further inspiration for reuse, the NYPL Labs team has also released several demonstration projects delving into specific collections, as well as a visual browsing tool allowing users to explore the public domain collections at scale. These projects — which include a then-and-now comparison of New York’s Fifth Avenue, juxtaposing 1911 wide angle photographs with Google Street View, and a “trip planner” using locations extracted from mid-20th century motor guides that listed hotels, restaurants, bars, and other destinations where black travelers would be welcome — suggest just a few of the myriad investigations made possible by fully opening these collections.

The public domain release spans the breadth and depth of NYPL’s holdings, from the Library’s rich New York City collection, historic maps, botanical illustrations, unique manuscripts, photographs, ancient religious texts, and more. Materials include:

Visit for information about the materials related to the public domain update and links to all of the projects demonstrating creative reuse of public domain materials.

The New York Public Library’s Rights and Information Policy team has carefully reviewed Items and collections to determine their copyright status under U.S. law. As a U.S.-based library, NYPL limits its determinations to U.S. law and does not analyze the copyright status of an item in every country. However, when speaking more generally, the Library uses terms such as “public domain” and “unrestricted materials,” which are used to describe the aggregate collection of items it can offer to the public without any restrictions on subsequent use.

If you are looking for content for a topic map or inspiration to pass onto other institutions about opening up their collections, take a look at the New York Public Library’s Digital Collections.

Content designed for re-use. Imagine that, re-use of content.

The exact time/place of the appearance of seamless re-use of content will be debated by future historians but for now, this is a very welcome step in that direction.

Jane, John … Leslie? A Historical Method for Algorithmic Gender Prediction [Gatekeeping]

Tuesday, January 5th, 2016

Jane, John … Leslie? A Historical Method for Algorithmic Gender Prediction by Cameron Blevins and Lincoln Mullen.


This article describes a new method for inferring the gender of personal names using large historical datasets. In contrast to existing methods of gender prediction that treat names as if they are timelessly associated with one gender, this method uses a historical approach that takes into account how naming practices change over time. It uses historical data to measure the likelihood that a name was associated with a particular gender based on the time or place under study. This approach generates more accurate results for sources that encompass changing periods of time, providing digital humanities scholars with a tool to estimate the gender of names across large textual collections. The article first describes the methodology as implemented in the gender package for the R programming language. It goes on to apply the method to a case study in which we examine gender and gatekeeping in the American historical profession over the past half-century. The gender package illustrates the importance of incorporating historical approaches into computer science and related fields.

An excellent introduction to the gender package for R, historical grounding of the detection of gender by name, with the highlight of the article being the application of this technique to professional literature in American history.

It isn’t uncommon to find statistical techniques applied to texts whose authors and editors are beyond the reach of any critic or criticism.

It is less than common to find statistical techniques applied to extant members of a profession.

Kudos to both Blevins and Mullen for refinement the detection of gender and for applying that refinement publishing in American history.

Calendar of Inquisitions Post Mortem, Volume 15, Richard II

Wednesday, December 23rd, 2015

Calendar of Inquisitions Post Mortem, Volume 15, Richard II by By M. C. B. Dawes, A. C. Wood and D. H. Gifford. (Covers the years 1 to 7 in the reign of Richard II.).

From the homepage for the series:

An inquisition post mortem is a local enquiry into the lands held by a deceased individual, in order to discover any income and rights due to the crown. Such inquisitions were only held when people were thought or known to have held lands of the crown. The records in this series relate to the City of London for the periods 1485-1561 and 1577-1603.

I admit that some of my posts have broader audiences than others but only British History Online could send this tweet:

BHO at the IHR ‏@bho_history 2h hours ago
One final new publication to keep you busy over the holiday: Calendar of Inquisitions Post Mortem vol 15. Enjoy! …
0 retweets 0 likes

Be sure to explore the British History Online (BHO). With a goal of creating access to printed primary and secondary sources from 1300 to 1800, the BHO site promises to be a rich source of historical data.

Amateur Discovery Confirmed by NASA

Friday, October 30th, 2015

NASA Adds to Evidence of Mysterious Ancient Earthworks by Ralph Blumenthal.

From the post:

High in the skies over Kazakhstan, space-age technology has revealed an ancient mystery on the ground.

Satellite pictures of a remote and treeless northern steppe reveal colossal earthworks — geometric figures of squares, crosses, lines and rings the size of several football fields, recognizable only from the air and the oldest estimated at 8,000 years old.

The largest, near a Neolithic settlement, is a giant square of 101 raised mounds, its opposite corners connected by a diagonal cross, covering more terrain than the Great Pyramid of Cheops. Another is a kind of three-limbed swastika, its arms ending in zigzags bent counterclockwise.

Described last year at an archaeology conference in Istanbul as unique and previously unstudied, the earthworks, in the Turgai region of northern Kazakhstan, number at least 260 — mounds, trenches and ramparts — arrayed in five basic shapes.

Spotted on Google Earth in 2007 by a Kazakh economist and archaeology enthusiast, Dmitriy Dey, the so-called Steppe Geoglyphs remain deeply puzzling and largely unknown to the outside world.

Two weeks ago, in the biggest sign so far of official interest in investigating the sites, NASA released clear satellite photographs of some of the figures from about 430 miles up.

More evidence you don’t need to be a globe trotter to make major discoveries!

A few of the satellite resources I have blogged about for your use: Free Access to EU Satellite Data, Planet Platform Beta & Open California:…, Skybox: A Tool to Help Investigate Environmental Crime.

Good luck!

Mapping the Medieval Countryside

Thursday, July 16th, 2015

Mapping the Medieval Countryside – Places, People, and Properties in the Inquisitions Post Mortem.

From the webpage:

Mapping the Medieval Countryside is a major research project dedicated to creating a digital edition of the medieval English inquisitions post mortem (IPMs) from c. 1236 to 1509.

IPMs recorded the lands held at their deaths by tenants of the crown. They comprise the most extensive and important body of source material for landholding in medieval England. Describing the lands held by thousands of families, from nobles to peasants, they are a key source for the history of almost every settlement in England and many in Wales.

This digital edition is the most authoritative available. It is based on printed calendars of the IPMs but incorporates numerous corrections and additions: in particular, the names of some 48,000 jurors are newly included.

The site is currently in beta phase: it includes IPMs from 1418-1447 only, and aspects of the markup and indexing are still incomplete. An update later this year will make further material available.

The project is funded by the Arts and Humanities Research Council and is a collaboration between the University of Winchester and the Department of Digital Humanities at King’s College London. The project uses five volumes of the Calendars of Inquisitions Post Mortem, gen. ed. Christine Carpenter, xxii-xxvi (The Boydell Press, Woodbridge, 2003-11) with kind permission from The Boydell Press. These volumes are all in print and available for purchase from Boydell, price £195.

One of the more fascinating aspects of the project is the list of eighty-nine (89) place types, which can be used for filtering. Just scanning the list I happened across “rape” as a place type, with four (4) instances recorded thus far.

The term “rape” in this context refers to a subdivision of the county of Sussex in England. The origin of this division is unknown but it pre-dates the Norman Conquest.

The “rapes of Sussex” and the eighty-eight (88) other place types are a great opportunity to explore place distinctions that may or may not be noticed today.


Royal Albert Hall – Performance History & Archive

Thursday, July 9th, 2015

Royal Albert Hall – Performance History & Archive

From the webpage:


Search our Performance Database to find out about your favourite artist or explore 30,000+ events from 1871 to last night.

Search the Archive to discover items in the Hall’s unique archive collections which chart the history of the building, organisation and events.

Another extraordinary resource from the UK. It is almost enough to make you forget that David Cameron is also a product of the UK.

Digital Bodleian

Thursday, July 9th, 2015

I know very little of what there is to be known about the Bodleian Library but as soon as I saw Digital Bodleian, I had to follow the link.

As of today, there are 115,179 images and more are on their way. Check the collections frequently and for new collections as well.

One example that is near and dear to me:

Exploring Egypt in the 19th Century

The popup reads:

A complete facsimile of publications from the early-nineteeth-century expeditions to Egypt by Champollion and Rosellini.

The growth of “big data” isn’t just from the production of new data but from the digitization of existing collections as well.

Now the issue is how to collate copies of inscriptions by Champollion in these works with much later materials. So that a scholar finding one such resource will be automatically made aware of the others.

That may not sound like a difficult task but given the amount of material published every year, it remains a daunting one.

Ancient [?] Craft of Information Visualization

Tuesday, July 7th, 2015

Vintage Infodesign [125]: More examples of the ancient craft of information visualization by Tiago Veloso.

From the post:

To open this week’s edition of Vintage InfoDesign, we picked some of the maps published in the 1800s/early 1900’s about the Battle of Waterloo . As we showed you before, on June 18th several newspapers marked with stunning pieces of infographic design the 200th anniversary of Napoleon’s final attempt to rule Europe, and since we haven’t feature any “oldies” related to this topic, we thought it would be interesting to do some Internet “digging”.

Hope you enjoy our findings, and feel free to leave the links to other charts and maps about Waterloo, in the comments section.

I’m not entirely comfortable with using the term “ancient” to describe maps depicting the Battle of Waterloo. I think of the fall of the New Kingdom of Egypt, in about 343 BCE as the beginning of “ancient” history.

Our World in Data

Saturday, July 4th, 2015

Our World in Data by Mike Roser.

Visualizations of War & Violence, Global Health, Africa, World Poverty and World Hunger & Food Provision.

An author chooses their time period but I find limiting the discussion of world poverty to the last 2,000 years problematic. Obtaining even projected data would be problematic but we know there were civilizations, particularly in the Ancient Near East and in Pre-Columbian America that had rather high standards of living. For that matter, for the time period given, the poverty map skips over the Roman Empire at its height, saying “we know that every country was extremely poor compared to modern living standards.”

The Romans had public bath houses, running water, roads that we still use today, public entertainment, libraries, etc. I am not sure how they were “extremely poor compared to modern living conditions.”

It is also problematic (slide 12) when Max says that:

Before modern economic growth the huge majority lived in extreme poverty and only a tiny elite enjoyed a better standard of living.

There are elites in every society that live better than most but that doesn’t automatically imply that over 84% to 94% of the world population was living in poverty. You don’t sustain a society such as the Aztecs or the Incas with only 6 to 16% of the population living outside poverty.

I am deeply doubtful of Max’s conclusion that in terms of poverty the world is becoming more “equal.”

Part of that skepticism is from being aware of statistics like:

“With less than 5 percent of world population, the U.S. uses one-third of the world’s paper, a quarter of the world’s oil, 23 percent of the coal, 27 percent of the aluminum, and 19 percent of the copper,” he reports. “Our per capita use of energy, metals, minerals, forest products, fish, grains, meat, and even fresh water dwarfs that of people living in the developing world.”
Use It and Lose It: The Outsize Effect of U.S. Consumption on the Environment

Considering that many of those resources are not renewable, there is a natural limit to how much improvement can or will take place outside of the United States. When renewable resources become more practical than they are today, they will only supplement the growing consumption of energy in the United States, not replace it.

Max provides access to his data sets if you are interested in exploring the data further. I would be extremely careful with his World Bank data because the World Bank does have an agenda to show the benefits of development across the world.

Considering the impact of consumption on the environment, the World Bank’s pursuit of a global consumption economy may be one of the more ill-fated schemes of all time.

If you are interested in this type of issue, the National Geographic’s Greendex may be of interest.

Crime, Prisons and Punishment

Wednesday, July 1st, 2015

Crime, Prisons and Punishment

From the webpage:

Just how murky is your past? Are there law breakers or law makers in your family tree? Whether your family history contains vice or virtue, with our Crime and Punishment month we’ll be giving you the opportunity to find out, with blogs, articles and videos to help you research your criminal ancestry.

Launched to coincide with our release of almost 2 million crime and punishment records – made available online for the first time only on Findmypast – our Crime and Punishment month explores the seedy underbelly of our family histories.

In addition to our helpful blogs and videos, we’ll have stories of the criminals amongst our record collections, fun games and quizzes and case studies of the amazing criminal ancestry discoveries made by our users. Find out more over on our blog!

I don’t usually post about strictly commercial sites but this one has “family reunion” written all over it. Appears to be focused on the UK, Australia, etc.

If you have any ancestors in the records covered, it could be a real conversation starter at your next family event. 😉

1.5 Million Slavery Era Documents Will Be Digitized…

Thursday, June 25th, 2015

1.5 Million Slavery Era Documents Will Be Digitized, Helping African Americans to Learn About Their Lost Ancestors

From the post:

The Freedmen’s Bureau Project — a new initiative spearheaded by the Smithsonian, the National Archives, the Afro-American Historical and Genealogical Society, and the Church of Jesus Christ of Latter-Day Saints — will make available online 1.5 million historical documents, finally allowing ancestors [sic. descendants] of former African-American slaves to learn more about their family roots. Near the end of the US Civil War, The Freedmen’s Bureau was created to help newly-freed slaves find their footing in postbellum America. The Bureau “opened schools to educate the illiterate, managed hospitals, rationed food and clothing for the destitute, and even solemnized marriages.” And, along the way, the Bureau gathered handwritten records on roughly 4 million African Americans. Now, those documents are being digitized with the help of volunteers, and, by the end of 2016, they will be made available in a searchable database at According to Hollis Gentry, a Smithsonian genealogist, this archive “will give African Americans the ability to explore some of the earliest records detailing people who were formerly enslaved,” finally giving us a sense “of their voice, their dreams.”

You can learn more about the project by watching the video below, and you can volunteer your own services here.

A crowd sourced project that has a great deal of promise with regard to records on 4 million African Americans, who were previously held as slaves.

Making the documents “searchable” will be of immense value. However, imagine capturing the myriad relationships documented in these records so that subsequent searchers can more quickly find relationships you have already documented.

Finding former slaves with a common owner or other commonalities, could be the clues others need to untangle a past we only see dimly.

Topic maps are a nice fit for this work.

Black Freedom Struggle Collection [That Is Struggling To Be Free]

Wednesday, June 17th, 2015

Law Library Introduces Black Freedom Struggle Collection.

From the webpage:

The Law Library, Davis Library and the Sonja Haynes Stone Center have just purchased rich digital collections of NAACP, federal government and other organization documents. The collections illuminate the African American struggle to attain equal rights after Reconstruction. Collections span the 1870s to the 1980s. The collections are:

  • Black Freedom Struggle in the 20th Century: Federal Government Records
  • Black Freedom Struggle in the 20th Century: Organizational Records and Personal Papers

They supplement current UNC collections of NAACP documents and complement another new collection documenting earlier struggles, Slavery & the Law, and the existing Southern Life and African American History, 1715-1915, Plantation Records. Slavery and the Law features petitions on race, slavery, and free blacks that were submitted to state legislatures and county courthouses between 1775 and 1867.

The collections are in ProQuest’s History Vault Collection. For more information, contact a law librarian at 919-962-1194.

ProQuest sales brochure for Black Freedom Struggle in the 20th Century: Federal Government Records and Black Freedom Struggle in the 20th Century: Organizational Records and Personal Papers.

I rather doubt that the UNC Law Library has purchased these collections but rather has secured access to members of its faculty and student body to these materials. Hence the access via the ProQuest History Vault Collection.

Like any good massa, ProQuest is going to make a return on its investment, even if that excludes black Americans, indeed, all Americans, from learning the history of race in American from primary sources. Or at least those members of the population who don’t have institutional access to the Proquest History Vault Collection.

What makes this particularly galling in this case is that the materials represent a history of struggling for freedom, a story that should be widely told. A story that is being suppressed as it were in the name of our current IP model in the United States.

If we are confined to the artifices of commercial exploitation currently in place, why doesn’t Congress, which has wasted $billions on aircraft that exhibit spontaneous combustion (long rumored about people but confirmed in the F-35), site license this resource for everyone in the United States?

That would eliminate the paperwork for every institution that wants to access this material, eliminate the paperwork for all those contracts for ProQuest, make the original sources of our racial history available to every person located in the United States, so where is the downside?

While we work on changing the pernicious and exploitative IP regime of the present day, let’s change the rules on site licensing and let the greed of ProQuest lead it into doing the right thing. I care nothing for their motives, so long as universal access is the result.

Map of the Tracks of Yu, 1136

Monday, June 15th, 2015


I first saw this on Instagram at: with the following comment:

Map of the Tracks of Yu, 1136, is the first known map to use a cartographic grid.

The David Rumsey Map Collection, Cartography Associates, offers this more complete image from the Harvard Fine Arts Library:


And the following blurb:

Yujitu (Map of the Tracks of Yu), 1136. This map’s title derives from the Yugong, a treatise describing the sage-king Yu’s mythical channeling of China’s rivers. It is a rare surviving example of cartography used in the 12th century for public education, mixing classical references with later administrative history. Carved on a large stone tablet so that students or visitors could make rubbings, the map strikingly depicts a riverine network on a regular grid of squares intended to represent 100 li to a side. Read a more detailed description of this map by Alexander Akin, Ph.D. View the map in Google Earth. The image is courtesy Harvard Fine Arts Library.

To temp you into further reading, Alexander Akin’s description opens with these lines:

The Yijitu (Map of the Tracks of Yu) is the earliest extant map based on the Yugong (introduced below). Engraved in stone in 1136, the map measures about one meter to a side. It was carved into the face of an upright monument on the grounds of a school in Xi’an so that visitors could make detailed rubbings using paper and ink. These rubbings could be taken away for later reference. The stone plaque thus functioned as something like an immovable printing block, remaining in Xi’an while copies of its map found their way further afield. Harvard University holds one such rubbing made from the original stone, and has generously granted permission for the use of this unusually clear image, which shows more detail than any previously published version….

Alexander struggles, as only a modern would, over the “accuracy” of the map. A map that at times accords with the findings of modern map makers and at times accords with its Confucian heritage.

With maps in general and topic maps in particular, a question of “accuracy” cannot be answered with being supplied with the measurement to be applied in answering that question.

Cultural Heritage Markup (Pre-Balisage)

Thursday, June 4th, 2015

Cultural Heritage Markup Balisage, Monday, August 10, 2015.

Do you remember visiting your great-aunt’s house? Where everything looked like museum pieces and the smell was worse than your room every got? And all the adults has strained smiles and said how happy they were to be there?

Well, cultural heritage markup isn’t like that. All the real cultural heritage stuff we have maiden aunts and Norwegian bachelor uncles to take care of that stuff. This pre-Balisage workshop is working with markup and is a lot more fun!

Hugh Cayless, Duke University introduces the workshop:

Cultural heritage materials are remarkable for their complexity and heterogenity. This often means that when you’ve solved one problem, you’ve solved one problem. Arrayed against this difficulty, we have a nice big pile of tools and technologies with an alphabet soup of names like XML, TEI, RDF, OAIS, SIP, DIP, XIP, AIP, and BIBFRAME, coupled with a variety of programming languages or storage and publishing systems. All of our papers today address in some way the question of how you deal with messy, complex, human data using the available toolsets and how those toolsets have to be adapted to cope with our data. How do you avoid having your solution dictated by the tools available? How do you know when you’re doing it right? Our speakers are all trying, in various ways, to reconfigure their tools or push past those tools’ limitations, and they are going to tell us how they’re doing it.

A large number of your emails, tweets, webpages, etc. are destined to be “cultural heritage” (phone calls too if the NSA has anything to say about it) so you better get on the cultural heritage markup train today!

Speaking Truth To Power (sort of)

Wednesday, May 27th, 2015

16 maps that Americans don’t like to talk about by Max Fisher.

Max lists the following maps:

  1. The US was built on the theft of Native American’s lands
  2. The Trail of Tears, one of the darkest moments in US history — and we rarely talk about it
  3. America’s indigenous population today is sparse and largely lives in areas we forced them into
  4. America didn’t just tolerate slavery for a century — we expanded it
  5. This 1939 map of redlining in Chicago is just a hint at the systematic discrimination against African Americans
  6. School segregation is still a terrible problem
  7. Kids born poor have almost no chance at achieving the American Dream
  8. American has the second-highest child poverty rate in the developed world
  9. The US ranks alongside Nigeria on income inequality
  10. The US tried to replace Spain as an imperialist power
  11. The US outright stole Hawaii as part of its Pacific colonialism
  12. The firebombing that devastated Japan — including lots of non-military targets
  13. Agent Orange: the chemical we used to destroy a generation in Vietnam and harm our own troops
  14. The US backed awful dictators and insurgencies of the Cold War
  15. The thousands of Iraqi civilian deaths in the Iraq War
  16. Syria’s refugee crisis; the humanitarian catastrophe we could still help address but won’t

As far as Max’s maps:

Truthful? Yes.

Informative? Yes.

Not widely known? In some cases.

Will result in different outcomes? No so far.

The repetition of these narratives is part and parcel of Chompsky’s Propaganda System that we were discussing yesterday.

People make entire careers at keeping old injustices alive. Taking up historical causes is safe because the past is beyond our ability to change. You don’t want to be the March of Dimes when they discover a cure for polio.

Is bringing up old injustices speaking truth to power? After some amount of discussion, those in power will stop pretending to pay attention, a majority of citizens will lose interest (until next time) and present injustices, will continue without effort or change.

Ask yourself, whose interest does distraction from current injustices serve?

Power can tolerate a lot of truth, so long as it is beyond being changed by anyone. The crowd can vent its righteous anger, speeches can be made, marches held, and other for cleaning up after crowds, the system grinds on.

PS: On Syrian refugees, Saudi Arabia is a lot closer than the United States and the oil states of the Middle East have the resources to more than adequately care for Syrian refugees. US involvement will only continue its tradition of weak/corrupt governments in the Middle East.

Civil War Navies Bookworm

Tuesday, May 19th, 2015

Civil War Navies Bookworm by Abby Mullen.

From the post:

If you read my last post, you know that this semester I engaged in building a Bookworm using a government document collection. My professor challenged me to try my system for parsing the documents on a different, larger collection of government documents. The collection I chose to work with is the Official Records of the Union and Confederate Navies. My Barbary Bookworm took me all semester to build; this Civil War navies Bookworm took me less than a day. I learned things from making the first one!

This collection is significantly larger than the Barbary Wars collection—26 volumes, as opposed to 6. It encompasses roughly the same time span, but 13 times as many words. Though it is still technically feasible to read through all 26 volumes, this collection is perhaps a better candidate for distant reading than my first corpus.

The document collection is broken into geographical sections, the Atlantic Squadron, the West Gulf Blockading Squadron, and so on. Using the Bookworm allows us to look at the words in these documents sequentially by date instead of having to go back and forth between different volumes to get a sense of what was going on in the whole navy at any given time.

Before you ask:

The earlier post: Text Analysis on the Documents of the Barbary Wars

More details on Bookworm.

As with all ngram viewers, exercise caution in assuming a text string has uniform semantics across historical, ethnic, or cultural fault lines.