Early English Books Online
The very good news is that 25,000 volumes from the Early English Books Online collection have been made available to the public!
From the webpage:
The EEBO corpus consists of the works represented in the English Short Title Catalogue I and II (based on the Pollard & Redgrave and Wing short title catalogs), as well as the Thomason Tracts and the Early English Books Tract Supplement. Together these trace the history of English thought from the first book printed in English in 1475 through to 1700. The content covers literature, philosophy, politics, religion, geography, science and all other areas of human endeavor. The assembled collection of more than 125,000 volumes is a mainstay for understanding the development of Western culture in general and the Anglo-American world in particular. The STC collections have perhaps been most widely used by scholars of English, linguistics, and history, but these resources also include core texts in religious studies, art, women’s studies, history of science, law, and music.
Even better news from Sebastian Rahtz Sebastian Rahtz (Chief Data Architect, IT Services, University of Oxford):
The University of Oxford is now making this collection, together with Gale Cengage’s Eighteenth Century Collections Online (ECCO), and Readex’s Evans Early American Imprints, available in various formats (TEI P5 XML, HTML and ePub) initially via the University of Oxford Text Archive at http://www.ota.ox.ac.uk/tcp/, and offering the source XML for community collaborative editing via Github. For the convenience of UK universities who subscribe to JISC Historic Books, a link to page images is also provided. We hope that the XML will serve as the base for enhancements and corrections.
This catalogue also lists EEBO Phase 2 texts, but the HTML and ePub versions of these can only be accessed by members of the University of Oxford.
[Technical note]
Those interested in working on the TEI P5 XML versions of the texts can check them out of Github, via https://github.com/textcreationpartnership/, where each of the texts is in its own repository (eg https://github.com/textcreationpartnership/A00021). There is a CSV file listing all the texts at https://raw.githubusercontent.com/textcreationpartnership/Texts/master/TCP.csv, and a simple Linux/OSX shell script to clone all 32853 unrestricted repositories at https://raw.githubusercontent.com/textcreationpartnership/Texts/master/cloneall.sh
Now for the BAD NEWS:
An additional 45,000 books:
Currently, EEBO-TCP Phase II texts are available to authorized users at partner libraries. Once the project is done, the corpus will be available for sale exclusively through ProQuest for five years. Then, the texts will be released freely to the public.
Can you guess why the public is barred from what are obviously public domain texts?
Because our funding is limited, we aim to key as many different works as possible, in the language in which our staff has the most expertise.
Academic projects are supposed to fund themselves and be self-sustaining. When anyone asks about sustainability of an academic project, ask them when the last time your countries military was “self sustaining?” The U.S. has spent $2.6 trillion on a “war on terrorism” and has nothing to show for it other than dead and injured military personnel, perversion of budgetary policies, and loss of privacy on a world wide scale.
It is hard to imagine what sort of life-time access for everyone on Earth could be secured for less than $1 trillion. No more special pricing and contracts if you are in countries A to Zed. Eliminate all that paperwork for publishers and to access all you need is a connection to the Internet. The publishers would have a guaranteed income stream, less overhead from sales personnel, administrative staff, etc. And people would have access (whether used or not) to educate themselves, to make new discoveries, etc.
My proposal does not involve payments to large military contractors or subversion of legitimate governments or imposition of American values on other cultures. Leaving those drawbacks to one side, what do you think about it otherwise?