The Classical Language Toolkit
From the webpage:
The Classical Language Toolkit (CLTK) offers natural language processing (NLP) support for the languages of Ancient, Classical, and Medieval Eurasia. Greek and Latin functionality are currently most complete.
Goals
- compile analysis-friendly corpora;
- collect and generate linguistic data;
- act as a free and open platform for generating scientific research.
You are sure to find one or more languages of interest:
- Akkadian
- Arabic
- Bengali
- Chinese
- Coptic
- Ancient Egyptian
- Old English
- Middle English
- Greek
- Hebrew
- Hindi
- Javanese
- Latin
- Malayalam
- Multilingual
- Old Norse
- Pali
- Persian
- Prakrit
- Punjabi
- Sanskrit
- Telugu
- Tibetan
- Urdu
Collecting, analyzing and mapping Tweets can be profitable and entertaining, but tomorrow or perhaps by next week, almost no one will read them again.
The texts in this project survived by hand preservation for thousands of years. People are still reading them.
How about you?