Analyzing PubMed Entries with Python and NLTK by Themos Kalafatis.
From the post:
I decided to take my first steps of learning Python with the following task : Retrieve all entries from PubMed and then analyze those entries using Python and the Text Mining library NLTK.
We assume that we are interested in learning more about a condition called Sudden Hearing Loss. Sudden Hearing Loss is considered a medical emergency and has several causes although usually it is idiopathic (a disease or condition the cause of which is not known or that arises spontaneously according to Wikipedia).
At the moment of writing, the PubMed Query for sudden hearing loss returns 2919 entries :
A great illustration of using NLTK but of the iterative nature of successful querying.
Some queries, quite simple ones, can and do succeed on the first attempt.
Themos demonstrates how to use NLTK to explore a data set where the first response isn’t all that helpful.
This is a starting idea for weekly exercises with NLTK. Exercises which emphasize different aspects of NLTK.