Medical literature searches: a comparison of PubMed and Google Scholar by Eva Nourbakhsh, Rebecca Nugent, Helen Wang, Cihan Cevik and Kenneth Nugent. (Health Information & Libraries Journal, Article first published online: 19 JUN 2012)
From the abstract:
Background
Medical literature searches provide critical information for clinicians. However, the best strategy for identifying relevant high-quality literature is unknown.
Objectives
We compared search results using PubMed and Google Scholar on four clinical questions and analysed these results with respect to article relevance and quality.
Methods
Abstracts from the first 20 citations for each search were classified into three relevance categories. We used the weighted kappa statistic to analyse reviewer agreement and nonparametric rank tests to compare the number of citations for each article and the corresponding journals’ impact factors.
Results
Reviewers ranked 67.6% of PubMed articles and 80% of Google Scholar articles as at least possibly relevant (P = 0.116) with high agreement (all kappa P-values < 0.01). Google Scholar articles had a higher median number of citations (34 vs. 1.5, P < 0.0001) and came from higher impact factor journals (5.17 vs. 3.55, P = 0.036). Conclusions
PubMed searches and Google Scholar searches often identify different articles. In this study, Google Scholar articles were more likely to be classified as relevant, had higher numbers of citations and were published in higher impact factor journals. The identification of frequently cited articles using Google Scholar for searches probably has value for initial literature searches.
I have several concerns that may or may not be allied by further investigation:
- Four queries seems like an inadequate basis for evaluation. Not that I expect to see one “winner” and one “loser,” but am more concerned with what lead to the differences in results.
- It is unclear why a citation from a journal with a higher impact factor is superior to one with a lesser impact factor? I assume the point of the query is to obtain a useful result (in the sense of medical treatment, not tenure).
- Neither system enabled users to build upon the query experience of prior users with a similar query.
- Neither system enabled users to avoid re-reading the same texts as other had read before them.
Thoughts?