Open Sentiment Analysis by Pete Warden.
From the post:
Sentiment analysis is fiendishly hard to solve well, but easy to solve to a first approximation. I’ve been frustrated that there have been no easy free libraries that make the technology available to non-specialists like me. The problem isn’t with the code, there are some amazing libraries like NLTK out there, but everyone guards their training sets of word weights jealously. I was pleased to discover that SentiWordNet is now CC-BY-SA, but even better I found that Finn Årup has made a drop-dead simple list of words available under an Open Database License!
With that in hand, I added some basic tokenizing code and was able to implement a new text2sentiment API endpoint for the Data Science Toolkit:
BTW, while you are there, take a look at the Data Science Toolkit more generally.
Glad to hear about the open set of word weights.
Sentiment analysis with undisclosed word weights sounds iffy to me.
It’s like getting a list of rounded numbers but you don’t know the rounding factor.
Even worse with sentiment analysis because every rounding factor may be different.