Parsing Natural Scenes and Natural Language with Recursive Neural Networks by Richard Socher; Cliff Chiung-Yu Lin; Andrew Ng; and Chris Manning.
Description:
Recursive structure is commonly found in the inputs of different modalities such as natural scene images or natural language sentences. Discovering this recursive structure helps us to not only identify the units that an image or sentence contains but also how they interact to form a whole. We introduce a max-margin structure prediction architecture based on recursive neural networks that can successfully recover such structure both in complex scene images as well as sentences. The same algorithm can be used both to provide a competitive syntactic parser for natural language sentences from the Penn Treebank and to outperform alternative approaches for semantic scene segmentation, annotation and classification. For segmentation and annotation our algorithm obtains a new level of state-of-the-art performance on the Stanford background dataset (78.1%). The features from the image parse tree outperform Gist descriptors for scene classification by 4%.
Video of Richard Socher’s presentation at ICML 2011.
PDF of the paper: http://nlp.stanford.edu/pubs/SocherLinNgManning_ICML2011.pdf
According to one popular search engine the paper has 51 citations (as of today).
What caught my attention was the mapping of phrases into vector spaces which resulted in the ability to calculate nearest neighbors on phrases.
Both for syntactic and semantic similarity.
If you need more than a Boolean test for similarity (Yes/No), then you are likely to be interested in this work.
Later work by Socher at his homepage.