Dandlion’s New Bloom: A Family Of Semantic Text Analysis APIs by Jennifer Zaino.
From the post:
Dandelion, the service from SpazioDati whose goal is to delivering linked and enriched data for apps, has just recently introduced a new suite of products related to semantic text analysis.
Its dataTXT family of semantic text analysis APIs includes dataTXT-NEX, a named entity recognition API that links entities in the input sentence with Wikipedia and DBpedia and, in turn, with the Linked Open Data cloud and dataTXT-SIM, an experimental semantic similarity API that computes the semantic distance between two short sentences. TXT-CL (now in beta) is a categorization service that classifies short sentences into user-defined categories, says SpazioDati.CEO Michele Barbera.
“The advantage of the dataTXT family compared to existing text analysis’ tools is that dataTXT relies neither on machine learning nor NLP techniques,” says Barbera. “Rather it relies entirely on the topology of our underlying knowledge graph to analyze the text.” Dandelion’s knowledge graph merges together several Open Community Data sources (such as DBpedia) and private data collected and curated by SpazioDati. It’s still in private beta and not yet publicly accessible, though plans are to gradually open up portions of the graph in the future via the service’s upcoming Datagem APIs, “so that developers will be able to access the same underlying structured data by linking their own content with dataTXT APIs or by directly querying the graph with the Datagem APIs; both of them will return the same resource identifiers,” Barbera says. (See the Semantic Web Blog’s initial coverage of Dandelion here, including additional discussion of its knowledge graph.)
The line, “…dataTXT relies neither on machine learning nor NLP techniques,…[r]ather it relies entirely on the topology of our underlying knowledge graph to analyze the text,” caught my eye.
In private beta now but I am interested in how well it works against data in the wild.