Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

November 29, 2013

OpenSearchServer

Filed under: Search Engines,Searching — Patrick Durusau @ 8:11 pm

OpenSearchServer by Emmanuel Keller.

From the webpage:

OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, …) and the REST/RESTFul API you will be able to integrate quickly and easily advanced full-text search capabilities in your application. OpenSearchServer runs on Linux/Unix/BSD/Windows.

Search functions

  • Advanced full-text search features
  • Phonetic search
  • Advanced boolean search with query language
  • Clustered results with faceting and collapsing
  • Filter search using sub-requests (including negative filters)
  • Geolocation
  • Spell-checking
  • Relevance customization
  • Search suggestion facility (auto-completion)

Indexation

  • Supports 17 languages
  • Fields schema with analyzers in each language
  • Several filters: n-gram, lemmatization, shingle, stripping diacritic from words,…
  • Automatic language recognition
  • Named entity recognition
  • Word synonyms and expression synonyms
  • Export indexed terms with frequencies
  • Automatic classification

Document supported

  • HTML / XHTML
  • MS Office documents (Word, Excel, Powerpoint, Visio, Publisher)
  • OpenOffice documents
  • Adobe PDF (with OCR)
  • RTF, Plaintext
  • Audio files metadata (wav, mp3, AIFF, Ogg)
  • Torrent files
  • OCR over images

Crawlers

  • The web crawler for internet, extranet and intranet
  • The file systems crawler for local and remote files (NFS, SMB/CIFS, FTP, FTPS, SWIFT)
  • The database crawler for all JDBC databases (MySQL, PostgreSQL, Oracle, SQL Server, …)
  • Filter inclusion or exclusion with wildcards
  • Session parameters removal
  • SQL join and linked files support
  • Screenshot capture
  • Sitemap import

General

  • REST API (XML and JSON)
  • SOAP Web Service
  • Monitoring module
  • Index replication
  • Scheduler for management of periodic tasks
  • WordPress plugin and Drupal module

OpenSearchServer is something to consider if your project is GPL v3 compatible.

Even in an enterprise context, you don’t have to be better than Google at searching the entire WWW.

You just have to be better at searching content of interest to a user, project, department, etc.

The difference between your search results and Google’s should be the difference of a breakfast on near-food at McDonald’s and the best home-cooked breakfast you can imagine.

One is a mass-produced product that is the same over the world, the other is customized to your taste.

Which one would you prefer?

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress