Apache Lucene/Solr 4.8.0 Available!

The Lucene PMC is pleased to announce the availability of Apache Lucene 4.8.0 and Apache Solr 4.8.0.

Lucene can be downloaded from http://lucene.apache.org/core/mirrors-core-latest-redir.html and Solr can be downloaded from http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

Both releases now require Java 7 or greater (recommended is Oracle Java 7 or OpenJDK 7, minimum update 55; earlier versions have known JVM bugs affecting Lucene and Solr). In addition, both are fully compatible with Java 8.

See the Lucene CHANGES.txt and Solr CHANGES.txt files included with the release for a full list of details.

Highlights of the Lucene release include:

  • All index files now store end-to-end checksums, which are now validated during merging and reading. This ensures that corruptions caused by any bit-flipping hardware problems or bugs in the JVM can be detected earlier. For full detection be sure to enable all checksums during merging (it’s disabled by default).
  • Lucene has a new Rescorer/QueryRescorer API to perform second-pass rescoring or reranking of search results using more expensive scoring functions after first-pass hit collection.
  • AnalyzingInfixSuggester now supports near-real-time autosuggest.
  • Simplified impact-sorted postings (using SortingMergePolicy and EarlyTerminatingCollector) to use Lucene’s Sort class to express the sort order.
  • Bulk scoring and normal iterator-based scoring were separated, so some queries can do bulk scoring more effectively.
  • Switched to MurmurHash3 to hash terms during indexing.
  • IndexWriter now supports updating of binary doc value fields.
  • HunspellStemFilter now uses 10 to 100x less RAM. It also loads all known OpenOffice dictionaries without error.
  • Lucene now also fsyncs the directory metadata on commits, if the operating system and file system allow it (Linux, MacOSX are known to work).
  • Lucene now uses Java 7 file system functions under the hood, so index files can be deleted on Windows, even when readers are still open.
  • A serious bug in NativeFSLockFactory was fixed, which could allow multiple IndexWriters to acquire the same lock. The lock file is no longer deleted from the index directory even when the lock is not held.

Highlights of the Solr release include:

  • <fields> and <types> tags have been deprecated from schema.xml. There is no longer any reason to keep them in the schema file, they may be safely removed. This allows intermixing of <fieldType>, <field> and <copyField> definitions if desired.
  • The new {!complexphrase} query parser supports wildcards, ORs etc. inside Phrase Queries.
  • New Collections API CLUSTERSTATUS action reports the status of collections, shards, and replicas, and also lists collection aliases and cluster properties.
  • Added managed synonym and stopword filter factories, which enable synonym and stopword lists to be dynamically managed via REST API.
  • JSON updates now support nested child documents, enabling {!child} and {!parent} block join queries.
  • Added ExpandComponent to expand results collapsed by the CollapsingQParserPlugin, as well as the parent/child relationship of nested child documents.
  • Long-running Collections API tasks can now be executed asynchronously; the new REQUESTSTATUS action provides status.
  • Added a hl.qparser parameter to allow you to define a query parser for hl.q highlight queries.
  • In Solr single-node mode, cores can now be created using named configsets.
  • New DocExpirationUpdateProcessorFactory supports computing an expiration date for documents from the “TTL” expression, as well as automatically deleting expired documents on a periodic basis.

All exciting additions, except that today I finished configuring Tomcat7/Solr/Nutch, where Solr = 4.7.2.

Sigh, well, I suppose that was just a trial run. 😉

Comments are closed.