Archive for the ‘TSearch’ Category

Tsearch2 – full text extension for PostgreSQL

Wednesday, February 16th, 2011

Tsearch2 – full text extension for PostgreSQL

Following up on the TSearch Primer post from yesterday.

This is the current documentation for Tsearch2.

TSearch Primer

Tuesday, February 15th, 2011

TSearch Primer

From the website:

TSearch is a Full-Text Search engine that is packaged with PostgreSQL. The key developers of TSearch are Oleg Bartunov and Teodor Sigaev who have also done extensive work with GiST and GIN indexes used by PostGIS, PgSphere and other projects. For more about how TSearch and OpenFTS got started check out A Brief History of FTS in PostgreSQL. Check out the TSearch Official Site if you are interested in related TSearch tips or interested in donating to this very worthy project.

Tsearch is different from regular string searching in PostgreSQL in a couple of key ways.

  1. It is well-suited for searching large blobs of text since each word is indexed using a Generalized Inverted Index (GIN) or Generalized Search Tree (GiST) and searched using text search vectors. GIN is generally used for indexing. Search vectors are at word and phrase boundaries.
  2. TSearch has a concept of Linguistic significance using various language dictionaries, ISpell, thesaurus, stop words, etc. therefore it can ignore common words and equate like meaning terms and phrases.
  3. TSearch is for the most part case insensitive.
  4. While various dictionaries and configs are available out of the box with TSearch, one can create new ones and customize existing further to cater to specific niches within industries – e.g. medicine, pharmaceuticals, physics, chemistry, biology, legal matters.

Short introduction to TSearch, which is part of PostgreSQL.

Should be of interest to topic mappers using PostgreSQL.