Searching Legal Information in Multiple Asian Languages

Searching Legal Information in Multiple Asian Languages by Philip Chung, Andrew Mowbray, and Graham Greenleaf.


In this article the Co-Directors of the Australasian Legal Information Institute (AustLII) explain the need for an open source search engine which can search simultaneously over legal materials in European languages and also in Asian languages, particularly those that require a ‘double byte’ representation, and the difficulties this task presents. A solution is proposed, the ‘u16a’ modifications to AustLII’s open source search engine (Sino) which is used by many legal information institutes. Two implementations of the Sino u16A approach, on the Hong Kong Legal Information Institute (HKLII), for English and Chinese, and on the Asian Legal Information Institute (AsianLII), for multiple Asian languages, are described. The implementations have been successful, though many challenges (discussed briefly) remain before this approach will provide a full multi-lingual search facility.

If the normal run of legal information retrieval, across jurisdictions, vocabularies, etc. challenging enough, you can try your hand at cross-language retrieval with European and Asian languages, plus synonyms, etc.


I would like to think the synonymy issue, which is noted as open by this paper, could be addressed in part through the use of topic maps. It would be an evolutionary solution, to be updated as our use and understanding of language evolves.

Any thoughts on Sino versus Lucene/Solr 4.0 (alpha I know but it won’t stay that way forever).

I first saw this at Legal Informatics.

Comments are closed.