Terrier is software for the rapid development of Web, intranet, and desktop search engines. More generally, it is a modular platform for building large-scale information retrieval applications, providing indexing and probabilistic retrieval functionalities. It comes with a desktop search application.
|Tags||Information Management Internet Web Indexing/Search Software Development Libraries Java Libraries Text Processing Indexing|
The developers of the popular Terrier Information Retrieval Platform (http://terrier.org) are running a tutorial on how to use Terrier for state-of-the-art research. The tutorial will cover: * Large-scale indexing and retrieval using MapReduce and Hadoop. * Effective retrieval using a wide variety of models, in addition to advanced term-proximity, field retrieval and rich query language functionality. * Fast integrated Cranfield-style evaluation tools. * Extending Terrier into new research areas, e.g. evaluating new tasks using crowdsourcing. Register for CIKM online at http://www.cikm2011.org/registration, or contact firstname.lastname@example.org to add a tutorial to an existing registration. Early-bird registration closes after August 31st (less than three days to go!) -- Rodrygo Santos Richard McCreadie Vassilis Plachouras
Release Notes: Terrier 3.6 represents a significant update over the previous 3.5 release, providing over 75 improvements and bugfixes to the core Terrier search platform. It includes efficiency improvements when searching and better support for proximity search within documents.
Release Notes: 3.5 represents a significant update over the previous 3.0 version of Terrier, including: Document-at-a-time (DAAT) retrieval for large indices; refactored tokenisation for enhanced multi-language support; upgraded Hadoop support to version 0.20; synonym support in query language and retrieval; out-of-the box indexing support for query-biased summaries and an improved example Web-based interface; new, 2nd generation DFR models as well as other recent effective information-theoretic effective models; and many more JUnit tests (now 300+).
Release Notes: Terrier 3.0 is a major update, including field-based models (such as BM25F), and term dependence proximity models (Markov Random Fields). Index changes have made the platform more scalable. Hadoop MapReduce indexing is improved. A JSP Web-based interface is now included.
Release Notes: This is a substantial update, which includes new support for Hadoop, primarily a Hadoop Map Reduce indexing system, allowing large collections of documents to be indexed in a highly distributed fashion. Also included are various minor improvements, including improved support for the IIT CDIP1 (TREC Legal track) collection, and various bug fixes. This is intended to be the ultimate release in the 2.x series.
Release Notes: This is a minor update that contains some bugfixes, and some minor improvements. Support for indexing various test collections has been improved (CLEF and TREC Legal track), and the flexibility of the settings of some applications such as the Desktop Search and Interactive Terrier has been enhanced. This release includes a filesystem abstraction layer, which allows various types of files to be accessed through a uniform API. For example, indexing an HTTP Web page is as easy as indexing a local document. Moreover, a notable indexing bug affecting only the Windows platform was resolved.