Semantic document indexing and search engine framework

Target users and customers

Domain-specific communities, especially technical and scientific, willing to build search  engines and information systems to manage documents with fine-grained semantic annotations.

Application sectors

Search engines and information systems development.


AlvisIR is a complete suite for indexing documents with fine-grained semantic annotations. The search  engine performs a semantic analysis of the user query and searches for synonyms and sub-concepts.

AlvisIR has two main components:

1. the indexing tool and search  daemon based on IndexDataʼs Zebra that supports standard CQL queries,

2. the web user interface featuring result snippets, query-term highlight,  facet filtering and concept hierarchy browsing.

Setting  up a search  engine requires  the semantic resources for query analysis (synonyms and concept hierarchy) and a set of annotated documents. AlvisIR is closely integrated with AlvisNLP and TyDI for document annotation and semantic resources acquisition respectively.

Indicative indexing time: 24mn  for a corpus containing 5 million annotations.

Indicative response time: 18s for a response containing 20,000 annotations.

Technical requirements:

  • Linux platform
  • Perl
  • libxml2
  • Zebra indexing engine
  • PHP5

Conditions for access and use:

Sources  available upon request. Free of use for academic institutions.



  • Inra

Contact details:

Robert Bossy

Domaine de Vilvert
78352 Jouy-en-Josas cedex