A question answering system that is adapted for searching precise answers in textual passages extracted from Web documents or text collections.
Question-answering is for both the general public to retrieve precise information in raw texts, and for companies and organizations, that have specific text mining needs. Question-answering systems suggest short answers and their justification passage to questions provided in natural language.
Extension of search engine, technology monitoring
The large number of documents currently on the Web, but also on intranet systems, makes it necessary to provide users with intelligent assistant tools to help them find the specific information they are searching for. Relevant information at the right time can help solving a particular task. Thus, the purpose is to be able to access the content of texts, and not only give access to documents. Question-answering systems address this question.
Question-answering systems aim at finding answers to a question asked in natural language, using a collection of documents. When the collection is extracted from the Web, the structure and style of the texts are quite different from those of newspaper articles. We developed a question-answering system QAVAL based on an answer validation process able to handle both kinds of documents. A large number of candidate answers are extracted from short passages in order to be validated, according to question and excerpt characteristics. The validation module is based on a machine learning approach. It takes into account criteria characterizing both excerpt and answer relevance at surface, lexical, syntactic and semantic levels, in order to deal with different types of texts.
QAVAL is made of sequential modules, corresponding to five main steps. The question analysis provides main characteristics to retrieve excerpts and guide the validation process. Short excerpts are obtained directly from the search engine and are parsed and enriched with the question characteristics, which allows QAVAL to compute the different features for validating or discarding candidate answers.
Linux platform
Available for licensing on a case-by-case basis
Brigitte Grau
Brigitte.Grau@limsi.fr
LIMSI-CNRS
ILES Group
B.P. 133
91403 Orsay Cedex
France