A pipeline framework for Natural Language Processing
The targeted audience includes projects that require usual Natural Language Processing tools for production and research purpose.
Alvis NLP is a pipeline framework to annotate text documents using Natural Language Processing (NLP) tools for sentence and word segmentation, named-entity recognition, term analysis, semantic typing and relation extraction (see the paper by Nedellec et al. in Handbook on Ontologies 2009 for a comprehensive overview).
The various available functions are accessible as modules, that can be composed in a sequence forming the pipeline. This sequence, as well as parameters for the modules, is speciﬁed through a XML-based conﬁguration ﬁle.
New components can easily be integrated into the pipeline. To implement a new module, one has to build a Java class manipulating text annotations following the data model deﬁned in Alvis NLP.
The class is loaded at run-time by Alvis NLP, which makes the integration much easier.
Java 7 Weka
Sources available upon request. Free for use for academic institutions.