Openminted: Sharing IXA pipes in the OpenMinTeD platform.

IXA pipes tresnak ( Lengoaia Naturalaren Prozesamendurako tresna multzoa da, hainbat hizkuntzatan prozesaketa linguistiko oinarrizkoa burutzen duena. Proiektu honetan, IXA pipes tresnak OpenMinted plataforman integratuko dira ( OpenMinted plataforma irekia da, hainbat hizkuntzaren prozesaketarako atari bateratua eskaintzeko sortua dena. Horretarako, tresnak plataforma amankomunean integratzen dira. Horrela, erabiltzaileak aukera du datu-fluxu desberdinak sortzeko eta exekutatzeko. Docker teknologiaz baliatuko gara proiektu honetan IXA pipes OpenMinted plataforman integratzeko.
The aim of this project is the integration of IXA pipes (, a set of ready to use Natural Language Processing (NLP) tools within the OpenMinTeD platform ( . The aim of IXA pipes is to provide a modular set of ready to use Natural Language Processing (NLP) tools. Apart from being easy to train and deploy, they are also a good fit for our aim of providing NLP tools for many languages because every module but the tokenizer is machine learning based. In fact, IXA pipes tries to use the same approach across NLP tasks in order to create robust processors both across domains and languages. This strategy has proven to be very successful for several tasks and languages, such as NER and Opinion Target Extraction (OTE), both in out-of-domain and in-domain evaluations. In the project, we will integrate IXA pipes into OpenMinTed, an open platform that will be a gateway to many types of language data, including tagsets, ontologies, publications and corpora. The platform will also offer services and functionalities that are useful for text and data mining, and allow miners to share their tools and build their own workflows. To this end, the IXA pipes modules will be shared as Docker images as well as previous works in doing similar integrating (e.g., the case of Alvis NLP modules).
IXA-pipes ( es un conjunto de herramientas para el Procesamiento de Lenguaje Natural (PLN) que incluye módulos para el análisis lingüístico en ocho idiomas que varían entre el tokenizado, clasificación de categorías gramaticales, reconocimiento de entidades nombradas, etc. En este proyecto, se integrarán las herramientas IXA-pipes en openminted (, una plataforma abierta cuyo objetivo es proponer un punto de acceso común para analizar textos multilíngües a varios niveles, incluyendo un grupo de etiquetas comunes, ontologías compartidas etc.
Rodrigo Agerri
Rodrigo Agerri