WSD-IXA

Short description: 
Word-Sense Disambiguation
Contact: 
e.agirre[abildua/at]ehu.es
Description: 
The WSD system is based on the well known Support Vectors Machine (SVM) Algorithm. This system has been trained on EuSemCor corpus (the unique basque corpus semantically tagged). Due to corpus's reduced size, the WSD system has been trained for 402 polysemous nouns.
Functionality: 
Perl CGI script runs the input raw text over Eustagger basque lemmatizer in order to extract features. Then, the feature-vector is classified by the WSD (SVM) system. Finally, the CGI manage classifier and lemmatizer output in order to show in a proper format.
Technology: 
C, C++, Perl.
Modules: 
Perl CGI script, EusSemcor data base (MySql), Eustagger, SVM-light.
Innovation: 
First online WSD system for Basque.