Gero Corpus Historikoa

Short description: 
Datasets for modernising historical Basque words
Contact: 
ixa@ehu.eus
Description: 
Datasets for modernising historical Basque words.
The lexicons have been automatically extracted from this corpus:
http://klasikoak.armiarma.eus/idazlanak/A/AxularGero.htm
Based on this corpus some paragraphs have been selected and annotated using BRAT.
Functionality: 
Training/dev. corpus and test corpus- train-gero training word-form lexicon for Gero book (only non-standard words)
- train-gero-std training word-form lexicon for Gero book (with standard words, half of them)
- test-gero test word-form lexicon for Gero book (only non-standard words)
Innovation: 
Selection and manual annotation
Ownership: 
Ixa taldea
License: 
CC By
Notes: 
This resources are available under CC-BY: if you use them please refer to the reference above.