Paper

Measuring diachronic language distance using perplexity. Application to English, Portuguese and Spanish.

The objective of this work is to set a corpus-driven methodology to quantify automatically diachronic language distance between chronological periods of several languages. We apply a perplexity-based measure to written text representing different historical periods of three languages: European English, European Portuguese and European Spanish. For this purpose, we have built historical corpora for each period, which have been compiled from different open corpus sources containing texts as close as possible to its original spelling. The results of our experiments show that a diachronic

Towards a top-down approach for an automatic discourse analysis for Basque: Segmentation and Central Unit detection tool

Lately, discourse structure has received considerable attention due to the benefits carried out by its application in several NLP task such as opinion mining, summarization, question answering, text simplification, among others.

Leveraging SNOMED CT terms and relations for machine translation of clinical texts from Basque to Spanish

We present a method for machine translation of clinical texts without using bilingual clinical texts, leveraging the rich terminology and structure of the Systematized Nomenclature of Medicine – Clinical Terms (SNOMED CT), which is considered the most comprehensive, multilingual clinical health care terminology collection in the world. We evaluate our method for Basque to Spanish translation, comparing the performance with and without using clinical domain resources.

Neural Machine Translation of clinical texts between long distance languages

ABSTRACT Objective: To analyze techniques for machine translation of electronic health records (EHRs) between long distance languages, using Basque and Spanish as a reference. We studied distinct configurations of neural machine translation systems and used different methods to overcome the lack of a bilingual corpus of clinical texts or health records in Basque and Spanish.

Pages

Subscribe to RSS - Paper