Measuring Language Distance of Isolated European Languages

Phylogenetics is a sub-field of historical linguistics whose aim is to classify a group of
languages by considering their distances within a rooted tree that stands for their historical evolution.
A few European languages do not belong to the Indo-European family or are otherwise isolated
in the European rooted tree. Although it is not possible to establish phylogenetic links using basic
strategies, it is possible to calculate the distances between these isolated languages and the rest using

Identification and translation of verb+noun multiword expressions: a Spanish-Basque study

This is a summary of the PhD thesis written by Uxoa Iñurrieta under the supervision of Dr. Gorka Labaka and Dr. Itziar Aduriz. Full title of the PhD thesis in Basque: "Izena+aditza Unitate Fraseologikoak gaztelaniatik euskarara: azterketa eta tratamendu konputazionala". The defense was held in San Sebastian on November 29, 2019. The doctoral committee was integrated by Ricardo Etxepare (Centre National de la Recherche Scientifique), Margarita Alonso (Universidad de Coruña) and Miren Azkarate (University of the Basque Country).

Teknologia, testuinguru digitala eta konpetentzia digitalak hezkuntzan

Teknologiaren garapenak ez du etenik. Badirudi hainbat motako datuen bilketa (eta hein batean jakintza) negozio bihurtu dela eta enpresa handien eta pribatuen esku nabarmen geratzen ari dela. Datuen bilketa eta garapen mota horrek gure identitate digitala (eta bestelakoa) arriskuan jar dezake eta oro har arrakala digitala areagotu egin du, eremu publikoaren edo jendartearen esku dauden aukerak eta baliabideak murrizten direlako.

A Methodology to Measure the Diachronic Language Distance between Three Languages Based on Perplexity

The aim of this paper is to apply a corpus-based methodology, based on the measure of perplexity, to automatically calculate the cross-lingual language distance between historical periods of three languages. The three historical corpora have been constructed and collected with the closest spelling to the original on a balanced basis of fiction and non-fiction.

Detection of Reading Absorption in User-Generated Book Reviews: Resources Creation and Evaluation

To detect how and when readers are experiencing engagement with a literary work, we bring together empirical literary studies and
language technology via focusing on the affective state of absorption. The goal of our resource development is to enable the detection
of different levels of reading absorption in millions of user-generated reviews hosted on social reading platforms. We present a corpus
of social book reviews in English that we annotated with reading absorption categories. Based on these data, we performed supervised,

Linguistic Appropriateness and Pedagogic Usefulness of Reading Comprehension Questions

Automatic generation of reading comprehension questions is a topic receiving growing interest in the NLP community, but there is currently no consensus on evaluation metrics and many approaches focus on linguistic quality only while ignoring the pedagogic value and appropriateness of questions. This paper overcomes such weaknesses by a new evaluation scheme where questions from the questionnaire are structured in a hierarchical way to avoid confronting human annotators with evaluation measures that do not make sense for a certain question.

Domain Adapted Distant Supervision for Pedagogically Motivated Relation Extraction

In this paper we present a relation extraction system that given a text extracts pedagogically motivated relation types, as a previousstep to obtaining a semantic representation of the text which will make possible to automatically generate questions for reading comprehension. The system maps pedagogically motivated relations with relations from ConceptNet and deploys Distant Supervisionfor relation extraction. We run a study on a subset of those relationships in order to analyse the viability of our approach.

Evaluating Multimodal Representations on Visual Semantic Textual Similarity

The combination of visual and textual representations has produced excellent results in tasks such as image captioning and visual question answering, but the inference capabilities of multimodal representations are largely untested.
In the case of textual representations, inference tasks such as Textual Entailment and Semantic Textual Similarity have been often used to benchmark the quality of textual representations.


Subscribe to RSS - Paper