Master Tesia

Title:

Robust Document Representations for Hyperpartisan and Fake News Detection

Author:

Talita Anthonio, MA

Abstract:

Hyperpartisan news is characterized by extremely one-sided content from a left-wing or right-wing political perspective. This thesis is concerned with automatically detecting such news through supervised text classification. We work with data from the recent shared task on hyperpartisan news detection (SemEval-2019 Task 4). We use two classification techniques: Support Vector Machines (SVMs) and Recurrent Neural Networks. We experiment with document representations using bag-of-words, bag-of-clusters, word embeddings and contextual character-based embeddings. We also try to improve our classifiers by adding local features, such as POS n-grams, stylistic features and the sentiment of a text. Our aim is to build robust classifiers across tasks related to fake news, for different domains and text genres. Although local features help to model the task in-domain, this thesis shows that dense document representations work better across domains and tasks. We obtain very competitive results in the hyperpartisan news detection task and state-of-the-art results in an out-of-domain evaluation on fake news.

File:

MAL-Talita_Rani_Anthonio.pdf

Tutor:

Rodrigo Agerri and Malvina Nissim

Urtea:

2019

hitz_gakoak:

hyperpartisan news detection, fake news, supervised text classification

bilatzailea

You are here

Languages

Master Tesia