Machine learning methods for understanding social media communication: Modeling irony and emojis

Barbieri, Francesco

Machine learning methods for understanding social media communicationModeling irony and emojis

Barbieri, Francesco

Dirigée par:

Horacio Saggion Directeur/trice

Université de défendre: Universitat Pompeu Fabra

Fecha de defensa: 25 janvier 2018

Jury:

Pablo Gervás Gómez-Navarro President
Dirk Hovy Secrétaire
Ricardo Baeza Yates Rapporteur

Type: Thèses

Teseo: 529558 DIALNET TDX editor

Résumé

In this dissertation we propose algorithms for the analysis of social media texts, focusing on two particular aspects: irony and emojis. We propose novel automatic systems, based on machine learning methods, able to recognize and interpret these two phenomena. We also explore the problem of topic bias in sentiment analysis and irony detection, showing that traditional word based systems are not robust when they have to recognize irony on a new domain. We argue that our proposal is better suited for topic changes. We then use our approach to recognize another phenomena related to irony: satirical news in Twitter. By relying on distributional semantic models, we also introduce a novel method for the study of the meaning and use of emojis in social media texts. Moreover, we also propose an emoji prediction task that consists in predicting the emoji present in a text message using only the text. We have shown that this emoji prediction task can be performed by deep-learning systems with good accuracy, and that this accuracy can be improved by using images included in the post.