
CITIC-UDC researcher Miguel Anxo Pérez Vila recognised for his AI-based work on detecting depression on social media
- The work has been awarded the Best Doctoral Thesis of 2024 by the Spanish Society for Natural Language Processing (SEPLN).
- The thesis is based on evidence that individuals experiencing mental health issues often show changes in the way they express themselves.
- At the same congress, researcher Roi Santos presented his progress on the CIDMEFEO project, developed in collaboration with the National Statistics Institute (INE).
A Coruña, 6 October 2025.– The Spanish Society for Natural Language Processing (SEPLN) has awarded Miguel Anxo Pérez Vila the Best Doctoral Thesis of 2024, one of the most prestigious awards in the field of Natural Language Processing (NLP) in Spain, in recognition of its scientific excellence and social relevance.
Pérez’s research, conducted at the Centre for ICT Research (CITIC) at the University of A Coruña — a centre integrated into the CIGUS Network of the Regional Government of Galicia — is pioneering in the application of Artificial Intelligence, machine learning, and computational linguistics to detect depression through language use on social media.
The thesis, supervised by researchers Javier Parapar and Álvaro Barreiro, is based on evidence that people suffering from mental health conditions often exhibit changes in how they communicate. Building on this premise — and leveraging the vast repository of written language available on social platforms — Pérez’s work explores how such posts can be utilised, through AI models trained on clinical and social data, to identify risk signals among users.
One of the most innovative aspects of the research is its focus on increasing transparency in detection systems. Unlike previous, more opaque approaches, the thesis proposes explainable models based on clinically validated symptoms, enabling results to be more understandable and useful to healthcare professionals.
The project combines the design of new algorithms to estimate depression severity with the creation of specialised datasets and the exploration of large language models (LLMs). Furthermore, its contributions have been integrated into a demonstration platform designed for clinical use, paving the way for practical applications in the healthcare sector.
Throughout the development of the thesis, Pérez has presented his findings at leading international conferences such as ECIR, SIGIR and EMNLP, as well as in journals like Artificial Intelligence in Medicine (AIM). Among the main conclusions are several key contributions: depression symptoms manifest differently in language, requiring models sensitive to their nature; social media messages contain subtle signals that can be revealed through semantic retrieval techniques; and the lack of suitable data led to the creation of two new benchmark datasets, BDI-Sen and DepreSym. The research also highlights that while large language models can support annotation tasks, human supervision remains essential. Finally, collaboration with clinical professionals was critical to guide classification, interpret results and ensure the medical validity of conclusions.
With this award, SEPLN recognises research that stands out not only for its originality and methodological rigour but also for its potential impact on improving mental health and social wellbeing.
Strong CITIC presence at the SEPLN Congress
At the same SEPLN Congress where Miguel Anxo Pérez Vila received his award, CITIC researcher Roi Santos Ríos also participated, presenting part of his doctoral work, “Automatic Classification of the Economic Activity of a Company Using ML and DL Techniques”. This research is part of the Data Science and Engineering for the Improvement of Official Statistical Function (CIDMEFEO) project, funded by the National Statistics Institute (INE). It focuses on developing a prototype for the automatic classification of text to identify and label the economic activity of Spanish companies based on their self-reported descriptions.
This line of work aims to develop an automatic coding system based on machine learning techniques to streamline and improve the processing of open-ended responses in official surveys, in collaboration with INE. The objective is to reduce time and costs, improve the consistency of results, and address complex challenges such as linguistic variability, uneven response quality, and Spain’s multilingual reality.
Thus, CITIC’s participation at the congress was marked not only by a prestigious award but also by the presentation of cutting-edge projects that reinforce its leadership in applying AI to enhance statistical and social processes.
About CITIC
CITIC is a research centre driving progress and excellence in applied R&D&I in Information and Communication Technologies (ICT). Established in 2008 by the University of A Coruña, the centre’s scientific activity is structured around four main research areas: Artificial Intelligence, Data Science and Engineering, High-Performance Computing, and Intelligent Services and Networks, as well as a cross-cutting area: Cybersecurity.
CITIC is accredited as a Centre of Excellence and a member of the CIGUS Network for the period 2024–2027, underscoring the quality and impact of its research. Its accreditation, structure, and development are co-funded by the Regional Government of Galicia and 60% by the European Union under the ERDF Galicia Operational Programme 2021–2027, with the thematic objective of promoting “a smarter Europe: innovative and intelligent economic transformation” (ED431G 2023/01).