CITIC

The CITIC researcher from UDC, Carlos Gómez, presents his projects on computational linguistics in Kyoto, the science that translates natural language into technological devices

27/03/2025 - CITIC
  • He participated in a workshop on data-oriented approaches to the social sciences and humanities organized by Kyoto University.

The CITIC researcher from the University of A Coruña, Carlos Gómez, a specialist in computational linguistics, spoke about the potential of so-called LLMs (Large Language Models), aimed at adapting natural language to virtual environments and technological devices, at the workshop “Unit End-Year Workshop 2025: Data-Oriented Approaches to the Social Sciences and Humanities”.

The National Research Award winner presented the initial results of a comparative study being developed at CITIC on the differences between LLM-generated texts and human-written texts, their potential, drawbacks, and the recommendation to consider a scenario with multiple pathways for further research in this field. More languages, customized models, or considering gender influence are some of the directions highlighted by the researcher.

CITIC’s mission in participating in this event, as well as in other international initiatives, is to demonstrate that cutting-edge science can be conducted in Galicia. Science that can be transferred to society and, additionally, developed from a functional perspective.

Sciences and humanities, hand in hand to create the language of ICT

The objective of the project presented in Japan is to create algorithms that ‘translate’ natural language, often complex for ICT, into current digital environments typical of artificial intelligence, as well as into the codes, tools, and software specific to computational linguistics. The research of language technologies is the key to this scientific discipline, which combines the work of technologists such as mathematicians, computer scientists, or engineers with that of linguists and other humanities specialists. It is an interdisciplinary field that focuses on developing formalisms that describe the functioning of natural language so that they can be transformed and thus processed into executable programs by a technological device.

The results presented apply the findings of a project led by the same CITIC researcher, in which new techniques were developed to improve the speed of natural language parsers, making them suitable for web-scale processing. These techniques have also proven useful for analyzing and comparing large volumes of texts generated by humans and LLMs.