CITIC

DATA SCIENCE AND ENGINEERING

Data Science and Engineering is a growing area of research that promises to create new knowledge and applications that will improve people’s lives. Our researchers in this area have a proved track record of research excellence, as evidenced by both their publications and numerous projects at a national and international level.

 

RESEARCH AREAS AND PRIORITIES:

Big Data: Big Data is the study of how to process and manage large volumes of data that conventional methods cannot deal with owing to the size, variety and complexity of the data sets, the speed at which they need to be managed, and the presence of both structured and unstructured data. Additional challenges associated with the collection and analysis of big data include the veracity (reliability), value (performance) and visualization of the data sets.

Research areas:

    • New systems for efficient massive data storage using compact, self-indexed data structures. Design and implementation of algorithms for the creation and exploitation of compressed data structures for a range of data types, including text, graphics, raster maps, moving object trajectories and genome.

 

Information retrieval: Information retrieval is the activity of collecting, storing, indexing and searching for information. It is a collective system of theories, models, algorithms and ranking methods for unstructured online texts and texts in fields such as medicine, law and science.

Research areas:

    • Information retrieval models
    • Efficiency of information retrieval systems
    • Recommender systems
    • Information system assessment methods
    • Text mining and information retrieval for early detection of psychological disorders online

 

Information systems: Information systems deal with the collection, storage, processing and use of data of all kinds. Our research in this area focuses on developments in software engineering in relation to the automation of information systems, the quality of the resulting systems, and their ability to manage massive, complex, heterogeneous data sets.

Research areas:

    • Development of tools for the automated creation of information systems online, with particular focus on geographic information systems
    • Development of complex information systems technology, especially in relation to digital libraries, multimedia systems and geographic information systems
    • Development of mobility management systems to track moving objects on a map, and consult and analyse space-time data more efficiently

 

Data analysis: Data analysis refers to the different techniques and approaches used to analyse data according to the type, nature and purpose of the data. These include statistical analysis and artificial intelligence, and have applications across an array of different sectors, including industry, finance, astrophysics, biology and health.

Research areas:

    • Dependent data analysis, including cluster analysis of time series, classification of time series, soft bootstrapping for serial dependence, and soft estimation of copula functions
    • Spatial data analysis
    • Functional data analysis
    • Complex biological systems analysis
    • Survival analysis
    • Gaia satellite data analysis and processing
    • Statistical inference in small areas
    • Image optimization using deep learning techniques
    • Deterministic symbolic regression
    • Machine Learning with feature engineering, using the perturbation theory for computational biology and chemistry
    • Medical signal analysis
    • Astronomical data mining
    • Compressed data structures
    • Mobility data analysis

 

Simulation and optimization: Modelling is the study of a process or product in order to represent it in mathematical or statistical terms. Numerical simulation involves the study and selection of numeric algorithms to solve the model, and the implementation of the corrected algorithms in the computer to calculate and provide a graphic representation of the solution found. Numerical optimization algorithms are used to adjust model parameters to real-time data in order to predict the evolution of a process or the operation of a product.

Research areas:

    • Industrial production process modelling and simulation
    • Computational finance
    • Numerical modelling and simulation of genetic networks and computational biology processes
    • Numerical modelling and simulation of fluid flow through porous media, fluid mechanics and internal waves
    • Analysis and simulation of problems related to convection and diffusion, fluid-structure interaction, and floating bodies
    • Hybrid numerical optimization algorithms
    • Parallel algorithms in multi-core CPU and GPU
    • Mathematical model analysis of ordinary, partial and stochastic differential equations
    • Mathematical programming
    • Metaheuristics 
    • Cooperative and non-cooperative games
    • Applications: routes, inventory, scheduling, etc.