Network and Information Technologies

Data Science

Proposta de tesi Investigadors/es Grup de recerca

Data Mining and Community Detection in Graphs (Graph Mining)

In many applications, it is natural to represent data with graphs. Usually, data is represented in one large, connected network. Examples of such networks include the Internet, social networks, citation networks, concept networks, computer networks, chemical interaction networks, regulatory networks, socio-economic networks and encyclopedias. Sample datasets are publicly available at amongst others

Graph mining is the study of how to perform data mining and machine learning on data represented with graphs. It includes several types of analysis, from pattern recognition [2] to community detection [1] and information flow. Algorithms from structured data mining do not work properly, since structural and topological information is crucial for graph analysis. Thus, new extensions or algorithms should be developed to deal with graph-formatted data.

For instance, uncovering the community structure exhibited by real networks is a crucial step towards an understanding of complex systems that goes beyond the local organization of their constituents. Many algorithms have been proposed so far [3], but the problem is still open and new methods and algorithms appear. Additionally, the recently tremendous increment of graph-formatted data, specially in the context of social networks and IoT, needs new methods to deal with very large graphs, with thousands or millions of vertices and edges. Therefore, parallelism and other techniques imported from Big Data can be applied in order to overcome the complexity when dealing with such data.

The challenges in this area are still many and of great complexity, therefore the research is guaranteed for the years to come.

[1] Ferrara, E. (2012). Community structure discovery in Facebook. International Journal of Social Network Mining, 1(1), 67–90.

[2] Gibson, D., Kumar, R., & Tomkins, A. (2005). Discovering large dense subgraphs in massive graphs. In International Conference on Very Large Data Bases (VLDB), pp. 721–732.

[3] Lancichinetti, A., & Fortunato, S. (2009). Community detection algorithms: a comparative analysis. Physical Review E, 80(5), 56117.


Dr Jordi Casas-Roma

Dr Jordi Conesa Caralt


eHealth Center



Medical image processing

Medical image processing is a key step in the diagnosis of a large number of diseases. Nowadays, we can acquire images of the inside and outside of our bodies using a large variety of devices (ultrasound, magnetic resonance, optic tomography, computed tomography, etc.). Afterward, the acquired images usually need to be denoised, corrected for inhomogeneities, segmented, registered, etc. in order to be able to get relevant information to aid the clinical decision using image-based biomarkers. 
On this research line, we would like to explore the latest image processing challenges and develop new image-based biomarkers that aid clinicians in their daily work. This work will be done in collaboration with world-wide recognised clinical institutions in Barcelona. 

Dr Ferran Prados



Dr Jordi Casas




Supply Chain Management (SCM) Optimization and Resilience to Disasters and Disruptions

Natural disasters have a significant and increasing impact all over the world. There is a growing concern about them, so Disaster Risk Reduction (DRR) is increasingly in international agenda, with special focus on cities because growing concentration of people and assets in urban zones. This thesis proposal sets up the scientific and technical basis for a significantly improved resilience to natural hazards (such as climate related hazards, earthquakes, etc.) and their human and socioeconomic impacts in urban zones.
The proposal is based on three principles, inspired by UN Sendai Framework and related to UN 2030 Agenda for Sustainable Development: 1) Focus on prevention and resilience building oriented. 2) Inclusive “whole-of-society” approach, to involve non-
traditional stakeholders not usually involved in DRR planning and decision making (such as households, SMEs, NGOs, etc.). 3) Data-driven approach, to integrate in DRR planning and decision-making diverse types of data (including small data, thick data, and big data) from a wide range of sources, and including reuse of data.
This thesis proposal will conduct research about data-based instruments for DRR planning and decision making (such as indexes, models and scorecards) applied to urban environments.
Adrot, A; Grace, R.; Moore, K.; Zobel, C.W.(eds) (2021). ISCRAM 2021 Conference Proceedings. 18th International Conference on Information Systems for Crisis Response and Management. Blakcsburg, Virginia (USA), May 2021.
Cobarsí, J.; Calvet, L. (2020). Community resilience instruments: Chances of improvement through customization and integration?. In ISCRAM 2020 Conference Proceedings. 17th International Conference on Information Systems for Crisis Response and Management Pp.381-388.
Kuipers, S.; Welsh, N.H. (2017). Taxonomy of the Crisis and Disaster Literature. Risk, Hazards and Crisis in Public Policy, v. 8, n. 4, pp. 272-283.

Dr Josep Cobarsí


Misinformation and disinformation through the lens of data analytics
This thesis proposal focuses on cases studies about misinformation or disinformation, through the application of quantitative data analytics methods to amounts of digital content such as: social media, mainstream media news and reports, Wikipedia entries, literature about historical events, open data and/or other open or public domain sources. This digital content may be created, updated, influenced and/or used by a wide range of actors: citizens, anonymous agents or activists, governments and public agencies, companies, international organizations, political parties, social organizations, etc. 
Research methodologies for these case studies will usually include the advanced conceptualization of misinformation and disinformation events, so to enhance the intensive application of quantitative methods to trace and analyse them through amounts of digital content and logs. These quantitative methods may be combined when suitable with qualitative methods.  
Cardoso, G.; Sepúlveda, R.; Narciso, I. (2022). Whatsapp and audio misinformation during Covid-19 pandemic. El Profesional de la Información  
Cobarsí-Morales, J. (2022). Controversial ‘Black Kegend’ concept as misinformation or disinformation related to history: where do we go from here in the 21st century information field?. In: Smits, M. Proceedings iConference 2022 – Information for a Better World: Shaping the Global Future.
Salaverría, R., & León, B. (2022). Misinformation beyond the media: ‘fake news’ in the big data ecosystem. In: Vázquez-Herrero J., Silva-Rodríguez A., Negreira-Rey M.C., Toural-Bran C., López-García X. (2022). Total Journalism. Models, Techniques and Challenges (pp. 109-121. Studies in Big Data, 97. Springer Nature. Cham: Springer. DOI:10.1007/978-3-030-88028-6_9
Meel, P.; Vishwakarma, D.K. (2019). Fake news, rumor, information pollution in social media and web: A contemporary survey of state of the arts, challenges and opportunities. Expert Systems With Applications

Dr Josep Cobarsí


Multilayer networks to better understand Multiple Sclerosis
Neuroaxonal anatomy and function are affected by multiple sclerosis (MS) disease which, in turn, impacts the brain structure, organization, and function. In general, networks representing particular brain aspects (morphology, structure, or dynamics) are studied independently to understand and predict individual brain damage effects [1]. Designing a single unified model to jointly study these multiple aspects is necessary to understand neurological diseases. 
Within this PhD we would like to introduce an interconnected multi-layer framework for the joint analysis of morphological, structural, and functional networks. Therefore, we aim to define a multi-layer scheme that allows us to combine the information of morphological, structural, and functional networks into a single scheme in order to better assess brain damage effects and evolution of MS patients. Then, it is very relevant to define or adapt graph-mining metrics to evaluate and quantify the deterioration of the connectivity of the brain using the new multi-layer scheme.
This work will be done in close collaboration with the Multiple Sclerosis group led by Dr. Sara Llufriu at the IDIBAPS-Hospital Clinic, a world-wide recognized clinical institution.
[1] E. Solana, E. Martinez-Heras, J. Casas-Roma, L. Calvet, E. Lopez-Soley, M. Sepulveda, N. Sola-Valls, C. Montejo, Y. Blanco, I. Pulido-Valdeolivas, M. Andorra, A. Saiz, F. Prados, S. Llufriu. (2019). Modified connectivity of vulnerable brain nodes in multiple sclerosis, their impact on cognition and their discriminative value. Scientific Reports 9, 20172. 10.1038/s41598-019-56806-z