Distributed, Parallel and Collaborative Systems

Propuesta de tesis

Investigadores/as

Grupo de investigación

Parallel and distributed scientific applications: performance and efficiency

There are currently various bottlenecks in the growth in parallel and distributed programming paradigms and environments, which are affecting the ability to provide efficient applications for performing concurrent computations.

We need to know the platforms, their performance, the underlying hardware and networking technologies, and we must be able to produce optimized software that statically or dynamically may take advantage of the computational resources available.

In this line of research we study different approaches to producing better scientific applications, and to making tools (via automatic performance analysis), which can understand the application model and the underlying programming paradigm. We try to tune the performance of these to a dynamically changing computational environment, in which the resources (and their characteristics) can be homogeneous or heterogeneous depending on the hardware platform. In particular we focus our research on shared memory and message-passing paradigms, and in many-core/multi-core environments including multi-core CPUs, GPUs (graphic cards computing) and cluster/grid/cloud/super computing platforms.

Dr Josep Jorba

WINE

Community-owned systems at the edge

Edge computing is a case of cloud computing where a portion of the computing part (data and/or services) is hosted in resources spread in Internet (“at the edges”). By community-owned systems at the edge we refer to systems that host their data and services in personal computers (mostly desktop computers or single-board computers such as Raspberry Pi) voluntarily contributed by participants in the system. Community-owned systems at the edge will be self-owned systems (community members own the computers where data and services are hosted); self-managed (with a decentralized and uncoupled structure); and self-growing. They also share the following characteristics:

(a) No central authority is responsible for providing the required computational resources.

(b) Heterogeneous (software and hardware) and low capacity computer resources spread across the Internet in contrast with high capacity cluster of computers on traditional clouds.

(c) The computational infrastructure belongs to the user and is shared to build the computational infrastructure.

Regarding the reliability and QoS of these community-owned systems at the edge they have

to guarantee to the user:

* Availability: the user can access data anytime from anywhere;

* Freshness: the user gets up-to-date data; and

* Immediateness: the user obtains the data in a time that is felt as immediate.

Therefore, this kind of system has to (a) guarantee a clever and optimal usage of the (likely scarce) contributed resources (storage, bandwidth, and CPU) to avoid wasting them; and (b) provide privacy and security guarantees.

We are looking for PhD candidates interested in large-scale distributed systems applied to community-owned systems at the edge in fields such as (a) optimal allocation of data and services in resources, (b) availability prediction, (c) efficient usage of resources, or (d) privacy and security.

 
Dr Joan Manuel Marquès ICSO

Migration of Parallel Applications to Cloud Computing architectures

Scientific parallel applications usually require a lot of computing  resources to resolve complex problems. Traditionally, this kind of applications have been executed in cluster or supercomputing environments.

With the advent of cloud computing, a new and interesting platform arose to execute scientific parallel applications, which require High Performance Computing (HPC), providing scalable, elastic, practical, and low cost platform, to satisfy the computational and storage demands of many scientific parallel applications.

The migration of HPC parallel applications to cloud environments comes with several advantages, but due to the complex interaction between the parallel applications and the virtual layer, many applications may suffer performance inefficiencies when they scale. This problem is particularly serious when the application is executed many times over a long period of time. 

To achieve an efficient use of these virtual systems using a large number of cores, a point to consider before executing an application is to know its performance behavior in the system. It is important to know this information since the ideal number of processes and resources required to run the application may vary from one system to another, due to hardware virtualization architecture differences. Moreover, it is known that using more resources does not always imply a higher performance. The lack of this information may produce an inefficient use of the cores, causing problems such as not achieving the expected speedup, and increased energy and economic costs.

In this research line we study different approaches to make novel methodologies and automatic performance analysis tools, to analyze and predict the application behaviour in a specific cloud platform. By this way, users have valuable information to execute the parallel application on a target virtual cloud architecture in an efficient way (i.e selecting the righ number of cloud resources, tuning the virtual parameters, etc). Moreover, the developed tools provide information to detect possible inefficiences which become potential bottlenecks in the specific System.

 

 Dr Josep Jorba

 

Dr Javier Panadero

 WINE

 

ICSO