Publications
A full list of my publications. Also available on Google Scholar .
Journal Article

A survey on the Distributed Computing stack

Cristian Ramon-Cortes, Pol Alvarez, Francesc Lordan, Javier Alvarez, Jorge Ejarque, Rosa M. Badia

COSREV — Computer Science Review  •  2021

Abstract

In this paper, we review the background and the state of the art of the Distributed Computing software stack. We aim to provide the readers with a comprehensive overview of this area by supplying a detailed big-picture of the latest technologies. First, we introduce the general background of Distributed Computing and propose a layered top-bottom classification of the latest available software. Next, we focus on each abstraction layer, i.e. Application Development (including Task-based Workflows, Dataflows, and Graph Processing), Platform (including Data Sharing and Resource Management), Communication (including Remote Invocation, Message Passing, and Message Queuing), and Infrastructure (including Batch and Interactive systems). For each layer, we give a general background, discuss its technical challenges, review the latest programming languages, programming models, frameworks, libraries, and tools, and provide a summary table comparing the features of each alternative. Finally, we conclude this survey with a discussion of open problems and future directions.

Journal Article Featured

The impact of non-additive genetic associations on age-related complex diseases

Marta Guindo-Martinez, Ramon Amela, Silvia Bonas-Guarch, Montserrat Puiggros, Cecilia Salvoro, Irene Miguel-Escalada, Caitlin E. Carey, Joanne B. Cole, Sina Rueger, Elizabeth Atkinson, Aaron Leong, Friman Sanchez, Cristian Ramon-Cortes, Jorge Ejarque, Duncan S. Palmer, Mitja Kurki, Krishna Aragam, Jose C. Florez, Rosa M. Badia, Josep M. Mercader, David Torrents

NatureComm — Nature Communications  •  2021

Abstract

Genome-wide association studies (GWAS) are not fully comprehensive, as current strategies typically test only the additive model, exclude the X chromosome, and use only one reference panel for genotype imputation. We implement an extensive GWAS strategy, GUIDANCE, which improves genotype imputation by using multiple reference panels and includes the analysis of the X chromosome and non-additive models to test for association. We apply this methodology to 62,281 subjects across 22 age-related diseases and identify 94 genome-wide associated loci, including 26 previously unreported. Moreover, we observe that 27.7% of the 94 loci are missed if we use standard imputation strategies with a single reference panel, such as HRC, and only test the additive model. Among the new findings, we identify three novel low-frequency recessive variants with odds ratios larger than 4, which need at least a three-fold larger sample size to be detected under the additive model. This study highlights the benefits of applying innovative strategies to better uncover the genetic architecture of complex diseases.

Preprint

Enabling the execution of large scale workflows for molecular dynamics simulations

Pau Andrio, Adam Hospital, Cristian Ramon-Cortes, Javier Conejero, Daniele Lezzi, Jorge Ejarque, Josep LL. Gelpi, Rosa M. Badia

BioRxiv — BioRxiv  •  2021

Abstract

The usage of workflows has led to progress in many fields of science, where the need to process large amounts of data is coupled with difficulty in accessing and efficiently using High Performance Computing platforms. On the one hand, scientists are focused on their problem and concerned with how to process their data. On top of that, the applications typically have different parts and use different tools for each part, thus complicating the distribution and the reproducibility of the simulations. On the other hand, computer scientists concentrate on how to develop frameworks for the deployment of workflows on HPC or HTC resources; often providing separate solutions for the computational aspects and the data analytic ones. In this paper we present an approach to support biomolecular researchers in the development of complex workflows that i) allow them to compose pipelines of individual simulations built from different tools and interconnected by data dependencies, ii) run them seamlessly on different computational platforms, and iii) scale them up to the large number of cores provided by modern supercomputing infrastructures. Our approach is based on the orchestration of computational building blocks for Molecular Dynamics simulations through an efficient workflow management system that has already been adopted in many scientific fields to run applications on multitudes of computing backends. Results demonstrate the validity of the proposed solution through the execution of massively parallel runs in a supercomputer facility.

Thesis Featured

PhD Thesis: Programming Models to support Data Science workflows

Cristian Ramon-Cortes

UPC Commons — UPC Commons  •  2020

Abstract

In the recent joint venture between High-Performance Computing (HPC) and Big-Data (BD) Ecosystems towards the Exascale Computing, the scientific community has realized that powerful programming models and high-level abstraction tools are a must. Within this context, the Barcelona Supercomputing Center (BSC) is developing the COMP Superscalar (COMPSs) programming model, whose main objective is to develop applications in a sequential way, while the Runtime System handles the inherent parallelism of the application and abstracts the programmer from the different underlying infrastructures. The parallelism is achieved by defining an application Interface that allows COMPSs to detect methods that operate on a set of parameters (called tasks), and execute them distributedly and transparently. This PhD Thesis aims to enhance COMPSs, adapting it to the needs of the Big-Data Ecosystems, by supporting Analytic and HPC workflows, introducing automatic parallelisation, and enabling hybrid task-based and dataflow workflows.

Journal Article Featured

A Programming Model for Hybrid Workflows: combining Task-based Workflows and Dataflows all-in-one

Cristian Ramon-Cortes, Francesc Lordan, Jorge Ejarque, Rosa M. Badia

FGCS — Future Generation Computer Systems  •  2020

Abstract

In the past years, e-Science applications have evolved from large-scale simulations executed in a single cluster to more complex workflows where these simulations are combined with Artificial Intelligence (AI) and High-Performance Data Analytics (HPDA). To implement these workflows, developers are currently using different patterns; mainly targeting task-based and dataflow. However, since these patterns are usually managed by isolated frameworks, the implementation of these applications requires to combine them; considerably increasing the effort for learning, deploying, and integrating applications in the different frameworks. This paper tries to reduce this effort by proposing a way to extend task-based management systems to support continuous input and output data to enable the combination of task-based workflows and dataflows (Hybrid Workflows from now on) using a single programming model. Hence, developers can build complex Data Science workflows with different approaches depending on the step requirements. To illustrate the capabilities of Hybrid Workflows, we have built a Distributed Stream Library and a fully functional prototype extending COMPSs, a mature, general-purpose, task-based, parallel programming model. The library can be easily integrated with existing task-based frameworks to provide support for dataflows. Also, it provides a homogeneous, generic, and simple representation of object and file streams in both Python and Java; enabling complex workflows to handle any data without dealing directly with the streaming back-end. During the evaluation, we introduce four use cases to illustrate the new capabilities of Hybrid Workflows, and measure the performance benefits when processing data continuously as it is generated, when removing synchronisation points, and when scaling the number of writers and readers. Furthermore, we conduct an in-depth analysis of the task analysis, task scheduling, and task execution times when using objects or streams.

Preprint

The impact of non-additive genetic associations on age-related complex diseases (preprint)

Marta Guindo-Martinez, Ramon Amela, Silvia Bonas-Guarch, Montserrat Puiggros, Cecilia Salvoro, Irene Miguel-Escalada, Caitlin E. Carey, Joanne B. Cole, Sina Rueger, Elizabeth Atkinson, Aaron Leong, Friman Sanchez, Cristian Ramon-Cortes, Jorge Ejarque, Duncan S. Palmer, Mitja Kurki, Krishna Aragam, Jose C. Florez, Rosa M. Badia, Josep M. Mercader, David Torrents

BioRxiv — BioRxiv  •  2020

Abstract

Genome-wide association studies (GWAS) are not fully comprehensive as current strategies typically test only the additive model, exclude the X chromosome, and use only one reference panel for genotype imputation. We implemented an extensive GWAS strategy, GUIDANCE, which improves genotype imputation by using multiple reference panels, includes the analysis of the X chromosome and non-additive models to test for association. We applied this methodology to 62,281 subjects across 22 age-related diseases and identified 94 genome-wide associated loci, including 26 previously unreported. We observed that 27.6% of the 94 loci would be missed if we only used standard imputation strategies and only tested the additive model. Among the new findings, we identified three novel low-frequency recessive variants with odds ratios larger than 4, which would need at least a three-fold larger sample size to be detected under the additive model. This study highlights the benefits of applying innovative strategies to better uncover the genetic architecture of complex diseases.

Journal Article

AutoParallel: Automatic parallelisation and distributed execution of affine loop nests in Python

Cristian Ramon-Cortes, Ramon Amela, Jorge Ejarque, Philippe Clauss, Rosa M. Badia

IJHPCA — International Journal of High-Performance Computing Applications  •  2020

Abstract

The last improvements in programming languages and models have focused on simplicity and abstraction; leading Python to the top of the list of the programming languages. However, there is still room for improvement when preventing users from dealing directly with distributed and parallel computing issues. This paper proposes and evaluates AutoParallel, a Python module to automatically find an appropriate task-based parallelisation of affine loop nests to execute them in parallel in a distributed computing infrastructure. This parallelisation can also include the building of data blocks to increase tasks' granularity in order to achieve a good execution performance. Moreover, AutoParallel is based on sequential programming and only contains a small annotation in the form of a Python decorator so that anyone with little programming skills can scale up an application to hundreds of cores.

Conference Paper

AutoParallel: A Python module for automatic parallelization and distributed execution of affine loop nests

Cristian Ramon-Cortes, Ramon Amela, Jorge Ejarque, Philippe Clauss, Rosa M. Badia

PyHPC 2018 — SC18 Workshop: Python for High-Performance and Scientific Computing  •  2018

Abstract

The last improvements in programming languages, programming models, and frameworks have focused on abstracting the users from many programming issues. Among others, recent programming frameworks include simpler syntax, automatic memory management and garbage collection, which simplifies code re-usage through library packages, and easily configurable tools for deployment. For instance, Python has risen to the top of the list of the programming languages due to the simplicity of its syntax, while still achieving a good performance even being an interpreted language. Moreover, the community has helped to develop a large number of libraries and modules, tuning them to obtain great performance. However, there is still room for improvement when preventing users from dealing directly with distributed and parallel computing issues. This paper proposes and evaluates AutoParallel, a Python module to automatically find an appropriate task-based parallelization of affine loop nests to execute them in parallel in a distributed computing infrastructure. This parallelization can also include the building of data blocks to increase task granularity in order to achieve a good execution performance. Moreover, AutoParallel is based on sequential programming and only contains a small annotation in the form of a Python decorator so that anyone with little programming skills can scale up an application to hundreds of cores.

Conference Paper

Boosting Atmospheric Dust Forecast with PyCOMPSs

Javier Conejero, Cristian Ramon-Cortes, Kim Serradell, Rosa M. Badia

eScience 2018 — IEEE eScience 2018  •  2018

Abstract

This paper describes the success story of the adaptation of the NMMB-MONARCH online multi-scale atmospheric dust model to PyCOMPSs in order to exploit its inherent parallelism with the minimal developer effort. The paper also includes an evaluation of this implementation in the Nord3 supercomputer, a scalability analysis and an in-depth behaviour study. The main results presented in this paper are: (1) PyCOMPSs is able to extract the parallelism from the NMMB-MONARCH application; (2) it is able to improve the dust forecasting in terms of performance when compared with previous versions, and (3) PyCOMPSs is able to interact and share the resources with MPI applications when included in the workflow as tasks. Finally, we present the keys for exporting the knowledge of this experience to other applications in order to benefit from using PyCOMPSs.

Journal Article

Executing linear algebra kernels in heterogeneous distributed infrastructures with PyCOMPSs

Ramon Amela, Cristian Ramon-Cortes, Jorge Ejarque, Javier Conejero, Rosa M. Badia

OGST 2018 — Oil & Gas Science and Technology - Revue d'IFP Energies nouvelles  •  2018

Abstract

Python is a popular programming language due to the simplicity of its syntax, while still achieving a good performance even being an interpreted language. The adoption from multiple scientific communities has evolved in the emergence of a large number of libraries and modules, which has helped to put Python on the top of the list of the programming languages. Task-based programming has been proposed in the recent years as an alternative parallel programming model. PyCOMPSs follows such approach for Python, and this paper presents its extensions to combine task-based parallelism and thread-level parallelism. Also, we present how PyCOMPSs has been adapted to support heterogeneous architectures, including Xeon Phi and GPUs. Results obtained with linear algebra benchmarks demonstrate that significant performance can be obtained with a few lines of Python.

Journal Article

Transparent Orchestration of Task-based Parallel Applications in Containers Platforms

Cristian Ramon-Cortes, Albert Serven, Jorge Ejarque, Daniele Lezzi, Rosa M. Badia

JoGC — Journal of Grid Computing (JoGC)  •  2017

Abstract

This paper presents a framework to easily build and execute parallel applications in container-based distributed computing platforms in a user-transparent way. The proposed framework is a combination of the COMP Superscalar (COMPSs) programming model and runtime, which provides a straightforward way to develop task-based parallel applications from sequential codes, and containers management platforms that ease the deployment of applications in computing environments (as Docker, Mesos or Singularity). This framework provides scientists and developers with an easy way to implement parallel distributed applications and deploy them in a one-click fashion. We have built a prototype which integrates COMPSs with different containers engines in different scenarios: i) a Docker cluster, ii) a Mesos cluster, and iii) Singularity in an HPC cluster. We have evaluated the overhead in the building phase, deployment and execution of two benchmark applications compared to a Cloud testbed based on KVM and OpenStack and to the usage of bare metal nodes. We have observed an important gain in comparison to cloud environments during the building and deployment phases. This enables better adaptation of resources with respect to the computational load. In contrast, we detected an extra overhead during the execution, which is mainly due to the multi-host Docker networking.

Conference Paper

Enabling Python to execute efficiently in heterogeneous distributed infrastructures with PyCOMPSs

Ramon Amela, Cristian Ramon-Cortes, Jorge Ejarque, Javier Conejero, Rosa M. Badia

SC17 ICHPC — SC17 The International Conference for High Performance Computing, Networking, Storage and Analysis  •  2017

Abstract

Python has been adopted as programming language by a large number of scientific communities. Additionally to the easy programming interface, the large number of libraries and modules that have been made available by a large number of contributors, have taken this language to the top of the list of the most popular programming languages in scientific applications. However, one main drawback of Python is the lack of support for concurrency or parallelism. PyCOMPSs is a proved approach to support task-based parallelism in Python that enables applications to be executed in parallel in distributed computing platforms. This paper presents PyCOMPSs and how it has been tailored to execute tasks in heterogeneous and multi-threaded environments. We present an approach to combine the task-level parallelism provided by PyCOMPSs with the thread-level parallelism provided by MKL. Performance and behavioral results in distributed computing heterogeneous clusters show the benefits and capabilities of PyCOMPSs in both HPC and Big Data infrastructures.

Thesis Featured

Master Thesis: Enabling Analytic and HPC Workflows with COMPSs

Cristian Ramon-Cortes

UPC Commons — UPC Commons  •  2017

Abstract

In the recent joint venture between High-Performance Computing (HPC) and Big-Data (BD) Ecosystems towards the Exascale Computing, the scientific community has realized that powerful programming models and high-level abstraction tools are a must. Within this context, the Barcelona Supercomputing Center (BSC) is developing the COMP Superscalar (COMPSs) programming model, whose main objective is to develop applications in a sequential way, while the Runtime System handles the inherent parallelism of the application and abstracts the programmer from the different underlying infrastructures. The parallelism is achieved by defining an application Interface that allows COMPSs to detect methods that operate on a set of parameters (called tasks), and execute them distributedly and transparently. This Master Thesis aims to enhance COMPSs, adapting it to the needs of the Big-Data Ecosystems, by supporting Analytic and HPC workflows. To this end, we propose a straight-forward integration with the execution of binaries, and MPI and OmpSs applications. Although the COMPSs programming model is kept untouched, we extend the COMPSs Annotations and some of the COMPSs internals such as the task schedulers and the worker executors. To support our contribution, we have ported to COMPSs two real use cases. On the one hand, NMMB BSC-Dust, a workflow to predict the atmospheric life cycle of the desert dust and, on the other hand, Guidance, an integrated solution for Genome and Phenome association analysis.

Conference Paper

Transparent Execution of Task-Based Parallel Applications in Docker with COMP Superscalar

Victor Anton, Cristian Ramon-Cortes, Jorge Ejarque, Rosa M. Badia

PDP 2017 — 25th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)  •  2017

Abstract

This paper presents a framework to easily build and execute parallel applications in container-based distributed computing platforms in a user transparent way. The proposed framework is a combination of the COMP Superscalar and Docker. We have built a prototype in order to evaluate how it performs by evaluating the overhead in the building, deployment and execution phases. We have observed an important gain compared with cloud environments during the building and deployment phases. In contrast, we have detected an extra overhead during the execution, which is mainly due to the multi-host Docker networking.

Journal Article Featured

COMP Superscalar, an interoperable programming framework

Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Francesc Lordan, Cristian Ramon-Cortes, Javier Conejero, Carlos Diaz

SoftwareX — Software X  •  2015

Abstract

COMPSs is a programming framework that aims to facilitate the parallelization of existing applications written in Java, C/C++ and Python scripts. For that purpose, it offers a simple programming model based on sequential development in which the user is mainly responsible for (i) identifying the functions to be executed as asynchronous parallel tasks and (ii) annotating them with annotations or standard Python decorators. A runtime system is in charge of exploiting the inherent concurrency of the code, automatically detecting and enforcing the data dependencies between tasks and spawning these tasks to the available resources, which can be nodes in a cluster, clouds or grids. In cloud environments, COMPSs provides scalability and elasticity features allowing the dynamic provision of resources.

Thesis Featured

Design, implementation, and integration of a hand for a Darwin-OP robot

Cristian Ramon-Cortes

UPC Commons — UPC Commons  •  2014

Abstract

This project enhances the Darwin-OP robot with the capacity to handle small objects through the design, implementation, and integration of an anthropomorphic hand (forearm, wrist, hand, and fingers). The final design must be fast-prototyped and low cost. First, we introduce the robotic context; focusing on the development of robotic arms and small humanoids. This section also includes a comparison between the used robot and its closest competitor: Nao. Next, we detail the design process of the hand using CAD-3D software (SolidWorks). This section includes the mechanical design of the final prototype so that any user can build it. We also describe the implementation process using an ABS 3-D printer; detailing its advantages and disadvantages and highlighting the construction simplicity. Furthermore, we describe the hardware and software integration of the prototype with the Darwin-OP robot. This section introduces the Robotis Dynamixel XL-320 servo-motors and details its physical installation and the device code so that the Darwin-OP can operate them. The evaluation includes the tests performed to verify the correct behaviour of the prototype. Finally, we highlight that the prototype meets all the requirements: enhances the Darwin-OP robot with the capacity to handle small objects, uses fast-prototyping, and is low cost.