Compiler-assisted checkpointing of message-passing applications in heterogeneous environments

  1. Rodríguez, Gabriel
unter der Leitung von:
  1. María J. Martín Doktorvater/Doktormutter
  2. Patricia González Doktorvater/Doktormutter

Universität der Verteidigung: Universidade da Coruña

Fecha de defensa: 16 von Dezember von 2008

Gericht:
  1. Emilio Luque Fadón Präsident/in
  2. Francisco Fernández Rivera Sekretär/in
  3. Ignacio Martín Llorente Vocal
  4. Dolores Isabel Rexachs del Rosario Vocal
  5. Andrés Gómez Tato Vocal

Art: Dissertation

Teseo: 178223 DIALNET lock_openRUC editor

Zusammenfassung

With the evolution of high performance computing towards heterogeneous, massively parallel systems, parallel applications have developed new checkpoint and restart necessities, Whether due to a failure in the execution or to a migration of the processes to different machines, checkpointing tools must be able to operate in heterogeneous environments. However, some of the data manipulated by a parallel application are not truly portable. Examples of these include opaque state (e.g. data structures for communications support) or diversity of interfaces for a single feature (e.g. communications, I/O). Directly manipulating the underlying ad-hoc representations renders checkpointing tools incapable of working on different environments. Portable checkpointers usually work around portability issues at the cost of transparency: the user must provide information such as what data needs to be stored, where to store it, or where to checkpoint. CPPC (ComPiler for Portable Checkpointing) is a checkpointing tool designed to feature both portability and transparency, while preserving the scalability of the executed applications. It is made up of a library and a compiler. The CPPC library contains routines for variable level checkpointing, using portable code and protocols. The CPPC compiler achieves transparency by relieving the user from time-consuming tasks, such as performing code analyses and adding instrumentation code.