Increasing parallelism through speculation in a task-based programming model
- GAYATRI, RAHULKUMAR
- Eduard Ayguadé Parra Director/a
- Rosa M Badia Codirector/a
Universidad de defensa: Universitat Politècnica de Catalunya (UPC)
Fecha de defensa: 26 de marzo de 2015
- Manuel Prieto Matías Presidente
- Julita Corbalán González Secretario/a
- Osman Unsal Vocal
Tipo: Tesis
Resumen
Synchronization is an integral part of parallel programming, whether in updates performed on critical memory or while scheduling of an instruction whose operands are still a part of previous instructions. The use of locks and barriers have been traditionally used to achieve this atomicity and instruction scheduling in parallel programming models. But these approaches lead to problems such as deadlock, starvation and they block the threads from progressing. Speculation is an alternative mechanism that can be used to achieve synchronization. The optimistic approach of speculation avoids the pitfalls of locks and allows better exploitation of the available parallelism in an application. In this thesis we introduce speculation to synchronize the multiple threads of StarSs, a task-based programming model. We use TinySTM, a Software Transactional Memory (STM) library, to speculatively update memory locations shared between StarSs tasks. We later extend the concept of speculative memory updates for speculative task generation in StarSs. The StarSs programming model employs an implicit form of parallelism where an application code is composed of multiple tasks of computation. The framework¿s runtime comprises of a main-thead which generates tasks and worker-threads which execute them. The main-thread generates tasks and adds it to a data-dependency graph. The worker-threads execute independent tasks from this graph. In cases where memory locations are shared between multiple tasks, the framework provides a lock based synchronization. The separation of task generation and its execution makes the use of synchronization directives a neccessity in order to maintain serializability. To maintain control between the threads, the framework uses barriers and wait directives. In this thesis we replace the pessimistic approaches to synchronization in StarSs with optimistic speculation based implementations. Constructs to update critical memory locations inside transactions have been introduced to the StarSs framework. If two or more transactions try to update the same memory locations, only one of them is allowed to successfully commit which guarantees the progress of the application. In our analysis we conclude that STM based concurrency control performs better when contention to update critical memory is lower. We observe a direct corelation between the increase in the performance timings of an application with an increase in the rollbacks of conflicting transactions. We later extend the concept of speculative memory updates to speculative task execution in StarSs. The use of barrier and wait directives in StarSs, block the generation of work and lead to the under utilization of the available resources. To overcome this problem, we schedule tasks speculatively while using TinySTM to maintain correctness in case the speculation fails. A construct to speculatively generate tasks (speculate) has been introduced into the StarSs framework. This directive can be used to speculatively generate tasks from loops. The guarantee of their presence in the sequential version is evaluated later in the execution path. The use of an STM library for speculative task execution introduces a significant overhead due to its conflict detection mechanisms. These STM features are not used in StarSs due to its own dependency analysis feature. Hence we modify the implementation to use data-versions in case of misspeculation instead of transaction rollback. This allows us to evade use of TinySTM in this case which avoids the overhead incurred due to the use of an external STM library. The approach provides us with a performance benefits of at-least 1.2x compared to the regular StarSs applications. It is an important benefit since the performance improvements that we gain are on an already paralleized code.