Arquitectura de Computadores y Automática
Departamento
Universitat Politècnica de Catalunya
Barcelona, EspañaPublicaciones en colaboración con investigadores/as de Universitat Politècnica de Catalunya (20)
2023
-
Fine-grain task-parallel algorithms for matrix factorizations and inversion on many-threaded CPUs
Concurrency and Computation: Practice and Experience
-
Programming parallel dense matrix factorizations and inversion for new-generation NUMA architectures
Journal of Parallel and Distributed Computing, Vol. 175, pp. 51-65
2022
-
NUMA-Aware Dense Matrix Factorizations and Inversion with Look-Ahead on Multicore Processors
Proceedings - Symposium on Computer Architecture and High Performance Computing
2021
-
A New Generation of Task-Parallel Algorithms for Matrix Inversion in Many-Threaded CPUs
Proceedings of the 12th International Workshop on Programming Models and Applications for Multicores and Manycores, PMAM 2021
2020
-
Towards an auto-tuned and task-based spmv (lass library)
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
-
sLASs: A fully automatic auto-tuned linear algebra library based on OpenMP extensions implemented in OmpSs (LASs Library)
Journal of Parallel and Distributed Computing, Vol. 138, pp. 153-171
2019
-
A Case for Malleable Thread-Level Linear Algebra Libraries: The LU Factorization with Partial Pivoting
IEEE Access, Vol. 7, pp. 17617-17633
-
Accelerating conjugate gradient using OmpSs
Proceedings - 2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2019
-
BLAS-3 Optimized by OmpSs Regions (LASs Library)
Proceedings - 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2019
-
Look-ahead in the two-sided reduction to compact band forms for symmetric eigenvalue problems and the SVD
Numerical Algorithms, Vol. 80, Núm. 2, pp. 635-660
2018
-
Energy balance between voltage-frequency scaling and resilience for linear algebra routines on low-power multicore architectures
Parallel Computing, Vol. 73, pp. 28-39
-
Multi-threaded dense linear algebra libraries for low-power asymmetric multicore processors
Journal of Computational Science, Vol. 25, pp. 140-151
-
Reuse Detector: Improving the Management of STT-RAM SLLCs
Computer Journal, Vol. 61, Núm. 6, pp. 856-880
-
Static scheduling of the LU factorization with look-ahead on asymmetric multicore processors
Parallel Computing, Vol. 76, pp. 18-27
-
Two-sided orthogonal reductions to condensed forms on asymmetric multicore processors
Parallel Computing, Vol. 78, pp. 85-100
2017
-
DSPONE48: A methodology for automatically synthesize HDL focus on the reuse of DSP slices
Journal of Parallel and Distributed Computing, Vol. 106, pp. 132-142
-
Reduction to tridiagonal form for symmetric eigenproblems on asymmetric multicore processors
Proceedings of the 8th International Workshop on Programming Models and Applications for Multicores and Manycores, PMAM 2017
-
Static versus dynamic task scheduling of the LU factorization on arm big. little architectures
Proceedings - 2017 IEEE 31st International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017
2010
-
Extending OpenMP to survive the heterogeneous multi-core era
International Journal of Parallel Programming
2009
-
A proposal to extend the OpenMP tasking model for heterogeneous architectures
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)