Publikationen (45) Publikationen von SANDRA CATALÁN PALLARÉS

2024

  1. Experiences with nested parallelism in task-parallel applications using malleable BLAS on multicore processors

    International Journal of High Performance Computing Applications, Vol. 38, Núm. 2, pp. 55-68

  2. Inference with Transformer Encoders on ARM and RISC-V Multicore Processors

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

  3. Parallel GEMM-based convolutions for deep learning on multicore ARM and RISC-V architectures

    Journal of Systems Architecture, Vol. 153

2022

  1. NUMA-Aware Dense Matrix Factorizations and Inversion with Look-Ahead on Multicore Processors

    Proceedings - Symposium on Computer Architecture and High Performance Computing

  2. QR Factorization Using Malleable BLAS on Multicore Processors

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

2021

  1. A New Generation of Task-Parallel Algorithms for Matrix Inversion in Many-Threaded CPUs

    Proceedings of the 12th International Workshop on Programming Models and Applications for Multicores and Manycores, PMAM 2021

  2. Leveraging teaching on demand: Approaching HPC to undergrads

    Journal of Parallel and Distributed Computing, Vol. 156, pp. 148-162

  3. Scalable Hybrid Loop- And Task-Parallel Matrix Inversion for Multicore Processors

    2021 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2021 - In conjunction with IEEE IPDPS 2021

2020

  1. Programming parallel dense matrix factorizations with look-ahead and OpenMP

    Cluster Computing, Vol. 23, Núm. 1, pp. 359-375

  2. Towards an auto-tuned and task-based spmv (lass library)

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

  3. sLASs: A fully automatic auto-tuned linear algebra library based on OpenMP extensions implemented in OmpSs (LASs Library)

    Journal of Parallel and Distributed Computing, Vol. 138, pp. 153-171

2019

  1. A Case for Malleable Thread-Level Linear Algebra Libraries: The LU Factorization with Partial Pivoting

    IEEE Access, Vol. 7, pp. 17617-17633

  2. Accelerating conjugate gradient using OmpSs

    Proceedings - 2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2019

  3. BLAS-3 Optimized by OmpSs Regions (LASs Library)

    Proceedings - 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2019

  4. Dynamic look-ahead in the reduction to band form for the singular value decomposition

    Parallel Computing, Vol. 81, pp. 22-31

  5. Look-ahead in the two-sided reduction to compact band forms for symmetric eigenvalue problems and the SVD

    Numerical Algorithms, Vol. 80, Núm. 2, pp. 635-660

  6. Tasking in accelerators: Performance evaluation

    Proceedings - 2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2019