FRANCISCO DANIEL
IGUAL PEÑA
Profesor titular de universidad
Publikationen (101) Publikationen von FRANCISCO DANIEL IGUAL PEÑA
2024
-
Algorithm 1039: Automatic Generators for a Family of Matrix Multiplication Routines with Apache TVM
ACM Transactions on Mathematical Software, Vol. 50, Núm. 1
-
Automatic generation of ARM NEON micro-kernels for matrix multiplication
Journal of Supercomputing
-
Experiences with nested parallelism in task-parallel applications using malleable BLAS on multicore processors
International Journal of High Performance Computing Applications, Vol. 38, Núm. 2, pp. 55-68
2023
-
Algorithm 1033: Parallel Implementations for Computing the Minimum Distance of a Random Linear Code on Distributed-memory Architectures
ACM Transactions on Mathematical Software, Vol. 49, Núm. 1
-
Automatic Generation of Micro-kernels for Performance Portability of Matrix Multiplication on RISC-V Vector Processors
ACM International Conference Proceeding Series
-
Dynamic power budget redistribution under a power cap on multi-application environments
Sustainable Computing: Informatics and Systems, Vol. 38
-
Fine-grain task-parallel algorithms for matrix factorizations and inversion on many-threaded CPUs
Concurrency and Computation: Practice and Experience
-
Improving inference time in multi-TPU systems with profiled model segmentation
Proceedings - 2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2023
-
Micro-kernels for portable and efficient matrix multiplication in deep learning
Journal of Supercomputing, Vol. 79, Núm. 7, pp. 8124-8147
-
Programming parallel dense matrix factorizations and inversion for new-generation NUMA architectures
Journal of Parallel and Distributed Computing, Vol. 175, pp. 51-65
2022
-
Algorithm 1022: Efficient Algorithms for Computing a Rank-Revealing UTV Factorization on Parallel Computing Architectures
ACM Transactions on Mathematical Software, Vol. 48, Núm. 2
-
Anatomy of the BLIS Family of Algorithms for Matrix Multiplication
Proceedings - 30th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2022
-
Applying Game-Learning Environments to Power Capping Scenarios via Reinforcement Learning
Communications in Computer and Information Science
-
NUMA-Aware Dense Matrix Factorizations and Inversion with Look-Ahead on Multicore Processors
Proceedings - Symposium on Computer Architecture and High Performance Computing
-
QR Factorization Using Malleable BLAS on Multicore Processors
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
2021
-
A New Generation of Task-Parallel Algorithms for Matrix Inversion in Many-Threaded CPUs
Proceedings of the 12th International Workshop on Programming Models and Applications for Multicores and Manycores, PMAM 2021
-
Low precision matrix multiplication for efficient deep learning in NVIDIA Carmel processors
Journal of Supercomputing, Vol. 77, Núm. 10, pp. 11257-11269
-
Scalable Hybrid Loop- And Task-Parallel Matrix Inversion for Multicore Processors
2021 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2021 - In conjunction with IEEE IPDPS 2021
2020
-
Integration and exploitation of intra-routine malleability in BLIS
Journal of Supercomputing, Vol. 76, Núm. 4, pp. 2860-2875
-
Leveraging knowledge-as-a-service (KaaS) for QoS-aware resource management in multi-user video transcoding
Journal of Supercomputing, Vol. 76, Núm. 12, pp. 9388-9403