FRANCISCO DANIEL
IGUAL PEÑA
Profesor titular de universidad
Universitat Jaume I
Castelló de la Plana, EspañaPublicaciones en colaboración con investigadores/as de Universitat Jaume I (59)
2024
-
Automatic generation of ARM NEON micro-kernels for matrix multiplication
Journal of Supercomputing
2023
-
Algorithm 1033: Parallel Implementations for Computing the Minimum Distance of a Random Linear Code on Distributed-memory Architectures
ACM Transactions on Mathematical Software, Vol. 49, Núm. 1
-
Automatic Generation of Micro-kernels for Performance Portability of Matrix Multiplication on RISC-V Vector Processors
ACM International Conference Proceeding Series
2022
-
Algorithm 1022: Efficient Algorithms for Computing a Rank-Revealing UTV Factorization on Parallel Computing Architectures
ACM Transactions on Mathematical Software, Vol. 48, Núm. 2
2020
-
Programming parallel dense matrix factorizations with look-ahead and OpenMP
Cluster Computing, Vol. 23, Núm. 1, pp. 359-375
2019
-
Accelerating the SRP-PHAT algorithm on multi- and many-core platforms using OpenCL
Journal of Supercomputing, Vol. 75, Núm. 3, pp. 1284-1297
-
Algorithm 994: Fast implementations of the Brouwer-Zimmermann algorithm for the computation of the minimum distance of a random linear code
ACM Transactions on Mathematical Software, Vol. 45, Núm. 2
-
Practical considerations for acoustic source localization in the IoT era: Platforms, energy efficiency, and performance
IEEE Internet of Things Journal, Vol. 6, Núm. 3, pp. 5068-5079
2018
-
Multi-threaded dense linear algebra libraries for low-power asymmetric multicore processors
Journal of Computational Science, Vol. 25, pp. 140-151
-
Optimized Fundamental Signal Processing Operations for Energy Minimization on Heterogeneous Mobile Devices
IEEE Transactions on Circuits and Systems I: Regular Papers, Vol. 65, Núm. 5, pp. 1614-1627
2017
-
On the Use of a GPU-Accelerated Mobile Device Processor for Sound Source Localization
Procedia Computer Science
-
Revisiting conventional task schedulers to exploit asymmetry in multi-core architectures for dense linear algebra operations
Parallel Computing, Vol. 68, pp. 59-76
-
Solving Weighted Least Squares (WLS) problems on ARM-based architectures
Journal of Supercomputing, Vol. 73, Núm. 1, pp. 530-542
-
Time and energy modeling of a high-performance multi-threaded Cholesky factorization
Journal of Supercomputing, Vol. 73, Núm. 1, pp. 139-151
2016
-
Analytical modeling is enough for high-performance BLIS
ACM Transactions on Mathematical Software, Vol. 43, Núm. 2
-
Architecture-aware configuration and scheduling of matrix multiplication on asymmetric multicore processors
Cluster Computing, Vol. 19, Núm. 3, pp. 1037-1051
-
Refactoring conventional task schedulers to exploit asymmetric ARM big.LITTLE architectures in dense linear algebra
Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016
2015
-
Balancing task- and data-level parallelism to improve performance and energy consumption of matrix computations on the Intel Xeon Phi
Computers and Electrical Engineering, Vol. 46, pp. 95-111
-
Speeding up the log-polar transform with inexpensive parallel hardware: graphics units and multi-core architectures
Journal of Real-Time Image Processing, Vol. 10, Núm. 3, pp. 533-550
-
Time and energy modeling of high-performance Level-3 BLAS on x86 architectures
Simulation Modelling Practice and Theory, Vol. 55, pp. 77-94