Shengle Lin

High-performance OpenCL-based GEMM Optimization

OpenCL has become the favored framework for emerging heterogeneous devices and FPGAs, owing to its versatility and portability.

However, OpenCL-based math libraries still face challenges in fully leveraging device performance.

When deploying high-performance arithmetic applications on these devices, the most important hot function is General Matrix-matrix Multiplication (GEMM).

This study presents a meticulously optimized OpenCL GEMM kernel.

Categories:

Other
Image Processing

Dataset Entries from this Author

High-performance OpenCL-based GEMM Optimization