High-performance OpenCL-based GEMM Optimization

Citation Author(s):: Shengle Lin (Hunan University)
Submitted by:: Shengle Lin
Last updated:: Tue, 04/16/2024 - 08:44
DOI:: 10.21227/0cxd-6706
Data Format:: *.csv; *.mat; *.txt; *.m; *.zip
Links:: CLBLASt-Modified

133 views

Categories:

Keywords:

OpenCL GEMM Results and Figures

ACCESS DATASET CITE

Abstract

OpenCL has become the favored framework for emerging heterogeneous devices and FPGAs, owing to its versatility and portability.

However, OpenCL-based math libraries still face challenges in fully leveraging device performance.

When deploying high-performance arithmetic applications on these devices, the most important hot function is General Matrix-matrix Multiplication (GEMM).

This study presents a meticulously optimized OpenCL GEMM kernel.

Our enhanced GEMM kernel emphasizes two key improvements: 1) a three-level double buffer pipeline that efficiently overlaps data fetching with floating-point computations;

2) a fine-grained prefetching strategy of private memory to increase device occupancy by optimizing register unit utilization.

Furthermore, this work presents a Bayesian Optimization (BO) tuner for kernel auto-tuning.

Experimental results demonstrate considerable optimization improvement and performance advantages achieved on diverse OpenCL devices.

Additionally, the BO tuner demonstrates superior efficiency and robustness, outperforming contemporary tuning methods.

Instructions:

Just show the results and figures in Manuscript

Dataset Files

Results.zip (Size: 2.56 MB)

DATASET SCRIPTS

CLBLASt-modified-master.zip

Datasets

Standard Dataset

High-performance OpenCL-based GEMM Optimization

Abstract

Instructions:

Dataset Files

DATASET SCRIPTS

QUESTIONS?

More like this Dataset

List of Indexed Journal: Web of Science, Scopus, and DOAJ

Dataset for classification of handwritten and printed text in a Doctor's prescription

Stock Market Tweets Data

Hotel Reviews from around the world with Sentiment Values and Review Ratings in different Categories for Natural Language Processing

SU-AIS BB-MAS (Syracuse University and Assured Information Security - Behavioral Biometrics Multi-device and multi-Activity data from Same users) Dataset

A Dataset on Online Learning-based Web Behavior from Different Countries Before and After COVID-19