A word-level Wi-Fi CSI based Deep Bangladeshi Sign Language Dataset(WiBaSL)

- Citation Author(s):
-
Mahmud Wasif NafeeMahian Kabir JoarderMahmudul MohtasimTahsina Farah Sanam
- Submitted by:
- Mahmud Wasif Nafee
- Last updated:
- DOI:
- 10.21227/1v4b-7290
- Data Format:
- Categories:
- Keywords:
Abstract
WiFi-based human sensing has shown remarkable potential to detect sign language gestures in a non-intrusive manner. However, most previous works focus on American Sign Language detection, ignoring applications in other widely used languages such as Bangla Sign Language. There also remains a lack of collection of sign language gestures for Activities of Daily Life (ADL) necessary for instructing children with Autism Spectrum Disorder (ASD). To bridge this gap, we introduce WiBaSL, a WiFi-based word-level deep Bangladeshi Sign Language recognition dataset with WiFi CSI (Channel State Information) measurements of hand gestures for 24 Bangladeshi Sign Language words necessary for Activities of Daily Living. This article presents the WiBaSL data acquisition process, covering the collection protocol, experimental setup, and validation strategy. CSI signals for 24 Bangla sign language gestures were recorded in controlled conditions and preprocessed using standard filtering techniques. Dynamic Time Warping (DTW) was applied to assess consistency across samples. The dataset is intended to support future research in device-free, WiFi-based sign language recognition.
Instructions:
# WiBaSL Dataset Overview
The data records cited in the **WiBaSL** article are stored [here](https://github.com/cpi-lab-buet/WiBaSL/tree/main/WiBaSL/data). The dataset is organized in a main folder, which is subdivided into individual folders corresponding to each volunteer.
Table [Table 1](#) in the paper (Table~\ref{tab:volunteers}) provides relevant details about the participating volunteers.
---
## Folder Structure
Each volunteer folder contains **480 CSI recordings**, comprising:
- **20 samples** for each of the **24 Bangladeshi Sign Language (BaSL)** signs.
Each recording is stored as a `.dat` file.
The filename encodes:
- The **volunteer serial**
- The **corresponding activity type**
This allows for easy identification of each sample.
---
## CSI File Information
- Collected using: **Intel 5300 CSI Tool**
- Parsing: Compatible with **CSI-Kit utilities** [[CSI-Kit Reference](https://github.com/gforbes/csi-kit)].
### Metadata Extractable:
- Chipset used
- Backend type
- Channel bandwidth
- Antenna configuration
- Frame count
- Subcarrier count
- Recording duration
- Average sampling rate
- Average RSSI
- CSI tensor shape
---
## Data Format
Each `.dat` file contains:
- Approx. **10 seconds** of CSI data for a specific **hand gesture**
- Sampled at **20 packets per second**
- Structured in **4D format**:
(frames, subcarriers, transmit antennas, receive antennas)
---