Morse Code Symbol Classification

Primary tabs

Citation Author(s):
Sourya
Dey
University of Southern California
Submitted by:
Sourya Dey
Last updated:
Mon, 09/23/2019 - 17:18
DOI:
10.21227/wbhw-py68
Data Format:
Links:
License:
123 Views
Categories:
Category: 
Keywords:
0
0 ratings - Please login to submit your rating.

Abstract 

We present two synthetic datasets on classification of Morse code symbols for supervised machine learning problems, in particular, neural networks. The linked Github page has algorithms for generating a family of such datasets of varying difficulty. The datasets are spatially one-dimensional and have a small number of input features, leading to high density of input information content. This makes them particularly challenging when implementing network complexity reduction methods. The linked research paper explores the effects on network performance by deliberately adding various forms of noise and expanding the feature set and dataset size.

Instructions: 

First unzip the given file 'morse_datasets.zip' to get two datasets - 'baseline.npz' and 'difficult.npz'. These are 2 out of a family of synthetic datasets that can be generated using the given script 'generate_morse_dataset.py'. For instructions on using the script, see the docstring and/or the linked Github page.

To load data from a dataset, first download 'load_data.txt' and change its extension to '.py'

Then run the method 'load_data' and set the argument 'filename' to the path of the given dataset, for example './baseline.npz'

This will output 6 variables - xtr, ytr, xva, yva, xte, yte. These are the data (x) and labels (y) for the training (tr), validation (va) and test (te) splits. The y data is in one-hot format.

Then you can run your favorite machine learning / classification algorithm on the data.