The training dataset for accelerated machine learning algorithms

- Citation Author(s):
- Submitted by:
- Qi Yang
- Last updated:
- DOI:
- 10.21227/fbpn-s032
- Categories:
Abstract
A 128-dimensional vector for one document in text format, where each dimension is represented as a single precision floating-point number。
Instructions:
The training dataset was randomly generated for accelerated machine learning algorithms that the coputing-intensive tasks are offload to FPGA accelerators. And the data is stored as a 128-dimensional vector for one document in text format, where each dimension is represented as a single precision floating-point number, so that we can increase the size of dataset easily to hundreds of GB or even more. The cosine distance is used to measure the vector similarity.