Datasets
Standard Dataset
The training dataset for accelerated machine learning algorithms
- Citation Author(s):
- Ruixuan Li
- Submitted by:
- Qi Yang
- Last updated:
- Thu, 11/08/2018 - 10:34
- DOI:
- 10.21227/fbpn-s032
- License:
279 Views
- Categories:
1 rating - Please login to submit your rating.
Abstract
A 128-dimensional vector for one document in text format, where each dimension is represented as a single precision floating-point number。
Instructions:
The training dataset was randomly generated for accelerated machine learning algorithms that the coputing-intensive tasks are offload to FPGA accelerators. And the data is stored as a 128-dimensional vector for one document in text format, where each dimension is represented as a single precision floating-point number, so that we can increase the size of dataset easily to hundreds of GB or even more. The cosine distance is used to measure the vector similarity.
Dataset Files
- TrainData_4M.7z (2.70 MB)
- TrainData-10M.7z (6.60 MB)
- TrainData_50M.7z (32.89 MB)
- 800M_ByteArrayWritable.7z (526.10 MB)
- 800M_FloatArrayWritable.7z (525.90 MB)
Comments
can not get