NSL-KDD in AVRO Format

Citation Author(s):
Jalindar
Karande
Research Scholar, Pune Institute of Computer Technology, Savitribai Phule Pune University, Pune
Sarang
Joshi
Professor, Pune Institute of Computer Technology, Savitribai Phule Pune University, Pune
Submitted by:
Jalindar Karande
Last updated:
Mon, 06/15/2020 - 05:44
DOI:
10.21227/134f-7171
License:
Creative Commons Attribution
72 Views

Abstract 

NSL-KDD is a benchmark dataset in security available in a comma-separated text format. Many big data analytics tools like Google BigQuery, Hadoop, Spark etc. prefer dataset in AVRO format for its advantages of lightweight and fast data serialisation and deserialization. This, in turn, results in delivering very good data ingestion performance. We have converted comma-separated values into AVRO format as a part of our research work. This dataset in AVRO format will be very useful for researchers working with cloud-based big data analytics tools.