Machine Learning

An accurate and reliable image-based quantification system for blueberries may be useful for the automation of harvest management. It may also serve as the basis for controlling robotic harvesting systems. Quantification of blueberries from images is a challenging task due to occlusions, differences in size, illumination conditions and the irregular amount of blueberries that can be present in an image. This paper proposes the quantification per image and per batch of blueberries in the wild, using high definition images captured using a mobile device.

Categories:
2365 Views

This dataset includes the measurements of a simulated vehicle inside a Gazebo simulation using different sensors: a simulated UWB tag, a IMU and a PX4Flow. 

Categories:
979 Views

This dataset is part of our research on malware detection and classification using Deep Learning. It contains 42,797 malware API call sequences and 1,079 goodware API call sequences. Each API call sequence is composed of the first 100 non-repeated consecutive API calls associated with the parent process, extracted from the 'calls' elements of Cuckoo Sandbox reports.

Categories:
6834 Views

7200 .csv files, each containing a 10 kHz recording of a 1 ms lasting 100 hz sound, recorded centimeterwise in a 20 cm x 60 cm locating range on a table. 3600 files (3 at each of the 1200 different positions) are without an obstacle between the loudspeaker and the microphone, 3600 RIR recordings are affected by the changes of the object (a book). The OOLA is initially trained offline in batch mode by the first instance of the RIR recordings without the book. Then it learns online in an incremental mode how the RIR changes by the book.

Categories:
631 Views

As one of the research directions at OLIVES Lab @ Georgia Tech, we focus on the robustness of data-driven algorithms under diverse challenging conditions where trained models can possibly be depolyed. To achieve this goal, we introduced a large-sacle (~1.72M frames) traffic sign detection video dataset (CURE-TSD) which is among the most comprehensive datasets with controlled synthetic challenging conditions. The video sequences in the 

Categories:
4654 Views

As one of the research directions at OLIVES Lab @ Georgia Tech, we focus on the robustness of data-driven algorithms under diverse challenging conditions where trained models can possibly be depolyed.

Categories:
3543 Views

Network traffic analysis, i.e. the umbrella of procedures for distilling information from network traffic, represents the enabler for highly-valuable profiling information, other than being the workhorse for several key network management tasks. While it is currently being revolutionized in its nature by the rising share of traffic generated by mobile and hand-held devices, existing design solutions are mainly evaluated on private traffic traces, and only a few public datasets are available, thus clearly limiting repeatability and further advances on the topic.

Categories:
1734 Views

A paradigm dataset is constantly required for any characterization framework. As far as we could possibly know, no paradigmdataset exists for manually written characters of Telugu Aksharaalu content in open space until now. Telugu content (Telugu: తెలుగు లిపి, romanized: Telugu lipi), an abugida from the Brahmic group of contents, is utilized to compose the Telugu language, a Dravidian language spoken in the India of Andhra Pradesh and Telangana just a few other neighboring states. The Telugu content is generally utilized for composing Sanskrit writings.

Categories:
15692 Views

WiFi measurements dataset for WiFi fingerprint indoor localization compiled on the first and ground floors of the Escuela Técnica Superior de Ingeniería Informática, in Seville, Spain. The facility has 24.000 m² approximately, although only accessible areas were compiled.

Categories:
1520 Views

This FFT-75 dataset contains randomly sampled, potentially overlapping file fragments from 75 popular file types (see details below). It is the most diverse and balanced dataset available to the best of our knowledge. The dataset is labeled with class IDs and is ready for training supervised machine learning models. We distinguish 6 different scenarios with different granularity and provide variants with 512 and 4096-byte blocks. In each case, we sampled a balanced dataset and split the data as follows: 80% for training, 10% for testing and 10% for validation.

Categories:
2885 Views

Pages