This dataset was developed at the School of Electrical and Computer Engineering (ECE) at the Georgia Institute of Technology as part of the ongoing activities at the Center for Energy and Geo-Processing (CeGP) at Georgia Tech and KFUPM. LANDMASS stands for “LArge North-Sea Dataset of Migrated Aggregated Seismic Structures”. This dataset was extracted from the North Sea F3 block under the Creative Commons license (CC BY-SA 3.0).
The LANDMASS database includes two different datasets. The first, denoted LANDMASS-1, contains 17667 small “patches” of size 99x99 pixels. it includes 9385 Horizon patches, 5140 chaotic patches, 1251 Fault patches, and 1891 Salt Dome patches. The images in this database have values in the range [-1,1]. The second dataset, denoted LANDMASS-2, contains 4000 images. Each image is of size 150x300 pixels and normalized to values in the range [0,1]. Each one of the four classes has 1000 images. Sample images from each database for each class can be found under the /samples file.
7200 .csv files, each containing a 10 kHz recording of a 1 ms lasting 100 hz sound, recorded centimeterwise in a 20 cm x 60 cm locating range on a table. 3600 files (3 at each of the 1200 different positions) are without an obstacle between the loudspeaker and the microphone, 3600 RIR recordings are affected by the changes of the object (a book). The OOLA is initially trained offline in batch mode by the first instance of the RIR recordings without the book. Then it learns online in an incremental mode how the RIR changes by the book.
folder 'load and preprocess offline data': matlab sourcecodes and raw/working offline (no additional obstacle) data files
folder 'lvq and kmeans test': matlab sourcecodes to test and compare in-sample failure with and without LVQ
folder 'online data load and preprocess': matlab sourcecodes and raw/working online (additional obstacle) data files
folder 'OOL': matlab sourcecodes configurable for case 1-4
folder 'OOL2': matlab sourcecodes for case 5
folder 'plots': plots and simulations
The dataset contains high-resolution microscopy images and confocal spectra of semiconducting single-wall carbon nanotubes. Carbon nanotubes allow down-scaling of electronic components to the nano-scale. There is initial evidence from Monte Carlo simulations that microscopy images with high digital resolution show energy information in the Bessel wave pattern that is visible in these images. In this dataset, images from Silicon and InGaAs cameras, as well as spectra, give valuable insights into the spectroscopic properties of these single-photon emitters.
The dataset is generated with docker containers from the measurement data. The measured data is in Igor Binary Waves. The specific format can be read with a custom reader an processed with various tools.
Processing will be applied automatically to various output formats using docker containers.
See current development status and dataset description will be updated on
Collecting and analysing heterogeneous data sources from the Internet of Things (IoT) and Industrial IoT (IIoT) are essential for training and validating the fidelity of cybersecurity applications-based machine learning. However, the analysis of those data sources is still a big challenge for reducing high dimensional space and selecting important features and observations from different data sources.
One of the major research challenges in this field is the unavailability of a comprehensive network based data set which can reflect modern network traffic scenarios, vast varieties of low footprint intrusions and depth structured information about the network traffic. Evaluating network intrusion detection systems research efforts, KDD98, KDDCUP99 and NSLKDD benchmark data sets were generated a decade ago. However, numerous current studies showed that for the current network threat environment, these data sets do not inclusively reflect network traffic and modern low footprint attacks.
The year 2018 was declared as "Turkey Tourism Year" in China. The purpose of this dataset, tourists prefer Turkey to be able to determine. The targeted audience was determined through TripAdvisor. Later, the travel histories of individuals were gathered in four different groups. These are the individuals’ travel histories to Europe (E), World (W) Countries and China (C) City/Province and all (EWC). Then, "One Zero Matrix (OZ)" and "Frequency Matrix (F)" were created for each group. Thus, the number of matrices belonging to four groups increased to eight.
The operational steps of the study are given in Fig. According to this, firstly, the targeted audience was determined through TripAdvisor. Later, the travel histories of individuals were gathered in four different groups. These are the individuals’ travel histories to Europe (E), World (W) Countries and China (C) City/Province and all (EWC). Then, "One Zero Matrix (OZ)" and "Frequency Matrix (F)" were created for each group. Thus, the number of matrices belonging to four groups increased to eight.
For more information, please read the article.
İbrahim Topal, Muhammed Kürşad Uçar, "Hybrid Artificial Intelligence Based Automatic Determination of Travel Preferences of Chinese Tourists", IEEE Open Access.
As one of the research directions at OLIVES Lab @ Georgia Tech, we focus on the robustness of data-driven algorithms under diverse challenging conditions where trained models can possibly be depolyed. To achieve this goal, we introduced a large-sacle (1.M images) object recognition dataset (CURE-OR) which is among the most comprehensive datasets with controlled synthetic challenging conditions. In CURE
Image name format :
1: White 2: Texture 1 - living room 3: Texture 2 - kitchen 4: 3D 1 - living room 5: 3D 2 – office
1: Front (0 º) 2: Left side (90 º) 3: Back (180 º) 4: Right side (270 º) 5: Top
No challenge 02: Resize 03: Underexposure 04: Overexposure 05: Gaussian blur 06: Contrast 07: Dirty lens 1 08: Dirty lens 2 09: Salt & pepper noise 10: Grayscale 11: Grayscale resize 12: Grayscale underexposure 13: Grayscale overexposure 14: Grayscale gaussian blur 15: Grayscale contrast 16: Grayscale dirty lens 1 17: Grayscale dirty lens 2 18: Grayscale salt & pepper noise
A number between [0, 5], where 0 indicates no challenge, 1 the least severe and 5 the most severe challenge. Challenge type 1 (no challenge) and 10 (grayscale) has a level of 0 only. Challenge types 2 (resize) and 11 (grayscale resize) has 4 levels (1 through 4). All other challenges have levels 1 to 5.
As one of the research directions at OLIVES Lab @ Georgia Tech, we focus on the robustness of data-driven algorithms under diverse challenging conditions where trained models can possibly be depolyed. To achieve this goal, we introduced a large-sacle (~1.72M frames) traffic sign detection video dataset (CURE-TSD) which is among the most comprehensive datasets with controlled synthetic challenging conditions. The video sequences in the
The name format of the video files are as follows: “sequenceType_sequenceNumber_challengeSourceType_challengeType_challengeLevel.mp4”
· sequenceType: 01 – Real data 02 – Unreal data
· sequenceNumber: A number in between [01 – 49]
· challengeSourceType: 00 – No challenge source (which means no challenge) 01 – After affect
· challengeType: 00 – No challenge 01 – Decolorization 02 – Lens blur 03 – Codec error 04 – Darkening 05 – Dirty lens 06 – Exposure 07 – Gaussian blur 08 – Noise 09 – Rain 10 – Shadow 11 – Snow 12 – Haze
· challengeLevel: A number in between [01-05] where 01 is the least severe and 05 is the most severe challenge.
We split the video sequences into 70% training set and 30% test set. The sequence numbers corresponding to test set are given below:
[01_04_x_x_x, 01_05_x_x_x, 01_06_x_x_x, 01_07_x_x_x, 01_08_x_x_x, 01_18_x_x_x, 01_19_x_x_x, 01_21_x_x_x, 01_24_x_x_x, 01_26_x_x_x, 01_31_x_x_x, 01_38_x_x_x, 01_39_x_x_x, 01_41_x_x_x, 01_47_x_x_x, 02_02_x_x_x, 02_04_x_x_x, 02_06_x_x_x, 02_09_x_x_x, 02_12_x_x_x, 02_13_x_x_x, 02_16_x_x_x, 02_17_x_x_x, 02_18_x_x_x, 02_20_x_x_x, 02_22_x_x_x, 02_28_x_x_x, 02_31_x_x_x, 02_32_x_x_x, 02_36_x_x_x]
The videos with all other sequence numbers are in the training set. Note that “x” above refers to the variations listed earlier.
The name format of the annotation files are as follows: “sequenceType_sequenceNumber.txt“
Challenge source type, challenge type, and challenge level do not affect the annotations. Therefore, the video sequences that start with the same sequence type and the sequence number have the same annotations.
· sequenceType: 01 – Real data 02 – Unreal data
· sequenceNumber: A number in between [01 – 49]
The format of each line in the annotation file (txt) should be: “frameNumber_signType_llx_lly_lrx_lry_ulx_uly_urx_ury”. You can see a visual coordinate system example in our GitHub page.
· frameNumber: A number in between [001-300]
· signType: 01 – speed_limit 02 – goods_vehicles 03 – no_overtaking 04 – no_stopping 05 – no_parking 06 – stop 07 – bicycle 08 – hump 09 – no_left 10 – no_right 11 – priority_to 12 – no_entry 13 – yield 14 – parking
As one of the research directions at OLIVES Lab @ Georgia Tech, we focus on the robustness of data-driven algorithms under diverse challenging conditions where trained models can possibly be depolyed.
The name format of the provided images are as follows: "sequenceType_signType_challengeType_challengeLevel_Index.bmp"
sequenceType: 01 - Real data 02 - Unreal data
signType: 01 - speed_limit 02 - goods_vehicles 03 - no_overtaking 04 - no_stopping 05 - no_parking 06 - stop 07 - bicycle 08 - hump 09 - no_left 10 - no_right 11 - priority_to 12 - no_entry 13 - yield 14 - parking
challengeType: 00 - No challenge 01 - Decolorization 02 - Lens blur 03 - Codec error 04 - Darkening 05 - Dirty lens 06 - Exposure 07 - Gaussian blur 08 - Noise 09 - Rain 10 - Shadow 11 - Snow 12 - Haze
challengeLevel: A number in between [01-05] where 01 is the least severe and 05 is the most severe challenge.
Index: A number shows different instances of traffic signs in the same conditions.