The data files contains all the thermal images and error data of the spindle in the experiment.




Dataset described in: 

Daudt, R.C., Le Saux, B., Boulch, A. and Gousseau, Y., 2019. Multitask learning for large-scale semantic change detection. Computer Vision and Image Understanding, 187, p.102783.


This dataset contains 291 coregistered image pairs of RGB aerial images from IGS's BD ORTHO database. Pixel-level change and land cover annotations are provided, generated by rasterizing Urban Atlas 2006, Urban Atlas 2012, and Urban Atlas Change 2006-2012 maps. 


The dataset is split into five parts:

    - 2006 images 



Please contact us if you have any questions.


Fifteen datasets of biomedical classification problems are provided, together with the experimental results of applying automated evolution of kernels functions for support vector machines.


Master data has played a significant role in improving operational efficiencies and has attracted the attention of many large businesses over the decade. Recent professional searches have also proved a significant growth in the practice and research of managing these master data assets.


The original dataset SECOM is obtained from the the UC Irvine Machine Learning Repository ( Then, eachsample is transformed to an image, with each pixel representing a feature. Therefore, image processing mechanisms such as convolutionary neural networks can be utilized for classification.


This dataset includes CSV files that contain IDs and sentiment scores of the tweets related to the COVID-19 pandemic. The tweets have been collected by an ongoing project deployed at The model monitors the real-time Twitter feed for coronavirus-related tweets using 90+ different keywords and hashtags that are commonly used while referencing the pandemic. This dataset has been wholly re-designed on March 20, 2020, to comply with the content redistribution policy set by Twitter.


Each CSV file contains a list of tweet IDs. You can use these tweet IDs to download fresh data from Twitter (read this article: hydrating tweet IDs). To make it easy for the NLP researchers to get access to the sentiment analysis of each collected tweet, the sentiment score computed by TextBlob has been appended as the second column. To hydrate the tweet IDs, you can use applications such as Hydrator (available for OS X, Windows and Linux) or twarc (python library).

Getting the CSV files of this dataset ready for hydrating the tweet IDs:

import pandas as pd

dataframe=pd.read_csv("corona_tweets_10.csv", header=None)


dataframe.to_csv("ready_corona_tweets_10.csv", index=False, header=None)

The above example code takes in the original CSV file (i.e., corona_tweets_10.csv) from this dataset and exports just the tweet ID column to a new CSV file (i.e., ready_corona_tweets_10.csv). The newly created CSV file can now be consumed by the Hydrator application for hydrating the tweet IDs. To export the tweet ID column into a TXT file, just replace ".csv" with ".txt" in the to_csv function (last line) of the above example code.

If you are not comfortable with Python and pandas, you can upload these CSV files to your Google Drive and use Google Sheets to delete the second column. Once finished with the deletion, download the edited CSV files: File > Download > Comma-separated values (.csv, current sheet). These downloaded CSV files are now ready to be used with the Hydrator app for hydrating the tweet IDs.


Subpixel classification (SPC) extracts meaningful information on land-cover classes from the mixed pixels.However, the major challenges for SPC are to obtain reliable soft reference data (RD), use apt input data, and achieve maximum accuracy. This article addresses these issues and applies the support vector machine (SVM) to retrieve the subpixel estimates of glacier facies (GF) using high radiometric-resolution Advanced Wide Field Sensor (AWiFS) data. Precise quantification of GF has fundamental importance in the glaciological research.


The submitted file is a supplemental of IEEE JSTAR article with DOI: 10.1109/JSTARS.2019.2955955

The dataset consists of three sections. The first section briefly reviews the subpixel classification (SPC) techniques and justifies the use of support vector machines in this study. It also highlights the key contribution of this study in the field of glaciology.

The second section details the steps involved in correcting the geometric, atmospheric, and topographic effects in the satellite images. It also specifies about the conversion of thermal band data to surface temperature.

The third section indicates how the ancillary layers used in this study are helpful in the segregation of various glacier facies.

Besides this, three tables (A.1, A.2, and A.3) are given. Table A.1 lists the ancillary layers used in this study, their source and applicability. Table A.2 provides a brief review on the SPC of different land-covers. The reported accuracies were compared with those obtained in this study. Table A.3 quantitatively illustrates how the ancillary layers are able to distinguish among various glacier facies.       

The dataset also contains seven figures (Figs. A.1, A.2, A.3, A.4, A.5, A.6, and A.7) depicting the research approach, correlation between SPC-derived and reference glacier facies area, SPC outputs from eight-class case using spectral data, SPC outputs from three-class case using spectral data, SPC-derived and reference glacier facies area obtained for different cases, SPC accuracy statistics, and texture-based differentiation of glacier facies respectively.

Each of these sections, tables and figures have been referred in the main article at appropriate places.


BCI-Double-ErrP-Dataset is an EEG dataset recorded while participants used a P300-based BCI speller. This speller uses a P300 post-detection based on Error-related potentials (ErrPs) to detect and correct errors (i.e. when the detected symbol does not match the user’s intention). After the P300 detection, an automatic correction is made when an ErrP is detected (this is called a “Primary ErrP”). The correction proposed by the system is also evaluated, eventually eliciting a “Secondary ErrP” if the correction is wrong.


A detailed description of the data is given in “BCI-Double-ErrP-Dataset-instructions.pdf” and a Matlab code example is provided to extract P300 and ErrPs (primary and secondary).


There are 4 folders, one with the datasets of the P300 calibration (session 1), one with the datasets of the ErrP calibration (session 1), one with the datasets of the testing session (session 2), and a folder with the Matlab code to run the example.


Pressing demand of workload along with social media interaction leads to diminished alertness during work hours. Researchers attempted to measure alertness level from various cues like EEG, EOG, Video-based eye movement analysis, etc. Among these, video-based eyelid and iris motion tracking gained much attention in recent years. However, most of these implementations are tested on video data of subjects without spectacles. These videos do not pose a challenge for eye detection and tracking.


Four fully annotated marine image datasets. The annotations are given as train and test splits that can be used to evaluate machine learning methods.


The following classes of fauna were used for annotation:

  • anemone
  • coral
  • crustacean
  • ipnops fish
  • litter
  • ophiuroid
  • other fauna
  • sea cucumber
  • sponge
  • stalked crinoid

For a definition of the classes see [1].

A dataset file contains the following files:

  • annotations/test.csv: The BIIGLE CSV annotation report of the annotations of the test split of this dataset. These annotations are used to test the performance of the trained Mask R-CNN model.
  • annotations/train.csv: The BIIGLE CSV annotation report of the annotations of the train split of this dataset. These annotations are used to generate the annotation patches which are transformed with scale and style transfer to be used to train the Mask R-CNN model.
  • images/: Directory that contains all the original image files.
  • dataset.json: JSON file that contains information about the dataset.
    • name: The name of the dataset.
    • images_dir: Name of the directory that contains the original image files.
    • metadata_file: Path to the CSV file that contains image metadata.
    • test_annotations_file: Path to the CSV file that contains the test annotations.
    • train_annotations_file: Path to the CSV file that contains the train annotations.
    • annotation_patches_dir: Name of the directory that should contain the scale- and style-transferred annotation patches.
    • crop_dimension: Edge length of an annotation or style patch in pixels.
  • metadata.csv: A CSV file that contains metadata for each original image file. In this case the distance of the camera to the sea floor is given for each image.