Recently, the coronavirus pandemic has made the use of facial masks and respirators common, the former to reduce the likelihood of spreading saliva droplets and the latter as Personal Protective Equipment (PPE). As a result, this caused problems for the existing face detection algorithms. For this reason, and for the implementation of other more sophisticated systems, able to recognize the type of facial mask or respirator and to react given this information, we created the Facial Masks and Respirators Database (FMR-DB).

Instructions: 

For reasons related to the copyright of the images, we cannot publish the entire database here. If you are a student, a professor, or a researcher and you want to use it for research purposes, send an email to antonio.marceddu@polito.it attaching the license, duly completed, which you can find here on IEEE DataPort.

 

Categories:
101 Views

The dataset comprises of image file s of size 20 x 20 pixels for various types of metals and non-metal.The data collected has been augmented, scaled and modified to represent a number a training set dataset.It can be used to detect and identify object type based on material type in the image.In this process both training data set and test data set can be generated from these image files. 

Instructions: 

## Instruction

The dataset is contained in a zip file named as object_type_material_type.zip.Download it and unzip it.

# command unzip object_type_material_type.zip in linux

# Simply unzip in windows

The folder contains five classes as followed.

 

1.copper 2. iron 3. nickel 4. plastic 5. silver.

 

These are stored as sub-directories under main directory(object_type_material_type).Each sub-directory contains 100 image files in jpg format of size 20 x 20 pixels.

 

Out of these classes 4 are metals type as copper, iron, nickel ,silver and one non-metal type as plastic.These image files can be used as training data set and test dataset as well.

 

Categories:
202 Views

This dataset is a collection of images and their respective labels containing examples of multiple Brazilian coins, the primary purpose is to support the development of Computer Vision techniques for automatic detection of such objects, i.e., localization and classification tasks. 

Instructions: 

The dataset is divided in classification and regression, where the classification set contains 3056 images with a single coin and its respective annotation file, the regression set contains all the images from the classification set and other images with several coins and labels, amounting to 6021 images.

 

This dataset provides two annotations formats, the labelme and COCO format;

The labelme format consists of one json per image, where the labels can assume one of two types: circle or polygon; A circle label has a center and edge point, and a polygon is a set of points (Polygons where used for partial coins). 

The COCO format, consists of a single json for the dataset.

Scripts for visualization are provided for both formats.

 

Further information about the formats can be found on the following links:

http://cocodataset.org/#home

https://github.com/wkentaro/labelme

 

Acknowledgments:

Moneda thanks Luciana Harada and Rafael de Souza, his group in the college course that generated these datasets. Yonekura and Guedes acknowledge the grant PPP 04/2017 provided by FAPEAM/CNPq and the label review carried out by Natan Siqueira.

 

 

 

 

 

Categories:
256 Views

Endoscopy is a widely used clinical procedure for the early detection of cancers in hollow-organs such as oesophagus, stomach, and colon. Computer-assisted methods for accurate and temporally consistent localisation and segmentation of diseased region-of-interests enable precise quantification and mapping of lesions from clinical endoscopy videos which is critical for monitoring and surgical planning. Innovations have the potential to improve current medical practices and refine healthcare systems worldwide.

Last Updated On: 
Wed, 08/12/2020 - 20:53

This dataset is part of my PhD research on malware detection and classification using Deep Learning. It contains static analysis data: Top-1000 imported functions extracted from the 'pe_imports' elements of Cuckoo Sandbox reports. PE malware examples were downloaded from virusshare.com. PE goodware examples were downloaded from portableapps.com and from Windows 7 x86 directories.

Instructions: 

* FEATURES *

Column name: hash
Description: MD5 hash of the example
Type: 32 bytes string

Column name: GetProcAddress
Description: Most imported function (1st)
Type: 0 (Not imported) or 1 (Imported)

...

Column name: LookupAccountSidW
Description: Least imported function (1000th)
Type: 0 (Not imported) or 1 (Imported)

Column name: malware
Description: Class
Type: 0 (Goodware) or 1 (Malware)

* ACKNOWLEDGMENTS *

We would like to thank: Cuckoo Sandbox for developing such an amazing dynamic analysis environment!
VirusShare! Because sharing is caring!
Universidade Nove de Julho for supporting this research.
Coordination for the Improvement of Higher Education Personnel (CAPES) for supporting this research.

* CITATIONS *

Please refer to the dataset DOI.
Please feel free to contact me for any further information.

Categories:
1724 Views

This dataset is part of my PhD research on malware detection and classification using Deep Learning. It contains static analysis data: Raw PE byte stream rescaled to a 32 x 32 greyscale image using the Nearest Neighbor Interpolation algorithm and then flattened to a 1024 bytes vector. PE malware examples were downloaded from virusshare.com. PE goodware examples were downloaded from portableapps.com and from Windows 7 x86 directories.

Instructions: 

* FEATURES *

Column name: hash
Description: MD5 hash of the example
Type: 32 bytes string

Column name: pix_0
Description: The first greyscale pixel value
Type: Integer (0-255)

Column name: pix_1023
Description: The last greyscale pixel value
Type: Integer (0-255)

Column name: malware
Description: Class
Type: 0 (Goodware) or 1 (Malware)

* ACKNOWLEDGMENTS *

We would like to thank: Cuckoo Sandbox for developing such an amazing dynamic analysis environment!
VirusShare! Because sharing is caring!
Universidade Nove de Julho for supporting this research.
Coordination for the Improvement of Higher Education Personnel (CAPES) for supporting this research.

* CITATIONS *

Please refer to the dataset DOI.
Please feel free to contact me for any further information.

Categories:
432 Views

This dataset is part of my PhD research on malware detection and classification using Deep Learning. It contains static analysis data (PE Section Headers of the .text, .code and CODE sections) extracted from the 'pe_sections' elements of Cuckoo Sandbox reports. PE malware examples were downloaded from virusshare.com. PE goodware examples were downloaded from portableapps.com and from Windows 7 x86 directories.

Instructions: 

* FEATURES *

Column name: hash
Description: MD5 hash of the example
Type: 32 bytes string

Column name: size_of_data
Description: The size of the section on disk
Type: Integer

Column name: virtual_address
Description: Memory address of the first byte of the section relative to the image base
Type: Integer

Column name: entropy
Description: Calculated entropy of the section
Type: Float

Column name: virtual_size
Description: The size of the section when loaded into memory
Type: Integer

Column name: malware
Description: Class
Type: 0 (Goodware) or 1 (Malware)

* ACKNOWLEDGMENTS *

We would like to thank: Cuckoo Sandbox for developing such an amazing dynamic analysis environment!
VirusShare! Because sharing is caring!
Universidade Nove de Julho for supporting this research.
Coordination for the Improvement of Higher Education Personnel (CAPES) for supporting this research.

* CITATIONS *

Please refer to the dataset DOI.
Please feel free to contact me for any further information.

Categories:
784 Views

 

Dataset was created as part of joint efforts of two research groups from the University of Novi Sad, which were aimed towards development of vision based systems for automatic identification of insect species (in particular hoverflies) based on characteristic venation patterns in the images of the insects' wings.The set of wing images consists of high-resolution microscopic wing images of several hoverfly species. There is a total of 868 wing images of eleven selected hoverfly species from two different genera, Chrysotoxum and Melanostoma.

Instructions: 

 

## University of Novi Sad (UNS), Hoverflies classification dataset - ReadMe file

__________________________________________________________

Version 1.0

Published: December, 2014

by:

## Dataset authors:

* Zorica Nedeljković    (zoricaned14 a_t gmail.com), A1

* Jelena Ačanski    (jelena.acanski a_t dbe.uns.ac.rs), A1

* Marko Panić    (mpanic a_t uns.ac.rs), A2

* Ante Vujić    (ante.vujic a_t dbe.uns.ac.rs), A1

* Branko Brkljač    (brkljacb a_t uns.ac.rs), A2, *corr. auth.

 

Dataset was created as part of joint efforts of two research groups from the University of Novi Sad, which were aimed towards development of vision based systems for automatic identification of insect species (in particular hoverflies) based on characteristic venation patterns in the images of the insects' wings. At the time of dataset's development, authors affiliations were:

 * A1: Department of Biology and Ecology, Faculty of Sciences, University of Novi Sad, Trg Dositeja Obradovića 2, 21000 Novi Sad, Republic of Serbia

and

* A2: Department of Power, Electronic and Telecommunication Engineering, Faculty of Technical Sciences, University of Novi Sad, Trg Dositeja Obradovića 6, 21000 Novi Sad, Republic of Serbia

University of Novi Sad:   http://www.uns.ac.rs/index.php/en/

 

# Dataset description:

The set of wing images consists of high-resolution microscopic wing images of several hoverfly species. There is a total of 868 wing images of eleven selected hoverfly species from two different genera, Chrysotoxum and Melanostoma. 

The wings have been collected from many different geographic locations in the Republic of Serbia during a relatively long period of time of more than two decades. Wing images were obtained from the wing specimens mounted in the glass microscopic slides by a microscopic device equipped with a digital camera with image resolution of 2880 × 1550 pixels and were originally stored in the TIFF image format.

Each wing specimen was uniquely numbered and associated with the taxonomy group it belongs to. Association of eachwing with a particular species was based on the classification of the insect at the time when it was collected and beforethe wings were detached. This classification was done after examination by a skilled expert.  

In the next step, digital images were acquired by biologists, under a relatively uncontrolled conditions of nonuniform background illumination and variable scene configuration, and without camera calibration. In that sense, originally obtained digital images were not particularly suitable for exact measurements. Other shortcomings of the samples in the initial image dataset were result of variable wing specimens' quality, damaged or badly mounted wings, existence of artifacts, variable wing positions during image acquisitions, and dust.

In order to overcome these limitations and make images amenable to automatic discrimination of hoverflyspecies, they were first preprocessed. The preprocessing of each image consisted of image rotation to a unified horizontalposition, wing cropping, and subsequent scaling of the cropped wing image. Cropping eliminated unnecessary background containing artifacts, while the aspect ratio-preserving image scaling enabled overcoming of the problem of variable size among the wings of the same species. Described scaling was performed after computing average width and average height of all cropped images, which were then interpolated to the same width of 1680 pixels using bicubic interpolation. Given width value was selected based on the prevailing image width among the wing images of different species.

Wing images obtained in this way formed the final wing images dataset used for the sliding-window detector training, its performance evaluation, and subsequent hoverfly species discrimination using the trained landmark points detector, described in [1, 2].

* Besides images of the whole wings (in the folder "Wing images"), provided "UNS_Hoverflies" dataset also consists of the small image patches (64x64 pixels) corresponding to 18 predetermined landmark points in each wing, which were systematically collected and organized inside the second root folder named "Training - test set". Each patch among the "Patch_positives" was manually cropped from the preprocessed wing image (i.e. rotated, cropped and scaled to the same predefined image width). However, images of the whole wings that were stored in the folder "Wing images", are provided without additional scaling step in the preprocessing procedure, and correspond to wing images that were only rotated and cropped.

"Wing images" are organized in two subfolders named "disk_1" and "disk_2", which correspond to two DVD drives where they were initially stored. Each folder also comes with additional .xml file containing some metadata. In "Wing images", .xml files contain average spatial size of the images in the given folder, while in the "Training - test set", individual .xml files contain additional data about created image patches (in case of patches corresponding to landmark points, "Patch_positives", each .xml contains image intrinsic spatial coordinates of each landmark point, as well as additional data about the corresponding specimen - who created it, when and where it was gathered, taxonomy, etc. Landmark points have unique numeration from 1 to 18, also provided by figures in [1,2]. In case of "Patch_negatives", each subfolder named after wing identifier, e.g. "W0034_neg", contains 40 randomly selected image patches that correspond to any part of the preprocessed image excluding one of the 18 landmark points and their closest surrounding. Although image patches were generated for all species, only a subset of images corresponding to the species with the highest number of specimens was used in the original classification studies described in [1, 2]. However, in the present form "UNS_Hoverflies" dataset contains all initially processed wing images and image patches.

Besides previously described data, which are the main part of the dataset, repository also contains the original microscopic images of insects' wings, stored without any additional processing after acquisition. These files are available in the second .zip archive denoted by the suffix "unprocessed".

 

Directory structure:

UNS_Hoverflies_Dataset├── Training - test set│   ├── Patch_negatives│   ├── Patch_positives└── Wing images    ├── disk_1    └── disk_2

 

UNS_Hoverflies_Dataset_unprocessed│└── Unprocessed wing images    ├── disk_1    └── disk_2

 

# How to cite:

We would be glad if you intend to use this dataset. In such case, please consider to cite our work as:

BibTex:

@article{UNShoverfliesDataset2019,author = {Zorica Nedeljković and Jelena Ačanski and Marko Panić and Ante Vujić and Branko Brkljač},title = {University of Novi Sad (UNS), Hoverflies classification dataset},journal = {{IEEE} DataPort},year = {2019}} and/or any of the corresponding original publications:

## References:

[1] Branko Brkljač, Marko Panić, Dubravko Ćulibrk, Vladimir Crnojević, Jelena Ačanski, and Ante Vujić, “Automatic hoverfly species discrimination,” in Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods, vol. 2, pp. 108–115, SciTePress, Vilamoura, 2012. https://dblp.org/db/conf/icpram/icpram2012-2.

[2] Vladimir Crnojević, Marko Panić, Branko Brkljač, Dubravko Ćulibrk, Jelena Ačanski, and Ante Vujić, “Image Processing Method for Automatic Discrimination of Hoverfly Species,” Mathematical Problems in Engineering, vol. 2014, Article ID 986271, 12 pages, 2014. https://doi.org/10.1155/2014/986271.

 

** This dataset is published on IEEE DataPort repository under CC BY-NC-SA 4.0 license by the authors (for more information please visit: https://creativecommons.org/licenses/by-nc-sa/4.0/).

Categories:
532 Views

This dataset is part of our research on malware detection and classification using Deep Learning. It contains 42,797 malware API call sequences and 1,079 goodware API call sequences. Each API call sequence is composed of the first 100 non-repeated consecutive API calls associated with the parent process, extracted from the 'calls' elements of Cuckoo Sandbox reports.

Instructions: 

* FEATURES *

Column name: hash
Description: MD5 hash of the example
Type: 32 bytes string

Column name: t_0 ... t_99
Description: API call
Type: Integer (0-306)

Column name: malware
Description: Class
Type: Integer: 0 (Goodware) or 1 (Malware)

API Calls: ['NtOpenThread', 'ExitWindowsEx', 'FindResourceW', 'CryptExportKey', 'CreateRemoteThreadEx', 'MessageBoxTimeoutW', 'InternetCrackUrlW', 'StartServiceW', 'GetFileSize', 'GetVolumeNameForVolumeMountPointW', 'GetFileInformationByHandle', 'CryptAcquireContextW', 'RtlDecompressBuffer', 'SetWindowsHookExA', 'RegSetValueExW', 'LookupAccountSidW', 'SetUnhandledExceptionFilter', 'InternetConnectA', 'GetComputerNameW', 'RegEnumValueA', 'NtOpenFile', 'NtSaveKeyEx', 'HttpOpenRequestA', 'recv', 'GetFileSizeEx', 'LoadStringW', 'SetInformationJobObject', 'WSAConnect', 'CryptDecrypt', 'GetTimeZoneInformation', 'InternetOpenW', 'CoInitializeEx', 'CryptGenKey', 'GetAsyncKeyState', 'NtQueryInformationFile', 'GetSystemMetrics', 'NtDeleteValueKey', 'NtOpenKeyEx', 'sendto', 'IsDebuggerPresent', 'RegQueryInfoKeyW', 'NetShareEnum', 'InternetOpenUrlW', 'WSASocketA', 'CopyFileExW', 'connect', 'ShellExecuteExW', 'SearchPathW', 'GetUserNameA', 'InternetOpenUrlA', 'LdrUnloadDll', 'EnumServicesStatusW', 'EnumServicesStatusA', 'WSASend', 'CopyFileW', 'NtDeleteFile', 'CreateActCtxW', 'timeGetTime', 'MessageBoxTimeoutA', 'CreateServiceA', 'FindResourceExW', 'WSAAccept', 'InternetConnectW', 'HttpSendRequestA', 'GetVolumePathNameW', 'RegCloseKey', 'InternetGetConnectedStateExW', 'GetAdaptersInfo', 'shutdown', 'NtQueryMultipleValueKey', 'NtQueryKey', 'GetSystemWindowsDirectoryW', 'GlobalMemoryStatusEx', 'GetFileAttributesExW', 'OpenServiceW', 'getsockname', 'LoadStringA', 'UnhookWindowsHookEx', 'NtCreateUserProcess', 'Process32NextW', 'CreateThread', 'LoadResource', 'GetSystemTimeAsFileTime', 'SetStdHandle', 'CoCreateInstanceEx', 'GetSystemDirectoryA', 'NtCreateMutant', 'RegCreateKeyExW', 'IWbemServices_ExecQuery', 'NtDuplicateObject', 'Thread32First', 'OpenSCManagerW', 'CreateServiceW', 'GetFileType', 'MoveFileWithProgressW', 'NtDeviceIoControlFile', 'GetFileInformationByHandleEx', 'CopyFileA', 'NtLoadKey', 'GetNativeSystemInfo', 'NtOpenProcess', 'CryptUnprotectMemory', 'InternetWriteFile', 'ReadProcessMemory', 'gethostbyname', 'WSASendTo', 'NtOpenSection', 'listen', 'WSAStartup', 'socket', 'OleInitialize', 'FindResourceA', 'RegOpenKeyExA', 'RegEnumKeyExA', 'NtQueryDirectoryFile', 'CertOpenSystemStoreW', 'ControlService', 'LdrGetProcedureAddress', 'GlobalMemoryStatus', 'NtSetInformationFile', 'OutputDebugStringA', 'GetAdaptersAddresses', 'CoInitializeSecurity', 'RegQueryValueExA', 'NtQueryFullAttributesFile', 'DeviceIoControl', '__anomaly__', 'DeleteFileW', 'GetShortPathNameW', 'NtGetContextThread', 'GetKeyboardState', 'RemoveDirectoryA', 'InternetSetStatusCallback', 'NtResumeThread', 'SetFileInformationByHandle', 'NtCreateSection', 'NtQueueApcThread', 'accept', 'DecryptMessage', 'GetUserNameExW', 'SizeofResource', 'RegQueryValueExW', 'SetWindowsHookExW', 'HttpOpenRequestW', 'CreateDirectoryW', 'InternetOpenA', 'GetFileVersionInfoExW', 'FindWindowA', 'closesocket', 'RtlAddVectoredExceptionHandler', 'IWbemServices_ExecMethod', 'GetDiskFreeSpaceExW', 'TaskDialog', 'WriteConsoleW', 'CryptEncrypt', 'WSARecvFrom', 'NtOpenMutant', 'CoGetClassObject', 'NtQueryValueKey', 'NtDelayExecution', 'select', 'HttpQueryInfoA', 'GetVolumePathNamesForVolumeNameW', 'RegDeleteValueW', 'InternetCrackUrlA', 'OpenServiceA', 'InternetSetOptionA', 'CreateDirectoryExW', 'bind', 'NtShutdownSystem', 'DeleteUrlCacheEntryA', 'NtMapViewOfSection', 'LdrGetDllHandle', 'NtCreateKey', 'GetKeyState', 'CreateRemoteThread', 'NtEnumerateValueKey', 'SetFileAttributesW', 'NtUnmapViewOfSection', 'RegDeleteValueA', 'CreateJobObjectW', 'send', 'NtDeleteKey', 'SetEndOfFile', 'GetUserNameExA', 'GetComputerNameA', 'URLDownloadToFileW', 'NtFreeVirtualMemory', 'recvfrom', 'NtUnloadDriver', 'NtTerminateThread', 'CryptUnprotectData', 'NtCreateThreadEx', 'DeleteService', 'GetFileAttributesW', 'GetFileVersionInfoSizeExW', 'OpenSCManagerA', 'WriteProcessMemory', 'GetSystemInfo', 'SetFilePointer', 'Module32FirstW', 'ioctlsocket', 'RegEnumKeyW', 'RtlCompressBuffer', 'SendNotifyMessageW', 'GetAddrInfoW', 'CryptProtectData', 'Thread32Next', 'NtAllocateVirtualMemory', 'RegEnumKeyExW', 'RegSetValueExA', 'DrawTextExA', 'CreateToolhelp32Snapshot', 'FindWindowW', 'CoUninitialize', 'NtClose', 'WSARecv', 'CertOpenStore', 'InternetGetConnectedState', 'RtlAddVectoredContinueHandler', 'RegDeleteKeyW', 'SHGetSpecialFolderLocation', 'CreateProcessInternalW', 'NtCreateDirectoryObject', 'EnumWindows', 'DrawTextExW', 'RegEnumValueW', 'SendNotifyMessageA', 'NtProtectVirtualMemory', 'NetUserGetLocalGroups', 'GetUserNameW', 'WSASocketW', 'getaddrinfo', 'AssignProcessToJobObject', 'SetFileTime', 'WriteConsoleA', 'CryptDecodeObjectEx', 'EncryptMessage', 'system', 'NtSetContextThread', 'LdrLoadDll', 'InternetGetConnectedStateExA', 'RtlCreateUserThread', 'GetCursorPos', 'Module32NextW', 'RegCreateKeyExA', 'NtLoadDriver', 'NetUserGetInfo', 'SHGetFolderPathW', 'GetBestInterfaceEx', 'CertControlStore', 'StartServiceA', 'NtWriteFile', 'Process32FirstW', 'NtReadVirtualMemory', 'GetDiskFreeSpaceW', 'GetFileVersionInfoW', 'FindFirstFileExW', 'FindWindowExW', 'GetSystemWindowsDirectoryA', 'RegOpenKeyExW', 'CoCreateInstance', 'NtQuerySystemInformation', 'LookupPrivilegeValueW', 'NtReadFile', 'ReadCabinetState', 'GetForegroundWindow', 'InternetCloseHandle', 'FindWindowExA', 'ObtainUserAgentString', 'CryptCreateHash', 'GetTempPathW', 'CryptProtectMemory', 'NetGetJoinInformation', 'NtOpenKey', 'GetSystemDirectoryW', 'DnsQuery_A', 'RegQueryInfoKeyA', 'NtEnumerateKey', 'RegisterHotKey', 'RemoveDirectoryW', 'FindFirstFileExA', 'CertOpenSystemStoreA', 'NtTerminateProcess', 'NtSetValueKey', 'CryptAcquireContextA', 'SetErrorMode', 'UuidCreate', 'RtlRemoveVectoredExceptionHandler', 'RegDeleteKeyA', 'setsockopt', 'FindResourceExA', 'NtSuspendThread', 'GetFileVersionInfoSizeW', 'NtOpenDirectoryObject', 'InternetQueryOptionA', 'InternetReadFile', 'NtCreateFile', 'NtQueryAttributesFile', 'HttpSendRequestW', 'CryptHashMessage', 'CryptHashData', 'NtWriteVirtualMemory', 'SetFilePointerEx', 'CertCreateCertificateContext', 'DeleteUrlCacheEntryW', '__exception__']

* ACKNOWLEDGMENTS *

We would like to thank: Cuckoo Sandbox for developing such an amazing dynamic analysis environment!
VirusShare! Because sharing is caring!
Universidade Nove de Julho for supporting this research.
Coordination for the Improvement of Higher Education Personnel (CAPES) for supporting this research.

* CITATIONS *

"Oliveira, Angelo; Sassi, Renato José (2019): Behavioral Malware Detection Using Deep Graph Convolutional Neural Networks. TechRxiv. Preprint." at https://doi.org/10.36227/techrxiv.10043099.v1 Please feel free to contact me for any further information.

Categories:
1553 Views

Two files are provided. In the first one, there are the power signals obtained from the current and voltage measurements made with our own acquisition system (with a sampling frequency of 5 kHz). They correspond to the switching on and off of 12 home electrical appliances randomly switched on and off during 1 hour by using relay modules and resulting in 1200 events.

In the second file, the time instants of these events are all reported.

Categories:
119 Views