Datasets
Standard Dataset
Dataset for Machine Learning-Based Classification of White Blood Cells of the Juvenile Visayan Warty Pig
- Citation Author(s):
- Submitted by:
- Myles Joshua Tan
- Last updated:
- Sat, 02/26/2022 - 08:58
- DOI:
- 10.21227/3qsb-d447
- License:
- Categories:
- Keywords:
Abstract
This dataset was prepared to aid in the creation of a machine learning algorithm that would classify the white blood cells in thin blood smears of juvenile Visayan warty pigs. The creation of this dataset was deemed imperative because of the limited availability of blood smear images collected from the critically endangered species on the internet. The dataset contains 3,457 images of various types of white blood cells (JPEG) with accompanying cell type labels (XLSX).
------------------------------
GENERAL INFORMATION
-----------------------------
Title of Dataset: Dataset for Machine Learning-Based Classification of White Blood Cells of the Juvenile Visayan Warty Pig
Available in: https://drive.google.com/drive/folders/1CsDoL448kvAtFVd5jowVJGKjFLv3qjz4...
Creators:
Jacqueline Rose Alipo-on, University of St. La Salle, s1821459@usls.edu.ph, https://orcid.org/
0000-0001-7948-9512
Francesca Isabelle Escobar, University of St. La Salle, s1822133@usls.edu.ph, https://orcid.org/
0000-0001-6174-890X
Jemima Loise Novia, University of St. La Salle, s1820906@usls.edu.ph, https://orcid.org/
0000-0001-7046-3973
Contributor(s):
Monica Marie Atienza, DVM
Correspondence and Project Advising:
Myles Joshua Tan, MS, MIEEE, MInstP, MIMA; mj.tan@usls.edu.ph; mylestan7996@gmail.com
Nouar AlDahoul, PhD; nouar.aldahoul@live.iium.edu.my; nouar.aldahoul@gmail.com
Evan Yu, PhD; emy24@cornell.edu
Date of data collection: 2021-06 to 2021-11
Geographic location of data collection: Bacolod City, Negros Occidental, Philippines
Keywords: peripheral blood smear, microscope, white blood cell, leukocyte, basophil, eosinophil, lymphocyte, neutrophil, image processing, image augmentation, machine learning, feature extraction, classification, juvenile Visayan warty pig, Philippines
------------------------------
DATA & FILE OVERVIEW
------------------------------
File List:
The total number of images in the dataset is 3539,
which consists of 667 raw images, 1464 augmented images, and 1408 cropped, classified images.
“Not Cropped” folder contains all the raw, unclassified images, with a total count of 667.
“Cropped Classified” folder contains five subfolders for each of the WBC type: “01 Neutrophil” (319 images), “02 Lymphocyte” (905 images), “03 Monocyte” (82 images), “04 Eosinophil” (82 images), and “05 Basophil” (20 images).
“Augmented images” folder also contains four subfolders for the augmented WBC images: “Basophil” (447 images), “Eosinophil” (405 images), “Monocyte” (418 images), and “Neutrophil” (194 images).
“Image Processing Features Augmented” (1328R x 53C)
“Image Processing Features for Cropped” (1408R x 53C)
------------------------------
ABSTRACT
------------------------------
This dataset was prepared to aid in the creation of a machine learning algorithm that would classify the white blood cells in thin blood smears of juvenile Visayan warty pigs.
The creation of this dataset was deemed imperative because of the limited availability of blood smear images collected from the critically endangered species on the internet.
The dataset contains 3,457 images of various types of white blood cells (JPEG) with accompanying cell type labels (XLSX).
------------------------------
SHARING/ACCESS INFORMATION
------------------------------
Licenses/restrictions placed on the data:
You are free to share (copy, distribute, and use the dataset), create (produce works from the dataset), and adapt (modify, transform, and build upon the dataset) as long as you attribute use and works produced from the dataset. For any use or redistribution of the dataset, or works produced from it, you must make clear to others the license of the dataset and keep intact any notices on the original dataset.
------------------------------
METHODOLOGICAL INFORMATION
------------------------------
Description of methods used for collection/generation of data:
Sample Collection
A smartphone was used to capture images of the peripheral blood smears of juvenile warty pigs viewed under 100x Oil Immersion lens.
Methods for processing the data:
In the manual classification of images to their respective WBC type, a licensed medical technologist was consulted for verification. Using Keras Preprocessing Layers, the images were augmented to generate a larger dataset. Image processing was used to extract the features that will be inputted into the chosen machine learning algorithm for automated classification.
People involved with sample collection, processing, analysis and/or submission:
Monica Marie Atienza, DVM
Sonny Mana-ay, RMT
------------------------------
DATA SPECIFIC INFORMATION FOR: Image Processing Features for Cropped.xlsx
------------------------------
Number of variables: 53
Number of cases/rows: 1408
Column Headings:
Column X - Image filename
Column Y - WBC class (1 - Neutrophil, 2 - Lymphocyte, 3 - Monocyte, 4 - Eosinophil, 5 - Basophil)
Columns 3-53 - Features extracted with Image Processing
------------------------------
DATA SPECIFIC INFORMATION FOR: Image Processing Features Augmented.xlsx
------------------------------
Number of variables: 53
Number of cases/rows: 1328
Column Headings:
Column X - Image filename
Column Y - WBC class (1 - Neutrophil, 2 - Lymphocyte, 3 - Monocyte, 4 - Eosinophil, 5 - Basophil)
Columns 3-53 - Features extracted with Image Processing
Comments
useful
Good!