Risso’s dolphin dataset
- Citation Author(s):
- Submitted by:
- Rosalia Maglietta
- Last updated:
- Mon, 03/20/2023 - 07:57
- Data Format:
- Link to Paper:
- Environmental variables and machine learning models to predict cetacean abundance in the Central-eastern Mediterranean SeaMachine Learning and Image Processing Methods for Cetacean Photo Identification: A Systematic Review
Photo identification (photoID) is a non-invasive technique devoted to the identification of individual animals using photos, and it is based on the hypothesis that each specimen has unique features useful for its recognition. This technique is particularly suitable to study highly mobile and hard to detect marine species, such as cetaceans. These animals play a key role in marine biodiversity conservation because they maintain the stability and health of marine ecosystems due to their apical role as top predators in food webs. The information obtained with photo identification studies is useful for acquiring new knowledge on their abundance estimation, social dynamics, pattern migration and site fidelity of the target species. Since manual photo identification is time-consuming and impractical in cases of large datasets, the employment of advanced automated techniques can support users and accelerate the process of individual photo identification.
The availability of open datasets, such as the dataset presented here and entitled “Risso’s dolphin dataset”, can contribute to the development and implementation of fully automated cetacean photo identification techniques, either by acting as training datasets, or by providing already identified individuals for possible matching. In general, it can be useful for the development of a new methodology based on image processing, computer vision and machine learning, devoted to image cropping, segmentation, and recognition.
Regarding the species object of this dataset, the Risso’s dolphin Grampus griseus (Cuvier, 1812), is one of the least-known cetacean species on a global scale, with Mediterranean subpopulation ranked as Endangered by the IUCN Red List. Risso’s dolphins exhibit long-lasting identifiable natural marks on their dorsal fin, and these patterns make the species particularly suitable for photoID algorithms. Hence, to bridge the gap to understanding this species, a key component is obtained through automated photoID studies. The state of the art for the automated photo identification of Risso’s dolphins is the algorithm SPIR (Smart Photo Identification of Risso’s dolphin, see Maglietta et al. Scientific Reports 2018), based on the analysis of dorsal fin scars.
Images collected in the “Risso’s dolphin dataset” are part of the sightings data for the Risso’s dolphin collected from 2013 to 2019, during standardized vessel-based surveys carried out in the Gulf of Taranto, in the Northern Ionian Sea (North-eastern Central Mediterranean Sea), on board the catamarans of the Jonian Dolphin Conservation NGO. The study area covers a very complex topography zone of approximately 14,000 km2, spanning from Santa Maria di Leuca to Punta Alice. A narrow continental shelf, shaped by a steep slope and several channels, characterizes the western sector, while the eastern sector shows descending terraces towards the “Taranto Valley”, a NW-SE submarine canyon with no clear bathymetric connection to a major river system.
All images were taken using a Nikon D3300 camera with a Nikon AF-P Nikkor 70–300 mm, f4,5–6,3 G ED lens. To avoid potential interference in dolphin behavior due to the presence of the vessel, sampling was interrupted by changing direction when specimens were observed at less than around 50 m. Moreover, all observers maintained a safe distance of no less than 5 m, while lowering speed or interrupting navigation to prevent collisions or possible injuries.
The “Risso’s dolphin dataset” contains 8022 images of Risso’s dolphins’ dorsal fins.Each image is accompanied by essential information such as the dolphin’s ID, the number of images for specimen contained in the dataset, and the number of sighting days for specimen for the entire period of study (for a total of 906 sighting days), listed in a xlsx file.
We provide it free of charge, but we ask those who intend to use our dataset the courtesy to quote the following papers, which are realized using the present catalog and its updates (thanks in advance):
- Maglietta, R. et al. “ARIANNA: a novel deep learning-based system for fin contours analysis in individual recognition of dolphins” Intelligent Systems with Applications 18, 200207 (2023)
- Maglietta, R. et al. "Environmental variables and machine learning models to predict cetacean abundance in the Central-eastern Mediterranean Sea" Scientific Reports 13, 2600 (2023)
- Maglietta, R. et al. "Machine Learning and Image Processing Methods for Cetacean Photo Identification: A Systematic Review" IEEE Access 10, 80195-80207 (2022)
- Maglietta, R. et al. "DolFin: an innovative digital platform for studying Risso’s dolphins in the Northern Ionian Sea (North-eastern Central Mediterranean)" Scientific reports 8, 17185 (2018)
The “Risso’s dolphin dataset” is organized in 115 folders, one for each individual, named with the dolphin’s ID assigned to each specimen. The Risso_dolphin_dataset.xlsx file, containing basic information, is provided. Each folder contains cropped fins images available for a specific individual. The number of cropped fins images is listed in column B of the Risso_dolphin_dataset.xlsx file, while the number of the sighting days is listed in column C of the same xlsx file.
Within each folder there is one type of file:
- dolphin’s ID_date of capture (aa-mm-gg), number of series image for the same date.png (i.e. ARIETE_180713_1.png), representing the cropped fin.