MPSC_MV2

Citation Author(s):: Masud Ahmed (University Of Maryland Baltimore County)
Submitted by:: Masud Ahmed
Last updated:: Mon, 12/12/2022 - 21:11
DOI:: 10.21227/4pgc-7884

74 views

Categories:

Artificial Intelligence

ACCESS DATASET CITE

Abstract

Deep video representation learning has recently attained state-of-the-art performance in video action recognition. However, when used with video clips from varied perspectives, the performance of these models degrades significantly. Existing VAR models frequently simultaneously contain both view information and action attributes, making it difficult to learn a view-invariant representation. Therefore, to study the attribute of multiview representation, we collected a large-scale time synchronous multiview video dataset from 10 subjects in both indoor and outdoor settings performing 10 different actions with three horizontal and vertical viewpoints using a smartphone, an action camera, and a drone camera. We provide the multiview video dataset with various meta-data information to facilitate further research for robust VAR systems.