This dataset is designed to advance research in Visual Question Answering (VQA), specifically addressing challenges related to language priors and compositional reasoning. It incorporates question labels categorizing queries based on their susceptibility to either issue, allowing for targeted evaluation of VQA models. The dataset consists of 33,051 training images and 14,165 validation images, along with 571,244 training questions and 245,087 validation questions. Among the training questions, 313,664 focus on compositional reasoning, while 257,580 pertain to language prior.