Jiaqi Fan

First Name

Jiaqi

Last Name

Fan

Dataset Entries from this Author

CODA_desc and nuScenes_desc

Large vision-language models (LVLMs) have demonstrated remarkable capabilities in multimodal understanding and generation tasks. However, these models occasionally generate hallucinatory texts, resulting in descriptions that seem reasonable but do not correspond to the image. This phenomenon can lead to wrong driving decisions of the autonomous driving system.

Categories:

Artificial Intelligence