Abstract: Knowledge-based visual question answering (VQA) requires external knowledge beyond the image to answer the question. Early studies retrieve required knowledge from explicit knowledge bases ...
Why do baseball umpires wear black underwear? How long is the longest burp ever recorded? Which two states make it illegal to get married on a dare? If you know the answers to those trivia questions, ...