NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)
-
Updated
Aug 2, 2025 - Python
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)
[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional Language Models
Implementation for the paper "Hierarchical Conditional Relation Networks for Video Question Answering" (Le et al., CVPR 2020, Oral)
[ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos
An VideoQA dataset based on the videos from ActivityNet
Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)
Video Graph Transformer for Video Question Answering (ECCV'22)
[CVPR'25] 🌟🌟 EgoTextVQA: Towards Egocentric Scene-Text Aware Video Question Answering
Video as Conditional Graph Hierarchy for Multi-Granular Question Answering (AAAI'22, Oral)
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)
This repo contains code for Invariant Grounding for Video Question Answering
[IEEE T-PAMI 2023] Cross-Modal Causal Relational Reasoning for Event-Level Visual Question Answering
Contrastive Video Question Answering via Video Graph Transformer (IEEE T-PAMI'23)
[ICCV 2023 CLVL Workshop] Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts
Agentic Keyframe Search for Video Question Answering
Data and PyTorch code for the LifeQA LREC 2020 paper.
Official Repository for NeurIPS'25 Paper "Tool-Augmented Spatiotemporal Reasoning for Streamlining Video Question Answering Task"
An adaptive chunking methodology for lecture videos using CLIP embeddings and SSIM to construct multimodal chunks for enhanced RAG performance.
LifeQA website code
Add a description, image, and links to the videoqa topic page so that developers can more easily learn about it.
To associate your repository with the videoqa topic, visit your repo's landing page and select "manage topics."