Build software better, together

doc-doc / NExT-QA

NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)

video-understanding videoqa vision-language video-question-answering multi-object-interaction causal-temporal-action-reasoning

Updated Aug 2, 2025
Python

jayleicn / TVQA

Star

[EMNLP 2018] PyTorch code for TVQA: Localized, Compositional Video Question Answering

pytorch dataset videoqa tvqa

Updated Oct 25, 2022
Python

antoyang / FrozenBiLM

Star

[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional Language Models

vqa video-understanding weakly-supervised-learning multimodal-learning visual-question-answering vision-and-language videoqa pre-training video-question-answering large-language-models

Updated Dec 9, 2024
Python

thaolmk54 / hcrn-videoqa

Star

Implementation for the paper "Hierarchical Conditional Relation Networks for Video Question Answering" (Le et al., CVPR 2020, Oral)

vqa question-answering tgif-qa videoqa

Updated Jul 25, 2024
Python

antoyang / just-ask

Star

[ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos

vqa video-understanding weakly-supervised-learning multimodal-learning visual-question-answering question-generation vision-and-language videoqa pre-training video-question-answering

Updated Sep 29, 2023
Jupyter Notebook

MILVLG / activitynet-qa

Star

An VideoQA dataset based on the videos from ActivityNet

dataset vqa activitynet videoqa

Updated Nov 22, 2020
Python

doc-doc / NExT-GQA

Star

Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)

videoqa video-grounding video-question-answering video-language-understanding trustworthy-vqa visual-evidence-grounding

Updated Jul 1, 2024
Python

sail-sg / VGT

Star

Video Graph Transformer for Video Question Answering (ECCV'22)

videoqa video-question-answering temporal-dynamics graph-transformer video-language-understanding

Updated Jun 8, 2023
Python

zhousheng97 / EgoTextVQA

Star

[CVPR'25] 🌟🌟 EgoTextVQA: Towards Egocentric Scene-Text Aware Video Question Answering

videoqa mllm-evaluation scene-text-vqa scene-text-videoqa egocentric-qa-assistance

Updated Jun 19, 2025
Python

doc-doc / HQGA

Star

Video as Conditional Graph Hierarchy for Multi-Granular Question Answering (AAAI'22, Oral)

videoqa vision-language video-question-answering conditional-graph-hierarchy

Updated Sep 17, 2022
Python

doc-doc / NExT-OE

Star

NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)

videoqa vision-language video-comprehension multi-object-interaction causal-temporal-action-reasoning

Updated Jul 18, 2023
Python

yl3800 / IGV

Star

This repo contains code for Invariant Grounding for Video Question Answering

video generalization interpretable videoqa video-question-answering invariant-learning cvpr-2022 cvpr-oral-2022

Updated Mar 2, 2023
Python

YangLiu9208 / CMCIR

Star

[IEEE T-PAMI 2023] Cross-Modal Causal Relational Reasoning for Event-Level Visual Question Answering

traffic vqa causality causal-inference causal videoqa causal-discovery

Updated Jul 6, 2023
Python

doc-doc / CoVGT

Star

Contrastive Video Question Answering via Video Graph Transformer (IEEE T-PAMI'23)

videoqa video-question-answering contrastive-learning dynamic-visual-graph video-language-understanding

Updated Mar 9, 2024
Python

engindeniz / vitis

Star

[ICCV 2023 CLVL Workshop] Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts

video-understanding zero-shot-learning multimodal-learning visual-question-answering few-shot-learning videoqa vision-language prompt-learning large-language-models

Updated Jan 13, 2025
Python

fansunqi / AKeyS

Star

Agentic Keyframe Search for Video Question Answering

agent computer-vision deep-learning search-algorithm videoqa llm reasoning-agent reasoning-algoritm

Updated Apr 7, 2025
Python

mmazab / LifeQA

Star

Data and PyTorch code for the LifeQA LREC 2020 paper.

nlp machine-learning natural-language-processing youtube research computer-vision deep-learning pytorch dataset videos question-answering real-life videoqa video-question-answering lrec2020 lrec lifeqa

Updated Mar 16, 2025
Python

fansunqi / VideoTool

Star

Official Repository for NeurIPS'25 Paper "Tool-Augmented Spatiotemporal Reasoning for Streamlining Video Question Answering Task"

agent computer-vision deep-learning video-understanding video-analysis multimodal videoqa llm mllm tool-learning multimodal-agents video-agents

Updated Dec 29, 2025
Python

PrismaticLab / Video-RAC

Star

An adaptive chunking methodology for lecture videos using CLIP embeddings and SSIM to construct multimodal chunks for enhanced RAG performance.

multilingual dataset multimodal rag videoqa

Updated Nov 11, 2025
Python

MichiganNLP / lifeqa

Star

LifeQA website code

nlp machine-learning natural-language-processing youtube research computer-vision deep-learning pytorch dataset videos question-answering real-life videoqa video-question-answering lrec2020 lrec lifeqa

Updated Feb 3, 2023
HTML

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

videoqa

Here are 22 public repositories matching this topic...

doc-doc / NExT-QA

jayleicn / TVQA

antoyang / FrozenBiLM

thaolmk54 / hcrn-videoqa

antoyang / just-ask

MILVLG / activitynet-qa

doc-doc / NExT-GQA

sail-sg / VGT

zhousheng97 / EgoTextVQA

doc-doc / HQGA

doc-doc / NExT-OE

yl3800 / IGV

YangLiu9208 / CMCIR

doc-doc / CoVGT

engindeniz / vitis

fansunqi / AKeyS

mmazab / LifeQA

fansunqi / VideoTool

PrismaticLab / Video-RAC

MichiganNLP / lifeqa

Improve this page

Add this topic to your repo