Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for vision and language visual question answering
vision-and-language
x
visual-question-answering
x
6 search results found
Xmodaler
⭐
929
X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).
Frozenbilm
⭐
120
[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional Language Models
Just Ask
⭐
101
[ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos
Villa
⭐
46
Research Code for NeurIPS 2020 Spotlight paper "Large-Scale Adversarial Training for Vision-and-Language Representation Learning": UNITER adversarial training part
Trar Vqa
⭐
23
This is the official pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering" on VQA Task
Lxmert Advtrain
⭐
17
Research Code for NeurIPS 2020 Spotlight paper "Large-Scale Adversarial Training for Vision-and-Language Representation Learning": LXMERT adversarial training part
Related Searches
Python Vision And Language (69)
Vqa Visual Question Answering (46)
Python Visual Question Answering (42)
Pytorch Vision And Language (34)
1-6 of 6 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.