Awesome Open Source

Programming Languages

Search results for vqa vision and language

vision-and-language x

11 search results found

Prismer ⭐ 1,245

The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".

Oscar and VinVL

Clipbert ⭐ 649

[CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks.

Lrv Instruction ⭐ 160

[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning

Frozenbilm ⭐ 120

[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional Language Models

Just Ask ⭐ 101

[ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos

ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration

Research Code for NeurIPS 2020 Spotlight paper "Large-Scale Adversarial Training for Vision-and-Language Representation Learning": UNITER adversarial training part

Awesome Vqa Latest ⭐ 42

Visual Question Answering Paper List.

The good practice in the VQA system such as pos-tag attention, structed triplet learning and triplet attention is very general and can be inserted into almost any visual and language task

Official implementation of our EMNLP 2022 paper "CPL: Counterfactual Prompt Learning for Vision and Language Models"

Related Searches

Python Vqa (219)

Pytorch Vqa (61)

Attention Vqa (61)

Dataset Vqa (57)

Jupyter Notebook Vqa (54)

Python Vision And Language (50)

Vqa Visual Question Answering (46)

Deep Learning Vqa (44)

Pytorch Vision And Language (34)

1-11 of 11 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.