Awesome Open Source

Programming Languages

Search results for python visual question answering

visual-question-answering x

44 search results found

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Xmodaler ⭐ 929

X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).

Flamingo Pytorch ⭐ 549

Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch

Ban Vqa ⭐ 527

Bilinear attention networks for visual question answering

Openvqa ⭐ 225

A lightweight, scalable, and general framework for visual question answering research

Pytorch Vqa ⭐ 213

Strong baseline for visual question answering

Mcan Vqa ⭐ 181

Deep Modular Co-Attention Networks for Visual Question Answering

Prophet ⭐ 179

Implementation of CVPR 2023 paper "Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering".

This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"

Vqa Winner Cvprw 2017 ⭐ 160

Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17

Frozenbilm ⭐ 120

[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional Language Models

TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering

Relationnetworks Clevr ⭐ 60

A pytorch implementation for "A simple neural network module for relational reasoning", working on the CLEVR dataset

[ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers.

Research Code for NeurIPS 2020 Spotlight paper "Large-Scale Adversarial Training for Vision-and-Language Representation Learning": UNITER adversarial training part

Bottom Up Features ⭐ 44

Bottom-up features extractor implemented in PyTorch.

Convolutional Vqa ⭐ 38

Flipped Vqa ⭐ 35

Large Language Models are Temporal and Causal Reasoners for Video Question Answering (EMNLP 2023)

Visual Question Answering ⭐ 32

CNN+LSTM, Attention based, and MUTAN-based models for Visual Question Answering

PyTorch implementation of paper "Visual Concept-Metaconcept Learner", NeruIPS 2019

Miccai19 Medvqa ⭐ 27

AIOZ AI - Overcoming Data Limitation in Medical Visual Question Answering

Fvta_memexqa ⭐ 24

Real-world photo sequence question answering system

Trar Vqa ⭐ 23

This is the official pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering" on VQA Task

Figureqa Baseline ⭐ 23

TensorFlow implementation of the CNN-LSTM, Relation Network and text-only baselines for the paper "FigureQA: An Annotated Figure Dataset for Visual Reasoning"

CLEVR-X: A Visual Reasoning Dataset for Natural Language Explanations

Visual Question Answering ⭐ 20

📷 ❓ Visual Question Answering Demo and Algorithmia API

Multimodal Meta Learn ⭐ 19

Official code repository for "Meta Learning to Bridge Vision and Language Models for Multimodal Few-Shot Learning" (published at ICLR 2023).

[Paper][IJCKG 2022] LaKo: Knowledge-driven Visual Question Answering via Late Knowledge-to-Text Injection

Aoa Pytorch ⭐ 13

A Pytorch implementation of Attention on Attention module (both self and guided variants), for Visual Question Answering

Cross Attention Vizwiz Vqa ⭐ 12

A self-evident application of the VQA task is to design systems that aid blind people with sight reliant queries. The VizWiz VQA dataset originates from images and questions compiled by members of the visually impaired community and as such, highlights some of the challenges presented by this particular use case.

Knowit Rock ⭐ 12

ROCK model for Knowledge-Based VQA in Videos

Easy Vqa ⭐ 12

The Easy Visual Question Answering dataset.

Detect Shortcuts ⭐ 12

Repo for ICCV 2021 paper: Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in Visual Question Answering -- project website here: https://cdancette.fr/projects/vqa-ce/

Film Pytorch ⭐ 11

PyTorch implementation of FiLM: Visual Reasoning with a General Conditioning Layer

Official repository for the A-OKVQA dataset

Vqa Med 2020 ⭐ 10

Sglkt Visdial ⭐ 10

🌈 PyTorch Implementation for EMNLP'21 Findings "Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer"

Towards Explainable Metrics for Conditional Image Synthesis Evaluation

Visual Question Answering ⭐ 8

Implementation of the visual question answering model from the paper "Exploring Models and Data for Image Question Answering".

Bilinear Attention Networks for Korean Visual Question Answering

Stacked Attention Networks For Visual Question Answering ⭐ 7

Implementation of the paper "Stacked Attention Networks for Image Question Answering" in Tensorflow

Vilbert Multi Task ⭐ 7

👀 🗣️ 📝12-in-1: Multi-Task Vision and Language Representation Learning Web Demo

Visual Question Answering System

Graph Matching Attention ⭐ 5

Bilateral Cross-Modality Graph Matching Attention for Feature Fusion in Visual Question Answering

Related Searches

Python Django (28,897)

Python Machine Learning (21,459)

Python Deep Learning (17,924)

Python Flask (17,643)

Python Docker (15,034)

Python Dataset (14,964)

Python Pytorch (14,906)

Python Tensorflow (14,663)

Python Command Line (13,351)

Python Jupyter Notebook (12,977)

1-44 of 44 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.