Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for vision and language
vision-and-language
x
143 search results found
Aerial Vision And Dialog Navigation
⭐
20
Codebase of the ACL 2023 (Findings) Paper "Aerial Vision-and-Dialog Navigation"
Pytorch_ldast
⭐
19
A PyTorch implementation of LDAST
Xmodal Ctx
⭐
18
Official PyTorch implementation of our CVPR 2022 paper: Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for Image Captioning
Cyclical Visual Captioning
⭐
18
PyTorch code for: Learning to Generate Grounded Visual Captions without Localization Supervision
Lxmert Advtrain
⭐
17
Research Code for NeurIPS 2020 Spotlight paper "Large-Scale Adversarial Training for Vision-and-Language Representation Learning": LXMERT adversarial training part
Stl Vqa
⭐
16
The good practice in the VQA system such as pos-tag attention, structed triplet learning and triplet attention is very general and can be inserted into almost any visual and language task
Explore And Match
⭐
16
Explore-And-Match: Bridging Proposal-Based and Proposal-Free With Transformer for Sentence Grounding in Videos
Gst Visdial
⭐
15
💬 Official PyTorch Implementation for CVPR'23 "The Dialog Must Go On: Improving Visual Dialog via Generative Self-Training"
Hero_video_feature_extractor
⭐
15
Video Feature Extraction Code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"
Pytorch_tvc
⭐
14
A PyTorch implementation of TVC
Cpl
⭐
14
Official implementation of our EMNLP 2022 paper "CPL: Counterfactual Prompt Learning for Vision and Language Models"
C3vqg Official
⭐
14
Code for the paper "C3VQG: Category Consistent Cyclic Visual Question Generation".
Phrasecutdataset
⭐
14
Dataset API for "PhraseCut: Language-based Image Segmentation in the Wild"
Clip Openness
⭐
13
Code for "Delving into the Openness of CLIP"
Partglot
⭐
12
Official Implementation of PartGlot (CVPR 2022 Oral)
Gpt Vision Assistant
⭐
12
A simple implementation of Be My Eyes GPT-4, a vision-LLM model that acts as a personal assistant
Val
⭐
11
Code on Paper [CVPR20]Image Search with Text Feedback by Visiolinguistic Attention Learning
Map2seq_vln
⭐
11
Code for ORAR Agent for Vision and Language Navigation on Touchdown and map2seq
Prompt Adapter
⭐
10
Prompt Tuning based Adapter for Vision-Language Model Adaption
Spacap3d
⭐
9
[IJCAI 2022] Spatiality-guided Transformer for 3D Dense Captioning on Point Clouds (official pytorch implementation)
Vlpd
⭐
8
Official Code of CVPR'23 Paper "VLPD: Context-Aware Pedestrian Detection via Vision-Language Semantic Self-Supervision"
Visiondetect
⭐
8
VisionDetect let you track user face gestures like blink, smile etc.
Foolyourvllms
⭐
8
Code for paper: Fool Your (Vision and) Language Model With Embarrassingly Simple Permutations
Medvlsm
⭐
8
Exploring Transfer Learning in Medical Image Segmentation using Vision-Language Models
Open Fashion Clip
⭐
8
This is the official repository for the paper "OpenFashionCLIP: Vision-and-Language Contrastive Learning with Open-Source Fashion Data". ICIAP 2023
Ris Learning List
⭐
8
Related papers about Referring Image Segmentation (RIS)
Mozuma
⭐
7
Model Zoo for Multimedia Applications
Groundvlp
⭐
7
GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection
Read Up
⭐
7
Visual dialog agents with pre-trained vision-and-language encoders.
Vlslice
⭐
7
Code for the 2023 ICCV paper "VLSlice: Interactive Vision-and-Language Slice Discovery."
Vision Language Modelling Series
⭐
7
Companion Repo for the Vision Language Modelling YouTube series - https://bit.ly/3PsbsC2 - by Prithivi Da. Open to PRs and collaborations
Zeroshot Storytelling
⭐
7
Github repository for Zero Shot Visual Storytelling
Gvcci
⭐
7
[IROS 2023] GVCCI: Lifelong Learning of Visual Grounding for Language-Guided Robotic Manipulation
Grounded Vision Parser
⭐
6
semantic parser trained by using videos only instead of labeled logical forms
Spatial Reasoning
⭐
6
Grounding Language Models for Compositional and Spatial Reasoning
Tgn
⭐
6
Tensorflow Reproduction of the EMNLP-2018 paper "Temporally Grounding Natural Sentence in Video"
Refcontrast
⭐
5
Understanding Synonymous Referring Expressions via Contrastive Features
Cizslv2
⭐
5
CIZSL++: Creativity Inspired Generative Zero-Shot Learning
Inside
⭐
5
INSIDE: Steering Spatial Attention with Non-Imaging Information in CNNs
Vscmr Visual Storytelling With Corss Modal Rules
⭐
5
Visual Storytelling with Cross-Modal Rules
Naq
⭐
5
NaQ: Leveraging Narrations as Queries to Supervise Episodic Memory. CVPR 2023.
Pma Net
⭐
5
With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning. ICCV 2023
Code_ssi
⭐
5
An implementation of SSI
Related Searches
Python Vision And Language (69)
Pytorch Vision And Language (34)
101-143 of 143 search results
< Previous
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.