Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for python vision language pretraining
python
x
vision-language-pretraining
x
19 search results found
Video Llama
⭐
1,826
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Video Chatgpt
⭐
590
"Video-ChatGPT" is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
Paddlemix
⭐
172
Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.
Valor
⭐
110
Codes and Models for VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
Ptp
⭐
100
[CVPR2023] The code for 《Position-guided Text Prompt for Vision-Language Pre-training》
Recognize Any Regions
⭐
92
Recognize Any Regions
Flair
⭐
42
FLAIR: A Foundation LAnguage-Image model of the Retina for fundus image understanding.
Protoclip
⭐
38
📍 Official pytorch implementation of ProtoCLIP in paper Prototypical Contrastive Language Image Pretraining (IEEE TNNLS)
Segclip
⭐
35
PyTorch implementation of ICML 2023 paper "SegCLIP: Patch Aggregation with Learnable Centers for Open-Vocabulary Semantic Segmentation"
Continual Clip
⭐
30
Official repository for "CLIP model is an Efficient Continual Learner".
Svl_adapter
⭐
17
SVL-Adapter: Self-Supervised Adapter for Vision-Language Pretrained Models
Flm
⭐
16
Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)
Sga
⭐
14
Set-level Guidance Attack: Boosting Adversarial Transferability of Vision-Language Pre-training Models. [ICCV 2023]
Cosa
⭐
10
Codes and Models for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model
Managertower
⭐
9
Code for ACL 2023 Oral Paper: ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning
Vtc
⭐
8
VTC: Improving Video-Text Retrieval with User Comments
Blitext
⭐
8
[NeurIPS 2023] Bootstrapping Vision-Language Learning with Decoupled Language Pre-training
B2t
⭐
8
Bias-to-Text: Debiasing Unknown Visual Biases through Language Interpretation
Itra
⭐
7
A codebase for flexible and efficient Image Text Representation Alignment
Related Searches
Python Django (28,897)
Python Machine Learning (20,195)
Python Deep Learning (18,808)
Python Flask (17,643)
Python Pytorch (15,584)
Python Dataset (14,792)
Python Docker (14,113)
Python Tensorflow (13,736)
Python Command Line (13,351)
Python Jupyter Notebook (12,976)
1-19 of 19 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.