Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for image captioning multimodal learning
image-captioning
x
multimodal-learning
x
6 search results found
Omml
⭐
528
Multi-Modal learning toolkit based on PaddlePaddle and PyTorch, supporting multiple applications such as multi-modal classification, cross-modal retrieval and image caption.
Omninet
⭐
426
Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain
Upop
⭐
54
[ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers.
Crossget
⭐
13
CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers.
Cxrmate
⭐
8
CXRMate: Longitudinal Data and a Semantic Similarity Reward for Chest X-Ray Report Generation
Diverse_and_specific_image_captioning
⭐
7
Unsupervised specificity-guided optimization of Image Captioning models to encourage meaningful diversity in the generated captions.
Vqa
⭐
6
Visual Question Answering System
1-6 of 6 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.