Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for multimodal image captioning
image-captioning
x
multimodal
x
9 search results found
Interngpt
⭐
2,976
InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)
Ofa
⭐
2,142
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Omml
⭐
528
Multi-Modal learning toolkit based on PaddlePaddle and PyTorch, supporting multiple applications such as multi-modal classification, cross-modal retrieval and image caption.
Awesome Foundation And Multimodal Models
⭐
223
👁️ + 💬 + 🎧 = 🤖 Curated list of top foundation and multimodal models! [Paper + Code]
Magic
⭐
124
Language Models Can See: Plugging Visual Controls in Text Generation
Rstnet
⭐
95
Official Code for 'RSTNet: Captioning with Adaptive Attention on Visual and Non-Visual Words' (CVPR 2021)
Image Text Papers
⭐
71
Image Caption and Text to Image papers.
Cvt2distilgpt2
⭐
46
Improving Chest X-Ray Report Generation by Leveraging Warm-Starting
Inverse Dall E For Optical Character Recognition
⭐
24
Inverse DALL-E for Optical Character Recognition
Mplug
⭐
15
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections. (EMNLP 2022)
Spatial Reasoning
⭐
6
Grounding Language Models for Compositional and Spatial Reasoning
Related Searches
Python Multimodal (186)
Python Image Captioning (175)
1-9 of 9 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.