Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for multimodal llava
llava
x
multimodal
x
10 search results found
Llava
⭐
12,514
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Uform
⭐
729
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
Awesome Foundation And Multimodal Models
⭐
223
👁️ + 💬 + 🎧 = 🤖 Curated list of top foundation and multimodal models! [Paper + Code]
Lrv Instruction
⭐
160
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
Vlmevalkit
⭐
137
Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 30+ HF models, 15+ benchmarks
Llavar
⭐
133
Code/Data for the paper: "LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding"
Llava Cpp Server
⭐
116
LLaVA server (llama.cpp).
Vip Llava
⭐
81
ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts
Multi_token
⭐
54
Embed arbitrary modalities (images, audio, documents, etc) into large language models.
Llava Docker
⭐
32
Docker image for LLaVA: Large Language and Vision Assistant
Mmc
⭐
13
Metatron2
⭐
5
A Multimodal Discord bot with machine learning functions, including LLM chat, Image generation, and Speech Generation capabilities
Related Searches
Python Multimodal (186)
Artificial Intelligence Multimodal (53)
Pytorch Multimodal (49)
Llm Multimodal (44)
Dataset Multimodal (34)
Natural Language Processing Multimodal (27)
Machine Learning Multimodal (26)
Multimodal Gpt4 (23)
Computer Vision Multimodal (22)
Chatgpt Multimodal (18)
1-10 of 10 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.