Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for multimodal instruction tuning
instruction-tuning
x
multimodal
x
13 search results found
Llava
⭐
12,514
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Next Gpt
⭐
2,602
Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model
Video Llava
⭐
1,750
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Mplug Owl
⭐
1,657
[Official Implementation] mPLUG-Owl & mPLUG-Owl2: Alibaba MLLM Family.
Data Juicer
⭐
994
A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据!
Internlm Xcomposer
⭐
820
Internvideo
⭐
736
InternVideo: General Video Foundation Models via Generative and Discriminative Learning (https://arxiv.org/abs/2212.03191)
Awesome Multimodal Llm
⭐
243
Research Trends in LLM-guided Multimodal Learning.
Bliva
⭐
181
(AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions
Llavar
⭐
133
Code/Data for the paper: "LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding"
Kopa
⭐
67
[Paper][Preprint 2023] Making Large Language Models Perform Better in Knowledge Graph Completion
Ll3da
⭐
65
"LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning"; an interactive Large Language 3D Assistant.
Llava Docker
⭐
32
Docker image for LLaVA: Large Language and Vision Assistant
Awesome Multimodal Chatbot
⭐
30
Awesome Multimodal Assistant is a curated list of multimodal chatbots/conversational assistants that utilize various modes of interaction, such as text, speech, images, and videos, to provide a seamless and versatile user experience.
M3dbench
⭐
23
M3DBench introduces a comprehensive 3D instruction-following dataset with support for interleaved multi-modal prompts. Furthermore, M3DBench provides a new benchmark to assess large models across 3D vision-centric tasks.
Reform Eval
⭐
19
An benchmark for evaluating the capabilities of large vision-language models (LVLMs)
Lm Research Hub
⭐
14
Language Modeling Research Hub, a comprehensive compendium for enthusiasts and scholars delving into the fascinating realm of language models (LMs), with a particular focus on large language models (LLMs)
Awesome Llm For Robotics Reasoning
⭐
13
LLM for robotics reasoning toward AGI / Awesome repos&surveys / Chain of Thought / LLM / Prompt engineering / Reasoning / Robot / Agent / Planning / Reinforcement Learning / Created by @shure-dev / Check Wiki
Mmc
⭐
13
Related Searches
Python Multimodal (186)
Artificial Intelligence Multimodal (53)
Llm Multimodal (42)
Pytorch Multimodal (42)
Python Instruction Tuning (36)
Dataset Multimodal (34)
Chatgpt Multimodal (28)
Large Language Models Multimodal (22)
Multimodal Gpt4 (20)
Llm Instruction Tuning (18)
1-13 of 13 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.