Awesome Open Source

Programming Languages

Search results for multimodal instruction tuning

instruction-tuning x

13 search results found

Llava ⭐ 12,514

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Next Gpt ⭐ 2,602

Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model

Video Llava ⭐ 1,750

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Mplug Owl ⭐ 1,657

[Official Implementation] mPLUG-Owl & mPLUG-Owl2: Alibaba MLLM Family.

Data Juicer ⭐ 994

A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据！

Internlm Xcomposer ⭐ 820

Internvideo ⭐ 736

InternVideo: General Video Foundation Models via Generative and Discriminative Learning (https://arxiv.org/abs/2212.03191)

Awesome Multimodal Llm ⭐ 243

Research Trends in LLM-guided Multimodal Learning.

(AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions

Code/Data for the paper: "LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding"

[Paper][Preprint 2023] Making Large Language Models Perform Better in Knowledge Graph Completion

"LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning"; an interactive Large Language 3D Assistant.

Llava Docker ⭐ 32

Docker image for LLaVA: Large Language and Vision Assistant

Awesome Multimodal Chatbot ⭐ 30

Awesome Multimodal Assistant is a curated list of multimodal chatbots/conversational assistants that utilize various modes of interaction, such as text, speech, images, and videos, to provide a seamless and versatile user experience.

M3dbench ⭐ 23

M3DBench introduces a comprehensive 3D instruction-following dataset with support for interleaved multi-modal prompts. Furthermore, M3DBench provides a new benchmark to assess large models across 3D vision-centric tasks.

Reform Eval ⭐ 19

An benchmark for evaluating the capabilities of large vision-language models (LVLMs)

Lm Research Hub ⭐ 14

Language Modeling Research Hub, a comprehensive compendium for enthusiasts and scholars delving into the fascinating realm of language models (LMs), with a particular focus on large language models (LLMs)

Awesome Llm For Robotics Reasoning ⭐ 13

LLM for robotics reasoning toward AGI / Awesome repos&surveys / Chain of Thought / LLM / Prompt engineering / Reasoning / Robot / Agent / Planning / Reinforcement Learning / Created by @shure-dev / Check Wiki

Related Searches

Python Multimodal (186)

Artificial Intelligence Multimodal (53)

Llm Multimodal (42)

Pytorch Multimodal (42)

Python Instruction Tuning (36)

Dataset Multimodal (34)

Chatgpt Multimodal (28)

Large Language Models Multimodal (22)

Multimodal Gpt4 (20)

Llm Instruction Tuning (18)

1-13 of 13 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.