Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for llm multimodal
llm
x
multimodal
x
29 search results found
Unilm
⭐
16,971
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Interngpt
⭐
2,976
InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)
Awesome Aigc Tutorials
⭐
2,879
Curated tutorials and resources for Large Language Models, AI Painting, and more.
Next Gpt
⭐
2,602
Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model
Lisa
⭐
1,206
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
Data Juicer
⭐
994
A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据!
Modelfusion
⭐
712
The TypeScript library for building AI applications.
Fastrag
⭐
591
Efficient Retrieval Augmentation and Generation Framework
Awesome Japanese Llm
⭐
585
日本語LLMまとめ - Overview of Japanese LLMs
Swift
⭐
578
魔搭大模型训练推理部署工具箱,支持LLaMA、千问、ChatGLM、BaiChuan等多种模型及Lo LLM training/inference/deployment framework of ModelScope community, Support various models like LLaMA, Qwen, ChatGLM, Baichuan and others, and training methods like LoRA, ResTuning, NEFTune, etc.)
Llavavision
⭐
409
A simple "Be My Eyes" web app with a llama.cpp/llava backend
Agentchain
⭐
355
Chain together LLMs for reasoning & orchestrate multiple large models for accomplishing complex tasks
Awesome Multimodal Llm
⭐
243
Research Trends in LLM-guided Multimodal Learning.
Agi Papers
⭐
218
Papers and Book to look at when starting AGI 📚
Bliva
⭐
181
(AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions
Recommendation Systems Without Explicit Id Features A Literature Review
⭐
171
Large pre-trained Foundation recommender models
Mmmu
⭐
167
This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"
Vlmevalkit
⭐
137
Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 30+ HF models, 15+ benchmarks
Visual Chinese Llama Alpaca
⭐
129
多模态中文LLaMA&Alpaca大语言模型(VisualCLA)
Llava Cpp Server
⭐
116
LLaVA server (llama.cpp).
Awesome Colorful Llm
⭐
83
Recent advancements propelled by large language models (LLMs), encompassing an array of domains including Vision, Audio, Agent, Robotics, and Fundamental Sciences such as Mathematics.
Ll3da
⭐
65
"LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning"; an interactive Large Language 3D Assistant.
Gemini Pro Bot
⭐
59
A Python Telegram bot powered by Google's gemini-pro LLM API
Yuren Baichuan 7b
⭐
58
基于baichuan-7b的开源多模态大语言模型
Multi_token
⭐
54
Embed arbitrary modalities (images, audio, documents, etc) into large language models.
Keras Llm Robot
⭐
46
A web UI Project In order to learn the large language model. This project includes features such as chat, quantization, fine-tuning, prompt engineering templates, and multimodality.
Videodb Python
⭐
37
VideoDB Python SDK
Vle
⭐
33
VLE: Vision-Language Encoder (VLE: 视觉-语言多模态预训练模型)
Llava Docker
⭐
32
Docker image for LLaVA: Large Language and Vision Assistant
Figstep
⭐
32
Jailbreaking Large Vision-language Models via Typographic Visual Prompts
Ame
⭐
24
State-of-the-art, multi-modal virtual assistant framework powered by LLaMA. Ame is not complete and is under active development.
M3dbench
⭐
23
M3DBench introduces a comprehensive 3D instruction-following dataset with support for interleaved multi-modal prompts. Furthermore, M3DBench provides a new benchmark to assess large models across 3D vision-centric tasks.
Reform Eval
⭐
19
An benchmark for evaluating the capabilities of large vision-language models (LVLMs)
Mmgl
⭐
19
Multimodal Graph Learning: how to encode multiple multimodal neighbors with their relations into LLMs
Visualwebarena
⭐
19
VisualWebArena is a benchmark for multimodal agents.
Lm Research Hub
⭐
14
Language Modeling Research Hub, a comprehensive compendium for enthusiasts and scholars delving into the fascinating realm of language models (LMs), with a particular focus on large language models (LLMs)
Real Gemini
⭐
14
Real-time video understanding and interaction through text,audio,image and video with large multi-modal model. 利用多模态大模型的实时视频理解和交互框架,通过文本、语音、图像和视频和这是世界进行问答和交流。
Awesome Llm For Robotics Reasoning
⭐
13
LLM for robotics reasoning toward AGI / Awesome repos&surveys / Chain of Thought / LLM / Prompt engineering / Reasoning / Robot / Agent / Planning / Reinforcement Learning / Created by @shure-dev / Check Wiki
Pdx
⭐
7
Prompt Engineering and Dev-Ops toolkit for applications powered by Language Models
Metatron2
⭐
5
A Multimodal Discord bot with machine learning functions, including LLM chat, Image generation, and Speech Generation capabilities
Idforrec
⭐
5
Is ID embedding necessary for multimodal recommender system?
Related Searches
Python Llm (1,377)
Artificial Intelligence Llm (599)
Openai Llm (569)
Chatgpt Llm (373)
Natural Language Processing Llm (285)
Llm Gpt (260)
Typescript Llm (258)
Llm Llama (228)
Machine Learning Llm (214)
Llm Large Language Models (213)
1-29 of 29 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.