Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for multimodal vision language model
multimodal
x
vision-language-model
x
8 search results found
Llava
⭐
12,514
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Internlm Xcomposer
⭐
820
Awesome Japanese Llm
⭐
585
日本語LLMまとめ - Overview of Japanese LLMs
Advancedliteratemachinery
⭐
464
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Alibaba DAMO Academy.
Multi_token
⭐
54
Embed arbitrary modalities (images, audio, documents, etc) into large language models.
Awesome Multimodal Llm Autonomous Driving
⭐
35
Multimodal Large Language Models for Autonomous Driving [WACV 2024 Survey Paper]
Llava Docker
⭐
32
Docker image for LLaVA: Large Language and Vision Assistant
Spec
⭐
6
The official implementation of paper "synthesize, diagnose, and optimize: towards fine-grained vision-language understanding"
Cbvs Uniclip
⭐
5
A Large-Scale Chinese Image-Text Benchmark for Real-World Short Video Search Scenarios
Related Searches
Python Multimodal (347)
Pytorch Multimodal (81)
Natural Language Processing Multimodal (55)
Computer Vision Multimodal (53)
Artificial Intelligence Multimodal (53)
Large Language Models Multimodal (25)
Python Vision Language Model (20)
Chatgpt Multimodal (18)
Chatbot Multimodal (18)
Gpt 4 Multimodal (16)
1-8 of 8 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.