Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for python multimodality
multimodality
x
python
x
37 search results found
Llava
⭐
12,514
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Deep Daze
⭐
4,104
Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network). Technique was originally created by https://twitter.com/advadnoun
Otter
⭐
3,322
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
Big Sleep
⭐
1,726
A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN. Technique was originally created by https://twitter.com/advadnoun
Multimodal Maestro
⭐
871
Effective prompting for Large Multimodal Models like GPT-4 Vision, LLaVA or CogVLM. 🔥
Internlm Xcomposer
⭐
820
Cornac
⭐
782
A Comparative Framework for Multimodal Recommender Systems
Clip4clip
⭐
663
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
Fedot
⭐
582
Automated modeling and machine learning framework FEDOT
Woodpecker
⭐
473
✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models. The first work to correct hallucinations in MLLMs.
Swarms
⭐
376
Build, Deploy, and Scale Reliable Swarms of Autonomous Agents for Workflow Automation. Join our Community: https://discord.gg/DbjBMJTSWD
Gpt4roi
⭐
330
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
Sophia
⭐
324
Effortless plugin and play Optimizer to cut model training costs by 50%. New optimizer that is 2x faster than Adam on LLMs.
Collaborative Diffusion
⭐
320
Collaborative Diffusion (CVPR 2023)
Multi Modality Arena
⭐
308
Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!
Cm3leon
⭐
288
An open source implementation of "Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning", an all-new multi modal AI that uses just a decoder to generate both text and images
Mm Diffusion
⭐
287
[CVPR'23] MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
Dance
⭐
284
DANCE: A Deep Learning Library and Benchmark Platform for Single-Cell Analysis
X Vlm
⭐
272
X-VLM: Multi-Grained Vision Language Pre-Training (ICML 2022)
Gemini
⭐
270
The open source implementation of Gemini, the model that will "eclipse ChatGPT" by Google
Clip Guided Diffusion
⭐
267
A CLI tool/python module for generating images from text using guided diffusion and CLIP from OpenAI.
Univl
⭐
264
An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"
Multimodal Sentiment Analysis
⭐
205
Attention-based multimodal fusion for sentiment analysis
Mmmu
⭐
167
This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"
Mmmot
⭐
147
[ICCV2019] Robust Multi-Modality Multi-Object Tracking
Uvtr
⭐
137
Unifying Voxel-based Representation with Transformer for 3D Object Detection (NeurIPS 2022)
How2 Dataset
⭐
125
This repository contains code and metadata of How2 dataset
Fuse Med Ml
⭐
121
A python framework accelerating ML based discovery in the medical field by encouraging code reuse. Batteries included :)
Fusilli
⭐
120
A Python package housing a collection of deep-learning multi-modal data fusion method pipelines! From data loading, to training, to evaluation - fusilli's got you covered 🌸
The Compiler
⭐
119
Seed, Code, Harvest: Grow Your Own App with Tree of Thoughts!
Pali3
⭐
97
Implementation of PALI3 from the paper PALI-3 VISION LANGUAGE MODELS: SMALLER, FASTER, STRONGER"
Emmental
⭐
93
A deep learning framework for building multimodal multi-task learning systems.
Andromeda
⭐
92
An all-new Language Model That Processes Ultra-Long Sequences of 100,000+ Ultra-Fast
Cvpr21chal Slr
⭐
89
This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.
Mediar
⭐
83
(NeurIPS 2022 CellSeg Challenge - 1st Winner) Open source code for "MEDIAR: Harmony of Data-Centric and Model-Centric for Multi-Modality Microscopy"
Mirasol Pytorch
⭐
74
Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch
Prompt Highlighter
⭐
69
Prompt Highlighter: Interactive Control for Multi-Modal LLMs
Swarms Pytorch
⭐
67
Swarming algorithms like PSO, Ant Colony, Sakana, and more in PyTorch 😊
Multi_token
⭐
54
Embed arbitrary modalities (images, audio, documents, etc) into large language models.
Pali
⭐
42
Democratization of "PaLI: A Jointly-Scaled Multilingual Language-Image Model"
Kosmos2.5
⭐
34
My implementation of Kosmos2.5 from the paper: "KOSMOS-2.5: A Multimodal Literate Model"
Loris
⭐
31
Long-Term Rhythmic Video Soundtracker, ICML2023
Tfce_mediation
⭐
27
Fast regression and mediation analysis of vertex or voxel MRI data with TFCE
Trar Vqa
⭐
23
This is the official pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering" on VQA Task
Composeae
⭐
23
Official code for WACV 2021 paper - Compositional Learning of Image-Text Query for Image Retrieval
Cris.pytorch
⭐
20
An official PyTorch implementation of the CRIS paper
Visdial
⭐
20
Visual Dialog: Light-weight Transformer for Many Inputs (ECCV 2020)
Matcha Agent
⭐
16
Official implementation of Matcha-agent
Behaviopy
⭐
15
Behavioral data analysis and plotting in Python.
Documentclip
⭐
14
Corri2p
⭐
14
The code of CorrI2P
Semantic_segmentation
⭐
11
KERAS: Multimodal Deep Learning for Semantic Segmentation (RGB, NIR Streams) - multiple architectures
Moca
⭐
10
The implementation of MoCA
Mn Torch
⭐
9
Mode normalization (in PyTorch).
Pywikimm
⭐
9
Collects a multimodal dataset of Wikipedia articles and their images
Prml
⭐
9
Multimodal Fully Convolutional Neural networks for Semantic Segmentation.
Mmca
⭐
8
The open source community's implementation of the all-new Multi-Modal Causal Attention from "DeepSpeed-VisualChat: Multi-Round Multi-Image Interleave Chat via Multi-Modal Causal Attention"
Ai2d Rst
⭐
8
A repository for the AI2D-RST corpus.
Gato
⭐
8
Plug in and play Implementation of "A Generalist Agent" by Deepmind.
Tinygptv
⭐
7
Simple Implementation of TinyGPTV in super simple Zeta lego blocks
Multimodal Autoencoder For Breast Cancer
⭐
7
Prognostically Relevant Subtypes and Survival Prediction for Breast Cancer Based on Multimodal Genomics Data
Gats
⭐
6
Implementation of GATS from the paper: "GATS: Gather-Attend-Scatter" in pytorch and zeta
Acl2018 Multimodalmultitasksentimentanalysis
⭐
6
Codes for ACL2018 Multimodal Language Workshop paper
Multimodal Tot
⭐
6
Multi-Modal Tree of thoughts for DALLE-3 like auto self improvement
Memsem
⭐
5
A Multi-modal Framework for Sentimental Analysis of Meme
Diverse_sampling
⭐
5
Official project of DiverseSampling (ACMMM2022 Paper)
Mlxtransformer
⭐
5
Simple Implementation of a Transformer in the new framework MLX by Apple
Related Searches
Python Machine Learning (20,195)
Python Dataset (14,792)
Python Tensorflow (13,736)
Python Deep Learning (13,092)
Python Network (11,495)
Python Natural Language Processing (9,064)
Python Artificial Intelligence (8,580)
Python Pytorch (7,877)
Python Convolutional Neural Networks (7,435)
Python Keras (6,821)
1-37 of 37 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.