Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for pytorch multimodal
multimodal
x
pytorch
x
39 search results found
Mmf
⭐
5,414
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
Fengshenbang Lm
⭐
3,670
Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型
Mmpretrain
⭐
3,114
OpenMMLab Pre-training Toolbox and Benchmark
Chinese Clip
⭐
2,816
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
Docarray
⭐
2,721
Represent, send, store and search multimodal data
Deepke
⭐
2,679
An Open Toolkit for Knowledge Graph Extraction and Construction published at EMNLP2022 System Demonstrations.
Mplug Owl
⭐
1,657
[Official Implementation] mPLUG-Owl & mPLUG-Owl2: Alibaba MLLM Family.
Autodistill
⭐
1,286
Images to inference with no labeling (use foundation models to train supervised models)
Data Juicer
⭐
994
A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据!
Medmnist
⭐
903
[pip install medmnist] 18x Standardized Datasets for 2D and 3D Biomedical Image Classification
Uform
⭐
729
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
Omml
⭐
528
Multi-Modal learning toolkit based on PaddlePaddle and PyTorch, supporting multiple applications such as multi-modal classification, cross-modal retrieval and image caption.
Blended Latent Diffusion
⭐
458
Official implementation for "Blended Latent Diffusion" [SIGGRAPH 2023]
Pykale
⭐
415
Knowledge-Aware machine LEarning (KALE): accessible machine learning from multiple sources for interdisciplinary research, part of the 🔥PyTorch ecosystem. ⭐ Star to support our work!
Specvqgan
⭐
262
Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)
Oasis
⭐
260
Official implementation of the paper "You Only Need Adversarial Supervision for Semantic Image Synthesis" (ICLR 2021)
Deepviewagg
⭐
195
[CVPR'22 Best Paper Finalist] Official PyTorch implementation of the method presented in "Learning Multi-View Aggregation In the Wild for Large-Scale 3D Semantic Segmentation"
Emogen
⭐
189
PyTorch Implementation for Paper "Emotionally Enhanced Talking Face Generation"
Vlmevalkit
⭐
137
Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 30+ HF models, 15+ benchmarks
Fusilli
⭐
120
A Python package housing a collection of deep-learning multi-modal data fusion method pipelines! From data loading, to training, to evaluation - fusilli's got you covered 🌸
Vldet
⭐
117
[ICLR 2023] PyTorch implementation of VLDet (https://arxiv.org/abs/2211.14843)
Solc
⭐
109
Remote Sensing Sar-Optical Land-use Classfication Pytorch Pytorch高分辨率遥感语义分割/地物分割/地物分类
Mdvc
⭐
106
PyTorch implementation of Multi-modal Dense Video Captioning (CVPR 2020 Workshops)
Zeta
⭐
106
Build high-performance AI models with modular building blocks
Mkgformer
⭐
97
Code for the SIGIR2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion."
Rstnet
⭐
95
Official Code for 'RSTNet: Captioning with Adaptive Attention on Visual and Non-Visual Words' (CVPR 2021)
Vision Language Models Are Bows
⭐
95
Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR 2023
Clip4cir
⭐
92
[ACM TOMM 2023] - Composed Image Retrieval using Contrastive Learning and Task-oriented CLIP-based Features
Poda
⭐
86
[ICCV 2023] Official implementation of "PØDA: Prompt-driven Zero-shot Domain Adaptation"
Audio Lyrics Emotion Recognition
⭐
68
(Unofficial) Pytorch Implementation of Music Mood Detection Based On Audio And Lyrics With Deep Neural Net
Hvpnet
⭐
66
Code for the NAACL2022 paper "Good Visual Guidance Makes A Better Extractor: Hierarchical Visual Prefix for Multimodal Entity and Relation Extraction"
Kosmos X
⭐
53
The Next Generation Multi-Modality Superintelligence
Embracenet
⭐
50
Robust multimodal integration method implemented in PyTorch and TensorFlow
Cvt2distilgpt2
⭐
46
Improving Chest X-Ray Report Generation by Leveraging Warm-Starting
Keymorph
⭐
38
Robust multimodal brain registration via keypoints
Sparsesync
⭐
38
Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)
Iperceive
⭐
36
Applying Common-Sense Reasoning to Multi-Modal Dense Video Captioning and Video Question Answering | Python3 | PyTorch | CNNs | Causality | Reasoning | LSTMs | Transformers | Multi-Head Self Attention | Published in IEEE Winter Conference on Applications of Computer Vision (WACV) 2021
Mambatransformer
⭐
31
Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling
Ner Multimodal Pytorch
⭐
30
Pytorch Implementation of "Adaptive Co-attention Network for Named Entity Recognition in Tweets" (AAAI 2018)
Inverse Dall E For Optical Character Recognition
⭐
24
Inverse DALL-E for Optical Character Recognition
Mac
⭐
24
An end-to-end masked contrastive video-and-language pre-training framework
Modality Transferable Mer
⭐
23
Modality-Transferable-MER, multimodal emotion recognition model with zero-shot and few-shot abilities.
Trar Vqa
⭐
23
This is the official pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering" on VQA Task
Protein Localization Transformer
⭐
22
Code for CELL-E: Biological Zero-Shot Text-to-Image Synthesis for Protein Localization Prediction
Slp
⭐
20
Utils and modules for Speech Language and Multimodal processing using pytorch and pytorch lightning
Mmnet For Alzheimer Classification Using Smri
⭐
20
5th Place Solution to HUAWEI PRCV Challenge 2021 Alzheimer's Disease Classification Task
Neko
⭐
19
In Progress Implementation of GATO style Generalist Multimodal model capable of image, text, RL and Robotics tasks
Hashtag Prediction Pytorch
⭐
17
Multimodal Hashtag Prediction with instagram data & pytorch (2nd Place on OpenResource Hackathon 2019)
Concatbert
⭐
17
Baseline model for multimodal classification based on images and text. Text representation obtained from pretrained BERT base model and image representation obtained from VGG16 pretrained model.
Mt Net
⭐
16
Multi-scale Transformer Network for Cross-Modality MR Image Synthesis (IEEE TMI)
Mplug
⭐
15
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections. (EMNLP 2022)
Nemar
⭐
15
[CVPR2020] Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation
Ocrautoscore
⭐
14
OCR自动化阅卷项目
Vipformer
⭐
13
[ICRA 2023] ViPFormer: Efficient Vision-and-Pointcloud Transformer for Unsupervised Pointcloud Understanding. https://arxiv.org/abs/2303.14376
Mmtod
⭐
13
Multi-modal Thermal Object Detector
Gantree
⭐
9
Code release for "GAN-Tree: An Incrementally Learned Hierarchical Generative Framework for Multi-Modal Data Distributions", ICCV 2019
Avlit
⭐
8
Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Model" (AVLIT)
Pacs
⭐
7
Code and dataset release for "PACS: A Dataset for Physical Audiovisual CommonSense Reasoning" (ECCV 2022)
Flair 2
⭐
7
Engage in a semantic segmentation challenge for land cover description using multimodal remote sensing earth observation data, delving into real-world scenarios with a dataset comprising 70,000+ aerial imagery patches and 50,000 Sentinel-2 satellite acquisitions.
Deepgcca Pytorch
⭐
5
An implementation of Deep Generalized Canonical Correlation Analysis (DGCCA or Deep GCCA) with pytorch.
Cbvs Uniclip
⭐
5
A Large-Scale Chinese Image-Text Benchmark for Real-World Short Video Search Scenarios
Image Caption
⭐
5
PyTorch implementation of image captioning based on attention mechanism
Related Searches
Python Pytorch (15,943)
Deep Learning Pytorch (7,533)
Jupyter Notebook Pytorch (4,892)
Machine Learning Pytorch (2,934)
Dataset Pytorch (1,848)
Pytorch Convolutional Neural Networks (1,777)
Pytorch Neural Network (1,631)
Pytorch Natural Language Processing (1,408)
Pytorch Computer Vision (1,230)
Pytorch Neural (1,217)
1-39 of 39 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.