Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for python multi modal learning
multi-modal-learning
x
python
x
37 search results found
Chinese Clip
⭐
2,816
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
Hcaptcha Challenger
⭐
1,247
🥂 Gracefully face hCaptcha challenge with MoE(ONNX) embedded solution.
Prismer
⭐
1,245
The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".
Macaw Llm
⭐
1,090
Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration
X Clip
⭐
599
A concise but complete implementation of CLIP with various experimental improvements from recent papers
Cvpr 2023 Papers
⭐
185
CVPR 2023 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest developments in computer vision and deep learning. Code included. A star in the development of visual intelligence!
Nerco
⭐
163
[ICCV 2023] Implicit Neural Representation for Cooperative Low-light Image Enhancement
Embodiedscan
⭐
130
[arXiv 2023] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
Achelous
⭐
116
Achelous: A Fast Unified Water-surface Panoptic Perception Framework based on Fusion of Monocular Camera and 4D mmWave Radar
Zeta
⭐
106
Build high-performance AI models with modular building blocks
Recon
⭐
76
[ICML 2023] Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining
D Cube
⭐
56
A detection/segmentation dataset with class names characterized by intricate and flexible expressions. "Described Object Detection: Liberating Object Detection with Flexible Expressions" (NeurIPS 2023).
Japanese Clip
⭐
54
Japanese CLIP by rinna Co., Ltd.
Aurora
⭐
49
[NeurIPS2023] Parameter-efficient Tuning of Large-scale Multimodal Foundation Model
Cgdetr
⭐
43
Official pytorch repository for CG-DETR "Correlation-guided Query-Dependency Calibration in Video Representation Learning for Temporal Grounding"
Sugar Crepe
⭐
40
[NeurIPS 2023] A faithful benchmark for vision-language compositionality
Uavm
⭐
32
Code for the IEEE Signal Processing Letters 2022 paper "UAVM: Towards Unifying Audio and Visual Models".
Unav
⭐
31
Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline (CVPR 2023)
Hyperdensenet_pytorch
⭐
29
Pytorch version of the HyperDenseNet deep neural network for multi-modal image segmentation
Trar Vqa
⭐
23
This is the official pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering" on VQA Task
Mmea
⭐
22
MMEA: Entity Alignment for Multi-Modal Knowledge Graphs
Acmvh
⭐
21
Adaptive Confidence Multi-View Hashing
Hgclip
⭐
21
HGCLIP: Exploring Vision-Language Models with Graph Representations for Hierarchical Understanding
Zsd Sc Resolver
⭐
19
Resolving semantic confusions for improved zero-shot detection (BMVC 2022)
Mrm Pytorch
⭐
18
An official implementation of Advancing Radiograph Representation Learning with Masked Record Modeling (ICLR'23)
Nemar
⭐
15
[CVPR2020] Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation
Wss Cmer
⭐
14
Code for the paper : "Weakly supervised segmentation with cross-modality equivariant constraints", available at https://arxiv.org/pdf/2104.02488.pdf
Neuralmerger
⭐
13
Yi-Min Chou, Yi-Ming Chan, Jia-Hong Lee, Chih-Yi Chiu, Chu-Song Chen, "Unifying and Merging Well-trained Deep Neural Networks for Inference Stage," International Joint Conference on Artificial Intelligence, IJCAI-ECAI-2018
Multimodal Remote Sensing Toolkit
⭐
13
A python tool to perform deep learning experiments on multimodal remote sensing data.
Xmodal Vit
⭐
12
Official implementation of "Cross-Modal Fusion Distillation for Fine-Grained Sketch-Based Image Retrieval", BMVC 2022.
Multimodal Math Pretraining
⭐
11
[ICLR 2024 Spotlight] This is the official code for the paper "SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training"
Managertower
⭐
9
Code for ACL 2023 Oral Paper: ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning
Gimme_signals_action_recognition
⭐
9
Multi-Modal action recognition for skeleton sequences, inertial measurements, motion capturing data and Wi-Fi CSI fingerprints.
Dramaqa
⭐
8
DramaQA Starter Code (2021)
Ekr
⭐
6
Elysium Knowledge Repository is an open source initiative to embed all of Humanity's multi-modal knowledge and wisdom.
M2hse
⭐
6
PyTorch code for the paper "Complementarity is the king: A multi-modal and multi-grained hierarchical semantic enhancement network for cross-modal retrieval"
Multiviewcropclassification
⭐
5
Public repository of our IGARSS 2023 submission
Related Searches
Python Dataset (14,792)
Python Machine Learning (14,099)
Python Tensorflow (13,736)
Python Deep Learning (13,092)
Python Natural Language Processing (9,064)
Python Artificial Intelligence (8,580)
Python Pytorch (7,877)
Python Convolutional Neural Networks (6,861)
Python Paper (6,578)
Python Segmentation (4,571)
1-37 of 37 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.