Awesome Open Source

Programming Languages

Search results for vision transformer

vision-transformer x

248 search results found

Ceit Pytorch ⭐ 84

Implementation of Convolutional enhanced image Transformer

(NeurIPS 2022 CellSeg Challenge - 1st Winner) Open source code for "MEDIAR: Harmony of Data-Centric and Model-Centric for Multi-Modality Microscopy"

Pytorch implementation of "CF-ViT: A General Coarse-to-Fine Method for Vision Transformer"

🥶Vilio: State-of-the-art VL models in PyTorch & PaddlePaddle

Combining Efficientnet And Vision Transformers For Video Deepfake Detection ⭐ 81

Code for Video Deepfake Detection model from "Combining EfficientNet and Vision Transformers for Video Deepfake Detection" presented at ICIAP 2021.

Tensorflow implementation for "Improved Transformer for High-Resolution GANs" (NeurIPS 2021).

[ICLR 2023 Spotlight] GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation

Image Classification Pytorch ⭐ 80

Learning and Building Convolutional Neural Networks using PyTorch

Recurrent Video Restoration Transformer with Guided Deformable Attention (NeurlPS2022, official repository)

Pytorch implementation of the models RT-1-X and RT-2-X from the paper: "Open X-Embodiment: Robotic Learning Datasets and RT-X Models"

The official repo for [Arxiv'23] "Vision Transformer with Quadrangle Attention"

This repo contains the official implementation of ICCV 2023 paper "Keep It SimPool: Who Said Supervised Transformers Suffer from Attention Deficit?"

Official Implementation of ResViT: Residual Vision Transformers for Multi-modal Medical Image Synthesis

Patchmix ⭐ 68

The official implementation of paper: "Inter-Instance Similarity Modeling for Contrastive Learning"

Pytorch Cifar Model Hub ⭐ 65

Implementation of Conv-based and Vit-based networks designed for CIFAR.

Using pretrained encoder and language models to generate captions from multimedia inputs.

Sota Backbones ⭐ 64

A collection of SOTA Image Classification Models in PyTorch

DocEnTr: An end-to-end document image enhancement transformer - ICPR 2022

Vit Anti Oversmoothing ⭐ 62

[ICLR 2022] "Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice" by Peihao Wang, Wenqing Zheng, Tianlong Chen, Zhangyang Wang

Rel_pose ⭐ 59

Official Repository for the 3D 2022 paper "The 8-Point Algorithm as an Inductive Bias for Relative Pose Prediction by ViTs"

[CVPR'23] The official PyTorch implementation of our CVPR 2023 paper: "Generalized Relation Modeling for Transformer Tracking".

CounTR: Transformer-based Generalised Visual Counting

Scigraphqa ⭐ 58

Clip_surgery ⭐ 55

CLIP Surgery for Better Explainability with Enhancement in Open-Vocabulary Tasks

Official PyTorch implementation of the paper: Flow Matching in Latent Space

Vmformer ⭐ 54

[Preprint] VMFormer: End-to-End Video Matting with Transformer

Transformers4vision ⭐ 54

A summarization of Transformer-based architectures for CV tasks, including image classification, object detection, segmentation, and Few-shot Learning. Keep updated frequently.

[ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers.

[ICCV 2023] Integrally Migrating Pre-trained Transformer Encoder-decoders for Visual Object Detection

Vitae Transformer Scene Text Detection ⭐ 51

A comprehensive list of our research works related to scene text detection and spotting, including papers, codes, and citations. Note: The official repo for [IJCV'22] "I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection" has been moved to: https://github.com/ViTAE-Transformer/I3CL

Self Supervised Vit Path ⭐ 50

Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology (LMRL Workshop, NeurIPS 2021)

Icolorit ⭐ 48

Official PyTorch implementation of "iColoriT: Towards Propagating Local Hint to the Right Region in Interactive Colorization by Leveraging Vision Transformer." (WACV 2023)

Official implement of Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer

Cvt2distilgpt2 ⭐ 46

Improving Chest X-Ray Report Generation by Leveraging Warm-Starting

Official PyTorch implementation of our ECCV 2022 paper "Sliced Recursive Transformer"

Vision Transformer From Scratch ⭐ 43

A Simplified PyTorch Implementation of Vision Transformer (ViT)

The official implementation for NeurIPS 2022 Spotlight Neural Shape Deformation Priors

Ecoformer ⭐ 42

[NeurIPS 2022 Spotlight] This is the official PyTorch implementation of "EcoFormer: Energy-Saving Attention with Linear Complexity"

The official implementation of "Asymmetric Patch Sampling for Contrastive Learning"

Video Deblurring with Transformer

Point2vec ⭐ 40

Self-Supervised Representation Learning on Point Clouds (GCPR 2023 | T4V Workshop @ CVPR 2023)

[ICLR 2022] "Unified Vision Transformer Compression" by Shixing Yu*, Tianlong Chen*, Jiayi Shen, Huan Yuan, Jianchao Tan, Sen Yang, Ji Liu, Zhangyang Wang

The official repo for [IJCV'23] "Rethinking Portrait Matting with Privacy Preserving"

[CVPR2023] Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning (https://arxiv.org/abs/2212.04500)

Oreilly Hands On Transformers ⭐ 37

Hands on NLP and Computer Vision with Transformers

Efficient Attention ⭐ 37

[EVA ICLR'23; LARA ICML'22] Efficient attention mechanisms via control variates, random features, and importance sampling

Sparseformer ⭐ 36

the official implementation of SparseFormer

Awesome Multimodal Llm Autonomous Driving ⭐ 35

Multimodal Large Language Models for Autonomous Driving [WACV 2024 Survey Paper]

This is a PyTorch implementation of “Context AutoEncoder for Self-Supervised Representation Learning"

G Universal Clip ⭐ 33

4th place solution for the Google Universal Image Embedding Kaggle Challenge. Instance-Level Recognition workshop at ECCV 2022

Vit Finetune ⭐ 32

Fine-tuning Vision Transformers on various classification datasets

Wacv 2024 Papers ⭐ 32

WACV 2024 Papers: Discover cutting-edge research from WACV 2024, the leading computer vision conference. Stay updated on the latest in computer vision and deep learning, with code included. ⭐ support visual intelligence development!

Code of TVT: Transferable Vision Transformer for Unsupervised Domain Adaptation, WACV 2023

Tokenmixup ⭐ 30

Official pytorch implementation of NeurIPS 2022 paper, TokenMixup

Soft Mixture Of Experts ⭐ 30

PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)

Fastdifferentiablematsqrt ⭐ 28

Official Pytorch implementation of ICLR 22 paper "Fast Differentiable Matrix Square Root"

Flexivit ⭐ 26

PyTorch reimplementation of FlexiViT: One Model for All Patch Sizes

Detecting Images Generated By Diffusers ⭐ 26

Vision Diffmask ⭐ 26

Official PyTorch implementation of Vision DiffMask, a post-hoc interpretation method for vision models.

Semireward ⭐ 26

[ICLR 2024] SemiReward: A General Reward Model for Semi-supervised Learning

Retnet_vit Rmt ⭐ 25

Fpvt_bmvc22 ⭐ 25

Code of Pyramid Vision Transformer at BMVC 2022

Adaptive Token Sampling for Efficient Vision Transformers (ECCV 2022 Oral Presentation)

Mintime Multi Identity Size Invariant Timesformer For Video Deepfake Detection ⭐ 25

Code for Video Deepfake Detector from "MINTIME: Multi-Identity Size-Invariant Video Deepfake Detection", pre-print available on Arxiv

Rethinkvsralignment ⭐ 25

(NIPS 2022) Rethinking Alignment in Video Super-Resolution Transformers

An end-to-end masked contrastive video-and-language pre-training framework

An official Pytorch implementation of "Masked Jigsaw Puzzle: A Versatile Position Embedding for Vision Transformers", CVPR 2023.

Regionproxy ⭐ 23

[CVPR22] Official codebase of Semantic Segmentation by Early Region Proxy.

Patch Fool ⭐ 22

[ICLR 2022] "Patch-Fool: Are Vision Transformers Always Robust Against Adversarial Perturbations?" by Yonggan Fu, Shunyao Zhang, Shang Wu, Cheng Wan, Yingyan Lin

Self-Supervised Vision Transformers for multiplexed imaging datasets

[MICCAI 2023] MDViT: Multi-domain Vision Transformer for Small Medical Image Segmentation Datasets (an official implementation)

Kaggle_leaf_disease_classification ⭐ 22

Cassava leaf disease classification with CNNs and Transformers (top-1% Kaggle solution)

Keras Vision Transformer ⭐ 22

The Tensorflow, Keras implementation of Swin-Transformer and Swin-UNET

Official repository for "Self-Distilled Vision Transformer for Domain Generalization" (ACCV-2022 ORAL)

Transoar ⭐ 22

A 3D medical Detection Transformer library. Papers accepted @ MIDL & MELBA.

[ICCV'21] [Tensorflow] Semi-supervised Retinal Image Synthesis and Disease Prediction using Vision Transformers

Polsarformer ⭐ 22

This code is for the paper "Local Window Attention Transformer for Polarimetric SAR Image Classification" that is published in the IEEE Geoscience and Remote Sensing Letters journal.

Vision_transformers ⭐ 21

Vision Transformers for image classification, image segmentation, and object detection.

[ICLR 2022] "As-ViT: Auto-scaling Vision Transformers without Training" by Wuyang Chen, Wei Huang, Xianzhi Du, Xiaodan Song, Zhangyang Wang, Denny Zhou

Adversarial Automixup ⭐ 21

Official PyTorch(MMCV) implementation of “Adversarial AutoMixup” (ICLR 2024 spotlight)

Code for ViTAS_Vision Transformer Architecture Search

Deep Hash Distillation ⭐ 20

Deep Hash Distillation for Image Retrieval - ECCV 2022

Celebfaces_attributes_classification ⭐ 19

This repository is related to a project of the Introduction to Numerical Imaging (i.e, Introduction à l'Imagerie Numérique in French), given by the MVA Masters program at ENS-Paris Saclay. It was entirely build from scratch and contains code in PyTorch Lightning to train and then use a neural network for image classification. We used it to create a classifier allowing semantic attributes classification of faces with the dataset CelebA-HQ.

This repo contains the code for the CVPR 2023 paper: "CrOC : Cross-View Online Clustering for Dense Visual Representation Learning".

Structtoken ⭐ 18

StructToken : Rethinking Semantic Segmentation with Structural Prior

Pytorch Vision Transformer Vit Mnist ⭐ 18

Simplified Pytorch implementation of Vision Transformer (ViT) for MNIST dataset.

Deepvision ⭐ 18

PyTorch and TensorFlow/Keras image models with automatic weight conversions and equal API/implementations - Vision Transformer (ViT), ResNetV2, EfficientNetV2, NeRF, SegFormer, MixTransformer, (planned...) DeepLabV3+, ConvNeXtV2, YOLO, etc.

Safe Self-Refinement for Transformer-based Domain Adaptation (CVPR 2022)

Doprompt ⭐ 17

Official implementation of PCS in essay "Prompt Vision Transformer for Domain Generalization"

Vit Pytorch ⭐ 17

PyTorch implementation of the vision transformer

Protopformer ⭐ 17

ProtoPFormer: Concentrating on Prototypical Parts in Vision Transformers for Interpretable Image Recognition

Lithography Hotspot Detection ⭐ 17

Detected Hotspots in the Lithography process using Vision Transformers, Convolution Neural Networks and Artificial Neural Networks, and compared the results obtained using ANNs & CNNs

Vision_transformer_tf ⭐ 16

This repository contains the TensorFlow implementation of the paper "AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE" known as vision transformers.

The official implementation of the ICML 2023 paper OFQ-ViT

[ICASSP 2023] Official Implementation of ViTASD: Robust Vision Transformer Baselines for Autism Spectrum Disorder Facial Diagnosis

Buildformer ⭐ 15

Building Extraction from remote sensing image using Vision Transformer, IEEE Transactions on Geoscience and Remote Sensing, 2022

Self Supervised Distillation ⭐ 15

Easy-to-read implementation of self-supervised learning using vision transformer and knowledge distillation with no labels - DINO 😃

Vit Cifar10 Pruning ⭐ 14

Vision Transformer Pruning

Documentclip ⭐ 14

Official implementation of "Max Pooling with Vision Transformers reconciles class and shape in weakly supervised semantic segmentation"

Related Searches

Deep Learning Vision Transformer (105)

Python Vision Transformer (72)

101-200 of 248 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.