Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for video understanding
video-understanding
x
113 search results found
Mmaction2
⭐
3,647
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
Awesome Action Recognition
⭐
3,494
A curated list of action recognition and related area resources
Ask Anything
⭐
2,404
[VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
Kubric
⭐
2,089
A data generation pipeline for creating semi-realistic synthetic multi-object videos with rich annotations such as instance segmentation masks, depth maps, and optical flow.
Temporal Shift Module
⭐
2,005
[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding
Awesome Video Diffusion
⭐
1,793
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
Mmaction
⭐
1,415
An open-source toolbox for action understanding based on PyTorch
Paddlevideo
⭐
1,355
Awesome video understanding toolkits based on PaddlePaddle. It supports video data annotation tools, lightweight RGB and skeleton based action recognition model, practical applications for video tagging and sport action detection.
Temporal Segment Networks
⭐
1,235
Code & Models for Temporal Segment Networks (TSN) in ECCV 2016
Videomae
⭐
864
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Tsn Pytorch
⭐
751
Temporal Segment Networks (TSN) in PyTorch
Internvideo
⭐
736
InternVideo: General Video Foundation Models via Generative and Discriminative Learning (https://arxiv.org/abs/2212.03191)
Awesome Grounding
⭐
689
awesome grounding: A curated list of research papers in visual grounding
Action Detection
⭐
551
temporal action detection with SSN
Activity Recognition With Cnn And Rnn
⭐
402
Temporal Segments LSTM and Temporal-Inception for Activity Recognition
Mevis
⭐
388
[ICCV 2023] MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions
Chat Univi
⭐
382
Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
Video Understanding Dataset
⭐
352
A collection of recent video understanding datasets, under construction!
Tdn
⭐
307
[CVPR 2021] TDN: Temporal Difference Networks for Efficient Action Recognition
Tvnet
⭐
295
End-to-End Learning of Motion Representation for Video Understanding
Specvqgan
⭐
262
Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)
Multiverse
⭐
222
Dataset, code and model for the CVPR'20 paper "The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction". And for the ECCV'20 SimAug paper.
Tevit
⭐
209
Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral
Actionvlad
⭐
201
ActionVLAD for video action classification (CVPR 2017)
Tadaconv
⭐
177
[ICLR 2022] TAda! Temporally-Adaptive Convolutions for Video Understanding. This codebase provides solutions for video classification, video representation learning and temporal detection.
Movienet Tools
⭐
174
Tools for movie and video research
Youtube 8m
⭐
166
The 2nd place Solution to the Youtube-8M Video Understanding Challenge by Team Monkeytyping (based on tensorflow)
Phar
⭐
150
deep learning sex position classifier
Cap4video
⭐
149
【CVPR'2023 Highlight】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
Text4vis
⭐
149
【AAAI'2023 & IJCV】Transferring Vision-Language Models for Visual Recognition: A Classifier Perspective
Object_level_visual_reasoning
⭐
148
Pytorch Implementation of "Object level Visual Reasoning in Videos", F. Baradel, N. Neverova, C. Wolf, J. Mille, G. Mori , ECCV 2018
Video2tfrecord
⭐
142
Easily convert RGB video data (e.g. .avi) to the TensorFlow tfrecords file format for training e.g. a NN in TensorFlow. This implementation allows to limit the number of frames per video to be stored in the tfrecords.
Step
⭐
141
CVPR2019 STEP: Spatio-Temporal Progressive Learning for Video Action Detection
Video Contrastive Learning
⭐
139
Video Contrastive Learning with Global Context, ICCVW 2021
Videomaev2
⭐
134
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
Tubedetr
⭐
127
[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers
Bike
⭐
127
【CVPR'2023】Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
Awesome Activity Prediction
⭐
127
Paper list of activity prediction and related area
Sstda
⭐
122
[CVPR 2020] Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation (PyTorch)
Frozenbilm
⭐
120
[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional Language Models
I3d_finetune
⭐
104
TensorFlow code for finetuning I3D model on UCF101.
Just Ask
⭐
101
[ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos
Dear
⭐
100
[ICCV 2021 Oral] Deep Evidential Action Recognition
Mvfnet
⭐
97
【AAAI 2021】MVFNet: Multi-View Fusion Network for Efficient Video Recognition
Vidchapters
⭐
93
[NeurIPS 2023 D&B] VidChapters-7M: Video Chapters at Scale
Motionsqueeze
⭐
92
Official PyTorch Implementation of MotionSqueeze, ECCV 2020
Pyanomaly
⭐
92
Useful Toolbox for Anomaly Detection
S3d.pytorch
⭐
89
Spatiotemporal-separable 3D convolution network.
Mmpd_rppg_dataset
⭐
75
MMPD: Multi-Domain Mobile Video Physiology Dataset(EMBC2023 Oral)
Next Qa
⭐
74
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)
Vov3d
⭐
73
Efficient 3D Backbone Network for Temporal Modeling
Mmn
⭐
64
[AAAI 2022] Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding
Cater
⭐
63
CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning
Stale
⭐
63
[ECCV 2022] Official Pytorch Implementation of the paper : " Zero-Shot Temporal Action Detection via Vision-Language Prompting "
Awesome Temporally Language Grounding
⭐
59
A curated list of “Temporally Language Grounding” and related area
Din Group Activity Recognition Benchmark
⭐
53
[ICCV 2021] A new codebase containing various methods for Group Activity Recognition. Paper title: Spatio-Temporal Dynamic Inference Network for Group Activity Recognition.
Mini Kinetics 200
⭐
51
Mini-Kinetics-200 data splits used in paper "Rethinking Spatiotemporal Feature Learning For Video Understanding"
Graph_distillation
⭐
51
Graph Distillation for Action Detection
Temporally Language Grounding
⭐
49
A Pytorch implemention for some state-of-the-art models for" Temporally Language Grounding in Untrimmed Videos"
Sdn
⭐
48
[NeurIPS 2019] Why Can't I Dance in the Mall? Learning to Mitigate Scene Bias in Action Recognition
Icme2019 Ctr
⭐
47
The Code for ICME2019 Grand Challenge: Short Video Understanding (Single Model Ranks 6th)
Avion
⭐
46
Code release for "Training a Large Video Model on a Single Machine in a Day"
Temporal Shift Module
⭐
46
Unofficial implementation for paper `Temporal Shift Module for Efficient Video Understanding`
I3d Tensorflow
⭐
45
Inflated 3D ConvNets for video understanding
Cgdetr
⭐
43
Official pytorch repository for CG-DETR "Correlation-guided Query-Dependency Calibration in Video Representation Learning for Temporal Grounding"
Mtl Aqa
⭐
38
What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment
Mvd
⭐
37
[CVPR2023] Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning (https://arxiv.org/abs/2212.04500)
Openpvsg
⭐
33
Benchmarking Panoptic Video Scene Graph Generation (PVSG), CVPR'23
Pointtad
⭐
30
[NeurIPS 2022] PointTAD: Multi-Label Temporal Action Detection with Learnable Query Points
Testa
⭐
29
[EMNLP 2023] TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding
Glimpse_clouds
⭐
28
Pytorch implementation of the paper "Glimpse Clouds: Human Activity Recognition from Unstructured Feature Points", F. Baradel, C. Wolf, J. Mille , G.W. Taylor, CVPR 2018
Cola
⭐
27
[Codes of CVPR'21 paper] CoLA: Weakly-Supervised Temporal Action Localization with Snippet Contrastive Learning
Pi Consistency Activity Detection
⭐
26
End-to-End Semi-Supervised Learning for Video Action Detection [CVPR 2022]
Stcnet
⭐
24
STCNet: Spatio-Temporal Cross Network for Industrial Smoke Detection
Rdn4depth
⭐
22
Region Deformer Networks for Unsupervised Depth Estimation from Unconstrained Monocular Videos, IJCAI 2019
Mica Movieclip
⭐
22
This repository contains the codebase for MovieCLIP: Visual Scene Recognition in Movies
Gravi T
⭐
21
Graph learning framework for long-term video understanding
Tcgl
⭐
20
[IEEE T-IP 2022] TCGL: Temporal Contrastive Graph for Self-supervised Video Representation Learning
Sakdn
⭐
19
[IEEE T-IP 2021] Semantics-aware Adaptive Knowledge Distillation for Cross-modal Action Recognition
Progressive Action Prediction
⭐
18
[CVPR 2023] Code for action prediction from videos
Cv Datasets
⭐
18
Collection of open datasets in computer vision.
Soccerdb
⭐
17
SoccerDB: A Large-Scale Database for Comprehensive Video Understanding
Temporal Augmentation
⭐
17
Temporal augmentation with two-stream ConvNet features on human action recognition
Fitness Aqa
⭐
16
Fitness Action Quality Assessment or your AI-Fitness Coach [ECCV 2022]
Spot
⭐
16
[ECCV 2022] Official Pytorch Implementation of the paper : " Semi-Supervised Temporal Action Detection with Proposal-Free Masking "
Ltcontext
⭐
16
[ICCV 2023] How Much Temporal Long-Term Context is Needed for Action Segmentation?
Tem Adapter
⭐
15
[ICCV2023] Tem-adapter: Adapting Image-Text Pretraining for Video Question Answer
Image Pretraining For Video
⭐
15
[ECCV 2022] This repository includes the official implementation our paper "In Defense of Image Pre-Training for Spatiotemporal Recognition".
Cp 360 Weakly Supervised Saliency
⭐
13
CP-360-Weakly-Supervised-Saliency
Videomae Action Detection
⭐
13
[NeurIPS 2022 Spotlight] VideoMAE for Action Detection
Orbit 2022 Winner Method
⭐
13
Few-Shot Video Object Recognition with Embedding Adaptation and Uniform Clip Sampling: Winner of ORBIT Few-Shot Object Recognition Challenge 2022
Opra
⭐
11
Online Product Reviews for Affordances
Region Based Non Local Network
⭐
11
[Codes of paper]: Region-based Non-local operation for Video Classification
Sst
⭐
11
My implementation (PyTorch) for the paper SST: Single-Stream Temporal Action Proposals (http://vision.stanford.edu/pdf/buch2017cvpr.pdf).
Cvpr2023 Cmpae
⭐
10
[CVPR 2023] Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio-Visual Event Perception
Bcnet
⭐
10
【AAAI 2022】Temporal Action Proposal Generation with Background Constraint
Piano Skills Assessment
⭐
10
Piano Skills Assessment [IEEE MMSP 2021]
Deepepisodicmemory
⭐
10
Deep neural network architecture for representing robot experiences in an episodic-like memory which facilitates encoding, recalling, and predicting action experience - Research Project at KIT's High Performance Humanoids Technologies Lab (H2T)
C3d Lstm Pytorch
⭐
9
C3D-LSTM implementation in PyTorch
Vtc
⭐
8
VTC: Improving Video-Text Retrieval with User Comments
1-100 of 113 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.