Awesome Open Source

Programming Languages

Search results for video captioning

video-captioning x

49 search results found

Xmodaler ⭐ 929

X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).

Video Caption.pytorch ⭐ 397

pytorch implementation of video captioning

Video2description ⭐ 153

Video to Text: Natural language description generator for some given video. [Video Captioning]

Mlds2018spring ⭐ 106

Machine Learning and having it Deep and Structured (MLDS) in 2018 spring

Video Captioning ⭐ 102

This repository contains the code for a video captioning system inspired by Sequence to Sequence -- Video to Text. This system takes as input a video and generates a caption in English describing the video.

Vidchapters ⭐ 93

[NeurIPS 2023 D&B] VidChapters-7M: Video Chapters at Scale

Whisper Auto Transcribe ⭐ 91

Auto transcribe tool based on whisper

Recurrent Transformer ⭐ 87

[ACL 2020] PyTorch code for MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning

Weakly Supervised Dense Event Captioning in Videos, i.e. generating multiple sentence descriptions for a video in a weakly-supervised manner.

Tvcaption ⭐ 72

[ECCV 2020] PyTorch code of MMT (a multimodal transformer captioning model) on TVCaption dataset

Video_captioning_datasets ⭐ 63

Summary about Video-to-Text datasets. This repository is part of the review paper *Bridging Vision and Language from the Video-to-Text Perspective: A Comprehensive Review*

Video Captioning Models In Pytorch ⭐ 61

A PyTorch implementation of state of the art video captioning models from 2015-2019 on MSVD and MSRVTT datasets.

Awesome Captioning ⭐ 56

A curated list of Multimodal Captioning related research(including image captioning, video captioning, and text captioning)

Crossmodal Contrastive Learning ⭐ 56

CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video Representations, ICCV 2021

Abivirnet ⭐ 56

Attention Bidirectional Video Recurrent Net

[NeurIPS 2022 Spotlight] Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations

[AAAI 2023 Oral] VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning

Video Cap ⭐ 44

🎬 Video Captioning: ICCV '15 paper implementation

Srt Webvtt ⭐ 43

Convert SRT formatted subtitle to WebVTT on the fly over HTML5/browser environment

Video2commonsense ⭐ 40

Video captioning baseline models on Video2Commonsense Dataset.

What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment

Densecap ⭐ 33

Dense video captioning in PyTorch

Delving Deeper Into The Decoder For Video Captioning ⭐ 30

Source code for Delving Deeper into the Decoder for Video Captioning

Pytorch_empirical Mvm ⭐ 30

A PyTorch implementation of EmpiricalMVM

[ICIP 2022] VLCap: Vision-Language with Contrastive Learning for Coherent Video Paragraph Captioning

Video Caption Opennmt.pytorch ⭐ 26

implement video caption based on openNMT

Llmva Gebc ⭐ 26

Winner solution to Generic Event Boundary Captioning task in LOVEU Challenge (CVPR 2023 workshop)

Adlxmlds2017 ⭐ 25

Deep learning works for ADLxMLDS (CSIE 5431) in NTU

Video_features_extractor ⭐ 22

Python implementation of extraction of several visual features representations from videos

Video2language ⭐ 19

Generating video descriptions using deep learning in Keras

Visual_syntactic_embedding_video_captioning ⭐ 14

Source code of the paper titled *Improving Video Captioning with Temporal Composition of a Visual-Syntactic Embedding*

Video Captioning Transformer ⭐ 13

这是一个基于Pytorch平台、Transformer框架实现的视频描述生成 (Video Captioning) 深度学习模型。视频描述生成任务指的是：输入一个视频，输出一句描述整个视频内容的文字（前提是视频较短且可以用一句话来

Awesome Video Text Datasets ⭐ 13

A curated list of video-text datasets in a variety of languages. These datasets can be used for video captioning (video description) or video retrieval.

Mvad Names Dataset ⭐ 12

M-VAD Names Dataset. Multimedia Tools and Applications (2019)

Interactive Keras Captioning ⭐ 12

Interactive multimedia captioning with Keras

Spatial Temporal Adaptive Attention For Video Captioning ⭐ 11

Extension of hLSTMat

Shot2story ⭐ 11

A new multi-shot video understanding benchmark Shot2Story20K with detailed shot-level captions and comprehensive video summaries.

Msr Vtt It ⭐ 11

A large scale dataset for Video Captioning in Italian

Multicaptions ⭐ 10

Multicaptions processes and displays subtitles on a graphic LCD or VFD display while simultaneously playing fullscreen HD video on a separate monitor

A PyTorch implementation of the paper Thinking Hallucination for Video Captioning.

Adlxmlds2017 ⭐ 10

Deep learning projects in ADL (2017 FALL) @ NTU. Implement in Tensorflow and Python. Work on Sequence Labeling, Video Caption, Game Playing, and Comics Generation.

Codes and Models for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model

Attentive_specialized_network_video_captioning ⭐ 9

Source code of the paper titled *Attentive Visual Semantic Specialized Network for Video Captioning*

Official implementation of state-aware video procedural captioning (ACM MM 2021)

Captionomaly Deep Learning Toolbox For Anomaly Captioning ⭐ 6

Source Code for Captionomaly: A Deep Learning Toolbox for Anomaly Captioning in Surveillance Videos

Video Content Description ⭐ 6

Video content description model for generating descriptions for unconstrained videos

implementation of TDConvED for video captioning

(TIP) Concept-Aware Video Captioning: Describing Videos with Effective Prior Information

Temporal Segment Captioning Network(TSCN).

Related Searches

Python Video Captioning (24)

1-49 of 49 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.