Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
3d Resnets Pytorch | 2,677 | 2 years ago | 120 | mit | Python | |||||
3D ResNets for Action Recognition (CVPR 2018) | ||||||||||
Sparseconvnet | 1,781 | 3 months ago | 2 | July 10, 2019 | 50 | other | C++ | |||
Submanifold sparse convolutional networks | ||||||||||
Face_specific_augm | 373 | 4 years ago | 1 | Jupyter Notebook | ||||||
Face Renderer to perform Domain (Face) Specific Data Augmentation | ||||||||||
Historyobjectrecognition | 288 | 6 years ago | 3 | Python | ||||||
Pytorch Mfnet | 231 | 4 years ago | 14 | mit | Python | |||||
Facecept3d | 151 | 7 years ago | 6 | mit | C++ | |||||
FaceCept3D: 3D Face Analysis and Recognition | ||||||||||
3d Densenet | 113 | 4 years ago | 13 | mit | Python | |||||
3D Dense Connected Convolutional Network (3D-DenseNet for action recognition) | ||||||||||
Opendetection | 98 | 6 years ago | 15 | bsd-3-clause | C++ | |||||
OpenDetection is a standalone open source project for object detection and recognition in images and 3D point clouds. | ||||||||||
Gcode Preview | 92 | 1 | a month ago | 21 | September 21, 2022 | 3 | mit | G-code | ||
A simple GCode parser & previewer with 3D printing in mind. Written in Typescript. | ||||||||||
3d Resnets | 88 | 5 years ago | 1 | mit | Lua | |||||
3D ResNets for Action Recognition |
Expand the `Densely Connected Convolutional Networks DenseNets to 3D-DenseNet for action recognition (video classification):
Each model can be tested on such datasets:
A number of layers, blocks, growth rate, video normalization and other training params may be changed trough shell or inside the source code.
There are also many other implementations, they may be useful also.
UCF101.rar
file and you will get ../UCF101/<action_name>/<video_name.avi>
folder structure../data_prepare/convert_video_to_images.sh
script to decode the UCF101
video files to image files.
./data_prepare/convert_video_to_images.sh ../UCF101 25
(number 25
means the fps rate)./data_prepare/convert_images_to_list.sh
script to create/update the {train,test}.list
according to the new UCF101
image folder structure generated from last step (from images to files).
./data_prepare/convert_images_to_list.sh .../UCF101 4
, this will update the test.list
and train.list
files (number 4
means the ratio of test and train data is 1/4)train.list
example:
ApplyEyeMakeup/v_ApplyEyeMakeup_g01_c01 0
ApplyEyeMakeup/v_ApplyEyeMakeup_g01_c02 0
ApplyEyeMakeup/v_ApplyEyeMakeup_g01_c03 0
ApplyLipstick/v_ApplyLipstick_g01_c01 1
ApplyLipstick/v_ApplyLipstick_g01_c02 1
ApplyLipstick/v_ApplyLipstick_g01_c03 1
Archery/v_Archery_g01_c01 2
Archery/v_Archery_g01_c02 2
Archery/v_Archery_g01_c03 2
Archery/v_Archery_g01_c04 2
BabyCrawling/v_BabyCrawling_g01_c01 3
BabyCrawling/v_BabyCrawling_g01_c02 3
BabyCrawling/v_BabyCrawling_g01_c03 3
BabyCrawling/v_BabyCrawling_g01_c04 3
BalanceBeam/v_BalanceBeam_g01_c01 4
BalanceBeam/v_BalanceBeam_g01_c02 4
BalanceBeam/v_BalanceBeam_g01_c03 4
BalanceBeam/v_BalanceBeam_g01_c04 4
...
test.list
and train.list
files to the root of video folder (../UCF101
).Check the trainig help message
python run_dense_net_3d.py -h
Train and test the program
python run_dense_net_3d.py --train --test -ds path/to/video_folder
// Notices that all the logs message will be written in log.txt file in the root folder
run_dense_net_3d.py
-> train_params
settings
'num_classes': 5, # The number of the classes that this dataset had
'batch_size': 10, # Batch Size When we trian the model
'n_epochs': 100, # The total number of epoch we run the model
'crop_size': (64,64), # The (width, height) of images that we used to trian the model
'sequence_length': 16, # The length of the video clip
'overlap_length': 8, # The overlap of the images when we extract the video clips,
this should be less than sequence_length
'initial_learning_rate': 0.1,
'reduce_lr_epoch_1': 50, # epochs * 0.5
'reduce_lr_epoch_2': 75, # epochs * 0.75
'validation_set': True, # Whether used validation set or not
'validation_split': None, # None or float
'queue_size': 300, # The data queue size when we extract the data from dataset,
should be set according to your memory size
'normalization': 'std', # None, divide_256, divide_255, std
Test results on MERL shopping dataset. Video normalization per channels was used.
Approximate training time for models on GeForce GTX TITAN X (12 GB memory):