Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Style Based Gan Pytorch | 860 | 2 years ago | 67 | other | Python | |||||
Implementation A Style-Based Generator Architecture for Generative Adversarial Networks in PyTorch | ||||||||||
Torch Toolbox | 391 | 2 years ago | 10 | October 31, 2021 | 4 | bsd-3-clause | Python | |||
🛠 Toolbox to extend PyTorch functionalities | ||||||||||
Siamrpn_plus_plus_pytorch | 390 | 4 years ago | 11 | Python | ||||||
SiamRPN, SiamRPN++, unofficial implementation of "SiamRPN++" (CVPR2019), multi-GPUs, LMDB. | ||||||||||
Megreader | 277 | 3 years ago | 9 | Python | ||||||
A research project for text detection and recognition using PyTorch 1.2. | ||||||||||
Decoupled Attention Network | 219 | 3 years ago | 16 | mit | Python | |||||
Pytorch implementation for "Decoupled attention network for text recognition". | ||||||||||
Sequential Imagenet Dataloader | 199 | 2 years ago | 2 | mit | Python | |||||
A plug-in replacement for DataLoader to load Imagenet disk-sequentially in PyTorch. | ||||||||||
Svhnclassifier Pytorch | 145 | 2 years ago | 7 | mit | Jupyter Notebook | |||||
A PyTorch implementation of Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks (http://arxiv.org/pdf/1312.6082.pdf) | ||||||||||
Im2recipe Pytorch | 143 | 3 years ago | 10 | mit | Python | |||||
im2recipe Pytorch implementation | ||||||||||
Crnn | 127 | 5 years ago | 3 | mit | Python | |||||
Based on crnn add Chinese recognition | ||||||||||
Tecogan Pytorch | 126 | a year ago | 7 | apache-2.0 | Python | |||||
A PyTorch Reimplementation of TecoGAN: Temporally Coherent GAN for Video Super-Resolution |
This is an unofficial PyTorch implementation of SiamRPN++ (CVPR2019), implemented by Peng Xu and Jin Feng. Our training can be conducted on multi-GPUs, and use LMDB data format to speed up the data loading.
This project is designed with these goals:
As stated in the original paper, SiamRPN++ network has three parts, including Backbone Networks, SiamRPN Blocks, and Weighted Fusion Layers.
1. Backbone Network (modified ResNet-50)
As stated in the original paper, SiamRPN++ uses ResNet-50 as backbone by modifying the strides and adding dilated convolutions for conv4 and conv5 blocks. Here, we present the detailed comparison between original ResNet-50 and SiamRPN++ ResNet-50 backbone in following table.
bottleneck in conv4 | bottleneck in conv5 | ||||||
conv1x1 | conv3x3 | conv1x1 | conv1x1 | conv3x3 | conv1x1 | ||
original ResNet-50 | stride | 1 | 2 | 1 | 1 | 2 | 1 |
padding | 0 | 1 | 0 | 0 | 1 | 0 | |
dilation | 1 | 1 | 1 | 1 | 1 | 1 | |
ResNet-50 in SiamRPN++ | stride | 1 | 1 | 1 | 1 | 1 | 1 |
padding | 0 | 2 | 0 | 0 | 4 | 0 | |
dilation | 1 | 2 | 1 | 1 | 4 | 1 |
2. SiamRPN Block
Based on our understanding to the original paper, we plot a architecture illustration to describe the Siamese RPN block as shown in following.
We also present the detailed configurations of each layer of RPN block in following table. Please see more details in ./network/RPN.py.
component | configuration |
---|---|
adj_1 / adj_2 / adj_3 / adj_4 | conv2d(256, 256, ksize=3, pad=1, stride=1), BN2d(256) |
fusion_module_1 / fusion_module_2 | conv2d(256, 256, ksize=1, pad=0, stride=1), BN2d(256), ReLU |
box head | conv2d(256, 4*5, ksize=1, pad=0, stride=1) |
cls head | conv2d(256, 2*5, ksize=1, pad=0, stride=1) |
3. Weighted Fusion Layer
We implemente the weighted fusion layer via group convolution operations. Please see details in ./network/SiamRPN.py.
Ubuntu 14.04
Python 2.7
PyTorch 0.4.0
Other main requirements can be installed by:
# 1. Install cv2 package.
conda install opencv
# 2. Install LMDB package.
conda install lmdb
# 3. Install fire package.
pip install fire -c conda-forge
# 1. Clone this repository to your disk.
git clone https://github.com/PengBoXiangShang/SiamRPN_plus_plus_PyTorch.git
# 2. Change working directory.
cd SiamRPN_plus_plus_PyTorch
# 3. Download training data. In this project, we provide the downloading and preprocessing scripts for ILSVRC2015_VID dataset. Please download ILSVRC2015_VID dataset (86GB). The cripts for other tracking datasets are coming soon.
cd data
wget -c http://bvisionweb1.cs.unc.edu/ilsvrc2015/ILSVRC2015_VID.tar.gz
tar -xvf ILSVRC2015_VID.tar.gz
rm ILSVRC2015_VID.tar.gz
cd ..
# 4. Preprocess data.
chmod u+x ./preprocessing/create_dataset.sh
./preprocessing/create_dataset.sh
# 5. Pack the preprocessed data into LMDB format to accelerate data loading.
chmod u+x ./preprocessing/create_lmdb.sh
./preprocessing/create_lmdb.sh
# 6. Start the training.
chmod u+x ./train.sh
./train.sh
Many thanks to Sisi who helps us to download the huge ILSVRC2015_VID dataset.