Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Nerf_pl | 2,291 | 2 months ago | 72 | mit | Jupyter Notebook | |||||
NeRF (Neural Radiance Fields) and NeRF in the Wild using pytorch-lightning | ||||||||||
Dbnet.pytorch | 772 | 9 months ago | 78 | apache-2.0 | Python | |||||
A pytorch re-implementation of Real-time Scene Text Detection with Differentiable Binarization | ||||||||||
Nerf Pytorch | 414 | 2 years ago | 20 | other | Python | |||||
A PyTorch re-implementation of Neural Radiance Fields | ||||||||||
Vedastr | 389 | 2 years ago | 14 | apache-2.0 | Python | |||||
A scene text recognition toolbox based on PyTorch | ||||||||||
Awesome Gcn | 377 | 4 years ago | 1 | |||||||
resources for graph convolutional networks (图卷积神经网络相关资源) | ||||||||||
Csrnet Pytorch | 364 | 5 years ago | 50 | Jupyter Notebook | ||||||
CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes | ||||||||||
Neural Motifs | 348 | 4 years ago | 26 | mit | Python | |||||
Code for Neural Motifs: Scene Graph Parsing with Global Context (CVPR 2018) | ||||||||||
Aster.pytorch | 331 | 3 years ago | 2 | mit | Python | |||||
ASTER in Pytorch | ||||||||||
Unifiedparsing | 317 | 2 years ago | 13 | mit | Python | |||||
Codebase and pretrained models for ECCV'18 Unified Perceptual Parsing | ||||||||||
Generative Query Network Pytorch | 274 | 4 years ago | other | Jupyter Notebook | ||||||
Generative Query Network (GQN) in PyTorch as described in "Neural Scene Representation and Rendering" |
This is our implementation of Multi-level Scene Description Network in Scene Graph Generation from Objects, Phrases and Region Captions. The project is based on PyTorch version of faster R-CNN. (Update: model links have been updated. Sorry for the inconvenience.)
We have released our newly proposed scene graph generation model in our ECCV-2018 paper:
Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation.
Check the github repo Factorizable Net if you are interested.
We are still working on the project. If you are interested, please Follow our project.
Install the requirements (you can use pip or Anaconda):
conda install pip pyyaml sympy h5py cython numpy scipy
conda install -c menpo opencv3
conda install -c soumith pytorch torchvision cuda80
pip install easydict
Clone the Faster R-CNN repository
git clone [email protected]:yikang-li/MSDN.git
Build the Cython modules for nms and the roi_pooling layer
cd MSDN/faster_rcnn
./make.sh
cd ..
Download the trained full model and trained RPN, and place it to output/trained_model
Download our cleansed Visual Genome dataset. And unzip it:
tar xzvf top_150_50.tgz
Download Visual Genome images
Place Images and cleansed annotations to coresponding folders:
mkdir -p data/visual_genome
cd data/visual_genome
ln -s /path/to/VG_100K_images_folder VG_100K_images
ln -s /path/to/downloaded_folder top_150_50
__C.IMG_DATA_DIR
in faster_rcnn/fast_rcnn/config.py
Training in multiple stages. (Single-GPU training may take about one week.)
by default, the training is done on a small part of the full dataset:
CUDA_VISIBLE_DEVICES=0 python train_rpn.py
For full Dataset Training:
CUDA_VISIBLE_DEVICES=0 python train_rpn.py --max_epoch=10 --step_size=2 --dataset_option=normal --model_name=RPN_full_region
--step_size
is set to indicate the number of epochs to decay the learning rate, dataset_option
is to indicate the \[ small | fat | normal \]
subset.
Here, we use SGD (controled by --optimizer
)by default:
CUDA_VISIBLE_DEVICES=0 python train_hdn.py --load_RPN --saved_model_path=./output/RPN/RPN_region_full_best.h5 --dataset_option=normal --enable_clip_gradient --step_size=2 --MPS_iter=1 --caption_use_bias --caption_use_dropout --rnn_type LSTM_normal
Furthermore, we can directly use end-to-end training from scratch (not recommended). The result is not good.
CUDA_VISIBLE_DEVICES=0 python train_hdn.py --dataset_option=normal --enable_clip_gradient --step_size=3 --MPS_iter=1 --caption_use_bias --caption_use_dropout --max_epoch=11 --optimizer=1 --lr=0.001
Our pretrained full Model is provided for your evaluation for further implementation. (Please download the related files in advance.)
./eval.sh
Currently, the accuracy of our released version is slightly different from the reported results in the paper:Recall@50: 11.705%; Recall@100: 14.085%.
We thank longcw for his generously releasing the PyTorch Implementation of Faster R-CNN.
@inproceedings{li2017msdn,
author={Li, Yikang and Ouyang, Wanli and Zhou, Bolei and Wang, Kun and Wang, Xiaogang},
title={Scene graph generation from objects, phrases and region captions},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision},
year = {2017}
}
The pre-trained models and the MSDN technique are released for uncommercial use.
Contact Yikang LI if you have questions.