forked from daijifeng001/R-FCN
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
A matlab version of R-FCN which supports both Windows and Linux.
- Loading branch information
0 parents
commit 750e534
Showing
82 changed files
with
46,843 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
# Auto detect text files and perform LF normalization | ||
* text=auto | ||
|
||
# Custom for Visual Studio | ||
*.cs diff=csharp | ||
|
||
# Standard to msysgit | ||
*.doc diff=astextplain | ||
*.DOC diff=astextplain | ||
*.docx diff=astextplain | ||
*.DOCX diff=astextplain | ||
*.dot diff=astextplain | ||
*.DOT diff=astextplain | ||
*.pdf diff=astextplain | ||
*.PDF diff=astextplain | ||
*.rtf diff=astextplain | ||
*.RTF diff=astextplain |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
# Windows image file caches | ||
Thumbs.db | ||
ehthumbs.db | ||
|
||
# Folder config file | ||
Desktop.ini | ||
|
||
# Recycle Bin used on file shares | ||
$RECYCLE.BIN/ | ||
|
||
# User Ingore | ||
models/fast_rcnn_prototxts/ | ||
models/pre_trained_model/ | ||
models/rpn_prototxts/ | ||
data/ | ||
datasets/ | ||
output/ | ||
cachedir/ | ||
imdb/cache | ||
bin/ | ||
external/caffe/matlab | ||
fetch_data/*.zip | ||
*.caffemodel | ||
*.mat | ||
|
||
# Windows Installer files | ||
*.cab | ||
*.msi | ||
*.msm | ||
*.msp | ||
|
||
# Windows shortcuts | ||
*.lnk | ||
|
||
# ========================= | ||
# Operating System Files | ||
# ========================= | ||
|
||
# OSX | ||
# ========================= | ||
|
||
.DS_Store | ||
.AppleDouble | ||
.LSOverride | ||
|
||
# Thumbnails | ||
._* | ||
|
||
# Files that might appear on external disk | ||
.Spotlight-V100 | ||
.Trashes | ||
|
||
# Directories potentially created on remote AFP share | ||
.AppleDB | ||
.AppleDesktop | ||
Network Trash Folder | ||
Temporary Items | ||
.apdisk |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
[submodule "external/caffe"] | ||
path = external/caffe | ||
url = https://github.com/ShaoqingRen/caffe.git | ||
branch = faster-R-CNN |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
Faster R-CNN | ||
|
||
The MIT License (MIT) | ||
|
||
Copyright (c) 2015 Microsoft Corporation | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in | ||
all copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN | ||
THE SOFTWARE. | ||
|
||
************************************************************************ | ||
|
||
THIRD-PARTY SOFTWARE NOTICES AND INFORMATION | ||
|
||
This project, Faster R-CNN, incorporates material from the project(s) listed below (collectively, "Third Party Code"). Microsoft is not the original author of the Third Party Code. The original copyright notice and license under which Microsoft received such Third Party Code are set out below. This Third Party Code is licensed to you under their original license terms set forth below. Microsoft reserves all other rights not expressly granted, whether by implication, estoppel or otherwise. | ||
|
||
1. Caffe, version 0.9, (https://github.com/BVLC/caffe/) | ||
|
||
COPYRIGHT | ||
|
||
All contributions by the University of California: | ||
Copyright (c) 2014, 2015, The Regents of the University of California (Regents) | ||
All rights reserved. | ||
|
||
All other contributions: | ||
Copyright (c) 2014, 2015, the respective contributors | ||
All rights reserved. | ||
|
||
Caffe uses a shared copyright model: each contributor holds copyright over their contributions to Caffe. The project versioning records all such contribution and copyright details. If a contributor wants to further mark their specific copyright on a particular contribution, they should indicate their copyright solely in the commit message of the change when it is committed. | ||
|
||
The BSD 2-Clause License | ||
|
||
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: | ||
|
||
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. | ||
|
||
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. | ||
|
||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | ||
|
||
************END OF THIRD-PARTY SOFTWARE NOTICES AND INFORMATION********** | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,83 @@ | ||
# *R-FCN*: Object Detection via Region-based Fully Convolutional Networks | ||
|
||
By Jifeng Dai, Yi Li, Kaiming He, Jian Sun | ||
|
||
### Introduction | ||
|
||
**R-FCN** is a region-based object detection framework leveraging deep fully-convolutional networks, which is accurate and efficient. In contrast to previous region-based detectors such as Fast/Faster R-CNN that apply a costly per-region sub-network hundreds of times, our region-based detector is fully convolutional with almost all computation shared on the entire image. R-FCN can natually adopt powerful fully convolutional image classifier backbones, such as [ResNets](https://github.com/KaimingHe/deep-residual-networks), for object detection. | ||
|
||
R-FCN was initially described in an [arxiv tech report](https://arxiv.org/abs/1605.06409). | ||
|
||
This code has been tested on Windows 7/8 64 bit, Windows Server 2012 R2, and Ubuntu 14.04, with Matlab 2014a. | ||
|
||
### License | ||
|
||
R-FCN is released under the MIT License (refer to the LICENSE file for details). | ||
|
||
### Citing R-FCN | ||
|
||
If you find R-FCN useful in your research, please consider citing: | ||
|
||
@article{dai16rfcn, | ||
Author = {Jifeng Dai, Yi Li, Kaiming He, Jian Sun}, | ||
Title = {{R-FCN}: Object Detection via Region-based Fully Convolutional Networks}, | ||
Journal = {arXiv preprint arXiv:1605.06409}, | ||
Year = {2016} | ||
} | ||
|
||
### Main Results | ||
| training data | test data | mAP | time/img (K40) | time/img (Titian X) | ||
-------------------|:-------------------:|:---------------------:|:-----:|:--------------:|:------------------:| | ||
R-FCN, ResNet-50L | VOC 07+12 trainval | VOC 07 test | 77.0% | 0.12sec | 0.09sec | | ||
R-FCN, ResNet-101L | VOC 07+12 trainval | VOC 07 test | 79.5% | 0.17sec | 0.12sec | | ||
|
||
|
||
### Requirements: software | ||
|
||
0. `Caffe` build for R-FCN (included in this repository, see `external/caffe`) | ||
- If you are using Windows, you may download a compiled mex file by running `fetch_data/fetch_caffe_mex_windows_vs2013_cuda75.m` | ||
- If you are using Linux or you want to compile for Windows, please recompile [our Caffe branch](https://github.com/daijifeng001/caffe-rfcn). | ||
0. MATLAB 2014a or later | ||
|
||
|
||
### Requirements: hardware | ||
|
||
GPU: Titan, Titan X, K40, K80. | ||
|
||
|
||
### Preparation: | ||
0. Run `fetch_data/fetch_caffe_mex_windows_vs2013_cuda75.m` to download a compiled Caffe mex (for Windows only). | ||
0. Run `fetch_data/fetch_model_ResNet50.m` to download an ImageNet-pre-trained ResNet-50L net. | ||
0. Run `fetch_data/fetch_model_ResNet101.m` to download an ImageNet-pre-trained ResNet-101L net. | ||
0. Run `fetch_data/fetch_region_proposals.m` to download the pre-computed region proposals. | ||
0. Download VOC 2007 and 2012 data to ./datasets. | ||
0. Run `rfcn_build.m`. | ||
0. Run `startup.m`. | ||
|
||
|
||
### Training & Testing: | ||
0. Run `experiments/script_rfcn_VOC0712_ResNet50_OHEM_ss.m` to train a model using ResNet-50L net with online hard example mining (OHEM), leveraging selective search proposals. The accuracy should be ~75.4% in mAP. | ||
- **Note**: the training time is ~13 hours on Titian X. | ||
0. Run `experiments/script_rfcn_VOC0712_ResNet50_OHEM_rpn.m` to train a model using ResNet-50L net with OHEM, leveraging RPN proposals (using ResNet-50L net). The accuracy should be ~77.0% in mAP. | ||
- **Note**: the training time is ~13 hours on Titian X. | ||
0. Run `experiments/script_rfcn_VOC0712_ResNet101_OHEM_rpn.m` to train a model using ResNet-101L net with OHEM, leveraging RPN proposals (using ResNet-101L net). The accuracy should be ~79.5% in mAP. | ||
- **Note**: the training time is ~19 hours on Titian X. | ||
0. Check other scripts in `./experiments` for more settings. | ||
|
||
**Note:** In all the experiments, training is performed on VOC 07+12 trainval, and testing is performed on VOC 07 test. | ||
|
||
### Resources | ||
|
||
0. Experiment logs: [DropBox](https://www.dropbox.com/s/is2gatfdxs1tcls/experiment_log.zip?dl=0), [BaiduYun](http://pan.baidu.com/s/1mhFYejI) | ||
|
||
If the automatic "fetch_data" fails, you may manually download resouces from: | ||
|
||
0. Pre-complied caffe mex (Windows): | ||
- [DropBox](https://www.dropbox.com/s/n1x2bybd6d03s7c/caffe_mex.zip?dl=0), [BaiduYun](http://pan.baidu.com/s/1i4OlG7z) | ||
0. ImageNet-pretrained networks: | ||
- ResNet-50L net [DropBox](https://www.dropbox.com/s/0uzh90f6jx9l0yf/models_ResNet-50L.zip?dl=0), [BaiduYun](http://pan.baidu.com/s/1kVm4ly3) | ||
- ResNet-101L net [DropBox](https://www.dropbox.com/s/ev91ss0pyd5h9ix/models_ResNet-101L.zip?dl=0), [BaiduYun](http://pan.baidu.com/s/1nvgu1pJ) | ||
0. Pre-computed region proposals: | ||
- [DropBox](https://www.dropbox.com/s/gagkulgcif6k1dd/proposals.zip?dl=0), [BaiduYun](http://pan.baidu.com/s/1nv1tkH7) | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
function path = voc2007_devkit() | ||
path = './datasets/VOCdevkit2007'; | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
function path = voc2012_devkit() | ||
path = './datasets/VOCdevkit2012'; | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
function dataset = voc0712_trainval_sp(dataset, usage, use_flip, extension) | ||
% Pascal voc 0712 trainval set with *pre-computed* RPN proposals (trained with ResNet50 or ResNet101) | ||
% extension = "resnet50" or "resnet101" for specifying pre-computed RPN proposals | ||
% set opts.imdb_train opts.roidb_train | ||
|
||
% change to point to your devkit install | ||
devkit2007 = voc2007_devkit(); | ||
devkit2012 = voc2012_devkit(); | ||
|
||
switch usage | ||
case {'train'} | ||
dataset.imdb_train = { imdb_from_voc(devkit2007, 'trainval', '2007', use_flip), ... | ||
imdb_from_voc(devkit2012, 'trainval', '2012', use_flip)}; | ||
dataset.roidb_train = cellfun(@(x) x.roidb_func(x, 'with_self_proposal', true, 'extension', extension), dataset.imdb_train, 'UniformOutput', false); | ||
case {'test'} | ||
error('only supports one source test currently'); | ||
otherwise | ||
error('usage = ''train'' or ''test'''); | ||
end | ||
|
||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
function dataset = voc0712_trainval_ss(dataset, usage, use_flip) | ||
% Pascal voc 0712 trainval set with selective search | ||
% set opts.imdb_train opts.roidb_train | ||
% or set opts.imdb_test opts.roidb_train | ||
|
||
% change to point to your devkit install | ||
devkit2007 = voc2007_devkit(); | ||
devkit2012 = voc2012_devkit(); | ||
|
||
switch usage | ||
case {'train'} | ||
dataset.imdb_train = { imdb_from_voc(devkit2007, 'trainval', '2007', use_flip), ... | ||
imdb_from_voc(devkit2012, 'trainval', '2012', use_flip)}; | ||
dataset.roidb_train = cellfun(@(x) x.roidb_func(x, 'with_selective_search', true), dataset.imdb_train, 'UniformOutput', false); | ||
case {'test'} | ||
error('only supports one source test currently'); | ||
otherwise | ||
error('usage = ''train'' or ''test'''); | ||
end | ||
|
||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
function dataset = voc2007_test_sp(dataset, usage, use_flip, extension) | ||
% Pascal voc 2007 test set with *pre-computed* RPN proposals (trained with ResNet50 or ResNet101) | ||
% extension = "resnet50" or "resnet101" for specifying pre-computed RPN proposals | ||
% set opts.imdb_train opts.roidb_train | ||
|
||
|
||
% change to point to your devkit install | ||
devkit = voc2007_devkit(); | ||
|
||
switch usage | ||
case {'train'} | ||
dataset.imdb_train = { imdb_from_voc(devkit, 'test', '2007', use_flip) }; | ||
dataset.roidb_train = cellfun(@(x) x.roidb_func(x, 'with_self_proposal', true, 'extension', extension), dataset.imdb_train, 'UniformOutput', false); | ||
case {'test'} | ||
dataset.imdb_test = imdb_from_voc(devkit, 'test', '2007', use_flip); | ||
dataset.roidb_test = dataset.imdb_test.roidb_func(dataset.imdb_test, 'with_self_proposal', true, 'extension', extension); | ||
otherwise | ||
error('usage = ''train'' or ''test'''); | ||
end | ||
|
||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
function dataset = voc2007_test_ss(dataset, usage, use_flip) | ||
% Pascal voc 2007 test set with selective search | ||
% set opts.imdb_train opts.roidb_train | ||
% or set opts.imdb_test opts.roidb_train | ||
|
||
% change to point to your devkit install | ||
devkit = voc2007_devkit(); | ||
|
||
switch usage | ||
case {'train'} | ||
dataset.imdb_train = { imdb_from_voc(devkit, 'test', '2007', use_flip) }; | ||
dataset.roidb_train = cellfun(@(x) x.roidb_func(x, 'with_selective_search', true), dataset.imdb_train, 'UniformOutput', false); | ||
case {'test'} | ||
dataset.imdb_test = imdb_from_voc(devkit, 'test', '2007', use_flip) ; | ||
dataset.roidb_test = dataset.imdb_test.roidb_func(dataset.imdb_test, 'with_selective_search', true); | ||
otherwise | ||
error('usage = ''train'' or ''test'''); | ||
end | ||
|
||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
function model = ResNet101_for_RFCN_VOC0712(model) | ||
% ResNet 101layers (finetuned from res3a) | ||
|
||
model.solver_def_file = fullfile(pwd, 'models', 'rfcn_prototxts', 'ResNet-101L_res3a', 'solver_80k110k_lr1_3.prototxt'); | ||
model.test_net_def_file = fullfile(pwd, 'models', 'rfcn_prototxts', 'ResNet-101L_res3a', 'test.prototxt'); | ||
|
||
model.net_file = fullfile(pwd, 'models', 'pre_trained_models', 'ResNet-101L', 'ResNet-101-model.caffemodel'); | ||
model.mean_image = fullfile(pwd, 'models', 'pre_trained_models', 'ResNet-101L', 'mean_image'); | ||
|
||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
function model = ResNet101_for_RFCN_VOC0712_OHEM(model) | ||
% ResNet 101layers with OHEM training (finetuned from res3a) | ||
|
||
model.solver_def_file = fullfile(pwd, 'models', 'rfcn_prototxts', 'ResNet-101L_OHEM_res3a', 'solver_80k110k_lr1_3.prototxt'); | ||
model.test_net_def_file = fullfile(pwd, 'models', 'rfcn_prototxts', 'ResNet-101L_OHEM_res3a', 'test.prototxt'); | ||
|
||
model.net_file = fullfile(pwd, 'models', 'pre_trained_models', 'ResNet-101L', 'ResNet-101-model.caffemodel'); | ||
model.mean_image = fullfile(pwd, 'models', 'pre_trained_models', 'ResNet-101L', 'mean_image'); | ||
|
||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
function model = ResNet50_for_RFCN_VOC0712(model) | ||
% ResNet 50layers (finetuned from res3a) | ||
|
||
model.solver_def_file = fullfile(pwd, 'models', 'rfcn_prototxts', 'ResNet-50L_res3a', 'solver_80k110k_lr1_3.prototxt'); | ||
model.test_net_def_file = fullfile(pwd, 'models', 'rfcn_prototxts', 'ResNet-50L_res3a', 'test.prototxt'); | ||
|
||
model.net_file = fullfile(pwd, 'models', 'pre_trained_models', 'ResNet-50L', 'ResNet-50-model.caffemodel'); | ||
model.mean_image = fullfile(pwd, 'models', 'pre_trained_models', 'ResNet-50L', 'mean_image'); | ||
|
||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
function model = ResNet50_for_RFCN_VOC0712_OHEM(model) | ||
% ResNet 50layers with OHEM training (finetuned from res3a) | ||
|
||
model.solver_def_file = fullfile(pwd, 'models', 'rfcn_prototxts', 'ResNet-50L_OHEM_res3a', 'solver_80k110k_lr1_3.prototxt'); | ||
model.test_net_def_file = fullfile(pwd, 'models', 'rfcn_prototxts', 'ResNet-50L_OHEM_res3a', 'test.prototxt'); | ||
|
||
model.net_file = fullfile(pwd, 'models', 'pre_trained_models', 'ResNet-50L', 'ResNet-50-model.caffemodel'); | ||
model.mean_image = fullfile(pwd, 'models', 'pre_trained_models', 'ResNet-50L', 'mean_image'); | ||
|
||
end |
Oops, something went wrong.