-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit f141733
Showing
75 changed files
with
37,033 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
# compat | ||
outputs | ||
# system | ||
.DS_Store | ||
.gitattributes | ||
.vscode | ||
|
||
# Python | ||
build/ | ||
venv*/ | ||
*.egg-info | ||
.eggs/ | ||
__pycache__ | ||
dist/ | ||
.pytest_cache/ | ||
.ipynb_checkpoints/ | ||
butterflydetector/functional.cpython-* | ||
butterflydetector/functional.html | ||
|
||
# data | ||
data*/ | ||
*debug.png | ||
*.pl | ||
*.pl-* | ||
*.pl.* | ||
*.pkl | ||
*.pkl-* | ||
*.pkl.* | ||
*.json | ||
*.log | ||
*.zip | ||
outputs/ | ||
shared/ | ||
decoder.prof | ||
decoder_flame.svg | ||
*.onnx | ||
debug.*.png | ||
pretrained | ||
test-clis-*/ | ||
pretrained |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,150 @@ | ||
# Butterfly Detector | ||
|
||
## Butterfly Detector for Aerial Images | ||
 | ||
|
||
> Current state-of-the-art object detectors have achieved high performance | ||
> when applied to images captured by standard front facing cameras. When applied | ||
> to high-resolution aerial images captured from a drone or UAV stand-point, | ||
> they fail to generalize to the wide range of objects' scales. In order to | ||
> address this limitation, we propose an object detection method called | ||
> Butterfly Detector that is tailored to detect objects in aerial images. We | ||
> extend the concept of fields and introduce butterfly fields, a type of | ||
> composite field that describes the spatial information of output features as | ||
> well as the scale of the detected object. To overcome occlusion and viewing | ||
> angle variations that can hinder the localization process, we employ a voting | ||
> mechanism between related butterfly vectors pointing to the object center. We | ||
> evaluate our Butterfly Detector on two publicly available UAV datasets | ||
> (UAVDT and VisDrone2019) and show that it outperforms previous state-of-the-art | ||
> methods while remaining real-time. | ||
# Demo | ||
|
||
Solarized dark | Solarized Ocean | ||
:-------------------------------------------------------------------------------------------:|:-------------------------: | ||
 |  | ||
|
||
<!----> | ||
|
||
### Setup | ||
|
||
Python 3 is required. Python 2 is not supported. | ||
Do not clone this repository | ||
and make sure there is no folder named `butterflydetector` in your current directory. | ||
|
||
```sh | ||
pip3 install butterflydetector | ||
``` | ||
|
||
For development of the butterflydetector source code itself, you need to clone this repository and then: | ||
|
||
```sh | ||
pip3 install numpy cython | ||
pip3 install --editable '.[train,test]' | ||
``` | ||
|
||
The last command installs the Python package in the current directory | ||
(signified by the dot) with the optional dependencies needed for training and | ||
testing. | ||
|
||
### Data structure | ||
|
||
data | ||
├── UAV-benchmark-M | ||
├── test | ||
├── train | ||
├── VisDrone2019 | ||
├── VisDrone2019-DET-train | ||
├── annotations | ||
├── images | ||
├── VisDrone2019-DET-val | ||
├── VisDrone2019-DET-test-dev | ||
|
||
# Interfaces | ||
|
||
* `python3 -m butterflydetector.predict --help` | ||
* `python3 -m butterflydetector.train --help` | ||
* `python3 -m butterflydetector.eval --help` | ||
* `python3 -m butterflydetector.logs --help` | ||
|
||
Tools to work with models: | ||
|
||
* `python3 -m butterflydetector.migrate --help` | ||
|
||
|
||
# Benchmark | ||
Comparison of AP (%), False Positives (FP), and Recall (%) with state-of-the-art methods on UAVDT datasets. | ||
 | ||
|
||
Comparison of AP (average precision), AR (average recall), True Posi-tives (TP), and False Positives (FP) with state-of-the-art methods on VisDronedataset | ||
 | ||
|
||
|
||
# Visualization | ||
|
||
To visualize logs: | ||
|
||
```sh | ||
python3 -m butterflydetector.logs \ | ||
outputs/resnet50block5-pif-paf-edge401-190424-122009.pkl.log \ | ||
outputs/resnet101block5-pif-paf-edge401-190412-151013.pkl.log \ | ||
outputs/resnet152block5-pif-paf-edge401-190412-121848.pkl.log | ||
``` | ||
|
||
|
||
# Train | ||
|
||
See [datasets](docs/datasets.md) for setup instructions. | ||
|
||
The exact training command that was used for a model is in the first | ||
line of the training log file. | ||
|
||
Train a ResNet model: | ||
|
||
```sh | ||
time CUDA_VISIBLE_DEVICES=0,1 python3 -m butterflydetector.train \ | ||
--lr=1e-3 \ | ||
--momentum=0.95 \ | ||
--epochs=150 \ | ||
--lr-decay 120 140 \ | ||
--batch-size=16 \ | ||
--basenet=resnet101 \ | ||
--head-quad=1 \ | ||
--headnets butterfly \ | ||
--square-edge=401 \ | ||
--lambdas 10 1 1 15 1 1 15 1 1 | ||
``` | ||
|
||
You can refine an existing model with the `--checkpoint` option. | ||
|
||
# Evaluation | ||
|
||
# Video | ||
|
||
Processing a video frame by frame from `video.avi` to `video.pose.mp4` using ffmpeg: | ||
|
||
```sh | ||
export VIDEO=video.avi # change to your video file | ||
|
||
mkdir ${VIDEO}.images | ||
ffmpeg -i ${VIDEO} -qscale:v 2 -vf scale=641:-1 -f image2 ${VIDEO}.images/%05d.jpg | ||
python3 -m butterflydetector.predict --checkpoint resnet152 --glob "${VIDEO}.images/*.jpg" | ||
ffmpeg -framerate 24 -pattern_type glob -i ${VIDEO}.images/'*.jpg.skeleton.png' -vf scale=640:-2 -c:v libx264 -pix_fmt yuv420p ${VIDEO}.pose.mp4 | ||
``` | ||
|
||
In this process, ffmpeg scales the video to `641px` which can be adjusted. | ||
|
||
|
||
<!--# Documentation Pages | ||
* [datasets](docs/datasets.md) | ||
* [Google Colab demo](https://colab.research.google.com/drive/1H8T4ZE6wc0A9xJE4oGnhgHpUpAH5HL7W) | ||
* [studies.ipynb](docs/studies.ipynb) | ||
* [evaluation logs](docs/eval_logs.md) | ||
* [performance analysis](docs/performance.md)--> | ||
|
||
# Citation | ||
|
||
``` | ||
``` |
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
"""An open implementation of PifPaf.""" | ||
|
||
__version__ = '0.10.1' | ||
|
||
from . import datasets | ||
from . import decoder | ||
from . import network | ||
from . import optimize |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,137 @@ | ||
"""Benchmark.""" | ||
|
||
import argparse | ||
import datetime | ||
import json | ||
import logging | ||
import os | ||
import subprocess | ||
|
||
import pysparkling | ||
|
||
LOG = logging.getLogger(__name__) | ||
|
||
|
||
DEFAULT_BACKBONES = [ | ||
# 'shufflenetv2x1', | ||
'shufflenetv2x2', | ||
'resnet50', | ||
# 'resnext50', | ||
'resnet101', | ||
'resnet152', | ||
] | ||
|
||
|
||
def cli(): | ||
parser = argparse.ArgumentParser( | ||
description=__doc__, | ||
formatter_class=argparse.ArgumentDefaultsHelpFormatter, | ||
) | ||
parser.add_argument('--output', default=None, | ||
help='output file name') | ||
parser.add_argument('--backbones', default=DEFAULT_BACKBONES, nargs='+', | ||
help='backbones to evaluate') | ||
parser.add_argument('--iccv2019-ablation', default=False, action='store_true') | ||
group = parser.add_argument_group('logging') | ||
group.add_argument('--debug', default=False, action='store_true', | ||
help='print debug messages') | ||
args, eval_args = parser.parse_known_args() | ||
|
||
logging.basicConfig(level=logging.INFO if not args.debug else logging.DEBUG) | ||
|
||
# default eval_args | ||
if not eval_args: | ||
eval_args = ['--all-images', '--loader-workers=8'] | ||
|
||
if '--all-images' not in eval_args: | ||
LOG.info('adding "--all-images" to the argument list') | ||
eval_args.append('--all-images') | ||
|
||
if not any(l.startswith('--loader-workers') for l in eval_args): | ||
LOG.info('adding "--loader-workers=8" to the argument list') | ||
eval_args.append('--loader-workers=8') | ||
|
||
# generate a default output filename | ||
if args.output is None: | ||
now = datetime.datetime.now().strftime('%y%m%d-%H%M%S') | ||
args.output = 'outputs/benchmark-{}/'.format(now) | ||
os.makedirs(args.output) | ||
|
||
return args, eval_args | ||
|
||
|
||
def run_eval_coco(output_folder, backbone, eval_args, output_name=None): | ||
if output_name is None: | ||
output_name = backbone | ||
output_name = output_name.replace('/', '-') | ||
|
||
out_file = os.path.join(output_folder, output_name) | ||
if os.path.exists(out_file + '.stats.json'): | ||
LOG.warning('Output file %s exists already. Skipping.', | ||
out_file + '.stats.json') | ||
return | ||
|
||
LOG.debug('Launching eval for %s.', output_name) | ||
subprocess.run([ | ||
'python', '-m', 'openpifpaf.eval_coco', | ||
'--output', out_file, | ||
'--checkpoint', backbone, | ||
] + eval_args, check=True) | ||
|
||
|
||
def main(): | ||
args, eval_args = cli() | ||
|
||
if args.iccv2019_ablation: | ||
assert len(args.backbones) == 1 | ||
multi_eval_args = [ | ||
eval_args, | ||
eval_args + ['--connection-method=blend'], | ||
eval_args + ['--connection-method=blend', '--long-edge=961', '--multi-scale', | ||
'--no-multi-scale-hflip'], | ||
eval_args + ['--connection-method=blend', '--long-edge=961', '--multi-scale'], | ||
] | ||
names = [ | ||
'singlescale-max', | ||
'singlescale', | ||
'multiscale-nohflip', | ||
'multiscale', | ||
] | ||
for eval_args_i, name_i in zip(multi_eval_args, names): | ||
run_eval_coco(args.output, args.backbones[0], eval_args_i, output_name=name_i) | ||
else: | ||
for backbone in args.backbones: | ||
run_eval_coco(args.output, backbone, eval_args) | ||
|
||
sc = pysparkling.Context() | ||
stats = ( | ||
sc | ||
.wholeTextFiles(args.output + '*.stats.json') | ||
.mapValues(json.loads) | ||
.map(lambda d: (d[0].replace('.stats.json', '').replace(args.output, ''), d[1])) | ||
.collectAsMap() | ||
) | ||
LOG.debug('all data: %s', stats) | ||
|
||
# pretty printing | ||
for backbone, data in sorted(stats.items(), key=lambda b_d: b_d[1]['stats'][0]): | ||
print( | ||
'| {backbone: <25} ' | ||
'| __{AP:.1f}__ ' | ||
'| {APM: <8.1f} ' | ||
'| {APL: <8.1f} ' | ||
'| {t: <15.0f} ' | ||
'| {tdec: <12.0f} |' | ||
''.format( | ||
backbone=backbone, | ||
AP=100.0 * data['stats'][0], | ||
APM=100.0 * data['stats'][3], | ||
APL=100.0 * data['stats'][4], | ||
t=1000.0 * data['total_time'] / data['n_images'], | ||
tdec=1000.0 * data['decoder_time'] / data['n_images'], | ||
) | ||
) | ||
|
||
|
||
if __name__ == '__main__': | ||
main() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
"""Compute height distribution for all and medium sized bounding boxes.""" | ||
|
||
import argparse | ||
|
||
import numpy as np | ||
import torch | ||
|
||
from . import datasets, transforms | ||
|
||
ANNOTATIONS = 'data-mscoco/annotations/person_keypoints_val2017.json' | ||
IMAGE_DIR = 'data-mscoco/images/val2017/' | ||
|
||
|
||
def main(): | ||
parser = argparse.ArgumentParser( | ||
description=__doc__, | ||
formatter_class=argparse.ArgumentDefaultsHelpFormatter, | ||
) | ||
parser.add_argument('--long-edge', default=321, type=int, | ||
help='long edge of input images') | ||
args = parser.parse_args() | ||
|
||
preprocess = transforms.Compose([ | ||
transforms.RescaleAbsolute(args.long_edge), | ||
transforms.CenterPad(args.long_edge), | ||
]) | ||
data = datasets.CocoKeypoints( | ||
root=IMAGE_DIR, | ||
annFile=ANNOTATIONS, | ||
preprocess=preprocess, | ||
) | ||
data_loader = torch.utils.data.DataLoader( | ||
data, batch_size=1, num_workers=2) | ||
|
||
bbox_heights = [] | ||
bbox_heights_medium = [] | ||
for i, (_, anns) in enumerate(data_loader): | ||
print('batch {}/{}'.format(i, len(data_loader))) | ||
|
||
for ann in anns: | ||
mask = ann['iscrowd'] == 0 | ||
bbox_heights.append(ann['bbox'][mask, 3]) | ||
|
||
areas = ann['bbox_original'][:, 2] * ann['bbox_original'][:, 3] | ||
mask_medium = mask & (32**2 < areas) & (areas < 96**2) | ||
bbox_heights_medium.append(ann['bbox'][mask_medium, 3]) | ||
|
||
bbox_heights = np.array([h for batch in bbox_heights for h in batch]) | ||
bbox_heights_medium = np.array([h for batch in bbox_heights_medium for h in batch]) | ||
print( | ||
'bbox height: all = {} +/- {}, medium = {} +/- {}'.format( | ||
np.mean(bbox_heights), np.std(bbox_heights), | ||
np.mean(bbox_heights_medium), np.std(bbox_heights_medium), | ||
) | ||
) | ||
|
||
|
||
if __name__ == '__main__': | ||
main() |
Oops, something went wrong.