Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

RuntimeError: s__th_or is not implemented for type CUDABoolType #1181

Open
RadiantJeral opened this issue Dec 9, 2019 · 1 comment
Open

Comments

@RadiantJeral
Copy link

RadiantJeral commented Dec 9, 2019

🐛 Bug

2019-12-09 17:14:25,047 maskrcnn_benchmark.trainer INFO: Start training
Traceback (most recent call last):
File "/home/lucifer/LibSrcs/pycharm-community-2018.3.2/helpers/pydev/pydevd.py", line 1741, in
main()
File "/home/lucifer/LibSrcs/pycharm-community-2018.3.2/helpers/pydev/pydevd.py", line 1735, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/lucifer/LibSrcs/pycharm-community-2018.3.2/helpers/pydev/pydevd.py", line 1135, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/lucifer/LibSrcs/pycharm-community-2018.3.2/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/lucifer/Projects/05_maskrcnn-benchmark/maskrcnn-benchmark/my_run/run_local/e2e_faster_rcnn_R_50_C4_1x/train_net_e2e_faster_rcnn_R_50_C4_1x.py", line 201, in
main()
File "/home/lucifer/Projects/05_maskrcnn-benchmark/maskrcnn-benchmark/my_run/run_local/e2e_faster_rcnn_R_50_C4_1x/train_net_e2e_faster_rcnn_R_50_C4_1x.py", line 194, in main
model = train(cfg, args.local_rank, args.distributed)
File "/home/lucifer/Projects/05_maskrcnn-benchmark/maskrcnn-benchmark/my_run/run_local/e2e_faster_rcnn_R_50_C4_1x/train_net_e2e_faster_rcnn_R_50_C4_1x.py", line 94, in train
arguments,
File "/home/lucifer/Projects/05_maskrcnn-benchmark/maskrcnn-benchmark/maskrcnn_benchmark/engine/trainer.py", line 84, in do_train
loss_dict = model(images, targets)
File "/home/lucifer/.virtualenvs/maskrcnn-benchmark/lib/python3.6/site-packages/torch/nn/modules/module.py", line 494, in call
result = self.forward(*input, **kwargs)
File "/home/lucifer/.virtualenvs/maskrcnn-benchmark/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/apex/amp/_initialize.py", line 197, in new_fwd
**applier(kwargs, input_caster))
File "/home/lucifer/Projects/05_maskrcnn-benchmark/maskrcnn-benchmark/maskrcnn_benchmark/modeling/detector/generalized_rcnn.py", line 52, in forward
x, result, detector_losses = self.roi_heads(features, proposals, targets)
File "/home/lucifer/.virtualenvs/maskrcnn-benchmark/lib/python3.6/site-packages/torch/nn/modules/module.py", line 494, in call
result = self.forward(*input, **kwargs)
File "/home/lucifer/Projects/05_maskrcnn-benchmark/maskrcnn-benchmark/maskrcnn_benchmark/modeling/roi_heads/roi_heads.py", line 26, in forward
x, detections, loss_box = self.box(features, proposals, targets)
File "/home/lucifer/.virtualenvs/maskrcnn-benchmark/lib/python3.6/site-packages/torch/nn/modules/module.py", line 494, in call
result = self.forward(*input, **kwargs)
File "/home/lucifer/Projects/05_maskrcnn-benchmark/maskrcnn-benchmark/maskrcnn_benchmark/modeling/roi_heads/box_head/box_head.py", line 43, in forward
proposals = self.loss_evaluator.subsample(proposals, targets)
File "/home/lucifer/Projects/05_maskrcnn-benchmark/maskrcnn-benchmark/maskrcnn_benchmark/modeling/roi_heads/box_head/loss.py", line 114, in subsample
img_sampled_inds = torch.nonzero(pos_inds_img | neg_inds_img).squeeze(1)
RuntimeError: s__th_or is not implemented for type CUDABoolType

To Reproduce

Steps to reproduce the behavior:

single gpu running e2e_faster_rcnn_R_50_C4_1x.yaml

Expected behavior

Environment

Collecting environment information...
PyTorch version: 1.0.0.dev20190409
Is debug build: No
CUDA used to build PyTorch: 9.0.176
OS: Ubuntu 16.04.5 LTS
GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
CMake version: version 3.5.1
Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: GeForce GTX 1070
Nvidia driver version: 384.130
cuDNN version: Could not collect
Versions of relevant libraries:
[pip3] numpy==1.17.4
[pip3] torch-nightly==1.0.0.dev20190409
[pip3] torchvision-nightly==0.2.3
[conda] Could not collect:

Additional context

when I debugging the issue1080, I met this bug.
And I modify the file
....../maskrcnn-benchmark/maskrcnn_benchmark/modeling/roi_heads/box_head/loss.py
line 111, like below
pos_inds_img_uint8 = pos_inds_img.type(torch.uint8)
neg_inds_img_uint8 = neg_inds_img.type(torch.uint8)
img_sampled_inds = torch.nonzero(pos_inds_img_uint8 | neg_inds_img_uint8).squeeze(1)
# img_sampled_inds = torch.nonzero(pos_inds_img | neg_inds_img).squeeze(1)
proposals_per_image = proposals[img_idx][img_sampled_inds]

Maybe I had fixed it~

@RadiantJeral
Copy link
Author

see #1182

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant