Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When I use the csresnext50-panet-spp.cfg to train #3

Open
Timmmmmms opened this issue Dec 16, 2019 · 5 comments
Open

When I use the csresnext50-panet-spp.cfg to train #3

Timmmmmms opened this issue Dec 16, 2019 · 5 comments

Comments

@Timmmmmms
Copy link

RuntimeError: shape '[512, 2048, 1, 1]' is invalid for input of size 680093

Can you tell me what is wrong?

@WongKinYiu
Copy link
Owner

WongKinYiu commented Dec 16, 2019

@Timmmmmms

I guess you run the code using https://github.com/ultralytics/yolov3 with pretrained model.

If true, please add

    elif file == 'csresnext50c.conv.80': # change to your pretrain model name
        cutoff = 80

in https://github.com/ultralytics/yolov3/blob/master/models.py#L324

And because of https://github.com/ultralytics/yolov3 does not support add different number of channel using shortcut layer, you should modify the filter number ultralytics/yolov3#698 (comment).

However, I think the order of multiple-inputs-route-layer in https://github.com/AlexeyAB/darknet and https://github.com/ultralytics/yolov3 may be different.
So when you use pretrain model, you may not get expected results. #3 (comment)

@AlexeyAB
Copy link
Collaborator

@WongKinYiu

However, I think the order of multiple-inputs-route-layer in https://github.com/AlexeyAB/darknet and https://github.com/ultralytics/yolov3 may be different.

This conversion code works well with yolov3-spp.weights/cfg file despite the fact that yolov3-spp uses route layers with multiple inputs: https://github.com/ultralytics/yolov3#darknet-conversion

@WongKinYiu
Copy link
Owner

WongKinYiu commented Dec 17, 2019

@AlexeyAB Thanks

After I checked the code, it seems same.
https://github.com/ultralytics/yolov3/blob/master/models.py#L56
https://github.com/AlexeyAB/darknet/blob/master/src/parser.c#L846

I will check why csresnext50-panet-spp can not perform normally after convert to .pt.
or train on pytorch.

※update: the implementation of the number of filters of shortcut layer is different, but i am not sure it will really affect the result or not. https://github.com/ultralytics/yolov3/blob/master/models.py#L63

@Timmmmmms
Copy link
Author

@WongKinYiu

@Timmmmmms

I guess you run the code using https://github.com/ultralytics/yolov3 with pretrained model.

If true, please add

    elif file == 'csresnext50c.conv.80': # change to your pretrain model name
        cutoff = 80

in https://github.com/ultralytics/yolov3/blob/master/models.py#L324

And because of https://github.com/ultralytics/yolov3 does not support add different number of channel using shortcut layer, you should modify the filter number ultralytics/yolov3#698 (comment).

However, I think the order of multiple-inputs-route-layer in https://github.com/AlexeyAB/darknet and https://github.com/ultralytics/yolov3 may be different.
So when you use pretrain model, you may not get expected results. #3 (comment)

I try to do that, but there was still a mistake

when I train my data
python train.py --data data/coco.data --cfg cfg/csresnext50-panet-spp.cfg

RuntimeError: shape '[512, 512, 3, 3]' is invalid for input of size 1620480

@WongKinYiu
Copy link
Owner

WongKinYiu commented Dec 17, 2019

@Timmmmmms Hello,

use --weights '' if you do not want use pretrained weight.

python train.py --data data/coco.data --weights '' --cfg cfg/csresnext50-panet-spp.cfg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants