Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YOLO v4 for VPU #5467

Closed
ankandrew opened this issue May 3, 2020 · 9 comments
Closed

YOLO v4 for VPU #5467

ankandrew opened this issue May 3, 2020 · 9 comments
Labels

Comments

@ankandrew
Copy link

In the paper I noticed that for VPU some good candidates are EfficientNet-lite, MixNet, GhostNet and MobileNetV3. Did it gave good results these as backbones for YOLO v4 and, decrease the inference time considerably?

@AlexeyAB
Copy link
Owner

AlexeyAB commented May 3, 2020

EfficientNetB0-Yolo shows good result AP (MSCOCO) and 11 FPS on VPU even for: 416x416, batch=1, async=1 (sync-mode), so AP50/Inference_time is much higher for EfficientNetB0-Yolo than for Yolov3-tiny: #5079

So we should try to use EfficientNet-b3-lite without SE, and should use neck/head without PRN (because it slows down inference on VPU), it is proved that it is more efficient for VPU (Google-coral TPU-edge): https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet/lite


MixNet shows better accuracy/BFLOPS than EfficientNetB0, but another question is how well does it can be parallelized: #4203

May be GhostNet is more optimal than MobieNetv3.

@ankandrew
Copy link
Author

ankandrew commented May 4, 2020

It would also be interesting to consider a pruned version of EfficientNet (https://arxiv.org/abs/2002.08258) Apparently "A pruned EfficientNet-B1 may be more efficient than EfficientNet-B0." Maybe this can push AP a little further.

Also, I guess the EfficientNet-b3-lite-YOLO option for VPUs might be interesting (with ReLU6 maybe?). There is also EfficientNets-Edge TPU optimized specifically for the Coral Edge TPU, which might be very good in this specific case (don't know how it may perform on other VPU).

Anyways, thank you for the response :)

@AlexeyAB
Copy link
Owner

AlexeyAB commented May 4, 2020

Pruning usually gives the same results as training with initially fewer weights. And in the EfficientNet, the optimal number of weights has already been chosen using AutoML MNAS.

ReLU6 - should be used only if you want to quantize it to INT8.

@ankandrew
Copy link
Author

ankandrew commented May 4, 2020

Looking to the comparison of EfficientNet B1 Knapsack Pruned and B0, looks like the pruned B1 could be used instead of normal B0.

Comparison
Annotation 2020-05-04 131456

As my original question was answered, I am closing it
Thanks

@AlexeyAB
Copy link
Owner

AlexeyAB commented May 7, 2020

@ankandrew It seems there was something wrong during first testing of EfficientNetB0-Yolo on VPU. I re-tested many models. So it seems that EfficientNet isn't suitable even for VPU )

  • For EfficientNet-Yolo (Leaky instead of Swith, and without SE) 256x256 async=3 – I get oly 12.5 FPS
  • Approximately the same speed I get by using YOLOv4 (Leaky) 256x256 async=3 - 11 FPS, but accuracy much higher than in the EfficientNet-Yolo.

So even YOLOv4 (Leaky) more suitable for VPU than EfficientNet.

@ankandrew
Copy link
Author

ankandrew commented May 9, 2020

@AlexeyAB I wouldn't have exptected that bad performance of EfficientNet-Yolo, any thoughts why YOLO v4 is performing much better in terms of accuracy?

@AlexeyAB
Copy link
Owner

@ankandrew Are you abount accuracy or FPS?

About accuracy:

  • YOLOV4 (with Leaky, 256x256) gives 53.0% AP50 (checked on test-dev) - 11 FPS (async=3)

  • EfficientNetB0-Yolo (with SE, Swish, 416x416) gives only 45.5% AP50 - just imagine how much accuracy will be for EfficientNetB0-Yolo (without SE, without Swish, 256x256) - something like ~40% AP50 - 12 FPS (async=3)

  • EfficientNetB0+BiFPN-official (with SE, Swish, 416x416) gives only 52.2% AP50 - just imagine how much accuracy will be for EfficientNetB0+BiFPN-official (without SE, without Swish, 256x256) - something like ~45% AP50

That's why Google doesnt use Swish, SE, Grouped-conv for TPU-edge: https://ai.googleblog.com/2019/08/efficientnet-edgetpu-creating.html

@ankandrew
Copy link
Author

ankandrew commented May 11, 2020

Just for curious are there any plans to implement YOLO v4 specially for Coral Edge TPU?

Thanks for the update 👍

@vinorth-v
Copy link

Just for curious are there any plans to implement YOLO v4 specially for Coral Edge TPU?

Thanks for the update

I am also interested!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants