Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to tranform module into onnx format? #21

Open
JacksonVation opened this issue May 21, 2022 · 9 comments
Open

How to tranform module into onnx format? #21

JacksonVation opened this issue May 21, 2022 · 9 comments

Comments

@JacksonVation
Copy link

Hi, I recognized that we have ckpt and json files there and Im trying to transform the module into onnx format but I cant find the corresponding neural network file here. So I`m wondering how could I implement this.

@JacksonVation
Copy link
Author

So Im trying to deploy the given module on my device, the tflite files certainly fits the device but I think ckpt and json should be transformed into onnx format or it wont suit the board..

@JacksonVation
Copy link
Author

I successfully transformed model into onnx format, but its flash and SRAM both beyond my board. Is this because of my lack of Tinyengine? Or I didn`t compress the model in a proper way?

@JacksonVation
Copy link
Author

BeyondMemory
It spilt over like this.

@tonylins
Copy link
Collaborator

Hi, thanks for reaching out. Is the converted onnx file quantized to int8? Quantization will significantly reduce memory usage. The tflite file should be quantized and maybe you can try and see if it works.

@JacksonVation
Copy link
Author

Hi, thanks for your nice response. Actually thats exactly what I missed. I didnt quantize the module. So today I tried a lot of ways to convert onnx file into int8, unfortunately it couldnt fit the 32CubeMX(the screenshot is below). After searching on the documentation, I found it unbelievable that the platform seems only fit quantized keras and TFLite modules. But I believe its the thing we need to do, so I`m gonna convert TFLite files tomorrow and I believe this works.
image

@JacksonVation
Copy link
Author

Hi, its great to use the "generate_tflite.py" to convert the model and I found that in this procedure we can quantize the model meanwhile. And what we got from this seemed similar to the tflite file which was given. So we also tested tflite files on the platform, its great to see that we reduced Flash to a proper scale but the SRAM still overflowed some. For example, if we tested the "mcunet-320kb-1mb" model, it overflows but if we tested "mcunet-256kb-1mb", it fits. However, for our device is STM32F746,which has 320kb SRAM and 1MB Flash, so I believe the former should fit. Since tflite is the quantized model , what else shall we do to reduce the spilt SRAM? Or didn`t we quantize it enough?
image
image

@JacksonVation
Copy link
Author

Its so weird that the "mcunet-320kb-1mb" even needs more SRAM than the "mcunet-512kb-2mb". We just tested the bigger one "mcunet-521kb-2mb", and it surprised us that it only occupies 416.75KB, which is surprisingly smaller than "mcunet-320kb-1mb"s 467.03KB.
image
Maybe there`s some unusual things with "mcunet-320kb-1mb_imagenet.tflite" file.

@tonylins
Copy link
Collaborator

Hi, the memory usage is dependent on the system stack. We used TinyEngine in our experiments, which will have a different memory usage compared to Cube AI, so it should be normal if the peak memory does not align. The 320KB model should fit the device with TinyEngine, but may not for Cube AI.

@JacksonVation
Copy link
Author

Hi, that definitely makes sense. Thanks for your response. And then were gonna try to deploy the adaptive model on our device to implement some functions on it just like what youve showed in your Demo video which is really cool!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants