Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabling OpenCL GPU Model Caching in OpenVINO #205

Closed
natelowry opened this issue Aug 12, 2022 · 1 comment
Closed

Enabling OpenCL GPU Model Caching in OpenVINO #205

natelowry opened this issue Aug 12, 2022 · 1 comment

Comments

@natelowry
Copy link

Describe the bug
OpenVINO suggests caching the GPU model for a faster load time. Is there a way to do this in the C# API? If so, I can't find it :)

Urgency
We're seeing ~8000-12000ms load time for our models which is super inconvenient for users. Seems like the majority of that time could be saved if we could load a cached model as OpenVINO recommends here.

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10 21H2
  • ONNX Runtime installed from (source or binary): OpenVINO version from this repo
  • ONNX Runtime version: ONNX Runtime 1.11.0
  • Python version: 3.9.13
  • Visual Studio version (if applicable): VS2022 (not instead on edge devices)
  • GCC/Compiler version (if compiling from source): N/A
  • CUDA/cuDNN version: 2022.1.0.3787 (?)
  • GPU model and memory: Intel Iris Xe (i5-1145G7) 8GB 27.20.100.8935

To Reproduce
It would be nice to enable caching by specifying the ov::cache_dir property either while creating the session OR when appending the execution provider (probably on the EP).

The OptimizedModelFilePath in the constructor is for the ONNX graph optimization which doesn't work (as it uses compiled nodes) and is not recommended anyway.

Here's the documentation for enabling it in the C++ code:
https://docs.openvino.ai/latest/openvino_docs_OV_UG_Model_caching_overview.html
core.set_property(ov::cache_dir("/path/to/cache/dir"));

Ideally it would cache it by default and allow us to specify the cache directory if desired.

var sessionOptions = new SessionOptions
{
    LogSeverityLevel = OrtLoggingLevel.ORT_LOGGING_LEVEL_VERBOSE,
    GraphOptimizationLevel = GraphOptimizationLevel.ORT_DISABLE_ALL,
    //OptimizedModelFilePath = "na.onnx", //this is for the ONNX graph optimization, which doesn't apply here
    //CacheDirectory = "/this/could/be/an/option", //although since this is shared across all ONNX EPs it might be the wrong place
};

sessionOptions.AppendExecutionProvider_OpenVINO("GPU_FP16");
//sessionOptions.AppendExecutionProvider_OpenVINO(deviceId: "GPU_FP16", cacheDir: "/this/could/be/another/option"); 
//^ this seems like the more appropriate place as it's an OpenVINO specific option

_inferenceSession = new InferenceSession(modelPath: "model-path.onnx", options: sessionOptions);

Expected behavior
Model is cached when first built and read from said cache for subsequent usage.

Screenshots

Additional context
Thanks!

@natelowry
Copy link
Author

natelowry commented Aug 12, 2022

Solved it!!! 🎉

Add this to the plugins.xml and it just works™️

<plugin name="GPU" location="openvino_intel_gpu_plugin.dll">
    <properties>
        <property key="CACHE_DIR" value="cache/" />
    </properties>
</plugin>

@natelowry natelowry changed the title Enabling OpenCL GPU Model Caching Enabling OpenCL GPU Model Caching in OpenVINO Aug 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant