Description
Summary
With perspective sensor, if crop_size, crop_offset of film changes every iteration, new kernel will be generated every iteration, which results in bad performance especially if pytorch inter-op is used inside rendering code.
System configuration
System information:
OS: Red Hat Enterprise Linux release 8.8 (Ootpa)
CPU: AMD EPYC 7313 16-Core Processor
GPU: NVIDIA RTX A6000
Python: 3.10.10 (main, Feb 28 2023, 09:55:02) [GCC 8.5.0 20210514 (Red Hat 8.5.0-16)]
NVidia driver: 525.116.04
CUDA: 11.7.64
LLVM: -1.-1.-1
Dr.Jit: 0.4.2
Mitsuba: 3.3.0
Is custom build? False
Compiled with: GNU 10.2.1
Variants:
scalar_rgb
scalar_spectral
cuda_ad_rgb
llvm_ad_rgb
Description
After some investigation, I realize film size, crop_size and crop_offset parameters are mi.Scalar*, and cannot be made opaque by dr.make_opaque. While perspective sensor almost made all of its parameters depending on film parameters opaque, scaled_principal_point_offset
depends on film parameters, and is dynamically created in its sample_* methods.
I am considering modifying scaled_principal_point_offset
into an instance variable of m_scaled_principal_point_offset
, make it opaque, and update all relevant code. Would this be a reasonable fix? EDIT: this does not seem to eliminate new kernel generation.
Since such fix involves changes to official plugins, I decide to open an issue for that.
Steps to reproduce
- With a rendering procedure that depends on pytorch code, in a loop, call mi.render different crop_size and crop_offset film parameters in every iteration. When set drjit log to debug mode, observe new kernel being generated whenever pytorch code is called.
- Keep crop_size and crop_offset film parameters same in every iteration. Observe time per iteration is much faster and no new kernel being generated on every iteration.