Description
Intro
Current pybind11
functional type_caster
support invoking callbacks passed from Python in async way, i.e. from multiple C++ threads. It implements that by holding GIL while functor is being executed according to following essential code from type_caster::load()
that initializes value
member:
...
value = [func](Args... args) -> Return {
gil_scoped_acquire acq;
object retval(func(std::forward<Args>(args)...));
/* Visual studio 2015 parser issue: need parentheses around this expression */
return (retval.template cast<Return>());
};
Notice the sentence gil_scoped_acquire acq;
that captures and releases GIL in RAII fashion.
Problem
Problem with code above is that destruction of value
(and captured Python functor func
) happens after GIL has been released. Then, if functor func
is for example Python lambda that captures some variables, these variables are being freed (reference counter decremented) when GIL is no longer held.
Notice that all this process of functor invoke and destruction can execute in some worker C++ thread and that leads to UB (immediate terminate in my experience).
Problem isn't arising If func
is pure function or a stateless lambda.
Solution
I've made a workaround to this issue by replacing the code above with the following:
...
// dynamically allocated lambda that actually invokes passed functor
auto f = new auto([func](Args... args) -> Return {
object retval(func(std::forward<Args>(args)...));
/* Visual studio 2015 parser issue: need parentheses around this expression */
return (retval.template cast<Return>());
});
if(!f) return false;
// ensure GIL is released AFTER functor destructor is called
value = [f](Args... args) -> Return {
gil_scoped_acquire acq;
(*f)(std::forward<Args>(args)...);
delete f;
};
Basically what it does -- it keeps captured GIL until functor is finished and completely destructed. This approach completely cures the problem.
However the downside is that it dynamically allocates the inner lambda (that actually invokes func
). With C++17 lambda will be constructed in-place without copy/move involved. But still this proposal may be sub-optimal.
So, I'm calling for core devs here for looking into this issue because it leads to rather severe limitations of Python callbacks usage.