Skip to content

Renderer: Reduce scope of mutex locks to prevent common deadlocks #105138

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 15, 2025

Conversation

stuartcarnie
Copy link
Contributor

@stuartcarnie stuartcarnie commented Apr 8, 2025

Fixes #102877

Introduces fine-grained locking to ShaderRD, so that each version has its own lock, making ShaderRD version APIs thread safe.

Note

It is expected that one of the initialize APIs are called prior to querying version, which is a requirement. The initialize APIs mutate some of the state, after which point it is read-only.

Additional changes to BaseMaterial3D and MaterialStorage to extract the queued updates into a separate list (non-allocating) and then releases the lock. This is to ensure that any shader updates are not performed under the shared locks of BaseMaterial3D and MaterialStorage.

Now that accessing shader versions of ShaderRD is thread-safe, we remove a number of calls to the shared mutex in SceneShaderForwardClustered when querying shader versions, which was a source of many deadlocks.

ShaderRD no longer uses a separate named WorkerThreadPool.

Note

We may want to introduce a platform-specific Thread implementation for Apple, so that we can increase the size of the stack for worker threads.

@stuartcarnie stuartcarnie requested review from a team as code owners April 8, 2025 05:15
@bruvzg bruvzg self-requested a review April 8, 2025 05:16
Copy link
Contributor Author

@stuartcarnie stuartcarnie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few notes

@@ -182,6 +182,7 @@ void WorkerThreadPool::_process_task(Task *p_task) {

void WorkerThreadPool::_thread_function(void *p_user) {
ThreadData *thread_data = (ThreadData *)p_user;
Thread::set_name(vformat("WorkerThread %d", thread_data->index));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it easy to find the worker threads

Comment on lines 688 to 695
if (shader_map.has(current_key)) {
shader_map[current_key].users--;
if (shader_map[current_key].users == 0) {
// Deallocate shader which is no longer in use.
RS::get_singleton()->free(shader_map[current_key].shader);
shader_map.erase(current_key);
{
MutexLock lock(shader_map_mutex);
if (ShaderData *v = shader_map.getptr(current_key); v) {
v->users--;
if (v->users == 0) {
// Deallocate shader which is no longer in use.
RS::get_singleton()->free(v->shader);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An improvement, as we no longer perform multiple lookups into shader_map too

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great pattern that is worth using more widely. Calling has() is almost always a waste of cycles

Comment on lines 699 to 705
if (shader_map.has(mk)) {
shader_rid = shader_map[mk].shader;
shader_map[mk].users++;
if (ShaderData *v = shader_map.getptr(mk); v) {
shader_rid = v->shader;
v->users++;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here – let's not perform multiple lookups into shader_map.

if (ShaderData *v = shader_map.getptr(mk); v) {
// We raced and managed to sneak the same key in concurrently, so we'll destroy the one we just created,
// given we know it isn't used, and use the winner.
RS::get_singleton()->free(shader_data.shader);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@clayjohn I had to remove this, as it caused a crash in the List destructor of this list, as it wasn't empty:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense

Comment on lines +1978 to +1999
SelfList<BaseMaterial3D>::List copy;
{
MutexLock lock(material_mutex);
while (SelfList<BaseMaterial3D> *E = dirty_materials.first()) {
dirty_materials.remove(E);
copy.add(E);
}
}

while (dirty_materials.first()) {
dirty_materials.first()->self()->_update_shader();
dirty_materials.first()->remove_from_list();
while (SelfList<BaseMaterial3D> *E = copy.first()) {
E->self()->_update_shader();
copy.remove(E);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't allocate anything, which is convenient! We just remove from the dirty_materials list into the local copy.

Comment on lines -82 to -83
Mutex variant_set_mutex;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer needed, as we have a mutex per version.

@@ -95,7 +94,9 @@ class ShaderRD {
void _compile_ensure_finished(Version *p_version);
void _allocate_placeholders(Version *p_version, int p_group);

RID_Owner<Version> version_owner;
RID_Owner<Version, true> version_owner;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this thread-safe now

Comment on lines +2139 to +2151
SelfList<Material>::List copy;
{
MutexLock lock(material_update_list_mutex);
while (SelfList<Material> *E = material_update_list.first()) {
DEV_ASSERT(E == &E->self()->update_element);
material_update_list.remove(E);
copy.add(E);
}
}

while (SelfList<Material> *E = copy.first()) {
Material *material = E->self();
copy.remove(E);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change was necessary to limit the scope of the lock, as update_parameters could cause shader compilation.

@stuartcarnie
Copy link
Contributor Author

@clayjohn I've switched to the preferred style

Copy link
Member

@clayjohn clayjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@Repiteo Repiteo merged commit f56a4d4 into godotengine:master Apr 15, 2025
20 checks passed
@Repiteo
Copy link
Contributor

Repiteo commented Apr 15, 2025

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Scenes can't be fully loaded and Godot freezes
4 participants