Skip to content

Threads acquiring an exclusive lock on an Atomic_RW_Mutex wait forever. #5254

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kb9113 opened this issue Jun 2, 2025 · 1 comment · Fixed by #5267
Closed

Threads acquiring an exclusive lock on an Atomic_RW_Mutex wait forever. #5254

kb9113 opened this issue Jun 2, 2025 · 1 comment · Fixed by #5267
Labels
bug replicated We were able to replicate the bug.

Comments

@kb9113
Copy link

kb9113 commented Jun 2, 2025

Context

Odin:    dev-2025-05:843648c81
OS:      Fedora Linux 42 (Workstation Edition), Linux 6.14.6-300.fc42.x86_64
CPU:     AMD Ryzen 9 7900 12-Core Processor
RAM:     31229 MiB
Backend: LLVM 20.1.3

Expected Behavior

After an Atomic_RW_Mutex is unlocked using sync.shared_unlock any thread waiting to acquire the rw mutex using sync.lock continues past the lock.

Current Behavior

Theads waiting on acquiring an exclusive lock appear to wait forever if they attempt to acquire the exclusive lock while a thread has a shared lock. Even when the other thread releases the shared lock later.

Failure Information (for bugs)

Steps to Reproduce

In the below code I would expect that tast1 would run and acquire the RW Mutex for reading. Then task2 would try to acquire the RW Mutex for writing, would wait for task1 to release it then would continue though the code.

However, When I run the code, I get

before lock
task1 done

Then the program appears to wait on acquiring the lock for writing forever.

main :: proc()
{
    wg : sync.Wait_Group

    t1 := thread.create(task1)
    t1.init_context = context
    t1.user_index = 1
    t1.data = &wg

    t2 := thread.create(task2)
    t2.init_context = context
    t2.user_index = 1
    t2.data = &wg

    sync.wait_group_add(&wg, 2)

    thread.start(t1)
    thread.start(t2)

    sync.wait_group_wait(&wg)
}

rw_mutex : sync.Atomic_RW_Mutex

task1 :: proc(t: ^thread.Thread)
{
    sync.shared_lock(&rw_mutex)
    time.sleep(time.Second * time.Duration(2))
    sync.shared_unlock(&rw_mutex)

    sync.wait_group_done((cast(^sync.Wait_Group)t.data))

    fmt.println("task1 done")
}

task2 :: proc(t: ^thread.Thread)
{
    time.sleep(time.Second * time.Duration(1))

    fmt.println("before lock")
    sync.lock(&rw_mutex)
    fmt.println("after lock")
    time.sleep(time.Second * time.Duration(1))
    sync.unlock(&rw_mutex)

    sync.wait_group_done((cast(^sync.Wait_Group)t.data))

    fmt.println("task2 done")
}

Failure Logs

@Feoramund
Copy link
Contributor

I can confirm replication of this bug. I'll see if I can fix this today.

@Kelimion Kelimion added bug replicated We were able to replicate the bug. labels Jun 3, 2025
Feoramund added a commit to Feoramund/Odin that referenced this issue Jun 3, 2025
This patch simplifies the implementation and fixes odin-lang#5254.

Previously, the mutex was set up as if there could be multiple writers,
and there seemed to be some confusion as to which `Writer` bits to
check, as not all were checked or set at the same time.

This could also result in the mutex being left in a non-zero state even
after unlocking all locks.

All unneeded state has been removed and extra checks have been put in
place.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug replicated We were able to replicate the bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants