-
Notifications
You must be signed in to change notification settings - Fork 64
Thread safety #574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thread safety #574
Conversation
b6b13c8
to
cb63cdf
Compare
0a1e998
to
b90b799
Compare
Since other thread can create automaton states with an index larger the size of the position array, we compare at each step that the current index fits when running the automaton.
This is slow, but we can check that TSan reports no data race.
Basically, we are building an automaton lazily. We use double-checked locking to avoid acquiring a mutex when traversing a part of the automaton which has already been computed. The memory model ensures that we see either an uninitialized state or the initialized state. If we see the initialized state, we can just proceed. Otherwise, we acquire a mutex and update the state after checking this has not been done by another thread.
Use a fake implementation of mutexes and domains in this case to avoid a dependency on the threads library
@vouillon is this waiting on anything? Seems ready to me |
IIUC we are lacking a reviewer. |
Just as a data point: I ported this PR to OxCaml and we've been using it internally at Jane Street in production for a little over a month, and things seem to be working pretty well. My one qualm about this PR in its current state is the switching of |
The real str uses DLS AFAIK. I think we should stay as compatible as possible with str. In any case, thanks for testing. I'll merge this PR and @vouillon is always free to adjust anything later. |
This PR makes RE thread-safe with OCaml 5.
We want the overhead to be minimal while matching a string, and allow concurrent string matching. Basically, RE works by traversing an automaton which is built lazily. So, there is no locking while traversing the automaton, but only when it is updated.
For that, we take advantage of the fact that double-checked locking is sound under the OCaml memory-model. When we reach a part of the automaton that has not been initialized yet, we acquire a mutex to update it. All the datastructures used to build this automaton are protected by this mutex.