gh-144356: Make set iterator __length_hint__ and iternext race-safe under no-gil#144357
gh-144356: Make set iterator __length_hint__ and iternext race-safe under no-gil#144357hyongtao-code wants to merge 21 commits intopython:mainfrom
__length_hint__ and iternext race-safe under no-gil#144357Conversation
setiter_len() was reading so->used without atomic access while concurrent mutations update it atomically under Py_GIL_DISABLED. Use an atomic load for so->used to avoid a data race. This preserves the existing semantics of __length_hint__ while making the access thread-safe. Signed-off-by: Yongtao Huang <yongtaoh2022@gmail.com>
setiter_len() under no-gilset_iterator.__length_hint__ under no-gil
|
Thanks for the review. I’ve decided to address both issues in this PR. I also added a corresponding test case for the issue you pointed out. |
eendebakpt
left a comment
There was a problem hiding this comment.
I left some more review comments. I think we can get this right, but a simpler approach here would be to put a critical section on the set iterator itself.
Thanks for the suggestion. I simplified the implementation by protecting both the iterator and the set with It looks like there is an unrelated flaky test case. |
eendebakpt
left a comment
There was a problem hiding this comment.
@hyongtao-code Thanks for your work on this! The core idea of the PR has shifted a bit, which is why I have some more review comments, but we are getting closer I believe.
Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
This reverts commit 78241a8.
set_iterator.__length_hint__ under no-gil__length_hint__ and iternext race-safe under no-gil
| if (key == NULL) { | ||
| si->si_set = NULL; | ||
| /* exhausted */ | ||
| Py_DECREF(so); |
There was a problem hiding this comment.
Can you move the decref to the section
/* exhausted */
si->si_pos = -1;
si->len = 0;
?
There was a problem hiding this comment.
I’d prefer not to move the Py_DECREF into the Py_BEGIN_CRITICAL_SECTION block.
eendebakpt
left a comment
There was a problem hiding this comment.
We went through several iterations, but currently the PR:
- Adds free-threading tests for the behavior of set iterators
- Makes the set iterator safe in the FT build by using locks and not freeing the underlying set
Remaining suggestions are about the style, not about the actual implementation so I am approving.
Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
Long log:
In free-threaded builds, setiter_len() and setiter_iternext() could race with concurrent set mutation and iterator exhaustion.
We now serialize operations on the iterator itself (and keep the set stable) using a critical section, and ensure the iterator’s exhaustion state is handled consistently.
The non-free-threaded build keeps the existing semantics. Add free-threading-style concurrency tests (similar to the itertools FT tests) to exercise
__length_hint__anditernextexhaustion under concurrent execution.setiter_len()was readingso->usedwithout atomic access while concurrentmutations update it atomically under Py_GIL_DISABLED.
In free-threaded builds,setiter_len()could race with concurrent setmutation and iterator exhaustion.
Use an atomic load forso->usedto avoid a data race. This preserves theexisting semantics of
__length_hint__while making the access thread-safe.Signed-off-by: Yongtao Huang yongtaoh2022@gmail.com