-
Couldn't load subscription status.
- Fork 317
Add HashTable methods related to the raw bucket index
#657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
I'm still willing to change if other names are preferred. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that get_bucket and get_bucket_mut are good, but buckets should be renamed to num_buckets to make it clear it's returning a count and not an iterator.
In terms of find_bucket_index and get_bucket_entry, I left some comments. But if you wanted to merge the smaller set first, I'm fine with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, not required at all, but since a lot of code using the HashTable API is going to be unsafe anyway, it might be worth adding get_bucket_unchecked and get_bucket_unchecked_mut as well, which don't return Option.
|
Does |
I personally think
Ooh, I like that! We couldn't support the full |
|
Calling out this additional benefit of the
|
|
re: unsafe methods, I guess the main concern is whether the bounds checks can easily be optimised out. While there isn't the same robust codegen testing as the compiler here, maybe it would be worth poking around on godbolt and seeing how |
|
I've added those two unchecked methods.
Using + inc rdi
+ cmp r10, rdi
+ jae .LBB6_11
+ cmp byte ptr [rcx + r10], 0
+ js .LBB6_11
add rax, -8
pop rcx
.cfi_def_cfa_offset 8
ret... where |
|
I guess that another benefit to bumping MSRV would be the ability to add Even without an MSRV bump, I guess you could use the nightly feature as a temporary measure. |
|
I assume you mean adding that in pub fn find_bucket_index(&self, hash: u64, eq: impl FnMut(&T) -> bool) -> Option<usize> {
match self.raw.find(hash, eq) {
Some(bucket) => Some(unsafe {
let index = self.raw.bucket_index(&bucket);
core::hint::assert_unchecked(index < self.raw.buckets());
core::hint::assert_unchecked(self.raw.is_bucket_full(index));
index
}),
None => None,
}
}It does remove the unwrap in my simple back-to-back test, but it also pessimizes the unchecked case a little, recomputing the final address. Different variations of that assertion had the same effect. + neg r9
+ lea rax, [rax + 8*r9]
add rax, -8
pop rcx
.cfi_def_cfa_offset 8
retIn a real scenario I expect you'd also have other interim code, and who knows if the assertion could still be propagated. I don't think we should bother until there's a real measured benefit, and anyway even the unhinted checked case is still much less than a full hash probe that you would have needed before! |
|
RE MSRV, I recently bumped |
`VacantEntry` now stores a `Tag` instead of a full `hash: u64`. This means that `OccupiedEntry` doesn't need to store anything, because it can get that tag from the control byte for `remove -> VacantEntry`. The `get_bucket_entry` method doesn't need a hash argument either. Also, since `OccupiedEntry` is now smaller, `enum Entry` will be the same size as `VacantEntry` by using a niche for the discriminant. (Although this is not _guaranteed_ by the compiler.)
On
HashTable<T, A>:On
OccupiedEntry<'_, T, A>:Closes #613, although I chose different names.