Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load factor setter for map #4

Open
bmcdonald3 opened this issue Oct 7, 2021 · 0 comments
Open

Load factor setter for map #4

bmcdonald3 opened this issue Oct 7, 2021 · 0 comments

Comments

@bmcdonald3
Copy link
Owner

Summary

Our hash table probing method was switched from using regular quadratic probing to triangular numbers probing. A property of quadratic probing is the table must not exceed 50% capacity to guarantee finding a slot, but triangular numbers probing guarantees finding a slot if there is an open one, regardless of capacity (it checks each slot exactly once). This means that we can give more flexibility to users of map by exposing a maxLoadFactor (to be consistent with other languages) to allow them to decide when is best for their map to grow.

Proposed solutions

Proposal 1

var m = new map(int, int);
m.setMaxLoadCapacity(0.75);

maxLoadCapacity is set through a setter method and takes a fraction that corresponds to the % full that the table can reach before resizing

Proposal 2

var m = new map(int, int);
m.setMaxLoadCapacity(2);

Identical to proposal 1, but the maxLoadCapacity is set by an integer whose reciprocal (1/val) is the max capacity (i.e., passing 2 means your max capacity is 1/2 or 50%)

Proposal 3

var m = new map(int, int, maxLoadCapacity=.75);

maxLoadCapacity is a keyword argument for the map constructor

Comparisons to other languages

C++ unordered_map

  • C++ exposes a max_load_factor() which allows setting how full the map will be before resizing
  • takes a integer number instead of a fraction (e.g., max_load_factor = 2 = 1/2, so map can be 50% full before resize; max_load_factor = 3 = 1/3 so map can be 33% full before resize)

Usage:

std::unordered_map<std::int, std::int> m;
m.max_load_factor(2); // map can now be up to 50% full (numslots/loadfactor)

(thanks for pointing this one out Andrew)

Java Hashtable

  • the Java Hashtable constructor can take an initial capacity and a loadFactor
  • loadFactor in Java is a fraction, rather than an integer (e.g., loadFactor = .75 means table can be 75% full)

Usage:

Hashtable<int, int> m = new Hashtable<int, int>(10, 0.50); // initial capacity 10, table can be up to 50% full

Other languages

  • Rust doesn't have a concept of a loadFactor, but instead allows setting of an initial capacity, which may have the same effect in many cases
  • Python, Julia, etc., do not have a concept of this, just like our map now

Other stuff that you may not be particularly interested in

Motivated by: chapel-lang#18427 and https://github.com/Cray/chapel-private/issues/2536
PR with proposal 1: chapel-lang#18523
C++ load factor: https://en.cppreference.com/w/cpp/container/unordered_map/max_load_factor
Java hashtable: https://docs.oracle.com/javase/8/docs/api/java/util/Hashtable.html
Rust hashmap: https://doc.rust-lang.org/std/collections/struct.HashMap.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant