Skip to content

Feedback: Cuckoo filter #2484

@umangkumarr

Description

@umangkumarr

Inconsistency Between Redis Documentation and Referenced Paper Regarding Cuckoo Filter Load / Fill Rates

Page: https://redis.io/docs/latest/develop/data-types/probabilistic/cuckoo-filter/

I noticed an inconsistency in the documentation for cuckoo filters when comparing the Redis documentation with the research paper linked on the same page.

The linked paper:
Cuckoo Filter: Practically Better Than Bloom
(https://www.cs.cmu.edu/~dga/papers/cuckoo-conext2014.pdf) states the following:

"With k = 2 hash functions, the load factor α is 50% when the bucket size b = 1 (i.e., the hash table is directly mapped), but increases to 84%, 95% or 98% respectively using bucket size b = 2, 4 or 8."

However, the Redis documentation under “Choosing the Bucket Size”
(https://redis.io/docs/latest/develop/data-types/probabilistic/cuckoo-filter/#choosing-the-bucket-size-bucketsize) states:

"When bucket size of 1 is used the fill rate is 55% and false positive error rate is 2/256 ≈ 0.78% which is the minimal false positive rate you can achieve. Larger buckets increase the error rate linearly but improve the fill rate of the filter. For example, a bucket size of 3 yields a 2.34% error rate and an 80% fill rate. Bucket size of 4 yields a 3.12% error rate and a 95% fill rate."

Question

Could you clarify which values are correct, or whether the Redis implementation deviates intentionally from the results in the paper (e.g., due to differences in assumptions, parameterization, fingerprint size, or implementation-specific adjustments)?

Metadata

Metadata

Assignees

Labels

questionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions