You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Changes and enhancements to compression stats, thresholds (facebook#11388)
Summary:
## Option API updates
* Add new CompressionOptions::max_compressed_bytes_per_kb, which corresponds to 1024.0 / min allowable compression ratio. This avoids the hard-coded minimum ratio of 8/7.
* Remove unnecessary constructor for CompressionOptions.
* Document undocumented CompressionOptions. Use idiom for default values shown clearly in one place (not precariously repeated).
## Stat API updates
* Deprecate the BYTES_COMPRESSED, BYTES_DECOMPRESSED histograms. Histograms incur substantial extra space & time costs compared to tickers, and the distribution of uncompressed data block sizes tends to be uninteresting. If we're interested in that distribution, I don't see why it should be limited to blocks stored as compressed.
* Deprecate the NUMBER_BLOCK_NOT_COMPRESSED ticker, because the name is very confusing.
* New or existing tickers relevant to compression:
* BYTES_COMPRESSED_FROM
* BYTES_COMPRESSED_TO
* BYTES_COMPRESSION_BYPASSED
* BYTES_COMPRESSION_REJECTED
* COMPACT_WRITE_BYTES + FLUSH_WRITE_BYTES (both existing)
* NUMBER_BLOCK_COMPRESSED (existing)
* NUMBER_BLOCK_COMPRESSION_BYPASSED
* NUMBER_BLOCK_COMPRESSION_REJECTED
* BYTES_DECOMPRESSED_FROM
* BYTES_DECOMPRESSED_TO
We can compute a number of things with these stats:
* "Successful" compression ratio: BYTES_COMPRESSED_FROM / BYTES_COMPRESSED_TO
* Compression ratio of data on which compression was attempted: (BYTES_COMPRESSED_FROM + BYTES_COMPRESSION_REJECTED) / (BYTES_COMPRESSED_TO + BYTES_COMPRESSION_REJECTED)
* Compression ratio of data that could be eligible for compression: (BYTES_COMPRESSED_FROM + X) / (BYTES_COMPRESSED_TO + X) where X = BYTES_COMPRESSION_REJECTED + NUMBER_BLOCK_COMPRESSION_REJECTED
* Overall SST compression ratio (compression disabled vs. actual): (Y - BYTES_COMPRESSED_TO + BYTES_COMPRESSED_FROM) / Y where Y = COMPACT_WRITE_BYTES + FLUSH_WRITE_BYTES
Keeping _REJECTED separate from _BYPASSED helps us to understand "wasted" CPU time in compression.
## BlockBasedTableBuilder
Various small refactorings, optimizations, and name clean-ups.
Pull Request resolved: facebook#11388
Test Plan:
unit tests added
* `options_settable_test.cc`: use non-deprecated idiom for configuring CompressionOptions from string. The old idiom is tested elsewhere and does not need to be updated to support the new field.
Reviewed By: ajkr
Differential Revision: D45128202
Pulled By: pdillinger
fbshipit-source-id: 5a652bf5c022b7ec340cf79018cccf0686962803
Copy file name to clipboardexpand all lines: HISTORY.md
+2
Original file line number
Diff line number
Diff line change
@@ -3,6 +3,7 @@
3
3
### Public API Changes
4
4
*`SstFileWriter::DeleteRange()` now returns `Status::InvalidArgument` if the range's end key comes before its start key according to the user comparator. Previously the behavior was undefined.
5
5
* Add `multi_get_for_update` to C API.
6
+
* Remove unnecessary constructor for CompressionOptions.
6
7
7
8
### Behavior changes
8
9
* Changed default block cache size from an 8MB to 32MB LRUCache, which increases the default number of cache shards from 16 to 64. This change is intended to minimize cache mutex contention under stress conditions. See https://github.com/facebook/rocksdb/wiki/Block-Cache for more information.
@@ -17,6 +18,7 @@
17
18
### New Features
18
19
* Add experimental `PerfContext` counters `iter_{next|prev|seek}_count` for db iterator, each counting the times of corresponding API being called.
19
20
* Allow runtime changes to whether `WriteBufferManager` allows stall or not by calling `SetAllowStall()`
21
+
* Added statistics tickers BYTES_COMPRESSED_FROM, BYTES_COMPRESSED_TO, BYTES_COMPRESSION_BYPASSED, BYTES_COMPRESSION_REJECTED, NUMBER_BLOCK_COMPRESSION_BYPASSED, and NUMBER_BLOCK_COMPRESSION_REJECTED. Disabled/deprecated histograms BYTES_COMPRESSED and BYTES_DECOMPRESSED, and ticker NUMBER_BLOCK_NOT_COMPRESSED. The new tickers offer more inight into compression ratios, rejected vs. disabled compression, etc. (#11388)
20
22
* New statistics `rocksdb.file.read.{flush|compaction}.micros` that measure read time of block-based SST tables or blob files during flush or compaction.
0 commit comments