Skip to content

SSTable max local deletion time allowing for missed deletion? #5

@rdzimmer-zz

Description

@rdzimmer-zz

Hi,

I've been testing with TWCS and KairosDB. My KairosDB TTL for data is 15 days. Here is the SCHEMA (note the 'timestamp_resolution': 'MILLISECONDS'):

CREATE TABLE metricdb.data_points (
    key blob,
    column1 blob,
    value blob,
    PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE
    AND CLUSTERING ORDER BY (column1 ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 'compaction_window_size': '1440', 'compaction_window_unit': 'MINUTES', 'max_threshold': '32', 'min_threshold': '4', 'timestamp_resolution': 'MILLISECONDS'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 1.0
    AND speculative_retry = 'NONE';

TWCS is working and creating daily SSTables, but the 2 compactors are usually very busy throughout the day. I believe I need to allocate more than the 2 default concurrent_compactors given my system size and load. I have extra disk IO and CPU capacity, so adding more should be okay. Unfortunately, I now have old SSTables that have expired but are not deleted. Instead of 15 days of daily SSTables I have 25 and growing. I didn't have any issues when testing with smaller loads, which is why I figured I needed more concurrent_compactors.

My issue is, when I stopped my incoming data, I expected the compactors to free up and clean up the old expired SSTables. However, the compactors are done and the expired SSTables are still there. Looking at the tables I see this:

date +%s
1490040578
/cassandra/tools/bin/sstablemetadata mc-50096-big-Data.db 
Minimum timestamp: 1487721574298
Maximum timestamp: 1487807978299
SSTable min local deletion time: 1489017575
SSTable max local deletion time: 1489103978
TTL min: 1296000
TTL max: 1296000
EncodingStats minTTL: 1296000
EncodingStats minLocalDeletionTime: 1489017575
EncodingStats minTimestamp: 1487721574298

I'm wondering if there was a reason for giving a "max local deletion time"? If that means what I think it does, my old SSTables have expired but will not be deleted since they missed the min/max local deletion time period. A quick google search for "SSTable min local deletion time" showed it is frequently set to 2147483647. Please let me know if there is any other information I can provide. Sorry if I'm miss-understanding those or have miss-configured TWCS/KairosDB. Thanks in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions