Skip to content

Commit c2d650a

Browse files
committed
Improve clickhouse table creation
1 parent 828edbd commit c2d650a

File tree

1 file changed

+37
-19
lines changed

1 file changed

+37
-19
lines changed

README.md

Lines changed: 37 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -32,41 +32,59 @@ make
3232

3333
```sql
3434
CREATE TABLE graphite (
35-
Path String,
36-
Value Float64,
37-
Time UInt32,
38-
Date Date,
39-
Timestamp UInt32
35+
Path String CODEC(ZSTD(3)), -- better compression
36+
Value Float64 CODEC(Gorilla, LZ4), -- better codec for Floats
37+
Time UInt32 CODEC(DoubleDelta, LZ4), -- will be almost always 0
38+
Date Date CODEC(DoubleDelta, LZ4), -- will be almost always 0
39+
Timestamp UInt32 CODEC(DoubleDelta, LZ4) TTL Date + INTERVAL 1 MONTH-- will be almost always 0, good to go in 1 month
4040
) ENGINE = GraphiteMergeTree('graphite_rollup')
41-
PARTITION BY toYYYYMM(Date)
41+
PARTITION BY toYearWeek(Date)
4242
ORDER BY (Path, Time);
4343

44-
-- optional table for faster metric search
4544
CREATE TABLE graphite_index (
46-
Date Date,
47-
Level UInt32,
48-
Path String,
49-
Version UInt32
45+
Date Date CODEC(DoubleDelta, LZ4), -- will be almost always 0
46+
Level UInt32 CODEC(DoubleDelta, LZ4), -- will be almost always 0
47+
Path String CODEC(ZSTD(3)), -- better compression
48+
Version UInt32 TTL toDateTime(Version) + INTERVAL 2 DAY -- is necessary only for the current day
5049
) ENGINE = ReplacingMergeTree(Version)
51-
PARTITION BY toYYYYMM(Date)
50+
PARTITION BY toYYYYMMDD(Date)
5251
ORDER BY (Level, Path, Date);
5352

54-
-- optional table for storing Graphite tags
5553
CREATE TABLE graphite_tagged (
56-
Date Date,
57-
Tag1 String,
58-
Path String,
59-
Tags Array(String),
60-
Version UInt32
54+
Date Date CODEC(DoubleDelta, LZ4), -- will be almost always 0
55+
Tag1 String CODEC(ZSTD(3)), -- better compression
56+
Path String CODEC(ZSTD(3)), -- better compression
57+
Tags Array(String) CODEC(ZSTD(3)), -- better compression
58+
Version UInt32 TTL toDateTime(Version) + INTERVAL 2 DAY -- is necessary only for the current day
6159
) ENGINE = ReplacingMergeTree(Version)
62-
PARTITION BY toYYYYMM(Date)
60+
PARTITION BY toYYYYMMDD(Date)
6361
ORDER BY (Tag1, Path, Date);
6462
```
6563

6664
[GraphiteMergeTree documentation](https://clickhouse.tech/docs/en/engines/table-engines/mergetree-family/graphitemergetree/)
6765

6866
You can create Replicated tables. See [ClickHouse documentation](https://clickhouse.tech/docs/en/engines/table-engines/mergetree-family/replication/)
6967

68+
3. One should always use [graphite-ch-optimizer](https://github.com/innogames/graphite-ch-optimizer) together with carbon-clickhouse and [graphite-clickhouse](https://github.com/go-graphite/graphite-clickhouse). Without it, the rules from `graphite-rollup` configuration aren't applied automatically.
69+
70+
### Fine tuning the `PARTITION BY` for graphite data table
71+
72+
The current `toYearWeek` function used in the `PARTITION BY` is the rule of thumb. When `graphite-ch-optimizer` works, it launches `OPTIMIZE TABLE graphite PARTITION ID 'YYYYWW' FINAL` once per configured interval. When the partition is too big, it processes it a few or even several of times.
73+
74+
If the partition contains too many data, and optimization runs too long, it could be an option to reduce the partition size, e.g. by using `toYYYYMMDD(toStartOfInterval(Date, toIntervalDay(3)))`.
75+
76+
Here's the `clickhouse` query to play with `toStartOfInterval`
77+
78+
```sql
79+
SELECT
80+
toDate(number) AS Date,
81+
toYYYYMMDD(Date) AS `YMD`,
82+
toYearWeek(Date) AS YW,
83+
toYYYYMMDD(toStartOfInterval(Date, toIntervalDay(3))) AS `3YMD`
84+
FROM system.numbers
85+
LIMIT 19900, 50
86+
```
87+
7088
## Configuration
7189
```
7290
$ carbon-clickhouse -help

0 commit comments

Comments
 (0)