Skip to content

Commit 37563dd

Browse files
Merge pull request #160 from axoflow/4.17-prep
4.17 prep
2 parents 473a6bc + 9a78bf5 commit 37563dd

File tree

10 files changed

+150
-13
lines changed

10 files changed

+150
-13
lines changed

config/_default/config.toml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ description = "Documentation for AxoSyslog, the scalable security data processor
7272
# The version number for the version of the docs represented in this doc set.
7373
# Used in the "version-banner" partial to display a version number for the
7474
# current doc set.
75-
version = "4.16.0"
75+
version = "4.17.0"
7676
version_menu_canonicallinks = true
7777

7878
# A link to latest version of the docs. Used in the "version-banner" partial to
@@ -172,9 +172,9 @@ description = "Documentation for AxoSyslog, the scalable security data processor
172172
[params.product]
173173
name = "AxoSyslog"
174174
abbrev = "AxoSyslog"
175-
version = "4.16"
176-
techversion = "4.16.0"
177-
configversion = "4.16"
175+
version = "4.17"
176+
techversion = "4.17.0"
177+
configversion = "4.17"
178178
syslog-ng = "syslog-ng"
179179
selinux = "SELinux"
180180
apparmor = "AppArmor"

content/chapter-destinations/clickhouse/_index.md

Lines changed: 24 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,28 @@ This destination has the following options:
8585

8686
{{< include-headless "chunk/option-destination-hook.md" >}}
8787

88+
## json-var()
89+
90+
| | |
91+
| -------- | ------------ |
92+
| Type: | string |
93+
| Default: | empty string |
94+
95+
Available in {{< product >}} 4.17 and later.
96+
97+
*Description:* The `json-var()` option accepts either a JSON template or a variable containing a JSON string, and sends it to the ClickHouse server in Protobuf/JSON mixed mode ([`JSONEachRow` format](https://clickhouse.com/docs/interfaces/formats/JSONEachRow)). In this mode, type validation is performed by the ClickHouse server itself, so no Protobuf schema is required for communication. For example:
98+
99+
```shell
100+
destination {
101+
clickhouse (
102+
...
103+
json-var(json("{\"ingest_time\":1755248921000000000, \"body\": \"test template\"}"))ß
104+
};
105+
};
106+
```
107+
108+
Using `json-var()` is mutually exclusive with the [`proto-var()`](#proto-var), [`server-side-schema()`](#server-side-schema), [`schema()`](#schema), and [`protobuf-schema()`](#protobuf-schema) options.
109+
88110
{{< include-headless "chunk/option-destination-grpc-keep-alive.md" >}}
89111
90112
{{% include-headless "chunk/option-destination-local-timezone.md" %}}
@@ -130,7 +152,7 @@ message CustomRecord {
130152
}
131153
```
132154
133-
Alternatively, you can set the schema with the [`schema()`](#schema) option, or use [proto-var()](#proto-var) to assign an already formatted object to the message.
155+
Alternatively, you can set the schema with the [`schema()`](#schema) option, use [proto-var()](#proto-var) to assign an already formatted object to the message, or use a JSON template with the [json-var()](#json-var) option.
134156
135157
{{< include-headless "chunk/option-destination-proto-var.md" >}}
136158
@@ -156,7 +178,7 @@ schema(
156178
)
157179
```
158180
159-
Alternatively, you can set the schema with the [`protobuf-schema()`](#protobuf-schema) option, or use [proto-var()](#proto-var) to assign an already formatted object to the message.
181+
Alternatively, you can set the schema with the [`protobuf-schema()`](#protobuf-schema) option, use [proto-var()](#proto-var) to assign an already formatted object to the message, or use a JSON template with the [json-var()](#json-var) option.
160182
161183
You can find the available column types in the [official ClickHouse documentation](https://clickhouse.com/docs/en/sql-reference/data-types).
162184

content/chapter-nonsequential-processing/_index.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ By default, {{% param "product.abbrev" %}} processes log messages arriving from
1010

1111
Sequential processing performs well if you have relatively many parallel connections, in which case it uses all the available CPU cores. However, if a small number of connections deliver a large number of messages, this behavior becomes a bottleneck.
1212

13-
Starting with {{% param "product.abbrev" %}} version 4.3, {{% param "product.abbrev" %}} can split a stream of incoming messages into a set of partitions, which can be processed by multiple threads in parallel. Depending on how you partition the stream, you might lose the message ordering, but can scale the incoming load to all CPUs in the system, even if the entire load is coming from a single, chatty sender.
13+
Starting with {{% param "product.abbrev" %}} version 4.3, {{% param "product.abbrev" %}} can distribute a stream of incoming messages between a set of workers to process the stream by multiple threads in parallel. Depending on how you partition the stream, you might lose the message ordering, but can scale the incoming load to all CPUs in the system, even if the entire load is coming from a single, chatty sender.
1414

1515
To enable this mode of execution, use the `parallelize()` element in your log path.
1616

@@ -24,7 +24,7 @@ log {
2424
log-iw-size(10M) max-connections(10) log-fetch-limit(100000)
2525
);
2626
};
27-
parallelize(partitions(4));
27+
parallelize(workers(4));
2828

2929
# from this part on, messages are processed in parallel even if
3030
# messages are originally coming from a single connection
@@ -34,7 +34,7 @@ log {
3434
};
3535
```
3636
37-
`parallelize()` uses round-robin to allocate messages to partitions by default, but you can retain ordering for a subset of messages with the `partition-key()` option. The `partition-key()` option specifies a template: messages that expand the template to the same value are mapped to the same partition. For example, you can partition messages based on their sender host:
37+
`parallelize()` uses round-robin to allocate messages to workers (called partitions in versions between 4.3-4.16) by default, but you can retain ordering for a subset of messages with the `worker-partition-key()` option. The `worker-partition-key()` option specifies a template: messages that expand the template to the same value are mapped to the same partition. For example, you can partition messages based on their sender host:
3838
3939
```shell
4040
log {
@@ -44,7 +44,7 @@ log {
4444
log-iw-size(10M) max-connections(10) log-fetch-limit(100000)
4545
);
4646
};
47-
parallelize(partitions(4) partition-key("$HOST"));
47+
parallelize(workers(4) worker-partition-key("$HOST"));
4848

4949
# from this part on, messages are processed in parallel if their
5050
# $HOST value differs. Messages with the same $HOST will be mapped
@@ -55,3 +55,5 @@ log {
5555
destination { ... };
5656
};
5757
```
58+
59+
Staring with {{< product >}} version 4.17, you can use the `batch-size()` option to specify how many consecutive messages should be processed by a single `parallelize()` worker. This ensures that this many messages preserve their order on the destination side, and also improves `parallelize()` performance. A value around 100 is recommended for `batch-size()`. Default value: `0` (batching is disabled).

content/filterx/_index.md

Lines changed: 36 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -326,7 +326,40 @@ js = json({
326326
});
327327
```
328328

329-
To create a field only if the assigned value is non-null, see [Create dict element if non-null (`:??`)]({{< relref "/filterx/operator-reference.md#create-non-null" >}}).
329+
When working with dicts, note the following points:
330+
331+
- To create a field only if the assigned value is non-null, see [Create dict element if non-null (`:??`)]({{< relref "/filterx/operator-reference.md#create-non-null" >}}).
332+
- To assign a value to a non-existing key where only this key doesn't exist, you can use a simple value assignment, for example:
333+
334+
```shell
335+
js = json({
336+
"key1": "one",
337+
"key2": "two"
338+
});
339+
340+
js.key3 = "three"
341+
```
342+
343+
However, if you want to assign a value where multiple elements of the path don't exist, use the [`dpath`]({{< relref "/filterx/function-reference.md#dpath" >}}) FilterX function, for example:
344+
345+
```shell
346+
dpath(js.key4.key41.key412) = "nested value"
347+
```
348+
349+
The value of the dictionary will be:
350+
351+
```shell
352+
js = json({
353+
"key1": "one",
354+
"key2": "two",
355+
"key3": "three",
356+
"key4": {
357+
"key41": {
358+
"key412": "nested value"
359+
}
360+
}
361+
});
362+
```
330363
331364
Within a FilterX block, you can access the fields of complex data types by using indexes and the dot notation, for example:
332365
@@ -378,8 +411,9 @@ For details, see {{% xref "/filterx/operator-reference.md" %}}.
378411
FilterX has the following built-in functions.
379412
380413
- [`cache_json_file`]({{< relref "/filterx/function-reference.md#cache-json-file" >}}): Loads an external JSON file to lookup contextual information.
381-
- [`endswith`]({{< relref "/filterx/filterx-string-search/_index.md" >}}): Checks if a string ends with the specified value.
382414
- [`dedup_metrics_labels`]({{< relref "/filterx/filterx-metrics/_index.md#metrics-labels" >}}): Deduplicate `metrics_labels` objects.
415+
- [`dpath`]({{< relref "/filterx/function-reference.md#dpath" >}}): Creates a nested path in a dictionary.
416+
- [`endswith`]({{< relref "/filterx/filterx-string-search/_index.md" >}}): Checks if a string ends with the specified value.
383417
- [`flatten`]({{< relref "/filterx/function-reference.md#flatten" >}}): Flattens the nested elements of an object.
384418
- [`format_cef`]({{< relref "/filterx/filterx-format-data/format-cef" >}}): Formats a dictionary into Common Event Format (CEF).
385419
- [`format_csv`]({{< relref "/filterx/filterx-format-data/format-csv.md" >}}): Formats a dictionary or a list into a comma-separated string.

content/filterx/filterx-parsing/key-value-parser/kv-parser-options/_index.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,24 @@ For example, to parse `key1=value1;key2=value2` pairs, use:
1616
${MESSAGE} = parse_kv("key1=value1;key2=value2", pair_separator=";");
1717
```
1818

19+
## stray_words_append_to_value {#stray-words-append}
20+
21+
Available in {{% param "product.abbrev" %}} 4.17 and later.
22+
23+
If the `stray_words_append_to_value` flag is set, any stray words between the value pairs are appended to the preceding value. For example:
24+
25+
```shell
26+
# input: a=b b=c d e f=g
27+
filterx {
28+
${MESSAGE} = parse_kv(${MESSAGE}, value_separator="=", pair_separator=" ", stray_words_append_to_value=true);
29+
};
30+
# The value of $MSG will be: {"a":"b","b":"c d e","f":"g"}
31+
```
32+
33+
If you want to collect the stray words into a separate key, see [`stray_words_key`](#stray-words-key).
34+
35+
{{< include-headless "wnt/note-parse-kv-stray-values.md" >}}
36+
1937
## stray_words_key {#stray-words-key}
2038

2139
Specifies the key where {{% param "product.abbrev" %}} stores any stray words that appear before or between the parsed key-value pairs. If multiple stray words appear in a message, then {{% param "product.abbrev" %}} stores them as a comma-separated list. Default value:`N/A`
@@ -35,6 +53,10 @@ ${PARSED_MESSAGE} = parse_kv(${MESSAGE}, stray_words_key="stray_words");
3553

3654
The value of `${PARSED_MESSAGE}.stray_words` for this message will be: `["interzone-emtn_s1_vpn-enodeb_om", "inbound"]`
3755

56+
If you want to append the stray words to the respective values instead of adding them to a separate value, see [`stray_words_append_to_value`](#stray-words-append).
57+
58+
{{< include-headless "wnt/note-parse-kv-stray-values.md" >}}
59+
3860
## value_separator
3961

4062
Specifies the character that separates the keys from the values. Default value: `=`.

content/filterx/function-reference.md

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,37 @@ Usually, you use the [strptime](#strptime) FilterX function to create datetime v
7171

7272
Deduplicate `metrics_labels` objects. For details, see {{% xref "/filterx/filterx-metrics/_index.md#metrics-labels" %}}.
7373

74+
## dpath
75+
76+
Available in {{< product >}} 4.17 and later.
77+
78+
Assigns a value to a dictionary and creates any elements of the path that don't exist. For example:
79+
80+
```shell
81+
js = json({
82+
"key1": "one",
83+
"key2": "two",
84+
"key3": "three"
85+
});
86+
87+
dpath(js.key4.key41.key412) = "nested value"
88+
```
89+
90+
The value of the dictionary will be:
91+
92+
```shell
93+
js = json({
94+
"key1": "one",
95+
"key2": "two",
96+
"key3": "three",
97+
"key4": {
98+
"key41": {
99+
"key412": "nested value"
100+
}
101+
}
102+
});
103+
```
104+
74105
## endswith
75106

76107
Available in {{< product >}} 4.9 and later.

content/filterx/operator-reference.md

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -196,7 +196,7 @@ Is there a workaround for wildcards/globbing? /chapter-routing-filters/filters/r
196196
197197
Available in {{< product >}} 4.15 and later.
198198
199-
You can slice strings at the specified index using the `..` operator to get a section of the string. Indexing starts at 0, and must be non-negative. You can omit the index to refer to the beginning or the end of the string. For example:
199+
You can slice strings at the specified index using the `..` operator to get a section of the string. Indexing starts at 0. You can omit the index to refer to the beginning or the end of the string. For example:
200200
201201
```shell
202202
filterx {
@@ -213,6 +213,17 @@ filterx {
213213
};
214214
```
215215
216+
Staring with {{< product >}} version 4.17, you can use negative indexes to refer to characters from the end of the string, for example:
217+
218+
```shell
219+
filterx {
220+
str = "example";
221+
str[..-2] == "examp";
222+
str[-3..] == "ple";
223+
str[2..-2] == "amp";
224+
};
225+
```
226+
216227
## Ternary conditional operator
217228
218229
The [ternary conditional operator](https://en.wikipedia.org/wiki/Ternary_conditional_operator) evaluates an expression and returns the first argument if the expression is true, and the second argument if it's false.

content/headless/axosyslog-intro.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
---
22
---
3+
<!-- This file is under the copyright of Axoflow, and licensed under Apache License 2.0, except for using the Axoflow and AxoSyslog trademarks. -->
34
{{< include-headless "tagline.md" >}}
45
{{< product >}} is a drop-in replacement for `syslog-ng`, created by the original creators of `syslog-ng`. (It started as a fork, branched after syslog-ng&trade; v4.7.1).
56

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
---
2+
---
3+
<!-- This file is under the copyright of Axoflow, and licensed under Apache License 2.0, except for using the Axoflow and AxoSyslog trademarks. -->
4+
{{% alert title="Note" color="info" %}}
5+
Note that you cannot use `stray_words_append_to_value` and `stray_words_key` in the same parser.
6+
{{% /alert %}}

content/whats-new/_index.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,14 @@ weight: 10
66

77
This page is a changelog that collects the major changes and additions to this documentation. (If you want to know the details about why we have separate documentation for AxoSyslog and how it relates to the `syslog-ng` documentation, read our [syslog-ng documentation and similarities with AxoSyslog Core](https://axoflow.com/blog/axosyslog-core-documentation-syslog-ng) blog post.)
88

9+
## Version 4.17 (2025-09-04)
10+
11+
- The `parse_kv` FilterX function has an option ({{% xref "/filterx/filterx-parsing/key-value-parser/kv-parser-options/_index.md#stray-words-key" %}}) to append stray words to the preceding key.
12+
- You can now use negative indexes when [slicing FilterX strings]({{< relref "/filterx/operator-reference.md#slicing" >}}).
13+
- The [`dpath`]({{< relref "/filterx/function-reference.md#dpath" >}}) FilterX function assigns a value to a dictionary and creates any elements of the path that don't exist.
14+
- When using `parallelize()` during {{% xref "/chapter-nonsequential-processing/_index.md" %}}, you set the `batch-size()` option to specify how many consecutive messages should be processed by a single `parallelize()` worker.
15+
- For the `clickhouse()` destination, you can now use the [`json-var()` option]({{< relref "/chapter-destinations/clickhouse/_index.md#json-var" >}}) to send the message to the ClickHouse server in Protobuf/JSON mixed mode ([`JSONEachRow` format](https://clickhouse.com/docs/interfaces/formats/JSONEachRow)). In this mode, type validation is performed by the ClickHouse server itself, so no Protobuf schema is required for communication.
16+
917
## Version 4.16 (2025-08-15)
1018

1119
- New [`${PROTO_NAME` macro]({{< relref "/chapter-manipulating-messages/customizing-message-format/reference-macros/_index.md#proto-name" >}}).

0 commit comments

Comments
 (0)