Skip to content

Commit 0c0684f

Browse files
adding file source page (#11355) (#11486)
1 parent e805050 commit 0c0684f

File tree

1 file changed

+92
-0
lines changed
  • _data-prepper/pipelines/configuration/sources

1 file changed

+92
-0
lines changed
Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
---
2+
layout: default
3+
title: File
4+
parent: Sources
5+
grand_parent: Pipelines
6+
nav_order: 24
7+
---
8+
9+
# File source
10+
11+
The `file` plugin reads events from a local file once when the pipeline starts. It's useful for loading seed data, testing processors and sinks, or replaying a fixed dataset. This source *does not monitor* the file for new lines after startup.
12+
13+
Option | Required | Type | Description
14+
:--- | :--- | :--- | :---
15+
`path` | Yes | String | An absolute path to the input file inside the Data Prepper container, for example, `/usr/share/data-prepper/data/input.jsonl`.
16+
`format` | No | String | Specifies how to interpret the file content. Valid values are `json` and `plain`. Use `json` when your file has one JSON object per line or a JSON array. Use `plain` for raw text lines. Default is `plain`.
17+
`record_type` | No | String | The type of output record produced by the source. Valid values are `event` and `string`. Use `event` to produce structured events expected by downstream processors and the OpenSearch sink. Default is `string`.
18+
19+
### Example
20+
21+
The following examples demonstrate how different file types can be processed.
22+
23+
### JSON file
24+
25+
The following example processes a JSON file:
26+
27+
```yaml
28+
file-to-opensearch:
29+
source:
30+
file:
31+
path: /usr/share/data-prepper/data/input.ndjson
32+
format: json
33+
record_type: event
34+
sink:
35+
- opensearch:
36+
hosts: ["https://opensearch:9200"]
37+
index: file-demo
38+
username: admin
39+
password: admin_pass
40+
insecure: true
41+
```
42+
{% include copy.html %}
43+
44+
### Plain text file
45+
46+
A raw text file can be processed using the following pipeline:
47+
48+
```yaml
49+
plain-file-to-opensearch:
50+
source:
51+
file:
52+
path: /usr/share/data-prepper/data/app.log
53+
format: plain
54+
record_type: event
55+
processor:
56+
- grok:
57+
match:
58+
message:
59+
- '%{TIMESTAMP_ISO8601:timestamp} \[%{LOGLEVEL:level}\] %{GREEDYDATA:msg}'
60+
sink:
61+
- opensearch:
62+
hosts: ["https://opensearch:9200"]
63+
index: plain-file-demo
64+
username: admin
65+
password: admin_pass
66+
insecure: true
67+
```
68+
{% include copy.html %}
69+
70+
### CSV file
71+
72+
You can process a CSV file using the `csv` processor:
73+
74+
```yaml
75+
csv-file-to-opensearch:
76+
source:
77+
file:
78+
path: /usr/share/data-prepper/data/ingest.csv
79+
format: plain
80+
record_type: event
81+
processor:
82+
- csv:
83+
column_names: ["time","level","message"]
84+
sink:
85+
- opensearch:
86+
hosts: ["https://opensearch:9200"]
87+
index: csv-demo
88+
username: admin
89+
password: admin_pass
90+
insecure: true
91+
```
92+
{% include copy.html %}

0 commit comments

Comments
 (0)