|
4 | 4 | [](https://goreportcard.com/report/github.com/jf-tech/omniparser)
|
5 | 5 | [](https://pkg.go.dev/github.com/jf-tech/omniparser)
|
6 | 6 |
|
7 |
| -Omniparser is written in naive Golang that ingests input data of various formats (**CSV, txt, XML, EDI, JSON**, and custom formats) in streaming fashion |
8 |
| -and transforms data into desired JSON output based on a schema written in JSON. |
| 7 | +Omniparser is written in naive Golang that ingests input data of various formats (**CSV, txt, XML, EDI, JSON**, and |
| 8 | +custom formats) in streaming fashion and transforms data into desired JSON output based on a schema written in JSON. |
9 | 9 |
|
10 |
| -Golang Version: 1.14.2 |
| 10 | +Golang Version: 1.14 |
11 | 11 |
|
12 |
| -## Demo in Playground |
| 12 | +## Getting Started |
| 13 | + |
| 14 | +Follow the tutorial [Getting Started](./doc/gettingstarted.md) to write your first omniparser schema. |
| 15 | + |
| 16 | +## Online Playground |
13 | 17 |
|
14 | 18 | Use https://omniparser.herokuapp.com/ (may need to wait for a few seconds for heroku instance to wake up)
|
15 |
| -for trying out schemas and inputs, yours and from sample library, to see how transform works. |
| 19 | +for trying out schemas and inputs, yours or existing samples, to see how ingestion and transform work. |
16 | 20 |
|
17 | 21 | 
|
18 | 22 |
|
19 |
| -Take a detailed look at samples here: |
| 23 | +## More Examples |
20 | 24 | - [csv examples](extensions/omniv21/samples/csv)
|
21 | 25 | - [fixed-length examples](extensions/omniv21/samples/fixedlength)
|
22 | 26 | - [json examples](extensions/omniv21/samples/json)
|
23 | 27 | - [xml examples](extensions/omniv21/samples/xml).
|
24 | 28 | - [edi examples](extensions/omniv21/samples/edi).
|
25 | 29 |
|
26 |
| -## Simple Example (JSON -> JSON Transform) |
27 |
| -- Input: |
28 |
| - ``` |
29 |
| - { |
30 |
| - "order_id": "1234567", |
31 |
| - "tracking_number": "1z9999999999999999", |
32 |
| - "items": [ |
33 |
| - { |
34 |
| - "item_sku": "ab123", |
35 |
| - "item_price": 12.34, |
36 |
| - "number_purchased": 5 |
37 |
| - }, |
38 |
| - { |
39 |
| - "item_sku": "ck763-23", |
40 |
| - "item_price": 3.12, |
41 |
| - "number_purchased": 2 |
42 |
| - } |
43 |
| - ] |
44 |
| - } |
45 |
| - ``` |
46 |
| -- Schema: |
47 |
| - ``` |
48 |
| - { |
49 |
| - "parser_settings": { |
50 |
| - "version": "omni.2.1", |
51 |
| - "file_format_type": "json" |
52 |
| - }, |
53 |
| - "transform_declarations": { |
54 |
| - "FINAL_OUTPUT": { "object": { |
55 |
| - "order_id": { "xpath": "order_id" }, |
56 |
| - "tracking_number": { "custom_func": { |
57 |
| - "name": "upper", |
58 |
| - "args": [ { "xpath": "tracking_number" } ] |
59 |
| - }}, |
60 |
| - "items": { "array": [{ "xpath": "items/*", "object": { |
61 |
| - "sku": { "custom_func": { |
62 |
| - "name": "javascript", |
63 |
| - "args": [ |
64 |
| - { "const": "sku.toUpperCase().substring(0, 5)" }, |
65 |
| - { "const": "sku" }, { "xpath": "item_sku" } |
66 |
| - ] |
67 |
| - }}, |
68 |
| - "total_price": { "custom_func": { |
69 |
| - "name": "javascript", |
70 |
| - "args": [ |
71 |
| - { "const": "num * price" }, |
72 |
| - { "const": "num" }, { "xpath": "number_purchased", "type": "int" }, |
73 |
| - { "const": "price" }, { "xpath": "item_price", "type": "float" } |
74 |
| - ] |
75 |
| - }} |
76 |
| - }}]} |
77 |
| - }} |
78 |
| - } |
79 |
| - } |
80 |
| - ``` |
81 |
| -- Code: |
82 |
| - ``` |
83 |
| - schema, err := omniparser.NewSchema("schema-name", strings.NewReader("...")) |
84 |
| - if err != nil { ... } |
85 |
| - transform, err := schema.NewTransform("input-name", strings.NewReader("..."), &transformctx.Ctx{}) |
86 |
| - if err != nil { ... } |
87 |
| - for { |
88 |
| - b, err := transform.Read() |
89 |
| - if err == io.EOF { break } |
90 |
| - if err != nil { ... } |
91 |
| - fmt.Println(string(b)) |
92 |
| - } |
93 |
| - ``` |
94 |
| -- Output: |
95 |
| - ``` |
96 |
| - { |
97 |
| - "order_id": "1234567", |
98 |
| - "tracking_number": "1Z9999999999999999", |
99 |
| - "items": [ |
100 |
| - { |
101 |
| - "sku": "AB123", |
102 |
| - "total_price": 61.7 |
103 |
| - }, |
104 |
| - { |
105 |
| - "sku": "CK763", |
106 |
| - "total_price": 6.24 |
107 |
| - } |
108 |
| - ] |
109 |
| - } |
110 |
| - ``` |
111 |
| -
|
112 | 30 | ## Why
|
113 | 31 | - No good ETL transform/parser library exists in Golang.
|
114 | 32 | - Even looking into Java and other languages, choices aren't many and all have limitations:
|
115 | 33 | - [Smooks](https://www.smooks.org/) is dead, plus its EDI parsing/transform is too heavyweight, needing code-gen.
|
116 | 34 | - [BeanIO](http://beanio.org/) can't deal with EDI input.
|
117 | 35 | - [Jolt](https://github.com/bazaarvoice/jolt) can't deal with anything other than JSON input.
|
118 | 36 | - [JSONata](https://jsonata.org/) still only JSON -> JSON transform.
|
119 |
| -- Many of the parsers/transforms don't support streaming read, loading entire input into memory - not acceptable in some situations. |
| 37 | +- Many of the parsers/transforms don't support streaming read, loading entire input into memory - not acceptable in some |
| 38 | +situations. |
120 | 39 |
|
121 | 40 | ## Requirements
|
122 | 41 | - Golang 1.14
|
@@ -150,4 +69,5 @@ Take a detailed look at samples here:
|
150 | 69 | - Ability to provide a new file format support to built-in omniv2 schema handler.
|
151 | 70 |
|
152 | 71 | ## Footnotes
|
153 |
| -- omniparser is a collaboration effort of [jf-tech](https://github.com/jf-tech/), [Simon](https://github.com/liangxibing) and [Steven](http://github.com/wangjia007bond). |
| 72 | +- omniparser is a collaboration effort of [jf-tech](https://github.com/jf-tech/),[Simon](https://github.com/liangxibing) |
| 73 | +and [Steven](http://github.com/wangjia007bond). |
0 commit comments