Skip to content

Commit f057be3

Browse files
authored
getting started (#121)
1 parent 1d5645f commit f057be3

File tree

4 files changed

+826
-96
lines changed

4 files changed

+826
-96
lines changed

README.md

Lines changed: 14 additions & 94 deletions
Original file line numberDiff line numberDiff line change
@@ -4,119 +4,38 @@
44
[![Go Report Card](https://goreportcard.com/badge/github.com/jf-tech/omniparser)](https://goreportcard.com/report/github.com/jf-tech/omniparser)
55
[![PkgGoDev](https://pkg.go.dev/badge/github.com/jf-tech/omniparser)](https://pkg.go.dev/github.com/jf-tech/omniparser)
66

7-
Omniparser is written in naive Golang that ingests input data of various formats (**CSV, txt, XML, EDI, JSON**, and custom formats) in streaming fashion
8-
and transforms data into desired JSON output based on a schema written in JSON.
7+
Omniparser is written in naive Golang that ingests input data of various formats (**CSV, txt, XML, EDI, JSON**, and
8+
custom formats) in streaming fashion and transforms data into desired JSON output based on a schema written in JSON.
99

10-
Golang Version: 1.14.2
10+
Golang Version: 1.14
1111

12-
## Demo in Playground
12+
## Getting Started
13+
14+
Follow the tutorial [Getting Started](./doc/gettingstarted.md) to write your first omniparser schema.
15+
16+
## Online Playground
1317

1418
Use https://omniparser.herokuapp.com/ (may need to wait for a few seconds for heroku instance to wake up)
15-
for trying out schemas and inputs, yours and from sample library, to see how transform works.
19+
for trying out schemas and inputs, yours or existing samples, to see how ingestion and transform work.
1620

1721
![](./cli/cmd/web/playground-demo.gif)
1822

19-
Take a detailed look at samples here:
23+
## More Examples
2024
- [csv examples](extensions/omniv21/samples/csv)
2125
- [fixed-length examples](extensions/omniv21/samples/fixedlength)
2226
- [json examples](extensions/omniv21/samples/json)
2327
- [xml examples](extensions/omniv21/samples/xml).
2428
- [edi examples](extensions/omniv21/samples/edi).
2529

26-
## Simple Example (JSON -> JSON Transform)
27-
- Input:
28-
```
29-
{
30-
"order_id": "1234567",
31-
"tracking_number": "1z9999999999999999",
32-
"items": [
33-
{
34-
"item_sku": "ab123",
35-
"item_price": 12.34,
36-
"number_purchased": 5
37-
},
38-
{
39-
"item_sku": "ck763-23",
40-
"item_price": 3.12,
41-
"number_purchased": 2
42-
}
43-
]
44-
}
45-
```
46-
- Schema:
47-
```
48-
{
49-
"parser_settings": {
50-
"version": "omni.2.1",
51-
"file_format_type": "json"
52-
},
53-
"transform_declarations": {
54-
"FINAL_OUTPUT": { "object": {
55-
"order_id": { "xpath": "order_id" },
56-
"tracking_number": { "custom_func": {
57-
"name": "upper",
58-
"args": [ { "xpath": "tracking_number" } ]
59-
}},
60-
"items": { "array": [{ "xpath": "items/*", "object": {
61-
"sku": { "custom_func": {
62-
"name": "javascript",
63-
"args": [
64-
{ "const": "sku.toUpperCase().substring(0, 5)" },
65-
{ "const": "sku" }, { "xpath": "item_sku" }
66-
]
67-
}},
68-
"total_price": { "custom_func": {
69-
"name": "javascript",
70-
"args": [
71-
{ "const": "num * price" },
72-
{ "const": "num" }, { "xpath": "number_purchased", "type": "int" },
73-
{ "const": "price" }, { "xpath": "item_price", "type": "float" }
74-
]
75-
}}
76-
}}]}
77-
}}
78-
}
79-
}
80-
```
81-
- Code:
82-
```
83-
schema, err := omniparser.NewSchema("schema-name", strings.NewReader("..."))
84-
if err != nil { ... }
85-
transform, err := schema.NewTransform("input-name", strings.NewReader("..."), &transformctx.Ctx{})
86-
if err != nil { ... }
87-
for {
88-
b, err := transform.Read()
89-
if err == io.EOF { break }
90-
if err != nil { ... }
91-
fmt.Println(string(b))
92-
}
93-
```
94-
- Output:
95-
```
96-
{
97-
"order_id": "1234567",
98-
"tracking_number": "1Z9999999999999999",
99-
"items": [
100-
{
101-
"sku": "AB123",
102-
"total_price": 61.7
103-
},
104-
{
105-
"sku": "CK763",
106-
"total_price": 6.24
107-
}
108-
]
109-
}
110-
```
111-
11230
## Why
11331
- No good ETL transform/parser library exists in Golang.
11432
- Even looking into Java and other languages, choices aren't many and all have limitations:
11533
- [Smooks](https://www.smooks.org/) is dead, plus its EDI parsing/transform is too heavyweight, needing code-gen.
11634
- [BeanIO](http://beanio.org/) can't deal with EDI input.
11735
- [Jolt](https://github.com/bazaarvoice/jolt) can't deal with anything other than JSON input.
11836
- [JSONata](https://jsonata.org/) still only JSON -> JSON transform.
119-
- Many of the parsers/transforms don't support streaming read, loading entire input into memory - not acceptable in some situations.
37+
- Many of the parsers/transforms don't support streaming read, loading entire input into memory - not acceptable in some
38+
situations.
12039

12140
## Requirements
12241
- Golang 1.14
@@ -150,4 +69,5 @@ Take a detailed look at samples here:
15069
- Ability to provide a new file format support to built-in omniv2 schema handler.
15170

15271
## Footnotes
153-
- omniparser is a collaboration effort of [jf-tech](https://github.com/jf-tech/), [Simon](https://github.com/liangxibing) and [Steven](http://github.com/wangjia007bond).
72+
- omniparser is a collaboration effort of [jf-tech](https://github.com/jf-tech/),[Simon](https://github.com/liangxibing)
73+
and [Steven](http://github.com/wangjia007bond).

cli.sh

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,9 @@
11
#!/bin/bash
2-
SCRIPT_DIR=$(pwd `dirname "$0"`)
3-
go run -ldflags "-X main.gitCommit=$(git rev-parse HEAD) -X main.buildEpochSec=$(date +%s)" $SCRIPT_DIR/cli/op.go "$@"
2+
CUR_DIR=$(pwd)
3+
SCRIPT_DIR=$(dirname "$0")
4+
cd $SCRIPT_DIR && \
5+
go build \
6+
-o $CUR_DIR/op \
7+
-ldflags "-X main.gitCommit=$(git rev-parse HEAD) -X main.buildEpochSec=$(date +%s)" \
8+
$SCRIPT_DIR/cli/op.go
9+
cd $CUR_DIR && ./op "$@"

doc/customfuncs.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# Custom Function Reference

0 commit comments

Comments
 (0)