Skip to content

Commit 27c7785

Browse files
committed
initial
0 parents  commit 27c7785

File tree

13 files changed

+843
-0
lines changed

13 files changed

+843
-0
lines changed

.env.example

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
ELASTICSEARCH_URL=http://url:port
2+
INDEX_NAME=index_name
3+
CSV_FILE_PATH=csvpath.csv

.github/workflows/release.yml

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
name: Release CsvES Binary
2+
3+
on:
4+
push:
5+
tags:
6+
- "v*"
7+
8+
permissions:
9+
contents: write
10+
11+
jobs:
12+
build-and-release:
13+
runs-on: ubuntu-latest
14+
steps:
15+
- name: Checkout code
16+
uses: actions/checkout@v4
17+
18+
- name: Set up Go
19+
uses: actions/setup-go@v4
20+
with:
21+
go-version: "1.23"
22+
23+
- name: Build binaries
24+
run: |
25+
BINARY_NAME="csves"
26+
VERSION=${GITHUB_REF#refs/tags/}
27+
BUILD_PATH="cmd/csves/main.go"
28+
29+
# Build for Linux
30+
GOOS=linux GOARCH=amd64 go build -o "${BINARY_NAME}-linux-amd64" ${BUILD_PATH}
31+
tar czf "${BINARY_NAME}-linux-amd64.tar.gz" "${BINARY_NAME}-linux-amd64"
32+
33+
# Build for macOS (Intel)
34+
GOOS=darwin GOARCH=amd64 go build -o "${BINARY_NAME}-darwin-amd64" ${BUILD_PATH}
35+
tar czf "${BINARY_NAME}-darwin-amd64.tar.gz" "${BINARY_NAME}-darwin-amd64"
36+
37+
# Build for macOS (Apple Silicon)
38+
GOOS=darwin GOARCH=arm64 go build -o "${BINARY_NAME}-darwin-arm64" ${BUILD_PATH}
39+
tar czf "${BINARY_NAME}-darwin-arm64.tar.gz" "${BINARY_NAME}-darwin-arm64"
40+
41+
# Build for Windows
42+
GOOS=windows GOARCH=amd64 go build -o "${BINARY_NAME}-windows-amd64.exe" ${BUILD_PATH}
43+
zip "${BINARY_NAME}-windows-amd64.zip" "${BINARY_NAME}-windows-amd64.exe"
44+
45+
- name: Create Release
46+
uses: softprops/action-gh-release@v1
47+
with:
48+
files: |
49+
csves-linux-amd64.tar.gz
50+
csves-darwin-amd64.tar.gz
51+
csves-darwin-arm64.tar.gz
52+
csves-windows-amd64.zip
53+
draft: false
54+
prerelease: false
55+
generate_release_notes: true

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
.env

README.md

Lines changed: 160 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,160 @@
1+
# CSVES (CSV to Elasticsearch)
2+
3+
A flexible tool for importing CSV data into Elasticsearch with automatic field detection and mapping.
4+
5+
## Features
6+
7+
- 🔍 Automatic CSV delimiter detection
8+
- 📄 Dynamic field mapping
9+
- 🧹 Automatic whitespace and control character cleaning
10+
- 🎯 Field selection and filtering
11+
- ⚙️ Configurable through command-line flags or environment variables
12+
- 🧪 Test mode for data verification
13+
- 📝 Custom field mapping through JSON configuration
14+
15+
## Installation
16+
17+
### Prerequisites
18+
19+
- Go 1.23 or higher
20+
- Elasticsearch 8.x
21+
- Access to an Elasticsearch instance
22+
23+
### Build Steps
24+
25+
1. Clone the repository:
26+
```bash
27+
git clone https://github.com/githubesson/csves
28+
cd csves
29+
```
30+
31+
2. Build the binary:
32+
```bash
33+
go build -o csves cmd/csves/main.go
34+
```
35+
36+
## Usage
37+
38+
### Basic Usage
39+
40+
```bash
41+
# Using .env settings
42+
./csves
43+
44+
# Test mode (no Elasticsearch connection) without .env settings
45+
./csves -csv="data.csv" -test
46+
47+
# Select specific fields without .env settings
48+
./csves -csv="data.csv" -select="email,phone,address"
49+
```
50+
51+
### Command Line Flags
52+
53+
| Flag | Description | Default | Required |
54+
|------|-------------|---------|----------|
55+
| `-csv` | Path to CSV file | - | Yes |
56+
| `-es-url` | Elasticsearch URL | http://localhost:9200 | No |
57+
| `-index` | Elasticsearch index name | csv_data | No |
58+
| `-fields` | Path to field configuration file | - | No |
59+
| `-select` | Comma-separated list of fields to include | - | No |
60+
| `-delimiter` | CSV delimiter character | auto-detect | No |
61+
| `-test` | Run in test mode | false | No |
62+
63+
### Environment Variables
64+
65+
You can also configure the tool using environment variables in a `.env` file:
66+
67+
```env
68+
ELASTICSEARCH_URL=http://localhost:9200
69+
INDEX_NAME=my_index
70+
CSV_FILE_PATH=data.csv
71+
FIELD_CONFIG_PATH=fields.json
72+
```
73+
74+
### Field Configuration
75+
76+
Create a `fields.json` file to specify field mappings and requirements:
77+
78+
```json
79+
[
80+
{
81+
"name": "User Id",
82+
"required": true,
83+
"csv_name": "userid"
84+
},
85+
{
86+
"name": "Email",
87+
"required": true,
88+
"csv_name": "email"
89+
}
90+
]
91+
```
92+
93+
- `name`: Field name in Elasticsearch
94+
- `required`: Whether the field must exist in CSV
95+
- `csv_name`: Column header name in CSV file
96+
97+
## Examples
98+
99+
### 1. Basic Import
100+
```bash
101+
./csves -csv="users.csv"
102+
```
103+
104+
### 2. Custom Elasticsearch Configuration
105+
```bash
106+
./csves -csv="users.csv" -es-url="http://elasticsearch:9200" -index="users_v1"
107+
```
108+
109+
### 3. Field Selection
110+
```bash
111+
./csves -csv="users.csv" -select="email,phone" -test
112+
```
113+
114+
### 4. Custom Field Mapping
115+
```bash
116+
./csves -csv="users.csv" -fields="fields.json"
117+
```
118+
119+
### 5. Specific Delimiter
120+
```bash
121+
./csves -csv="users.csv" -delimiter=";"
122+
```
123+
124+
## Data Cleaning
125+
126+
The tool automatically:
127+
- Removes leading and trailing whitespace
128+
- Removes control characters
129+
- Normalizes internal spaces
130+
- Skips empty fields
131+
- Handles multi-line values
132+
133+
## Error Handling
134+
135+
- Validates required fields
136+
- Reports parsing errors
137+
- Shows bulk indexing failures
138+
- Provides detailed error messages
139+
140+
## Development
141+
142+
### Project Structure
143+
```
144+
csves/
145+
├── cmd/
146+
│ └── csves/
147+
│ └── main.go # Entry point
148+
├── pkg/
149+
│ ├── config/ # Configuration handling
150+
│ ├── csv/ # CSV processing
151+
│ ├── elasticsearch/ # ES operations
152+
│ └── models/ # Data models
153+
├── go.mod # Go modules file
154+
├── go.sum # Dependencies checksum
155+
└── README.md # This file
156+
```
157+
158+
## License
159+
160+
This project is licensed under the MIT License - see the LICENSE file for details.

cmd/csves/main.go

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
package main
2+
3+
import (
4+
"log"
5+
6+
"csves/pkg/config"
7+
"csves/pkg/csv"
8+
"csves/pkg/elasticsearch"
9+
)
10+
11+
func main() {
12+
// Load configuration
13+
cfg, err := config.LoadConfig()
14+
if err != nil {
15+
log.Fatalf("Error loading configuration: %s", err)
16+
}
17+
18+
// Initialize CSV service
19+
csvService, err := csv.NewService(cfg)
20+
if err != nil {
21+
log.Fatalf("Error initializing CSV service: %s", err)
22+
}
23+
24+
// Process header
25+
headerMap, err := csvService.ProcessHeader()
26+
if err != nil {
27+
log.Fatalf("Error processing CSV header: %s", err)
28+
}
29+
30+
// Process records
31+
documents, err := csvService.ProcessRecords(headerMap)
32+
if err != nil {
33+
log.Fatalf("Error processing CSV records: %s", err)
34+
}
35+
36+
if cfg.TestMode {
37+
csvService.PrintDocuments(documents, true)
38+
return
39+
}
40+
41+
// Initialize Elasticsearch service
42+
esService, err := elasticsearch.NewService(cfg)
43+
if err != nil {
44+
log.Fatalf("Error initializing Elasticsearch service: %s", err)
45+
}
46+
47+
// Setup Elasticsearch
48+
if err := esService.Setup(); err != nil {
49+
log.Fatalf("Error setting up Elasticsearch: %s", err)
50+
}
51+
52+
// Print sample for verification
53+
csvService.PrintDocuments(documents, false)
54+
55+
// Bulk index documents
56+
if err := esService.BulkIndex(documents); err != nil {
57+
log.Fatalf("Error bulk indexing documents: %s", err)
58+
}
59+
60+
log.Println("All documents indexed successfully")
61+
}

example.csv

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
test;ababa;username
2+
test;ababa;email
3+
test;ababa;phonenum
4+
test;ababa;address

fields.json

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
[
2+
{
3+
"name": "User Id",
4+
"required": true,
5+
"csv_name": "userid"
6+
},
7+
{
8+
"name": "Email",
9+
"required": true,
10+
"csv_name": "email"
11+
},
12+
{
13+
"name": "Phone",
14+
"required": false,
15+
"csv_name": "phonenum"
16+
},
17+
{
18+
"name": "Mailing address",
19+
"required": false,
20+
"csv_name": "address"
21+
},
22+
{
23+
"name": "Date of birth",
24+
"required": false,
25+
"csv_name": "dob"
26+
}
27+
]

go.mod

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
module csves
2+
3+
go 1.23.0
4+
5+
require (
6+
github.com/elastic/go-elasticsearch/v8 v8.17.1
7+
github.com/joho/godotenv v1.5.1
8+
)
9+
10+
require (
11+
github.com/elastic/elastic-transport-go/v8 v8.6.1 // indirect
12+
github.com/go-logr/logr v1.4.2 // indirect
13+
github.com/go-logr/stdr v1.2.2 // indirect
14+
go.opentelemetry.io/otel v1.28.0 // indirect
15+
go.opentelemetry.io/otel/metric v1.28.0 // indirect
16+
go.opentelemetry.io/otel/trace v1.28.0 // indirect
17+
)

go.sum

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
2+
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
3+
github.com/elastic/elastic-transport-go/v8 v8.6.1 h1:h2jQRqH6eLGiBSN4eZbQnJLtL4bC5b4lfVFRjw2R4e4=
4+
github.com/elastic/elastic-transport-go/v8 v8.6.1/go.mod h1:YLHer5cj0csTzNFXoNQ8qhtGY1GTvSqPnKWKaqQE3Hk=
5+
github.com/elastic/go-elasticsearch/v8 v8.17.1 h1:bOXChDoCMB4TIwwGqKd031U8OXssmWLT3UrAr9EGs3Q=
6+
github.com/elastic/go-elasticsearch/v8 v8.17.1/go.mod h1:MVJCtL+gJJ7x5jFeUmA20O7rvipX8GcQmo5iBcmaJn4=
7+
github.com/go-logr/logr v1.2.2/go.mod h1:jdQByPbusPIv2/zmleS9BjJVeZ6kBagPoEUsqbVz/1A=
8+
github.com/go-logr/logr v1.4.2 h1:6pFjapn8bFcIbiKo3XT4j/BhANplGihG6tvd+8rYgrY=
9+
github.com/go-logr/logr v1.4.2/go.mod h1:9T104GzyrTigFIr8wt5mBrctHMim0Nb2HLGrmQ40KvY=
10+
github.com/go-logr/stdr v1.2.2 h1:hSWxHoqTgW2S2qGc0LTAI563KZ5YKYRhT3MFKZMbjag=
11+
github.com/go-logr/stdr v1.2.2/go.mod h1:mMo/vtBO5dYbehREoey6XUKy/eSumjCCveDpRre4VKE=
12+
github.com/google/go-cmp v0.6.0 h1:ofyhxvXcZhMsU5ulbFiLKl/XBFqE1GSq7atu8tAmTRI=
13+
github.com/google/go-cmp v0.6.0/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
14+
github.com/joho/godotenv v1.5.1 h1:7eLL/+HRGLY0ldzfGMeQkb7vMd0as4CfYvUVzLqw0N0=
15+
github.com/joho/godotenv v1.5.1/go.mod h1:f4LDr5Voq0i2e/R5DDNOoa2zzDfwtkZa6DnEwAbqwq4=
16+
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
17+
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
18+
github.com/stretchr/testify v1.9.0 h1:HtqpIVDClZ4nwg75+f6Lvsy/wHu+3BoSGCbBAcpTsTg=
19+
github.com/stretchr/testify v1.9.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY=
20+
go.opentelemetry.io/otel v1.28.0 h1:/SqNcYk+idO0CxKEUOtKQClMK/MimZihKYMruSMViUo=
21+
go.opentelemetry.io/otel v1.28.0/go.mod h1:q68ijF8Fc8CnMHKyzqL6akLO46ePnjkgfIMIjUIX9z4=
22+
go.opentelemetry.io/otel/metric v1.28.0 h1:f0HGvSl1KRAU1DLgLGFjrwVyismPlnuU6JD6bOeuA5Q=
23+
go.opentelemetry.io/otel/metric v1.28.0/go.mod h1:Fb1eVBFZmLVTMb6PPohq3TO9IIhUisDsbJoL/+uQW4s=
24+
go.opentelemetry.io/otel/sdk v1.21.0 h1:FTt8qirL1EysG6sTQRZ5TokkU8d0ugCj8htOgThZXQ8=
25+
go.opentelemetry.io/otel/sdk v1.21.0/go.mod h1:Nna6Yv7PWTdgJHVRD9hIYywQBRx7pbox6nwBnZIxl/E=
26+
go.opentelemetry.io/otel/trace v1.28.0 h1:GhQ9cUuQGmNDd5BTCP2dAvv75RdMxEfTmYejp+lkx9g=
27+
go.opentelemetry.io/otel/trace v1.28.0/go.mod h1:jPyXzNPg6da9+38HEwElrQiHlVMTnVfM3/yv2OlIHaI=
28+
golang.org/x/sys v0.19.0 h1:q5f1RH2jigJ1MoAWp2KTp3gm5zAGFUTarQZ5U386+4o=
29+
golang.org/x/sys v0.19.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
30+
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
31+
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=

0 commit comments

Comments
 (0)