|
| 1 | +# CSVES (CSV to Elasticsearch) |
| 2 | + |
| 3 | +A flexible tool for importing CSV data into Elasticsearch with automatic field detection and mapping. |
| 4 | + |
| 5 | +## Features |
| 6 | + |
| 7 | +- 🔍 Automatic CSV delimiter detection |
| 8 | +- 📄 Dynamic field mapping |
| 9 | +- 🧹 Automatic whitespace and control character cleaning |
| 10 | +- 🎯 Field selection and filtering |
| 11 | +- ⚙️ Configurable through command-line flags or environment variables |
| 12 | +- 🧪 Test mode for data verification |
| 13 | +- 📝 Custom field mapping through JSON configuration |
| 14 | + |
| 15 | +## Installation |
| 16 | + |
| 17 | +### Prerequisites |
| 18 | + |
| 19 | +- Go 1.23 or higher |
| 20 | +- Elasticsearch 8.x |
| 21 | +- Access to an Elasticsearch instance |
| 22 | + |
| 23 | +### Build Steps |
| 24 | + |
| 25 | +1. Clone the repository: |
| 26 | +```bash |
| 27 | +git clone https://github.com/githubesson/csves |
| 28 | +cd csves |
| 29 | +``` |
| 30 | + |
| 31 | +2. Build the binary: |
| 32 | +```bash |
| 33 | +go build -o csves cmd/csves/main.go |
| 34 | +``` |
| 35 | + |
| 36 | +## Usage |
| 37 | + |
| 38 | +### Basic Usage |
| 39 | + |
| 40 | +```bash |
| 41 | +# Using .env settings |
| 42 | +./csves |
| 43 | + |
| 44 | +# Test mode (no Elasticsearch connection) without .env settings |
| 45 | +./csves -csv="data.csv" -test |
| 46 | + |
| 47 | +# Select specific fields without .env settings |
| 48 | +./csves -csv="data.csv" -select="email,phone,address" |
| 49 | +``` |
| 50 | + |
| 51 | +### Command Line Flags |
| 52 | + |
| 53 | +| Flag | Description | Default | Required | |
| 54 | +|------|-------------|---------|----------| |
| 55 | +| `-csv` | Path to CSV file | - | Yes | |
| 56 | +| `-es-url` | Elasticsearch URL | http://localhost:9200 | No | |
| 57 | +| `-index` | Elasticsearch index name | csv_data | No | |
| 58 | +| `-fields` | Path to field configuration file | - | No | |
| 59 | +| `-select` | Comma-separated list of fields to include | - | No | |
| 60 | +| `-delimiter` | CSV delimiter character | auto-detect | No | |
| 61 | +| `-test` | Run in test mode | false | No | |
| 62 | + |
| 63 | +### Environment Variables |
| 64 | + |
| 65 | +You can also configure the tool using environment variables in a `.env` file: |
| 66 | + |
| 67 | +```env |
| 68 | +ELASTICSEARCH_URL=http://localhost:9200 |
| 69 | +INDEX_NAME=my_index |
| 70 | +CSV_FILE_PATH=data.csv |
| 71 | +FIELD_CONFIG_PATH=fields.json |
| 72 | +``` |
| 73 | + |
| 74 | +### Field Configuration |
| 75 | + |
| 76 | +Create a `fields.json` file to specify field mappings and requirements: |
| 77 | + |
| 78 | +```json |
| 79 | +[ |
| 80 | + { |
| 81 | + "name": "User Id", |
| 82 | + "required": true, |
| 83 | + "csv_name": "userid" |
| 84 | + }, |
| 85 | + { |
| 86 | + "name": "Email", |
| 87 | + "required": true, |
| 88 | + "csv_name": "email" |
| 89 | + } |
| 90 | +] |
| 91 | +``` |
| 92 | + |
| 93 | +- `name`: Field name in Elasticsearch |
| 94 | +- `required`: Whether the field must exist in CSV |
| 95 | +- `csv_name`: Column header name in CSV file |
| 96 | + |
| 97 | +## Examples |
| 98 | + |
| 99 | +### 1. Basic Import |
| 100 | +```bash |
| 101 | +./csves -csv="users.csv" |
| 102 | +``` |
| 103 | + |
| 104 | +### 2. Custom Elasticsearch Configuration |
| 105 | +```bash |
| 106 | +./csves -csv="users.csv" -es-url="http://elasticsearch:9200" -index="users_v1" |
| 107 | +``` |
| 108 | + |
| 109 | +### 3. Field Selection |
| 110 | +```bash |
| 111 | +./csves -csv="users.csv" -select="email,phone" -test |
| 112 | +``` |
| 113 | + |
| 114 | +### 4. Custom Field Mapping |
| 115 | +```bash |
| 116 | +./csves -csv="users.csv" -fields="fields.json" |
| 117 | +``` |
| 118 | + |
| 119 | +### 5. Specific Delimiter |
| 120 | +```bash |
| 121 | +./csves -csv="users.csv" -delimiter=";" |
| 122 | +``` |
| 123 | + |
| 124 | +## Data Cleaning |
| 125 | + |
| 126 | +The tool automatically: |
| 127 | +- Removes leading and trailing whitespace |
| 128 | +- Removes control characters |
| 129 | +- Normalizes internal spaces |
| 130 | +- Skips empty fields |
| 131 | +- Handles multi-line values |
| 132 | + |
| 133 | +## Error Handling |
| 134 | + |
| 135 | +- Validates required fields |
| 136 | +- Reports parsing errors |
| 137 | +- Shows bulk indexing failures |
| 138 | +- Provides detailed error messages |
| 139 | + |
| 140 | +## Development |
| 141 | + |
| 142 | +### Project Structure |
| 143 | +``` |
| 144 | +csves/ |
| 145 | +├── cmd/ |
| 146 | +│ └── csves/ |
| 147 | +│ └── main.go # Entry point |
| 148 | +├── pkg/ |
| 149 | +│ ├── config/ # Configuration handling |
| 150 | +│ ├── csv/ # CSV processing |
| 151 | +│ ├── elasticsearch/ # ES operations |
| 152 | +│ └── models/ # Data models |
| 153 | +├── go.mod # Go modules file |
| 154 | +├── go.sum # Dependencies checksum |
| 155 | +└── README.md # This file |
| 156 | +``` |
| 157 | + |
| 158 | +## License |
| 159 | + |
| 160 | +This project is licensed under the MIT License - see the LICENSE file for details. |
0 commit comments