You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -174,24 +177,24 @@ For more details, refer to the [documentation](https://olake.io/docs).
174
177
175
178
For a collection of 230 million rows (664.81GB) from [Twitter data](https://archive.org/details/archiveteam-twitter-stream-2017-11), here's how Olake compares to other tools:
Cost Comparison: (Considering 230 million first full load & 50 million rows incremental rows per month) as dated 30th September: Find more [here](https://olake.io/docs/connectors/mongodb/benchmarks).
197
+
Cost Comparison: (Considering 230 million first full load & 50 million rows incremental rows per month) as dated 30th September 2025: Find more [here](https://olake.io/docs/connectors/mongodb/benchmarks).
We have additionally planned the following sources - [AWS S3](https://github.com/datazip-inc/olake/issues/86) | [Kafka](https://github.com/datazip-inc/olake/issues/87)
Writers are directly integrated into drivers to avoid blockage of writing/reading into/from os.StdOut or any other type of I/O. This enables direct insertion of records from each individual fired query to the destination.
| Azure Purview | Not Planned, [submit a request](https://github.com/datazip-inc/olake/issues/new?template=new-feature.md) |
258
+
| BigLake Metastore | Not Planned, [submit a request](https://github.com/datazip-inc/olake/issues/new?template=new-feature.md) |
259
+
260
+
261
+
262
+
See [Roadmap](https://github.com/orgs/datazip-inc/projects/5) for more details.
230
263
231
-
Writers are being planned in this order
232
-
- [x] Parquet Writer (Writes Parquet files on Local/S3)
233
-
- [ ] S3 Iceberg Parquet (Coming Soon!)
234
-
- [ ] Snowflake
235
-
- [ ] BigQuery
236
-
- [ ] RedShift
237
264
238
265
### Core
239
266
240
267
Core or framework is the component/logic that has been abstracted out from Connectors to follow DRY. This includes base CLI commands, State logic, Validation logic, Type detection for unstructured data, handling Config, State, Catalog, and Writer config file, logging etc.
241
268
242
-
Core includes http server that directly exposes live stats about running sync such as
269
+
Core includes http server that directly exposes live stats about running sync such as:
243
270
- Possible finish time
244
271
- Concurrently running processes
245
272
- Live record count
246
273
247
-
Core handles the commands to interact with a driver via these
248
-
- spec command: Returns render-able JSON Schema that can be consumed by rjsf libraries in frontend
249
-
- check command: performs all necessary checks on the Config, Catalog, State and Writer config
250
-
- discover command: Returns all streams and their schema
251
-
- sync command: Extracts data out of Source and writes into destinations
274
+
Core handles the commands to interact with a driver via these:
275
+
- `spec` command: Returns render-able JSON Schema that can be consumed by rjsf libraries in frontend
276
+
- `check` command: performs all necessary checks on the Config, Catalog, State and Writer config
277
+
- `discover` command: Returns all streams and their schema
278
+
- `sync` command: Extracts data out of Source and writes into destinations
252
279
280
+
Find more about how OLake works [here.](https://olake.io/docs/category/understanding-olake)
253
281
254
282
### SDKs
255
283
@@ -267,15 +295,18 @@ Olake will be built on top of SDK providing persistent storage and a user interf
267
295
268
296
We ❤️ contributions big or small. Please read [CONTRIBUTING.md](CONTRIBUTING.md) to get started with making contributions to OLake.
269
297
298
+
- To contribute to Frontend, go to [OLake Frontend GitHub repo](https://github.com/datazip-inc/olake-frontend/).
299
+
300
+
- To contribute to OLake website and documentation (olake.io), go to [OLake Frontend GitHub repo](https://github.com/datazip-inc/olake-docs).
301
+
270
302
Not sure how to get started? Just ping us on `#contributing-to-olake` in our [slack community](https://olake.io/slack)
271
303
272
-
<br /><br />
304
+
## [Documentation](olake.io/docs)
305
+
273
306
274
-
## Documentation
307
+
If you need any clarification or find something missing, feel free to raise a GitHub issue with the label `documentation` at [olake-docs](https://github.com/datazip-inc/olake-docs/) repo or reach out to us at the community slack channel.
275
308
276
-
You can find docs at https://olake.io/docs. If you need any clarification or find something missing, feel free to raise a GitHub issue with the label `documentation` at [olake-docs](https://github.com/datazip-inc/olake-docs/) repo or reach out to us at the community slack channel.
0 commit comments