Skip to content

Commit 523d356

Browse files
committed
[skip ci] Update README and move demo pics to the repository
1 parent a5fcb0b commit 523d356

File tree

3 files changed

+57
-15
lines changed

3 files changed

+57
-15
lines changed

README.md

+57-15
Original file line numberDiff line numberDiff line change
@@ -5,27 +5,67 @@
55
[![Discord chat room](https://img.shields.io/discord/718534846472912936.svg)](https://discord.gg/4Qe2fYA)
66
[![Follow](https://img.shields.io/badge/[email protected])](https://twitter.com/Splitgraph)
77

8-
[Splitgraph](https://www.splitgraph.com) is a tool for building, versioning, querying and sharing datasets that works on top of [PostgreSQL](https://postgresql.org) and [integrates](https://www.splitgraph.com/product/splitgraph/integrations) seamlessly with anything that uses PostgreSQL.
8+
## Overview
99

10-
This repository contains most of the core code for the Splitgraph library,
11-
the [`sgr` command line client](https://www.splitgraph.com/docs/architecture/sgr-client) and the [Splitgraph Engine](engine/README.md).
10+
**Splitgraph** is a tool for building, versioning and querying reproducible datasets. It's inspired
11+
by Docker and Git, so it feels familiar. And it's powered by [PostgreSQL](https://postgresql.org), so it [works seamlessly with existing tools](https://www.splitgraph.com/product/splitgraph/integrations) in the Postgres ecosystem. Use Splitgraph to package your data into self-contained **data images** that you can [share with other Splitgraph instances](https://www.splitgraph.com/docs/getting-started/decentralized-demo).
1212

13-
See https://www.splitgraph.com/docs/getting-started/introduction for the full docs.
13+
[**Splitgraph.com**](https://www.splitgraph.com), or **Splitgraph Cloud**, is a public Splitgraph instance where you can share and discover data. It's a Splitgraph peer powered by the **Splitgraph Core** code in this repository, adding proprietary features like a data catalog, multitenancy, and a distributed SQL proxy.
1414

15-
![](https://www.mildbyte.xyz/asciicast/splitfiles.gif)
15+
You can explore [40k+ open datasets](https://www.splitgraph.com/explore) in the catalog. You can also connect directly to the [Data Delivery Network](https://www.splitgraph.com/docs/splitgraph-cloud/data-delivery-network) and query any of the datasets, without installing anything.
16+
17+
To install `sgr` (the command line client) or a local Splitgraph Engine, see the [Installation](#installation) section of this readme.
18+
19+
### Build and Query Versioned, Reproducible Datasets
20+
21+
[**Splitfiles**](https://www.splitgraph.com/docs/concepts/splitfiles) give you a declarative language, inspired by Dockerfiles, for expressing data transformations in ordinary SQL familiar to any researcher or business analyst. You can reference other images, or even other databases, with a simple JOIN.
22+
23+
![](pics/splitfile.png)
24+
25+
When you build data with Splitfiles, you get provenance tracking of the resulting data: it's possible to find out what sources went into every dataset and know when to rebuild it if the sources ever change. You can easily integrate Splitgraph into your existing CI pipelines, to keep your data up-to-date and stay on top of changes to upstream sources.
26+
27+
Splitgraph images are also version-controlled, and you can manipulate them with Git-like operations through a CLI. You can check out any image into a PostgreSQL schema and interact with it using any PostgreSQL client. Splitgraph will capture your changes to the data, and then you can commit them as delta-compressed changesets that you can package into new images.
28+
29+
### Why Splitgraph?
30+
31+
Splitgraph isn't opinionated and doesn't break existing abstractions. To any existing PostgreSQL application, Splitgraph images are just another database. We have carefully designed Splitgraph to not break the abstraction of a PostgreSQL table and wire protocol, because doing otherwise would mean throwing away a vast existing ecosystem of applications, users, libraries and extensions. This means that a lot of tools that work with PostgreSQL work with Splitgraph out of the box.
32+
33+
![](pics/splitfiles.gif)
34+
35+
## Components
36+
37+
The code in this repository, known as **Splitgraph Core**, contains:
38+
39+
- **[`sgr` command line client](https://www.splitgraph.com/docs/architecture/sgr-client)**: `sgr` is the main command line tool used to work with Splitgraph "images" (data snapshots). Use it to ingest data, work with splitfiles, and push data to Splitgraph.com.
40+
- **[Splitgraph Engine](engine/README.md)**: a [Docker image](https://hub.docker.com/r/splitgraph/engine) of the latest Postgres with Splitgraph and other required extensions pre-installed.
41+
- **[Splitgraph Python library](https://www.splitgraph.com/docs/python-api/splitgraph.core)**: All Splitgraph functionality is available in the Python API, offering first-class support for data science workflows including Jupyter notebooks and Pandas dataframes.
42+
43+
## Docs
44+
45+
Documentation is available at https://www.splitgraph.com/docs, specifically:
46+
47+
- [Installation](https://www.splitgraph.com/docs/getting-started/installation)
48+
- [FAQ](https://www.splitgraph.com/docs/getting-started/frequently-asked-questions)
49+
50+
We also recommend reading our Blog, including some of our favorite posts:
51+
52+
- [Supercharging `dbt` with Splitgraph: versioning, sharing, cross-DB joins](https://www.splitgraph.com/blog/dbt)
53+
- [Querying 40,000+ datasets with SQL](https://www.splitgraph.com/blog/40k-sql-datasets)
54+
- [Foreign data wrappers: PostgreSQL's secret weapon?](https://www.splitgraph.com/blog/foreign-data-wrappers)
1655

1756
## Installation
1857

19-
You will need access to [Docker](https://docs.docker.com/install/) as Splitgraph uses it to run
20-
the Splitgraph Engine.
58+
Pre-requisites:
2159

22-
For Linux and OSX, there's a single script:
60+
- Docker is required to run the Splitgraph Engine. `sgr` must have access to Docker. You either need to [install Docker locally](https://docs.docker.com/install/) or have access to a remote Docker socket.
61+
62+
For Linux and OSX, once Docker is running, install Splitgraph with a single script:
2363

2464
```
2565
$ bash -c "$(curl -sL https://github.com/splitgraph/splitgraph/releases/latest/download/install.sh)"
2666
```
2767

28-
This script downloads the `sgr` binary and sets up the Splitgraph Engine Docker container.
68+
This will download the `sgr` binary and set up the Splitgraph Engine Docker container.
2969

3070
Alternatively, you can get the `sgr` single binary from [the releases page](https://github.com/splitgraph/splitgraph/releases) and run [`sgr engine add`](https://www.splitgraph.com/docs/sgr/engine-management/engine-add) to create an engine.
3171

@@ -39,13 +79,15 @@ Alternatively, Splitgraph comes with plenty of [examples](examples) to get you s
3979

4080
If you're stuck or have any questions, check out the [documentation](https://www.splitgraph.com/docs/) or join our [Discord channel](https://discord.gg/4Qe2fYA)!
4181

42-
## Setting up a development environment
82+
## Contributing
83+
84+
### Setting up a development environment
4385

4486
* Splitgraph requires Python 3.6 or later.
4587
* Install [Poetry](https://github.com/python-poetry/poetry): `curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python` to manage dependencies
4688
* Install pre-commit hooks (we use [Black](https://github.com/psf/black) to format code)
47-
* `git clone https://github.com/splitgraph/splitgraph.git`
48-
* `poetry install`
89+
* `git clone --recurse-submodules https://github.com/splitgraph/splitgraph.git`
90+
* `poetry install`
4991
* To build the [engine](https://www.splitgraph.com/docs/architecture/splitgraph-engine) Docker image: `cd engine && make`
5092

5193
### Running tests
@@ -66,11 +108,11 @@ docker-compose -f test/architecture/docker-compose.core.yml up -d
66108
poetry run pytest -m "not mounting and not example"
67109
```
68110

69-
To run the test suite related to "mounting" and importing data from other databases
111+
To run the test suite related to "mounting" and importing data from other databases
70112
(PostgreSQL, MySQL, Mongo), do
71113

72114
```
73-
docker-compose -f test/architecture/docker-compose.core.yml -f test/architecture/docker-compose.core.yml up -d
115+
docker-compose -f test/architecture/docker-compose.core.yml -f test/architecture/docker-compose.core.yml up -d
74116
poetry run pytest -m mounting
75117
```
76118

@@ -82,4 +124,4 @@ docker-compose -f test/architecture/docker-compose.core.yml -f test/architecture
82124
poetry run pytest -m example
83125
```
84126

85-
All of these tests run in [CI](https://github.com/splitgraph/splitgraph/actions).
127+
All of these tests run in [CI](https://github.com/splitgraph/splitgraph/actions).

pics/splitfile.png

20.5 KB
Loading

pics/splitfiles.gif

185 KB
Loading

0 commit comments

Comments
 (0)