Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,9 @@ cookies.txt

# Don't commit logs
log/**


# Don't commit SSH keys
id_ed25519
id_rsa
*.pub
81 changes: 56 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,11 +46,7 @@ library code.
graph of required docker images that represent the production environment.

`docker-compose-dev.yml` is a similar file which sets up a dev environment,
with Redis and a MariaDB server for the `enwp10` database. Use it like so

```bash
docker compose -f docker-compose-dev.yml up -d
```
with Redis and a MariaDB server for the `enwp10` database. Through profiles like `zimfarm` and `zimfarm-worker`, you can start the Zimfarm containers required to execute a task.

`docker-compose-test.yml` is a another docker file which sets up the test db
for python "nosetests" (unit tests). Run it similarly:
Expand Down Expand Up @@ -199,11 +195,62 @@ Before you run the docker-compose command below, you must copy the file
section for `STORAGE`, if you wish to properly materialize builder lists into
backend selections.

After that is done, use the following command to run the dev environment:
### Setting up the development services

```bash
docker compose -f docker-compose-dev.yml up -d
```
The dev stack has various containers which can be activated via various profiles. The `zimfarm` profile sets up a local zimfarm DB, API and UI.
The `zimfarm-worker` profile sets up a local zimfarm worker manager and receiver that stores the results/files of tasks.

If it is your first execution of the dev stack, you need to create offliners and a "virtual" worker in Zimfarm DB. Thus, you need to start the services without the worker profile until you register a worker.

You may need to install the `jq` tool with [these instructions](https://github.com/jqlang/jq/wiki/Installation).

#### Registering a worker

- Start the dev stack without a Zimfarm worker for now

```sh
docker compose -f docker-compose-dev.yml --profile zimfarm up --pull always --build
```

This starts the API, creates an admin user with username: `admin` and password `admin`

- Register offliners in the database

```sh
cd docker/zimfarm
./create_offliners.sh
```

This pulls the various versions of the mwoffliner definition schema from the Zimfarm API
and registers the definition within your docker Zimfarm API. These definitions are
necessary as they contain the latest parameters needed to run the `mwoffliner`
scraper.

In your `credentials.py`, set the defintion version to any of the versions pulled from the API. For example, if `1.17.2` was one of the downloaded definitions of the mwoffliner scraper, you want to set `definition_version` under the `ZIMFARM` section:

```py
"ZIMFARM": {
"definition_version": "1.17.2",
"image": "ghcr.io/openzim/mwoffliner:1.17.2"
# other configurations for zimfarm follow...
}

```

- Register a test Zimfarm worker

```sh
cd docker/zimfarm
./create_worker.sh
```

This registers a worker with username `test_worker` and generates SSH keys for it to authenticate with the Zimfarm API. The worker is configured with 3 CPU, 20GB RAM and 20GB disk.

- Restart the dev stack with a Zimfarm worker now
```sh
docker compose -f docker-compose-dev.yml --profile zimfarm --profile zimfarm-worker \
up -d
```

## Migrating and updating the dev database.

Expand Down Expand Up @@ -242,22 +289,6 @@ If you wish to connect to a wiki replica database on toolforge, you will need
to fill out your credentials in WIKIDB section. This is not required for
developing the frontend.

## Running a ZIM Farm

If you wish to run a ZIM Farm instance for testing purposes, the easiest way is to
clone the zimfarm repository and then setup a development instance of it:

```bash
git clone https://github.com/openzim/zimfarm.git
cd zimfarm/dev
docker compose -p zimfarm up -d
```

For detailed setup instructions, refer to `dev/README.md` in the zimfarm repository.
The `ZIMFARM` section in your `credentials.py` file contains pre-configured default
values for the development instance. If you encounter connection issues, verify
these credentials match your local setup.

## Development overlay

The API server has a built-in development overlay, currently used for manual
Expand Down
113 changes: 110 additions & 3 deletions docker-compose-dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ services:
image: redis
container_name: wp1bot-redis-dev
ports:
- '9736:6379'
- 9736:6379
networks:
- wp1bot-dev
restart: always
Expand All @@ -19,7 +19,7 @@ services:
build: docker/dev-db/
container_name: wp1bot-db-dev
ports:
- '6300:3306'
- 6300:3306
networks:
- wp1bot-dev
restart: always
Expand All @@ -40,7 +40,7 @@ services:
- wp1bot-dev
restart: always
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
test: ['CMD', 'curl', '-f', 'http://localhost:9000/minio/health/live']
start_period: 30s
interval: 30s
timeout: 10s
Expand Down Expand Up @@ -78,8 +78,115 @@ services:
minio:
condition: service_started

dev-web:
build:
context: .
dockerfile: docker/web/Dockerfile
container_name: wp1bot-web-dev
environment:
- FLASK_DEBUG=1
- FLASK_RUN_HOST=0.0.0.0
command: flask --app wp1.web.app run
networks:
- wp1bot-dev
ports:
- 5000:5000
volumes:
- ./wp1/credentials.py.dev:/usr/src/app/wp1/credentials.py
links:
- redis
restart: always
depends_on:
redis:
condition: service_healthy

zimfarm-db:
image: postgres:17.3-bookworm
container_name: zimfarm-postgresdb
ports:
- 127.0.0.1:2345:5432
volumes:
- zimfarm-data:/var/lib/postgresql/data
- ./docker/zimfarm/postgres-initdb:/docker-entrypoint-initdb.d
environment:
- POSTGRES_DB=zimfarm
- POSTGRES_USER=zimfarm
- POSTGRES_PASSWORD=zimpass
healthcheck:
test: ['CMD', 'pg_isready', '-q', '-d', 'dbname=zimfarm user=zimfarm']
interval: 10s
timeout: 5s
retries: 3
networks:
- wp1bot-dev
profiles:
- zimfarm

zimfarm-api:
image: ghcr.io/openzim/zimfarm-backend:latest
container_name: zimfarm-api
ports:
- 127.0.0.1:8004:80
environment:
BINDING_HOST: 0.0.0.0
JWT_SECRET: DH8kSxcflUVfNRdkEiJJCn2dOOKI3qfw
POSTGRES_URI: postgresql+psycopg://zimfarm:zimpass@zimfarm-db:5432/zimfarm
ALEMBIC_UPGRADE_HEAD_ON_START: true
INIT_USERNAME: admin
INIT_PASSWORD: admin
ALLOWED_ORIGINS: http://localhost:8003
ARTIFACTS_UPLOAD_URI: s3+http://minio:9000/?keyId=minio_key&secretAccessKey=minio_secret&bucketName=org-kiwix-dev-artifacts
LOGS_UPLOAD_URI: s3+http://minio:9000/?keyId=minio_key&secretAccessKey=minio_secret&bucketName=org-kiwix-dev-logs
ZIM_UPLOAD_URI: s3+http://minio:9000/?keyId=minio_key&secretAccessKey=minio_secret&bucketName=org-kiwix-dev-zims
networks:
- wp1bot-dev
depends_on:
zimfarm-db:
condition: service_healthy
profiles:
- zimfarm

zimfarm-ui:
image: ghcr.io/openzim/zimfarm-ui:latest
container_name: zimfarm-ui
ports:
- 127.0.0.1:8003:80
volumes:
- ./docker/zimfarm/zimfarm_ui_dev/config.json:/usr/share/nginx/html/config.json:ro
depends_on:
zimfarm-api:
condition: service_healthy
networks:
- wp1bot-dev
profiles:
- zimfarm

zimfarm-worker-manager:
image: ghcr.io/openzim/zimfarm-worker-manager:latest
container_name: zimfarm-worker-manager
depends_on:
zimfarm-api:
condition: service_healthy
command: worker-manager --webapi-uri 'http://zimfarm-api:80/v2' --username test_worker --name test_worker
environment:
- DEBUG=true
- TASK_WORKER_IMAGE=ghcr.io/openzim/zimfarm-task-worker:latest
- ENVIRONMENT=development
- ZIMFARM_DISK=20Gb
- ZIMFARM_MEMORY=20Gb
- ZIMFARM_CPU=3
- POLL_INTERVAL=10
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- ./docker/zimfarm/id_ed25519:/etc/ssh/keys/zimfarm
networks:
- wp1bot-dev
profiles:
- zimfarm-worker

networks:
wp1bot-dev:

volumes:
minio-data:
zimfarm-data:
15 changes: 9 additions & 6 deletions docker/minio/setup-buckets.sh
Original file line number Diff line number Diff line change
@@ -1,21 +1,24 @@
#!/bin/sh
#!/bin/bash

MINIO_ALIAS="dev_minio"
MINIO_URL="http://minio:9000"
MINIO_USER="minio_key"
MINIO_PASSWORD="minio_secret"
BUCKET_NAME="org-kiwix-dev-wp1"
BUCKETS=("org-kiwix-dev-wp1" "org-kiwix-dev-artifacts" "org-kiwix-dev-logs" "org-kiwix-dev-zims" "org-kiwix-dev-cache")

# Wait for MinIO
echo "Waiting for MinIO to be ready..."
for i in $(seq 1 30); do
# Set up MinIO alias if it's ready
/usr/bin/mc alias set $MINIO_ALIAS $MINIO_URL $MINIO_USER $MINIO_PASSWORD && echo "MinIO is ready!" && break
sleep 2
[ $i -eq 30 ] && { echo "ERROR: MinIO timeout"; exit 1; }
[ "$i" -eq 30 ] && { echo "ERROR: MinIO timeout"; exit 1; }
done

# Setup bucket
/usr/bin/mc mb $MINIO_ALIAS/$BUCKET_NAME --ignore-existing # Create bucket if it doesn't exist
/usr/bin/mc anonymous set public $MINIO_ALIAS/$BUCKET_NAME # Set bucket to public
# Setup buckets
for bucket in "${BUCKETS[@]}"; do
echo "Setting up bucket: $bucket"
/usr/bin/mc mb "$MINIO_ALIAS/$bucket" --ignore-existing # Create bucket if it doesn't exist
/usr/bin/mc anonymous set public "$MINIO_ALIAS/$bucket" # Set bucket to public
done
echo "MinIO setup complete"
Loading
Loading