Skip to content

Commit fc16dd7

Browse files
authored
Merge pull request #6043 from gchq/5994-document-cdk-use
Issue 5994 - Document deployment with CDK directly
2 parents 8149a17 + 975babb commit fc16dd7

File tree

34 files changed

+685
-510
lines changed

34 files changed

+685
-510
lines changed

docs/deployment-guide.md

Lines changed: 41 additions & 253 deletions
Large diffs are not rendered by default.

docs/deployment/deploy-with-cdk.md

Lines changed: 136 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,136 @@
1+
Deployment with the CDK
2+
=======================
3+
4+
Sleeper is deployed with the AWS Cloud Development Kit (CDK). This can be done either with scripts as described in
5+
the [deployment guide](../deployment-guide.md#scripted-deployment), or by using the CDK directly. This document covers
6+
deployment using the CDK CLI directly.
7+
8+
### Uploading artefacts to AWS
9+
10+
Some jars and Docker images must be uploaded to AWS before you can deploy an instance of Sleeper. We have a CDK app
11+
`SleeperArtefactsCdkApp` which creates an S3 bucket and ECR repositories to hold these artefacts, but does not
12+
upload the artefacts. You can also include this in your own CDK app with `SleeperArtefacts`. You can use our tools to
13+
upload the artefacts as a separate step, or implement your own way to do this that may be specific to your Maven and
14+
Docker repositories.
15+
16+
The scripted deployment uploads the jars from the local `scripts/jars` directory within the Git repository. The Docker
17+
images are either built from the local `scripts/docker` directory or pulled from a remote repository if that is
18+
configured. You could replicate that behaviour yourself with the script `scripts/deploy/uploadArtefacts.sh`, or use our
19+
Java classes `SyncJars` and `UploadDockerImagesToEcr`, or implement your own way to upload these artefacts.
20+
21+
As part of `scripts/build/build.sh`, the jars are built and output to `scripts/jars`, and the Docker builds are prepared
22+
in separate directories for each Docker image under `scripts/docker`. You can also use
23+
our [publishing tools](../development/publishing.md) to prepare the artefacts.
24+
25+
It's important to upload artefacts from within AWS to avoid lengthy uploads into AWS. Usually this is done from an EC2
26+
instance.
27+
28+
#### `uploadArtefacts.sh`
29+
30+
This script can upload artefacts to an existing CDK deployment. You can either pass in the deployment ID that you used
31+
for the CDK deployment, or pass in an instance properties file for an instance that is configured to use that artefacts
32+
deployment. In the latter case, Docker images will only be uploaded if they are required with your instance
33+
configuration. Run `uploadArtefacts.sh --help` for details.
34+
35+
By default, the artefacts deployment ID should match the instance ID. Alternatively, you can set the deployment ID in
36+
the instance property [`sleeper.artefacts.deployment`](../usage/properties/instance/user/common.md).
37+
38+
Here's an example with a CDK command to create an artefacts deployment, and a call to the script to upload all artefacts
39+
to that deployment:
40+
41+
```bash
42+
DEPLOYMENT_ID=my-deployment
43+
cdk deploy --all -c id=$DEPLOYMENT_ID -a "java -cp ./scripts/jars/cdk-<version>.jar sleeper.cdk.SleeperArtefactsCdkApp"
44+
./scripts/deploy/uploadArtefacts.sh --id $DEPLOYMENT_ID
45+
```
46+
47+
#### Direct upload
48+
49+
If you prefer to implement this yourself, details of Docker images to be uploaded can be
50+
found [here](/docs/deployment/docker-images.md). That document includes details of how to build and push the images to
51+
ECR, as it is done by the automated scripts.
52+
53+
You'll also need to create an S3 bucket for jars, and upload the contents of the `scripts/jars` directory to it. That
54+
directory is created during a build, or during installation of a published version. The jars S3 bucket needs to have
55+
versioning enabled so we can tie a CDK deployment to specific versions of each jar.
56+
57+
When not using an artefacts CDK deployment, you can set the instance properties `sleeper.jars.bucket`
58+
and `sleeper.ecr.repository.prefix` instead of `sleeper.artefacts.deployment`.
59+
60+
### Including Sleeper in your CDK app
61+
62+
Sleeper supports deployment as part of your own CDK app, either as its own stack or as a nested stack under your stack.
63+
If you have published Sleeper to a Maven repository as described in the [publishing guide](../development/publishing.md)
64+
you can add the Sleeper CDK module as a Maven artefact like this:
65+
66+
```xml
67+
<dependency>
68+
<groupId>sleeper</groupId>
69+
<artifactId>cdk</artifactId>
70+
<version>version.number.here</version>
71+
</dependency>
72+
```
73+
74+
Use the class `SleeperInstance` to add instances of Sleeper to your app. To load instance and table properties from
75+
the local file system you can use `DeployInstanceConfiguration.fromLocalConfiguration`. Here's an example:
76+
77+
```java
78+
Stack stack = Stack.Builder.create(app, "MyStack")
79+
.stackName("my-stack")
80+
.env(environment)
81+
.build();
82+
SleeperInstanceConfiguration myInstanceConfig = SleeperInstanceConfiguration.fromLocalConfiguration(
83+
workingDir.resolve("my-instance/instance.properties"));
84+
SleeperInstance.createAsNestedStack(stack, "MyInstance",
85+
NestedStackProps.builder()
86+
.description("My instance")
87+
.build(),
88+
SleeperInstanceProps.builder(myInstanceConfig, s3Client, dynamoClient)
89+
.deployPaused(false)
90+
.build());
91+
```
92+
93+
### Using the CDK CLI
94+
95+
To deploy a Sleeper instance to AWS with the CDK, you need an [instance configuration](instance-configuration.md) and
96+
a [suitable environment](environment-setup.md). The artefacts will need to be uploaded as described in the section
97+
above. You can either use the instance ID as the deployment ID for the artefacts, or you can point to your artefacts
98+
deployment with the instance property `sleeper.artefacts.deployment`.
99+
100+
You can use the same CDK apps used by the automated scripts, or your own CDK configuration. We'll give examples with the
101+
CDK apps used by the automated scripts. The following commands will deploy a Sleeper instance:
102+
103+
```bash
104+
INSTANCE_PROPERTIES=/path/to/instance.properties
105+
SCRIPTS_DIR=./scripts # This is from the root of the Sleeper Git repository
106+
VERSION=$(cat "$SCRIPTS_DIR/templates/version.txt")
107+
cdk deploy --all -c propertiesfile=$INSTANCE_PROPERTIES -c newinstance=true -a "java -cp $SCRIPTS_DIR/jars/cdk-$VERSION.jar sleeper.cdk.SleeperCdkApp"
108+
```
109+
110+
To avoid having to explicitly give approval for deploying all the stacks, you can add "--require-approval never" to the
111+
command.
112+
113+
If you'd like to include data generation for system tests, use the system test CDK app instead.
114+
115+
```bash
116+
INSTANCE_PROPERTIES=/path/to/instance.properties
117+
SCRIPTS_DIR=./scripts # This is from the root of the Sleeper Git repository
118+
VERSION=$(cat "$SCRIPTS_DIR/templates/version.txt")
119+
cdk deploy --all -c propertiesfile=$INSTANCE_PROPERTIES -c newinstance=true -a "java -cp $SCRIPTS_DIR/jars/system-test-$VERSION-utility.jar sleeper.systemtest.cdk.SystemTestApp"
120+
```
121+
122+
#### Tear down
123+
124+
If the artefacts and the Sleeper instance are each deployed in their own CDK app, with `SleeperArtefactsCdkApp` and
125+
`SleeperCdkApp`, you can tear down an instance of Sleeper either by deleting the CloudFormation stacks, or with the CDK
126+
CLI. You may need to delete the Sleeper instance before deleting the artefacts used to deploy it. Here's an example:
127+
128+
```bash
129+
INSTANCE_PROPERTIES=/path/to/instance.properties
130+
ID=my-instance-id
131+
SCRIPTS_DIR=./scripts # From the root of the Sleeper Git repository
132+
VERSION=$(cat "$SCRIPTS_DIR/templates/version.txt")
133+
134+
cdk destroy --all -c propertiesfile=$INSTANCE_PROPERTIES -c validate=false -a "java -cp $SCRIPTS_DIR/jars/cdk-$VERSION.jar sleeper.cdk.SleeperCdkApp"
135+
cdk destroy --all -c id=$ID -a "java -cp $SCRIPTS_DIR/jars/cdk-$VERSION.jar sleeper.cdk.SleeperArtefactsCdkApp"
136+
```

docs/deployment/docker-images.md

Lines changed: 138 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,138 @@
1+
Docker images deployed in Sleeper
2+
=================================
3+
4+
A deployment of Sleeper includes components that run in Docker containers. This document lists the Docker images that
5+
are used in Sleeper, how to build them, and how to make them available for deployment.
6+
7+
The easiest way to build and deploy these images is with our automated scripts. See
8+
the [deployment guide](../deployment-guide.md) and [deployment with the CDK](./deploy-with-cdk.md) for more information.
9+
The information below may be useful if you prefer to replicate this yourself.
10+
11+
## Docker deployment images
12+
13+
A build of Sleeper outputs several directories under `scripts/docker`. Each is the directory to build a Docker image,
14+
with a Dockerfile. Some of these are used for parts of Sleeper that are always deployed from Docker images, and those
15+
are listed here.
16+
17+
* Deployment name - This is both the name of its directory under `scripts/docker`, and the name of the image when it's
18+
built and the repository it's uploaded to.
19+
* Optional Stack - They're each associated with an optional stack, and will only be used when that optional stack is
20+
deployed in an instance of Sleeper.
21+
* Multiplatform - Compaction job execution is built as a multiplatform image, so it can be deployed in both x86 and ARM
22+
architectures.
23+
24+
| Deployment Name | Optional Stack | Multiplatform |
25+
|----------------------------|--------------------|---------------|
26+
| ingest | IngestStack | false |
27+
| bulk-import-runner | EksBulkImportStack | false |
28+
| compaction-job-execution | CompactionStack | true |
29+
| bulk-export-task-execution | BulkExportStack | false |
30+
31+
32+
## Lambda images
33+
34+
Most lambdas are usually deployed from a jar in the jars bucket. Some need to be deployed as a Docker container, as
35+
there's a limit on the size of a jar that can be deployed as a lambda. We also have an option to deploy all lambdas as
36+
Docker containers as well.
37+
38+
All lambda Docker images are built from the Docker build directory that's output during a build of Sleeper
39+
at `scripts/docker/lambda`. To build a Docker image for a lambda, we copy its jar file from `scripts/jars`
40+
to `scripts/docker/lambda/lambda.jar`, and then run the Docker build for that directory. This results in a separate
41+
Docker image for each lambda jar.
42+
43+
* Filename - This is the name of the jar file that's output by the build in `scripts/jars`. It includes the version
44+
number you've built, which we've included as a placeholder here.
45+
* Image name - This is the name of the Docker image that's built, and the name of the repository it's uploaded to.
46+
* Always Docker deploy - This means that that lambda will always be deployed with Docker, usually because the jar is too
47+
large to deploy directly.
48+
49+
| Filename | Image Name | Always Docker deploy |
50+
|-----------------------------------------------------|-----------------------------------|----------------------|
51+
| athena-`<version-number>`.jar | athena-lambda | true |
52+
| bulk-import-starter-`<version-number>`.jar | bulk-import-starter-lambda | false |
53+
| bulk-export-planner-`<version-number>`.jar | bulk-export-planner | false |
54+
| bulk-export-task-creator-`<version-number>`.jar | bulk-export-task-creator | false |
55+
| ingest-taskrunner-`<version-number>`.jar | ingest-task-creator-lambda | false |
56+
| ingest-batcher-submitter-`<version-number>`.jar | ingest-batcher-submitter-lambda | false |
57+
| ingest-batcher-job-creator-`<version-number>`.jar | ingest-batcher-job-creator-lambda | false |
58+
| lambda-garbagecollector-`<version-number>`.jar | garbage-collector-lambda | false |
59+
| lambda-jobSpecCreationLambda-`<version-number>`.jar | compaction-job-creator-lambda | false |
60+
| runningjobs-`<version-number>`.jar | compaction-task-creator-lambda | false |
61+
| lambda-splitter-`<version-number>`.jar | partition-splitter-lambda | false |
62+
| query-`<version-number>`.jar | query-lambda | true |
63+
| cdk-custom-resources-`<version-number>`.jar | custom-resources-lambda | false |
64+
| metrics-`<version-number>`.jar | metrics-lambda | false |
65+
| statestore-lambda-`<version-number>`.jar | statestore-lambda | false |
66+
67+
68+
## Building and pushing
69+
70+
See the [deployment guide](../deployment-guide.md) and [deployment with the CDK](./deploy-with-cdk.md) for information
71+
on available scripts and code to automate building of these images. This is done automatically in any of the deployment
72+
scripts. We'll look at some examples of how to match the behaviour of those scripts.
73+
74+
We'll start by creating some environment variables for convenience:
75+
76+
```bash
77+
INSTANCE_ID=<insert-a-unique-id-for-the-sleeper-instance-here>
78+
ACCOUNT=<your-account-id>
79+
REGION=eu-west-2
80+
DOCKER_REGISTRY=$ACCOUNT.dkr.ecr.$REGION.amazonaws.com
81+
REPO_PREFIX=${DOCKER_REGISTRY}/${INSTANCE_ID}
82+
SCRIPTS_DIR=./scripts # This is from the root of the Sleeper Git repository
83+
VERSION=$(cat "$SCRIPTS_DIR/templates/version.txt")
84+
```
85+
86+
Then log into ECR:
87+
88+
```bash
89+
aws ecr get-login-password --region $REGION | docker login --username AWS --password-stdin $DOCKER_REGISTRY
90+
```
91+
92+
The value of the REPO_PREFIX environment variable could later be used as the value of the instance
93+
property [`sleeper.ecr.repository.prefix`](../usage/properties/instance/user/common.md).
94+
95+
### Docker deployments
96+
97+
Here's an example of commands to build and push a non-multiplatform image from the `scripts/docker` directory:
98+
99+
```bash
100+
TAG=$REPO_PREFIX/ingest:$VERSION
101+
aws ecr create-repository --repository-name $INSTANCE_ID/ingest
102+
docker build -t $TAG $SCRIPTS_DIR/docker/ingest
103+
docker push $TAG
104+
```
105+
106+
### Multiplatform images
107+
108+
For a multiplatform image, e.g. to run on AWS Graviton on the ARM64 architecture, we need a Docker builder suitable for
109+
this.
110+
111+
These commands will create or recreate a builder:
112+
113+
```bash
114+
docker buildx rm sleeper || true
115+
docker buildx create --name sleeper --use
116+
```
117+
118+
This also requires a slightly different command to build and push. This must be done as a single command as the builder
119+
does not automatically add the image to the Docker Engine image store:
120+
121+
```bash
122+
TAG=$REPO_PREFIX/ingest:$VERSION
123+
aws ecr create-repository --repository-name $INSTANCE_ID/compaction-job-execution
124+
docker buildx build --platform linux/amd64,linux/arm64 -t $TAG --push $SCRIPTS_DIR/docker/compaction-job-execution
125+
```
126+
127+
### Lambdas
128+
129+
For a lambda the jar must be copied into the build directory before the build. Provenance must also be disabled for the
130+
image to be supported by AWS Lambda. Here's an example:
131+
132+
```bash
133+
TAG=$REPO_PREFIX/query-lambda:$VERSION
134+
aws ecr create-repository --repository-name $INSTANCE_ID/query-lambda
135+
cp $SCRIPTS_DIR/jars/query-$VERSION.jar $SCRIPTS_DIR/docker/lambda/lambda.jar
136+
docker build --provenance=false -t $TAG $SCRIPTS_DIR/docker/query-lambda
137+
docker push $TAG
138+
```

docs/deployment/images-to-upload.md

Lines changed: 0 additions & 44 deletions
This file was deleted.

0 commit comments

Comments
 (0)