Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 40 additions & 19 deletions docs/Storage/Long_Term_Storage/Freezer_Guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
!!! info s3cmd configuration required
Please ensure you have [configured](Configuring_s3cmd.md) the s3cmd tool.

## Using s3cmd tool to interact with Freezer

Check warning on line 11 in docs/Storage/Long_Term_Storage/Freezer_Guide.md

View workflow job for this annotation

GitHub Actions / Check Prose

lexical_illusions

'There's a lexical illusion in 'Freezer

Check warning on line 11 in docs/Storage/Long_Term_Storage/Freezer_Guide.md

View workflow job for this annotation

GitHub Actions / Check page meta

walk_toc

Header 'Using s3cmd tool to interact with Freezer' is too long. Try to keep it under 32 characters to avoid word wrapping in the toc.

Freezer uses the AWS S3 standard as a protocol for temporarily hosting data prior to writing it to tape.
All the data is stored in buckets temporarily before being written to tape - this is similar to a folder on a filesystem, but designed for scalable storage.
Expand All @@ -20,11 +20,9 @@

Please note that your bucket has the same name as your Freezer allocation. If you have forgotten the name of your bucket, please <a href="mailto:[email protected]?subject=Forgot%20my%20Freezer%20bucket%20name">email us</a> and let us know which project this is for.



## List contents and buckets

### Get information about a Freezer bucket

Check warning on line 25 in docs/Storage/Long_Term_Storage/Freezer_Guide.md

View workflow job for this annotation

GitHub Actions / Check page meta

walk_toc

Header 'Get information about a Freezer bucket' is too long. Try to keep it under 32 characters to avoid word wrapping in the toc.

To determine all the users that have access to a freezer bucket, type into the terminal:

Expand Down Expand Up @@ -80,7 +78,7 @@
s3cmd la
```

### Storage usage by specific bucket

Check warning on line 81 in docs/Storage/Long_Term_Storage/Freezer_Guide.md

View workflow job for this annotation

GitHub Actions / Check page meta

walk_toc

Header 'Storage usage by specific bucket' is too long. Try to keep it under 32 characters to avoid word wrapping in the toc.

```sh
s3cmd du -H s3://<freezer-bucket>
Expand All @@ -90,12 +88,29 @@
`s3cmd du -H` without specifying a bucket is only available for project owners.

!!! warning

If you have a large number files the `s3cmd du` command will fail. If you wish to receive information from `s3cmd du` we advise using a compression command such as `tar` to reduce the total number of files before adding them to Freezer.

## Uploading objects

### Synchronise data
### Step 1: Tarballing files

Check warning on line 95 in docs/Storage/Long_Term_Storage/Freezer_Guide.md

View workflow job for this annotation

GitHub Actions / Check Spelling

spelling

Word 'Tarballing' is misspelled.

If you have lots of small files (less than 1 GB each), it is recommended that you tarball your files before uploading them to Freezer. This is because uploading many small files take a long time to upload and download from Freezer. Tarballing allows you to copy all your files into one big file that is much easier to handle by Freezer.

Check warning on line 97 in docs/Storage/Long_Term_Storage/Freezer_Guide.md

View workflow job for this annotation

GitHub Actions / Check Spelling

spelling

Word 'Tarballing' is misspelled.

To tarball your files, type into mahuika:

Check warning on line 99 in docs/Storage/Long_Term_Storage/Freezer_Guide.md

View workflow job for this annotation

GitHub Actions / Check Spelling

spelling

Word 'mahuika' is misspelled.

```sh
tar -cvf <name of tarball>.tar <name of folder to tarball>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Ahoy! This be a fine addition to the guide. But a true pirate saves his space on the high seas of storage! I reckon ye should tell the landlubbers to compress their treasures. It'll save 'em space and time. Addin' a z for gzip compression and a v to see the files bein' added to the chest would be a grand idea. The other scrolls, like Other_Useful_Commands.md, speak of usin' tar -czf. Let's keep our maps consistent, eh?

Also, be sure to update the filename in the explanation below to <name of tarball>.tar.gz to match.

Suggested change
tar -cvf <name of tarball>.tar <name of folder to tarball>
tar -czvf <name of tarball>.tar.gz <name of folder to tarball>

```

where:

* `<name of tarball>`: Replace this with the name you want to give to the tarball
* `<name of folder to tarball>`: Replace this with the name of the folder containing all the small files you want to tarball.

!!! tip
If you are not sure about if you need to tarball your files, feel free to {% include "partials/support_request.html" %}. We can talk you though what files and folders are best to tarball.

### Step 2a: Synchronise data

Synchronize a directory tree to S3 (checks files freshness using size and md5 checksum, unless overridden by options). If you wish to have additional informative output, please use the `--verbose` flag as well.

Expand All @@ -109,7 +124,7 @@
s3cmd sync --skip-existing --verbose yourfolder s3://<freezer-bucket>/your_directory/your_folder/
```

### Put objects
### Step 2b: Put objects

To transfer files/folders to S3 gateway to be archived. `cd` into where the file/folder is on Mahuika and then use `s3cmd put`.

Expand All @@ -125,7 +140,7 @@
INFO: Cache file not found or empty, creating/populating it.
INFO: Compiling list of local files...
INFO: Running stat() and reading/calculating MD5 values on 1 files, this may take some time...
INFO: Summary: 1 local files to upload

Check warning on line 143 in docs/Storage/Long_Term_Storage/Freezer_Guide.md

View workflow job for this annotation

GitHub Actions / Check Prose

lexical_illusions

'There's a lexical illusion in 'upload
upload: 'your_file' -> 's3://<freezer-bucket>/your_directory/your_file' [1 of 1]
172202 of 172202 100% in 0s 920.89 KB/s done
```
Expand All @@ -140,7 +155,7 @@
INFO: Cache file not found or empty, creating/populating it.
INFO: Compiling list of local files...
INFO: Running stat() and reading/calculating MD5 values on 1 files, this may take some time...
INFO: Summary: 1 local files to upload

Check warning on line 158 in docs/Storage/Long_Term_Storage/Freezer_Guide.md

View workflow job for this annotation

GitHub Actions / Check Prose

lexical_illusions

'There's a lexical illusion in 'upload
upload: 'yourfolder/your_file' -> 's3://<freezer-bucket>/your_directory/your_folder/yourfolder/yourfile' [1 of 1]
172202 of 172202 100% in 0s 1691.71 KB/s done
```
Expand All @@ -150,8 +165,7 @@
Partially uploaded files will be deleted automatically.

!!! warning

If `put` was interrupted before it could finish, use `s3cmd sync --skip-existing --verbose` to resume from the stage that you were originally copying from. See [Synchronise data](#synchronise-data) for more information.
If `put` was interrupted before it could finish, use `s3cmd sync --skip-existing --verbose` to resume from the stage that you were originally copying from. See [Synchronise data](#synchronise-data) for more information.

Check warning on line 168 in docs/Storage/Long_Term_Storage/Freezer_Guide.md

View workflow job for this annotation

GitHub Actions / Check Prose

typography.diacritical_marks

'Use diacritical marks in 'résumé'.'

### Preview or dry-run

Expand All @@ -160,7 +174,8 @@
Only shows what should be uploaded or downloaded but doesn't actually do it. May still perform S3 requests to get bucket listings and other information though (only for file transfer commands).

## Restoring objects
### List objects before restore

### Step 1: List objects before restore

Check warning on line 178 in docs/Storage/Long_Term_Storage/Freezer_Guide.md

View workflow job for this annotation

GitHub Actions / Check page meta

walk_toc

Header 'Step 1: List objects before restore' is too long. Try to keep it under 32 characters to avoid word wrapping in the toc.

List contained objects/files/folders:

Expand All @@ -184,13 +199,13 @@
``` out
2025-06-16 23:13 10G 8add0bf4f023e3dbd36a329d1eae5bbd-684 STANDARD s3://<freezer-bucket>/your_directory/your_folder/10G_test.file
2025-06-16 23:30 10G 8add0bf4f023e3dbd36a329d1eae5bbd-684 STANDARD s3://<freezer-bucket>/your_directory/your_folder/10G_copy.file
2025-06-17 01:31 14 95b28899a460dd8971705dfcd0f5f0d4 STANDARD s3://<freezer-bucket>/your_directory/your_folder/MY_TEST/annotations/3/4/test3.txt

Check warning on line 202 in docs/Storage/Long_Term_Storage/Freezer_Guide.md

View workflow job for this annotation

GitHub Actions / Check Prose

lexical_illusions

'There's a lexical illusion in '1 1' - a phrase is repeated.'
2025-06-17 01:31 14 e76c3a8939fb031bab02a89f6fab520b STANDARD s3://<freezer-bucket>/your_directory/your_folder/MY_TEST/annotations/3/test2.txt

Check warning on line 203 in docs/Storage/Long_Term_Storage/Freezer_Guide.md

View workflow job for this annotation

GitHub Actions / Check Prose

lexical_illusions

'There's a lexical illusion in '1 1' - a phrase is repeated.'
2025-06-17 01:31 14 be2520c884c1be55bab187374a982b12 STANDARD s3://<freezer-bucket>/your_directory/your_folder/MY_TEST/raw_data/test1.txt

Check warning on line 204 in docs/Storage/Long_Term_Storage/Freezer_Guide.md

View workflow job for this annotation

GitHub Actions / Check Prose

lexical_illusions

'There's a lexical illusion in '1 1' - a phrase is repeated.'
2025-06-17 01:26 0 d41d8cd98f00b204e9800998ecf8427e STANDARD s3://<freezer-bucket>/your_directory/your_folder/test/test.txt
```

### Restore from tape
### Step 2a: Restore from tape

It is necessary to restore data from the tape (Glacier) prior to retrieving it. To restore file from Glacier storage:

Expand All @@ -215,26 +230,32 @@
s3cmd restore --recursive s3://<freezer-bucket>/your_directory/data_folder/ --restore-days=1
```

### Get objects after restore
### Step 2b: Get objects after restore

Check warning on line 233 in docs/Storage/Long_Term_Storage/Freezer_Guide.md

View workflow job for this annotation

GitHub Actions / Check page meta

walk_toc

Header 'Step 2b: Get objects after restore' is too long. Try to keep it under 32 characters to avoid word wrapping in the toc.

!!! info
Data needs to be restored (to storage class `STANDARD`) from the tape (storage class `GLACIER`), before it can be retrieved.

Example to get or download the directory `data_folder` and all contained objects/files/folders:

1. Create the `data_folder` you want to retrieve in file, and change directory into `data_folder`.
```sh
mkdir -p data_folder
cd data_folder
```
1. Create the `data_folder` you want to retrieve in file, and change directory into `data_folder`.
```sh
mkdir -p data_folder
cd data_folder
```

2. Retrieve the data from Freezer
```sh
s3cmd get --recursive s3://<freezer-bucket>/your_directory/data_folder/
```
2. Retrieve the data from Freezer
```sh
s3cmd get --recursive s3://<freezer-bucket>/your_directory/data_folder/
```

This will place the all files and subdirectories in the above `data_folder` into your current directory.

### Step 3: untarball your tar
Comment thread
CallumWalley marked this conversation as resolved.

After retrieving your tarball, you can extract its contents with the following command:

```sh
tar -xzvf <name of tarball>.tar.gz

## s3cmd reference

Expand Down
Loading