diff --git a/astro.config.mjs b/astro.config.mjs
index 8c3b284..a557b3b 100644
--- a/astro.config.mjs
+++ b/astro.config.mjs
@@ -12,6 +12,9 @@ export default defineConfig({
site: "https://nf-neuro.github.io",
base: "/",
trailingSlash: 'never',
+ redirects: {
+ '/pipelines/download': 'https://raw.githubusercontent.com/nf-neuro/modules/main/assets/download_pipeline.sh'
+ },
integrations: [
starlight({
title: 'nf-neuro',
@@ -181,6 +184,13 @@ export default defineConfig({
link: 'pipelines',
icon: 'seti:pipeline',
items: [
+ {
+ label: 'Running pipelines',
+ items : [
+ { label: 'Common guidelines', slug: 'pipelines/run' },
+ { label: 'Offline execution', slug: 'pipelines/offline' }
+ ]
+ },
{ label: 'Add your pipeline', slug: 'pipelines/submit' }
]
}
diff --git a/src/content/docs/pipelines/offline.mdx b/src/content/docs/pipelines/offline.mdx
new file mode 100644
index 0000000..b12446c
--- /dev/null
+++ b/src/content/docs/pipelines/offline.mdx
@@ -0,0 +1,127 @@
+---
+title: Offline environments
+description: Running pipelines in offline environments
+---
+
+import { Steps } from '@astrojs/starlight/components';
+
+Pipelines backed by the nf-neuro (and [nf-core](https://nf-co.re)) framework are designed to run with internet access. This makes them
+easier to install and use. **They can also run completely offline**, with the help of a few commands to download everything required
+prior to execution.
+
+## Prerequisites
+
+|||
+|-|-|
+|**[Nextflow](https://www.nextflow.io/docs/latest/install.html) ≥ 23.10.0** | The download procedure uses [nextflow inspect](https://www.nextflow.io/docs/latest/reference/cli.html#inspect) to compute the **list of containers to download**. |
+| **Container engine** | **The container engine you will use to execute the pipeline needs to be installed**. The download procedure will populate its caches with the downloaded containers. We recommend [Docker](https://docs.docker.com/get-started/get-docker/) for local usage (where you have administrative rights), and [apptainer](https://apptainer.org/docs/admin/main/installation.html) anywhere else (computing clusters in the cloud or HPC infrastructures are typical use-cases). |
+
+
+## Setup using the `nf-core` command
+
+:::caution
+The `nf-core` framework is still being heavily developed, so is the `nf-neuro` ecosystem. If you experience problems setuping using the
+`nf-core` command, we recommend you instead use `nf-neuro` custom scripts through the procedure described [further down](#setup-using-nf-neuro-custom-scripts).
+:::
+
+
+
+1. Install the `nf-core` command. We give an example below using `pip`, refer to the [official documentation](https://nf-co.re/docs/nf-core-tools/installation)
+ for detailled instructions.
+
+ ```bash
+ python -m venv nf-core-env
+ source nf-core-env/bin/activate
+ python -m pip install nf_core==3.5.2
+ ```
+
+ :::caution[Installation on HPC]
+ Most HPC facilities distribute custom builds of python packages, which might conflict with `nf-core`. Refer to its administrators if you have problems
+ with installation, or defer to the custom scripts below.
+ :::
+
+ :::caution[Alliance Canada users]
+ As of today, the [documentation for nf-core](https://docs.alliancecan.ca/wiki/Nextflow) given by Alliance Canada is **outdated**. We've had success
+ installing latest versions with the commands [below](#setup-using-nf-neuro-custom-scripts) :
+
+ ```bash
+ module purge
+ module load nextflow/23.1.0 # Refer to the pipeline you are running for its minimal nextflow version
+ module load apptainer/3.5.2 # Refer to the pipeline you are running for its minimal apptainer version
+ module load python/3.12
+ module load rust
+ module load postgresql
+ module load python-build-bundle
+ module load scipy-stack
+ python -m venv nf-core-env
+ source nf-core-env/bin/activate
+ python -m pip install nf_core==3.5.2
+ ```
+ :::
+
+2. Run the pipeline download command, replacing the `` following your configuration :
+
+ :::caution[Apptainer/Singularity users]
+ If the `NXF_APPTAINER_CACHEDIR` or `NXF_SINGULARITY_CACHEDIR` environment variable is found in the environment, containers will first be downloaded to its
+ location before **being copied to the specified download location** under the `singularity-containers` directory. This can be good for sharing cache between
+ users or pipelines. However, pipelines with **large containers or a large number of them** could fill up your system. **Refer to your pipeline's documentation
+ for the recommended procedure**. In doubt, **unset those variables**.
+ :::
+
+ ```bash
+ nf-core pipelines download \
+ --revision \
+ --outdir \
+ --container-system \
+ --parallel-downloads
+ ```
+
+ :::danger[HPC users]
+ You **must guarantee all download locations** used are accessible to compute nodes ! It is also **highly recommended to download all configurations** by adding
+ the argument `--download-configuration yes` to the command above.
+ :::
+
+ |||
+ |-|-|
+ | **``** | Name of the pipeline to download. It must be a **repository name hosting it on Github** (for example, `scilus/sf-tractomics` refers to the `sf-tractomics` pipeline from the `scilus` organisation). |
+ | **``** | Can be the **tag** of a release, a **branch** name or a **commit SHA**. |
+ | **``** | The directory where to store the output pipeline, configurations and containers. |
+ | **``** | Either **singularity** (also stands for **apptainer**) or **docker**. It must align with the container engine you selected above. **If using apptainer or singularity, refer to the tip below for detailled configuration**. |
+ | **``** | Number of parallel downloads allowed. |
+
+ :::tip[Configuration for Apptainer/Singularity]
+ Finer configuration is available for **apptainer** and **singularity** :
+
+ |||
+ |-|-|
+ | **`--container-library`** | Remote library (registry) where to pull containers. When in doubt, use `docker.io` |
+ | **`--container-cache-utilisation`** | Set to `copy` by default, which copies containers to a `singularity-containers` directory placed aside the downloaded pipeline. Set to `amend` to disable the copy, **in which case ensure you have set valid cache locations for apptainer (`NXF_APPTAINER_CACHEDIR`) or singularity (`NXF_SINGULARITY_CACHEDIR`) in your environment before download**. |
+ :::
+
+
+
+## Setup using `nf-neuro` custom scripts
+
+:::caution
+This setup procedure requires that you use the **Apptainer** or **Singularity** container engine !
+:::
+
+Only two additional prerequisites are necessary to run the script : `jq` and `curl` or `wget`. On **debian** systems (such as Ubuntu), they all can be installed easily
+with `apt-get`. Once installed, use the command below to run the script, replacing every `` following your setup :
+
+```bash
+curl -fsSL https://nf-neuro.github.io/pipelines/download | bash -s -- \
+ -p \
+ -r \
+ -o \
+ -c \
+ -d
+```
+
+|||
+|-|-|
+| **``** | Name of the pipeline to download. It must be a **repository name hosting it on Github** (for example, `scilus/sf-tractomics` refers to the `sf-tractomics` pipeline from the `scilus` organisation). |
+| **``** | Can be the **tag** of a release, a **branch** name or a **commit SHA**. |
+| **``** | The directory where to copy the output containers. |
+| **``** | The directory where to cache the containers before copy. |
+| **``** | Number of parallel downloads allowed. |
\ No newline at end of file
diff --git a/src/content/docs/pipelines/run.mdx b/src/content/docs/pipelines/run.mdx
new file mode 100644
index 0000000..f7bd0df
--- /dev/null
+++ b/src/content/docs/pipelines/run.mdx
@@ -0,0 +1,360 @@
+---
+title: Running pipelines
+description: Common guidelines to run nf-neuro pipelines
+---
+
+import CoffeeIcon from '~icons/codicon/coffee';
+import { Steps } from '@astrojs/starlight/components';
+
+## Prerequisites
+
+Pipelines built against the **nf-neuro** ecosystem and published through it **support the full extent
+of nextflow capabilities**. This means **you don't even have to download or install a thing !** Well, that is except :
+
+|||
+|-|-|
+|**[Nextflow](https://www.nextflow.io/docs/latest/install.html)** | The backbone pipeline executor, actually the only **required** dependency. |
+| **Container engine** | This is optional, but **without it you need to install all the software required to run the pipeline**. We recommend [Docker](https://docs.docker.com/get-started/get-docker/) for local usage (where you have administrative rights), and [apptainer](https://apptainer.org/docs/admin/main/installation.html) anywhere else (computing clusters in the cloud or HPC infrastructures are typical use-cases). |
+
+## Prepare I/O
+
+This is your **main task**. You need to prepare the spaces where your input data lives, according to
+your pipeline's input specification. You also need to allocate a space for the pipeline's outputs. You'll
+need to refer to its own documentation to get everything in order, as each pipeline has its own specificities.
+Nomatterwhat, here is a quick checklist to get everything in good shape, ready to address any pipeline's I/O
+peculiarities :
+
+
+
+1. **Create a directory for your current project/processing**. It will act as a single entrypoint to access
+ the outputs from processing and to introspect into the pipelines code and its executions on your data.
+ **All following commands and manipulations take place inside this directory**.
+
+ :::danger
+ **On HPC, this directory needs to be accessible from computing nodes. Else many errors might ensue !**
+ :::
+
+2. **Create an `input` directory where to place and organize your input data**. If it's light enough or placing it
+ all in one place makes sense to you, copy it there. Else, a good way to get everything organized there is
+ with [symbolic links](https://www.linode.com/docs/guides/linux-symlinks/) between the actual locations of your
+ data and the `input` directory.
+
+ :::caution
+ **Symbolic links must be carefully verified on HPC**, as so they are accessible by computing nodes
+ :::
+
+ :::tip
+ Most pipelines use the **globstar** (`**`) pattern to naviguate in their input directory. This means you can place your
+ input data as deep as you want (for example `.../input/data/I/want/to/process/subject-1/...`) and the pipeline
+ will find it. The downside is it can make **hiding a subject from processing** troublesome, for which you'll probably need
+ to take its data out of the input directory altogether.
+ :::
+
+3. **Create a `results` directory to store the pipeline's output**. Validate enough disk space is available (no need to
+ be exact, when in doubt ensure you have **a lot** of it). The pipeline's execution should not be affected if no more
+ space is available to write results, but you won't have access to them easily. In which case, you might need to re-execute
+ some of the steps or the whole pipeline a second time, wasting time and computing resources.
+
+ :::tip
+ Pipeline's usually work in **overwrite mode**, meaning **subsequent pipeline runs will write over previous ones for the same
+ input subjects**. If unsure, consult the documentation for the specific pipeline you want to use
+ :::
+
+4. **Create or edit `nextflow.config`** in the directory created at **step 1**. In it, set or replace :
+
+ ```groovy
+ params.input = 'input'
+ params.outdir = 'results'
+ ```
+
+ Refer to the documentation of the pipeline you are running for any other **input parameters** needing to be set, and for **execution
+ parameters** that might be of interest to set given your data, project or research question.
+
+ :::tip[Centralize your configuration !]
+ You can specify configuration for the pipeline in many different ways. **We cannot recommend enough you centralize everything in the
+ `nextflow.config` we made you create above, for debugging purposes, but also for reuse, sharing and safekeeping**. A rule of thumb is
+ compiling all **static** configuration in that file, and supply parameters at the command line only to slightly execution
+ to specific use-cases. **Overriding parameters using the `-c` nextflow argument should be avoided at all costs !**
+ :::
+
+
+
+:::caution[Validation before next steps]
+Before continuing, refer to the documentation of the pipeline you are using and validates its specificities for **I/O** and **configuration**,
+as the procedure defined here only sets up the common grounds for its execution.
+:::
+
+
+## Configure execution
+
+Each pipeline comes with **its own set or parameters (`params`)** you can edit to tailor the execution to your data, your project
+or your research question. Each also prescribe a set of `profile`, logical configuration groups you can use to apply **behaviors predefined
+by the developer**, such as :
+
+|||
+|-|-|
+| **`gpu`** | Enable GPU acceleration for modules supporting it. |
+| **`docker`** | Use the [Docker](https://docs.docker.com/get-started/get-docker/) engine for execution isolation. |
+| **`slurm`** | Dispatch modules execution using the [SLURM](https://slurm.schedmd.com/overview.html) scheduler (works on HPC infrastructures). |
+
+:::tip
+All parameters `params` and profiles `profile` are described in the documentation related to te pipelines themselves. Below are lists of
+common parameters and profiles common to all pipelines, made available through the `nf-core` pipeline template
+:::
+
+### Common parameters
+
+#### Results publishing
+
+|||
+|-|-|
+| **`publish_dir_mode`** | Set to `copy` by default, which means results are copied from working directories to output. Refer to the [nextflow documentation](https://www.nextflow.io/docs/latest/reference/process.html#process-publishdir) for other options and their specificities. |
+
+#### Institutional configuration
+
+|||
+|-|-|
+| **`config_profile_name`** | If set, this configuration will be loaded to tailor to specified institution. Refer to [this page](https://nf-co.re/configs/) for a full list of configurations available. |
+
+#### Notifications
+
+|||
+|-|-|
+| **`email`** | If set, a summary is sent on pipeline completion, regardless of status. |
+| **`email_on_fail`** | If set, summary is only sent if the pipeline fails. |
+| **`plaintext_email`** | If set, disables `HTML` e-mail content. |
+| **`max_multiqc_email_size`** | Exclude MultiQC reports exceeding this size from summary e-mails |
+
+#### Miscellaneous
+
+|||
+|-|-|
+| **`version`** | If set, prints the pipeline's version to terminal without execution. |
+| **`multiqc_title`** | Title displayed atop all MultiQC reports generated by the pipeline. |
+
+
+### Common profiles
+
+|||
+|-|-|
+| **`docker`** | Use [Docker](https://docker.com) containers to isolate process execution. |
+| **`apptainer`** | Use [Apptainer](https://apptainer.org/docs/admin/main/index.html) containers to isolate process execution. |
+| **`singularity`** | Use [Singularity](https://docs.sylabs.io/guides/latest/user-guide/) containers to isolate process execution. |
+| **`arm`** | Customize configuration for the ARM chipset. Enables container emulation from `amd64` builds. |
+| **`debug`** | Enables stricter validation, as well as the collection and preservation of runtime information. Disables post-execution cleanup tasks. |
+
+## Run pipelines locally
+
+:::tip[Running pipelines without web access]
+If for any reason you **must run a pipeline in an offline environment, we got you covered** !
+Follow [these simple guidelines](/pipelines/offline) to deploy your offline setup and get back here.
+:::
+
+With all **I/O** and **configuration** done, running the pipeline take a single command line :
+
+```bash
+nextflow run -r -profile
+```
+
+**Replace :**
+
+|||
+|-|-|
+| **``** | With the name of your pipeline. It must abide to the **repository name hosting it on Github** (for example, `scilus/sf-tractomics` refers to the `sf-tractomics` pipeline from the `scilus` organisation). |
+| **``** | With the version of the pipeline to use. This can be a **release**, a **branch** name or a full **commit SHA**. |
+| **``** | With the list of profiles to apply to the pipeline's configuration. In overwrite order, such that `-profile slurm,docker,gpu` first applies the `slurm` profile and superseeds it with the configuration prescribed by the `docker` and `gpu` profile, successively. |
+
+:::tip[You can always download the pipeline locally]
+The above procedure differs slightly if you have downloaded the pipeline locally. In that case, no need to specify the
+**version** with `-r`, but you need to replace the `` section with the **full path to your pipeline location**.
+:::
+
+:::caution
+On the **first online pipeline run**, you might notice it takes some time before anything launches. **Containers are
+downloading**, which can take a while. Be patient, it's a good time for a hot drink
+ !
+:::
+
+## Run pipelines on HPC
+
+**Nextflow** knows how to handle scheduling the pipeline's job using [SLURM](https://slurm.schedmd.com/overview.html) and most HPC
+infrastructure supports it (for example **Alliance Canada** advertises [full support](https://docs.alliancecan.ca/wiki/Nextflow), especially
+pipelines distributed through the [nf-core](https://nf-co.re) toolchain). In **nf-neuro**, we can be a bit more strict. Most
+procedures prescribed by clusters still apply, but could require some adjustments. **As example, we provide here a complete walkthrough for
+configuration and execution on Alliance Canada HPC**. While its configuration should align with other HPC cluster deployments quite well,
+inspect the full procedure thoroughly and **tailor it up to your own cluster configuration**.
+
+### Validate before execution
+
+
+
+1. You have access to an **institutional configuration** for your cluster. **We [maintain one](https://nf-co.re/configs/alliance_canada/)
+ for all clusters under the Alliance Canada umbrella**. You might [find one here](https://nf-co.re/configs/) for your own alternative usage.
+ Else, you should probably develop one. In this case, refer to expertise on the [nf-core configuration repository](https://github.com/nf-core/configs)
+ for guidance.
+
+2. Your **input data is located in a filesystem that is accessible to compute nodes**. This is **crucial**. Not only this path needs to be
+ accessible, it must be **optimal for read access**. The pipeline will pull data from here to process, so its **input efficiency must be on-par**.
+
+ :::tip
+ On **Alliance Canada** clusters, the **project** directory is suitable for inputs. If you experience errors or reduced efficiency, create a
+ temporary copy into **scratch** and use it as input instead.
+ :::
+
+3. The **output location** where the processed data will be written is **optimally acessible for write**. As module's execution completes, this
+ location will experience **heavy loads of write operations**.
+
+ :::tip
+ On **Alliance Canada** clusters, **use `scratch` for writing outputs**. Then, on completion, no matter the status, copy the results to your project directory.
+ :::
+
+4. You are not using **`$HOME`**, or **any other restricted file paths** for anything related to the execution of the pipeline. **HPC clusters
+ are really picky about this ! If you experience cryptic errors, you might be using one of those.**
+
+ :::tip
+ On **Alliance Canada** clusters, if you limit yourself to the `scratch` filesystem and your allocated `project` directory for everything, you
+ should experience no problems. Else, [they can get in touch](https://github.com/nf-neuro/modules/issues) with us for rapid feedback !
+ :::
+
+5. **On compute nodes**, the temporary filesystem is a physical location, **not mounted from RAM**. In the case a RAM mount is used, you must create
+ a location yourself to host the temporary files produced by the pipeline. Then, tell nextflow to use this path by setting the `TMPDIR` environment
+ variable
+
+ :::tip[How to detect a RAM mount]
+ The simplest way is with the `df` command, which displays the size and types of the filesystems mounted on the compute node. If you see `tmpfs`
+ as the type of the mount where `/tmp` is located, then the node is using a RAM mount.
+
+ ```bash
+ df -h | grep /tmp
+ ```
+ :::
+
+
+
+
+### HPC in SLURM mode
+
+:::danger
+Exceptions aside, **compute nodes in HPC facilities don't have access to the web**. You **need** to first deploy the offline
+environment following [the guidelines here](/pipelines/offline).
+:::
+
+
+
+1. Open a terminal **on a login node** on the cluster where you want to run the pipeline.
+
+ :::caution
+ **This terminal needs to survive as long as the pipeline runs**. We recommend using a terminal mutiplexer **on the node**, such as `tmux`
+ or `screen` on **Linux**, will ensure it, even if you get disconnected from it. However, validate against your cluster configuration for
+ potential limits that could be enforced on **process run time**.
+ :::
+
+2. Move to a suitable **working directory** (we recommend a directory under your personal `/scratch`).
+
+3. Load `nextflow` and `apptainer` in the environment at their **latest possible versions**
+
+ ```bash
+ module load nextflow apptainer
+ ```
+
+4. Set the following environment variables :
+
+ ```bash
+ export NXF_APPTAINER_CACHEDIR=""
+ export SLURM_ACCOUNT=""
+ export SBATCH_ACCOUNT=$SLURM_ACCOUNT
+ export SALLOC_ACCOUNT=$SLURM_ACCOUNT
+ ```
+
+ |||
+ |-|-|
+ | **``** | The directory where you downloaded the pipeline's containers for offline usage. |
+ | **``** | The account nextflow will use to submit the pipeline's jobs. |
+
+5. **If your cluster uses a RAM mount for temporary files**, change its location to a directory on `/scratch`, or another
+ physical filesystem :
+
+ ```bash
+ export TMPDIR=""
+ ```
+
+ |||
+ |-|-|
+ | **``** | Directory on a physical filesystem accessible to compute nodes |
+
+6. Launch the pipeline with the command below, carefully replacing the variable fields :
+
+ ```bash
+ nextflow run -r \
+ --input \
+ --outdir
+
+
+
+### HPC in Single Node mode
+
+ :::danger
+Exceptions aside, **compute nodes in HPC facilities don't have access to the web**. You **need** to first deploy the offline
+environment following [the guidelines here](/pipelines/offline).
+:::
+
+
+
+1. Open a terminal **on a login node** on the cluster where you want to run the pipeline.
+
+2. Move to a suitable **working directory** (we recommend a directory under your personal `/scratch`).
+
+3. Create a **`sbatch` submission script. You can copy the one provided below and replace its `` with values fitting
+ your environment
+
+ ```bash
+ #!/bin/sh
+ #SBATCH --mail-user=
+ #SBATCH --mail-type=ALL
+
+ #SBATCH --account=
+ #SBATCH --nodes=1
+ #SBATCH --cpus-per-task=
+ #SBATCH --mem=
+ #SBATCH --time=
+
+ # Load the required modules.
+ module load nextflow apptainer
+
+ # Variables for containers, etc.
+ export NXF_APPTAINER_CACHEDIR=
+
+ # Call for the pipeline execution.
+ nextflow run -r \
+ --input \
+ --outdir \
+ -profile apptainer \
+ -resume
+ ```
+
+ |||
+ |-|-|
+ | **``** | E-mail address where to send notifications on the status of the pipeline's execution. |
+ | **``** | Slurm account to access computing resources. |
+ | **``** | Number of cpus to reserve for processing. We recommend setting this to the maximum number of cpus available, but refer to your pipeline's documentation for details. |
+ | **``** | Amount of RAM to reserve for processing. **This also includes all potential temporary mounts (`tmpfs`)**. |
+ | **``** | Amount of time allowed for the pipeline to run before cancellation. |
+ | **``** | The directory where you downloaded the pipeline's containers for offline usage. |
+ | **``** | Name of your downloaded pipeline. On newer nextflow version, use the **repository name** (usually from Github). **You can supply the path to the pipeline instead**, but you must omit the `-r ` argument. |
+ | **``** | The version of the pipeline to use, if using its **repository name** above. |
+ | **``** | The directory containing the input files. **This directory must be optimally accessible for reading by computing nodes**. |
+ | **``** | The directory where the output files will be published. **This directory must be optimally accessible for writing by computing nodes**. |
+
+
\ No newline at end of file
diff --git a/src/styles/custom.css b/src/styles/custom.css
index fe323b6..2f51abb 100644
--- a/src/styles/custom.css
+++ b/src/styles/custom.css
@@ -60,3 +60,8 @@ starlight-tabs {
background-color: light-dark(var(--color-gray-100), var(--color-gray-700));
padding: 10px;
}
+
+tr td:first-child {
+ width: 1%;
+ white-space: nowrap;
+}
\ No newline at end of file