Skip to content

Commit 9a16fa5

Browse files
authored
feat: adding CVMFS docs (#33)
* feat: adding CVMFS docs * fix: english typos
1 parent afb304a commit 9a16fa5

File tree

7 files changed

+123
-7
lines changed

7 files changed

+123
-7
lines changed

docs/cvmfs.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
# The ESCAPE CVMFS repository
2+
3+
The [CERN Virtual Machine File System](https://cernvm.cern.ch/fs/) (CVMFS) is a read-only, HTTP-based file system optimized for distributing scientific software at scale.
4+
Originally developed at CERN for the Worldwide LHC Computing Grid (WLCG), CVMFS is now widely adopted across many research infrastructures.<br/>
5+
Among other capabilities, CVMFS provides the following functionalities:
6+
7+
- **Global software delivery system**: Software is published once and instantly becomes available across all connected clients and compute sites.
8+
9+
- **HTTP and caching-based**: CVMFS fetches files over standard HTTP and relies on [caching proxies](https://www.squid-cache.org/) to reduce load and latency.
10+
11+
- **FUSE-mounted**: Software appears as a regular local file system on the client machine.
12+
13+
Find more at [the official CVMFS documentation](https://cvmfs.readthedocs.io/en/stable/).
14+
### Components
15+
The ESCAPE CVMFS infrastructure consists of three main components:
16+
1. The single source, called the Stratum 0 Repository Server
17+
2. Public mirrors, called Stratum 1 Replica Servers
18+
3. Caches, usually managed by Squid Proxy servers
19+
20+
One protected read/write Stratum 0 instance, called `sw.escape.eu`, is feeding up the public, distributed mirror servers. <br/>
21+
Then, a distributed hierarchy of proxy servers fetches content from the closest public mirror server.
22+
23+
Find more details about `sw.escape.eu` and how to setup CVMFS in [the developer documentation](tech-docs/services/cvmfs).
24+
25+
A diagram of the infrastructure described is shown below.
26+
27+
![image](../static/img/CVMFS-diagram.png)
28+
29+
## CVMFS the ESCAPE VRE
30+
31+
In the ESCAPE VRE, we use CVMFS to:
32+
33+
- Distribute pre-built scientific software and tools in a consistent, portable way across cloud, HPC, and local environments.
34+
35+
- Package domain-specific applications, such as Rucio clients or analysis environments, as tarballs that are extracted into CVMFS repositories.
36+
37+
- Ensure reproducibility by version-controlling software environments and decoupling runtime dependencies from local installations.
38+
39+
- Facilitate onboarding of new communities by removing the burden of local software setup and environment configuration.
40+
41+
This CVMFS-based software delivery system allows ESCAPE to scale up access to research tools, reduce setup time, and ensure consistency across diverse computing resources.

docs/tech-docs/services/cvmfs.md

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
# CVMFS
2+
:::warning
3+
These instructions are highly CERN-dependent.<br/>
4+
Please refer to the [Official CernVM-FS documentation](https://cvmfs.readthedocs.io/en/stable) for detailed instructions on how to start from scratch.
5+
:::
6+
## Setting up a Repository
7+
8+
At CERN, the repository and release manager creation are provided by the Storage and Data Management group, through the [CVMS Repository creation request](https://cern.service-now.com/service-portal?id=sc_cat_item&name=CVMS-repository&se=cvmfs).
9+
10+
Elsewhere, please follow the steps at [Creating a Repository (Stratum 0)](https://cvmfs.readthedocs.io/en/stable/cpt-repo.html).
11+
12+
A set of useful information to know before submitting the request (or autonomously creating a repository):
13+
1. What is the quota for the repository and expected growth of the over time?
14+
2. Does the repository need (periodical) garbage collection?
15+
3. Should the visibility of the repository be restricted (e.g., due to licensed software)?
16+
4. Should the repository be replicated on other Stratum 1 servers worldwide?
17+
18+
:::tip[Repository naming]
19+
*The repository name resembles a DNS scheme, but it does not need to reflect any real server name. It is supposed to be a globally unique name that indicates where/who the publishing of content takes place. A repository name must only contain alphanumeric characters plus -, , or ., and it is limited to a length of 60 characters.*
20+
:::
21+
22+
The release manager is usually a VM where operators can upload, modify and delete the content of the repository.
23+
24+
## Publishing content
25+
26+
As described [here](https://cvmfs.readthedocs.io/en/stable/cpt-repo.html#content-publishing), there are four main steps needed to publish content:
27+
1. Initiate the transation via `cvmfs_server transaction <repository name>`
28+
2. Install content into `/cvmfs/<repository name>`
29+
3. (optional) Create nested catalogs at proper locations
30+
4. Finalise the transaction via `cvmfs_server publish <repository name>`
31+
32+
#### Install content
33+
34+
At startup, a CERN-hosted release manager will look like this:
35+
```sh
36+
* ********************************************************************
37+
* Welcome to xxxxx.cern.ch, AlmaLinux release 9.5 (Teal Serval)
38+
* Archive of news is available in /etc/motd-archive
39+
* Reminder: you have agreed to the CERN
40+
* computing rules, in particular OC5. CERN implements
41+
* the measures necessary to ensure compliance.
42+
* https://cern.ch/ComputingRules
43+
* Puppet environment: production, Roger state: build
44+
* Foreman hostgroup: xxxxxxx
45+
* #######################################################
46+
The CVMFS Stratum 0 for repo xxxxxx.
47+
Access is controlled via the e-group xxxxxxx.
48+
Shared local unix user for account xxxxxxx.
49+
50+
To become the shared user execute:
51+
sudo -i -u xxxxxxx
52+
To start a transaction:
53+
cvmfs_server transaction sw.escape.eu
54+
To publish a transaction
55+
cvmfs_server publish sw.escape.eu
56+
57+
* A cvmfs installation host.
58+
* ********************************************************************
59+
```
60+
61+
In order to publish content, it is necessary to switch to a dedicated user, as suggested: `sudo -i -u xxxxxxx`.
62+
63+
How to provide packages and proper versioning is left to the operators.
64+
65+
:::danger[Pro Tip]
66+
The VRE strategy for content publishing relies on the creation of a tarball containing the necessary software and versions, followed by a setup script to be executed at need. <br/>
67+
More details can be found in the **[ESCAPE CVMFS GitHub Repository](https://github.com/vre-hub/escape-cvmfs)**.
68+
:::

docusaurus.config.js

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -101,8 +101,13 @@ const config = {
101101
},
102102
{
103103
type: 'doc', // This is a preset regarding the type of document, please update if needed
104-
docId: 'rucio', // The ID of the corresponding document. This should match with what you wrote in the document header.
105-
label: 'Rucio', // The title that will appear on the menu
104+
docId: 'cvmfs', // The ID of the corresponding document. This should match with what you wrote in the document header.
105+
label: 'CVMFS', // The title that will appear on the menu
106+
},
107+
{
108+
type: 'doc', // This is a preset regarding the type of document, please update if needed
109+
docId: 'notebook', // The ID of the corresponding document. This should match with what you wrote in the document header.
110+
label: 'JupyterHub', // The title that will appear on the menu
106111
},
107112
{
108113
type: 'doc', // This is a preset regarding the type of document, please update if needed
@@ -111,8 +116,8 @@ const config = {
111116
},
112117
{
113118
type: 'doc', // This is a preset regarding the type of document, please update if needed
114-
docId: 'notebook', // The ID of the corresponding document. This should match with what you wrote in the document header.
115-
label: 'JupyterHub', // The title that will appear on the menu
119+
docId: 'rucio', // The ID of the corresponding document. This should match with what you wrote in the document header.
120+
label: 'Rucio', // The title that will appear on the menu
116121
},
117122
],
118123
},

sidebars.js

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -159,6 +159,7 @@ const sidebars = {
159159
label: 'Services',
160160
items: [
161161
'tech-docs/services/aai',
162+
'tech-docs/services/cvmfs',
162163
'tech-docs/services/data-management',
163164
'tech-docs/services/jupyterhub',
164165
'tech-docs/services/computing-resources',

src/pages/LatestNews.jsx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ const newsItems = [
4848
title: 'The Virtual Research Environment: A multi-science analysis platform',
4949
authors: 'E. Gazzarrini, E. Garcia Garcia, D. Gosein, and X. Espinal',
5050
journal: 'EPJ Web of Conferences 295, 08023 (2024)',
51-
links: [{ text: 'CHEP 2023 proceedings', url: 'https://doi.org/10.1051/epjconf/202429508023/' }],
51+
links: [{ text: 'CHEP 2023 proceedings', url: 'https://doi.org/10.1051/epjconf/202429508023' }],
5252
iconType: 'book-open',
5353
type: 'proceedings',
5454
},

src/pages/index.mdx

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,9 +22,10 @@ The Virtual Research Environment’s main components are:
2222

2323

2424
1. **[AAI](/docs/auth)**: A federated and reliable **Authentication and Authorization** layer
25-
2. **[The Rucio Data Lake](/docs/rucio)**: A **federated distributed storage** solution (the ESCAPE Data Lake), providing functionalities for data injection and replication through a Data Management framework (Rucio)
25+
2. **[The Rucio Data Lake](/docs/rucio)**: A **federated distributed storage** solution, providing functionalities for data injection and replication through a Data Management framework (Rucio)
2626
3. **[Reana](/docs/reana)**: A **computing** cluster supplying the processing power to run full analyses with Reana, a re-analysis software
27-
4. **[JupyterHub](/docs/notebook)**: An enhanced **notebook interface** with containerised environments to hide the infrastructure’s complexity from the user.
27+
4. **[CVMFS](/docs/cvmfs)**: A read-only file system designed to distribute software, and more.
28+
5. **[JupyterHub](/docs/notebook)**: An enhanced **notebook interface** with containerised environments to hide the infrastructure’s complexity from the user.
2829

2930
![image](../../static/img/VRE-diagram.png)
3031

static/img/CVMFS-diagram.png

132 KB
Loading

0 commit comments

Comments
 (0)