3
3
![ JupyterHub integration in HPC cluster] ( jupyterhub-diagram.png " JupyterHub integration in HPC cluster ")
4
4
5
5
The hub and its HTTP proxy are run by a non-root user in a rootless container.
6
- The container is managed by a service in [ systemd ] ( https://systemd.io/ ) with
7
- [ podman] ( https://podman.io/ ) .
6
+ The container is managed in the host systems by a service in
7
+ [ systemd ] ( https://systemd.io/ ) with [ podman] ( https://podman.io/ ) .
8
8
9
- Notebooks are run remotely, in any available compute node in the HPC cluster.
9
+ Notebooks are launched remotely, on the compute nodes of our HPC cluster.
10
10
The allocation of hardware resources for the notebook is done on-demand by
11
- [ Slurm] ( https://slurm.schedmd.com/ ) . JupyterHub can submit jobs to Slurm to
12
- launch new notebooks thanks to [ batchspawner] ( https://github.com/jupyterhub/batchspawner ) .
11
+ the resource manager [ Slurm] ( https://slurm.schedmd.com/ ) . Users can select the
12
+ resources for their notebooks from the JupyterHub interface thanks to the
13
+ [ JupyterHub MOdular Slurm Spawner] ( https://github.com/silx-kit/jupyterhub_moss ) ,
14
+ which leverages [ batchspawner] ( https://github.com/jupyterhub/batchspawner ) to
15
+ submit jobs to Slurm in user's behalf that will launch the single-user server.
16
+
13
17
The main particularity of our setup is that such jobs are not submitted to
14
18
Slurm from the host running JupyterHub, but from the login nodes of the HPC
15
- cluster.
19
+ cluster via an SSH connection. This approach has the advantage that the system
20
+ running JupyterHub can be very minimal, avoiding the need for local users,
21
+ special file-system mounts and the complexity of provisioning a Slurm
22
+ installation capable of submitting jobs to the HPC cluster.
23
+
24
+ ## Rootless
25
+
26
+ JupyterHub is run by a non-root user in a rootless container. Setting up a
27
+ rootless container is well described in the [ podman rootless
28
+ tutorial] ( https://github.com/containers/podman/blob/main/docs/tutorials/rootless_tutorial.md ) .
29
+
30
+ We use a [ system service] ( host/etc/systemd/system/jupyterhub.service ) to
31
+ execute ` podman ` by a non-root user ` jupyterhub ` (* aka* JupyterHub operator).
32
+ This service relies on a [ custom shell script] ( host/usr/local/bin/jupyterhub-init.sh )
33
+ to automatically initialize a new image of the rootless container or start an
34
+ existing one.
35
+
36
+ The container [ binds a few mounts with sensitive configuration
37
+ files] ( host/usr/local/bin/jupyterhub-init.sh#L59-L66 ) for JupyterHub, SSL
38
+ certificates for the web server and SSH keys to connect to the login nodes.
39
+ Provisioning these files in the container through bind-mounts allows to have
40
+ secret-free container images and seamlessly deploy updates to the configuration
41
+ of the hub.
16
42
17
43
## Network
18
44
@@ -21,13 +47,13 @@ have a routable IP address, so they rely on the network interfaces of the host
21
47
system. The hub must be able to talk to the notebooks being executed on the
22
48
compute nodes in the internal network, as well as serve the HTTPS requests
23
49
(through its proxy) from users on the external network. Therefore, ports 8000
24
- (HTTP proxy) and 8081 (REST API) in the
25
- [ container are forwarded to the host system] ( usr/local/bin/jupyterhub-init.sh#L54 ) .
50
+ (HTTP proxy) and 8081 (REST API) in the [ container are forwarded to the host
51
+ system] ( host/ usr/local/bin/jupyterhub-init.sh#L53-L57 ) .
26
52
27
53
The firewall on the host systems blocks all connection through the external
28
54
network interface and forwards port 8000 on the internal interface (HTTP proxy)
29
- to port 443 on the external one. This setup allows accessing the web interface
30
- of the hub/notebooks from both the internal and external networks. The REST API
55
+ to port 443 on the external one. This setup renders the web interface of the
56
+ hub/notebooks accessible from both the internal and external networks. The REST API
31
57
of the hub is only available on port 8081 of the internal network.
32
58
33
59
## Authentication
@@ -36,112 +62,57 @@ User authentication is handled through delegation via the
36
62
[ OAuth] ( https://en.wikipedia.org/wiki/OAuth ) service of the
37
63
[ VSC] ( https://www.vscentrum.be/ ) accounts used by our users.
38
64
39
- We made a custom
40
- [ VSCGenericOAuthenticator] ( etc/jupyterhub/jupyterhub_config.py#L73-L77 ) which
41
- is heavily based on ` LocalGenericOAuthenticator ` from
42
- [ OAuthenticator] ( https://github.com/jupyterhub/oauthenticator/ ) :
43
-
44
- * entirely relies on OAuthenticator to carry out a standard OAuth delegation
45
- with the VSC account page, the [ URLs of the VSC OAuth are defined in the
46
- environment of the container] ( container/Dockerfile#L59-L61 ) and the [ secrets
47
- to connect to it are defined in JupyterHub's configuration
48
- file] ( etc/jupyterhub/jupyterhub_config.py#L83-L88 )
49
- * automatically creates local users in the container for any VSC account logged
50
- in to JupyterHub and ensures correct UID mapping to allow local VSC users to
51
- [ access their home directories] ( usr/local/bin/jupyterhub-init.sh#L80 ) ,
52
- which is needed to securely connect to the login nodes in the HPC cluster
53
- with their SSH keys
65
+ We use the [ GenericOAuthenticator] ( https://github.com/jupyterhub/oauthenticator/ )
66
+ from JupyterHub:
54
67
55
- ## Rootless
68
+ * carry out a standard OAuth delegation with the VSC account page
56
69
57
- JupyterHub is run by a non-root user in a rootless container. Setting up a
58
- rootless container is well described in the [ podman rootless
59
- tutorial] ( https://github.com/containers/podman/blob/main/docs/tutorials/rootless_tutorial.md ) .
60
-
61
- We use a [ system service] ( etc/systemd/system/jupyterhub.service ) to execute
62
- ` podman ` by a non-root user ` jupyterhub ` (* aka* JupyterHub operator). This
63
- service relies on a [ custom shell script] ( usr/local/bin/jupyterhub-init.sh ) to
64
- automatically initialize a new image of the rootless container or start an
65
- existing one.
66
-
67
- ### Extra permissions
68
-
69
- In the current setup, running JupyterHub fully non-root is not possible because
70
- the hub needs superuser permissions for two specific tasks:
71
-
72
- * ` VSCGenericOAuthenticator ` creates local users in the container
73
- * ` SlurmSpawner ` switches to VSC users (other non-root users) to launch their
74
- notebooks through Slurm
70
+ * [ URLs of the VSC OAuth] ( container/Dockerfile#L72-L76 ) are defined in the
71
+ environment of the container
75
72
76
- These additional permissions are granted to the hub user discretely by means of
77
- ` sudo ` . The definitions of each extra permission is defined in the
78
- [ sudoers] ( container/sudoers.conf ) file of the container.
73
+ * [ OAuth secrets] ( container/.config/jupyterhub_config.py#L40-L45 ) are
74
+ defined in JupyterHub's configuration file
79
75
80
- ### Container namespace
81
-
82
- Users logging in JupyterHub have to access their home directories to be able to
83
- connect to the login nodes of the HPC cluster with their SSH keys. Since home
84
- directories are bound to the mounts in the host system, it is critical to
85
- properly define the namespace used by the rootless container to cover the real
86
- UIDs of the users in the host system.
87
-
88
- The UID/GIDs of VSC users are all in the 250000-2599999 range. We can
89
- easily create a [ mapping for the container] ( etc/subuid ) with a straightforward
90
- relationship between the UIDs inside and outside the container:
91
-
92
- ```
93
- $ podman unshare cat /proc/self/uid_map
94
- 0 4009 1
95
- 1 2500001 65536
96
- ```
97
-
98
- Therefore, the non-root user executing the rootless container will be mapped to
99
- the root user of the container, as usual. While, for instance, user with UID 1
100
- in the container will be able to access the files of UID 250001 outside.
101
- The custom method [ ` vsc_user_uid_home ` ] ( etc/jupyterhub/jupyterhub_config.py#L43 )
102
- ensures that VSC users created inside the container have the correct UID with
103
- regards to this mapping.
104
-
105
- The namespace used by the container must be available in the host system (* i.e*
106
- not assigned to any user or group in the system), which means that the VSC
107
- users must not exist in the host system of the container. This requirement does
108
- not hinder mounting the home directories of those VSC users in the system
109
- though, as any existing files owned by those UID/GIDs of the VSC users will be
110
- just non-assigned to any known user/group.
76
+ * local users beyond the non-root user running JupyterHub are ** not needed**
111
77
112
78
## Slurm
113
79
114
- Integration with Slurm is leveraged by ` SlurmSpawner ` of
115
- [ batchspawner] ( https://github.com/jupyterhub/batchspawner ) .
116
-
117
- We modified the submission command to execute ` sbatch ` in the login nodes of
118
- the HPC cluster through SSH. The login nodes already run Slurm and are the sole
119
- systems handling job submission in our cluster. Delegating job submission to
120
- them avoids having to install and configure Slurm in the container running
121
- JupyterHub.
122
-
123
- The user's environment in the hub is passed through the SSH connection by
124
- selectively selecting the needed environment variables to launch the user's
125
- notebook:
126
-
127
- ```
128
- sudo -E -u vscXXXXX ssh -o 'StrictHostKeyChecking no' login.host.domain \
129
- env JUPYTERHUB_API_TOKEN="${JUPYTERHUB_API_TOKEN@Q}" \
130
- JPY_API_TOKEN="${JPY_API_TOKEN@Q}" \
131
- JUPYTERHUB_CLIENT_ID="${JUPYTERHUB_CLIENT_ID@Q}" \
132
- JUPYTERHUB_HOST="${JUPYTERHUB_HOST@Q}" \
133
- JUPYTERHUB_API_URL="${JUPYTERHUB_API_URL@Q}" \
134
- JUPYTERHUB_OAUTH_CALLBACK_URL="${JUPYTERHUB_OAUTH_CALLBACK_URL@Q}" \
135
- JUPYTERHUB_OAUTH_SCOPES="${JUPYTERHUB_OAUTH_SCOPES@Q}" \
136
- JUPYTERHUB_USER="${JUPYTERHUB_USER@Q}" \
137
- JUPYTERHUB_SERVER_NAME="${JUPYTERHUB_SERVER_NAME@Q}" \
138
- JUPYTERHUB_ACTIVITY_URL="${JUPYTERHUB_ACTIVITY_URL@Q}" \
139
- JUPYTERHUB_BASE_URL="${JUPYTERHUB_BASE_URL@Q}" \
140
- JUPYTERHUB_SERVICE_PREFIX="${JUPYTERHUB_SERVICE_PREFIX@Q}" \
141
- JUPYTERHUB_SERVICE_URL="${JUPYTERHUB_SERVICE_URL@Q}" \
142
- sbatch --parsable
143
- ```
144
-
145
- Note: the expansion operator ` ${var@Q} ` is available in bash 4.4+ and returns a
146
- quoted string with escaped special characters
147
-
80
+ Integration with Slurm is leveraged through a custom Spawner called
81
+ [ VSCSlurmSpawner] ( container/.config/jupyterhub_config.py#L60 ) based on
82
+ [ MOSlurmSpawner] ( https://github.com/silx-kit/jupyterhub_moss ) .
83
+ ` VSCSlurmSpawner ` allows JupyterHub to generate the user's environment needed
84
+ to spawn its single-user server without any local users. All user settings are
85
+ taken from ` vsc-config ` .
86
+
87
+ We modified the [ submission command] ( container/.config/jupyterhub_config.py#L295 )
88
+ to execute ` sbatch ` in the login nodes of the HPC cluster through SSH.
89
+ The login nodes already run Slurm and are the sole systems handling job
90
+ submission in our cluster. Delegating job submission to them avoids having to
91
+ install and configure Slurm in the container running JupyterHub. The hub
92
+ environment is passed over SSH with a strict control over the variables that
93
+ are [ sent] ( container/.ssh/config ) and [ accepted] ( slurm_login/etc/ssh/sshd_config )
94
+ on both ends.
95
+
96
+ The SSH connection is established by the non-root user running JupyterHub (the
97
+ hub container does not have other local users). This jupyterhub user has
98
+ special ` sudo ` permissions on the login nodes to submit jobs to Slurm as other
99
+ users. The specific group of users and list of commands allowed to the
100
+ jupyterhub user are defined in the [ sudoers file] ( slurm_login/etc/sudoers ) .
101
+
102
+ Single-user server spawn process:
103
+
104
+ 1 . user selects computational resources for the notebook in the
105
+ [ web interface of the hub] ( https://github.com/silx-kit/jupyterhub_moss )
106
+
107
+ 2 . ` VSCSlurmSpawner ` generates environment for the user without any local users
108
+ in the system of the hub
109
+
110
+ 3 . jupyterhub user connects to login node with SSH, environment is passed
111
+ through the wire
112
+
113
+ 4 . jupyterhub user submits new job to Slurm cluster as target user keeping the
114
+ hub environment
115
+
116
+ 5 . single-user server job fully [ resets the environment and
117
+ re-generates] ( container/.config/jupyterhub_config.py#L264-L285 ) specific
118
+ environment variables for the single-user server
0 commit comments