Skip to content

Commit 15bd4c0

Browse files
authored
Update block PV & persistent rootfs docs+examples
2 parents 96a228e + c54a172 commit 15bd4c0

6 files changed

+410
-11
lines changed

docs/volumes.md

+38-1
Original file line numberDiff line numberDiff line change
@@ -457,7 +457,44 @@ VirtletCloudInitUserData: |
457457
- ["/dev/testpvc", "/mnt"]
458458
```
459459

460-
See [local block PV example](../examples/ubuntu-vm-local-block-pv.yaml).
460+
See also [block PV examples](../examples#using-block-pvs).
461+
462+
## Using block PVs for persistent root filesystem
463+
464+
If a persistent block volume is specified for a pod and listed in
465+
container's `volumeDevices` with `devicePath` of `/`:
466+
```yaml
467+
volumeDevices:
468+
- devicePath: /
469+
name: testpvc
470+
```
471+
the corresponding PV will be used as a persistent root filesystem for
472+
a pod. The persistent root filesystem is reused as long as the image
473+
SHA256 hash doesn't change. Upon the change of SHA256 hash of the VM
474+
image, the PV will be overwritten again. Internally, Virtlet uses
475+
sector 0 of the block device to store persistent root filesystem
476+
metadata, and the block device visible inside the VM will use
477+
the sectors starting from sector 1. Overall, the following algorithm
478+
is used:
479+
1. The block device is checked for the presence of Virtlet header.
480+
2. If there's no Virtlet header, a new header is written to the sector
481+
0 and the device is overwritten with the contents of the image.
482+
3. If the header contains a future persistent root filesystem metadata
483+
version number, an error is logged and container creation fails.
484+
4. If the header contains mismatching image SHA256 hash, a new header
485+
is written to the sector 0 and the device is overwritten with the
486+
contents of the image.
487+
488+
Unless this algorithm fails on step 3, the VM is booted using the
489+
block PV starting from sector 1 as it's boot device.
490+
491+
*IMPORTANT NOTE:* in case if persistent root filesystem is used,
492+
cloud-init based network setup is disabled for the VM. This is done
493+
because some cloud-init implementations only apply cloud-init network
494+
configuration once, but the IP address given to the VM may change if
495+
the persistent root filesystem is reused by another pod.
496+
497+
See also [block PV examples](../examples#using-the-persistent-root-filesystem).
461498

462499
## Consuming other types of Kubernetes volumes
463500

examples/README.md

+212-5
Original file line numberDiff line numberDiff line change
@@ -36,25 +36,25 @@ virtletctl ssh fedora@fedora-vm -- -i examples/vmkey [command...]
3636
Kubernetes using `kubeadm` on it.
3737

3838
You can create the cluster like this:
39-
```
39+
```bash
4040
kubectl create -f k8s.yaml
4141
```
4242

4343
Watch progress of the cluster setup via the VM console:
44-
```
44+
```bash
4545
kubectl attach -it k8s-0
4646
```
4747

4848
After it's complete you can log into the master node:
4949

50-
```
50+
```bash
5151
virtletctl ssh root@k8s-0 -- -i examples/vmkey
5252
```
5353

5454
There you can wait a bit for k8s nodes and pods to become ready.
5555
You can list them using the following commands inside the VM:
5656

57-
```
57+
```bash
5858
kubectl get nodes -w
5959
# Press Ctrl-C when all 3 nodes are present and Ready
6060
kubectl get pods --all-namespaces -o wide -w
@@ -63,7 +63,7 @@ kubectl get pods --all-namespaces -o wide -w
6363

6464
You can then deploy and test nginx on the inner cluster:
6565

66-
```
66+
```bash
6767
kubectl run nginx --image=nginx --expose --port 80
6868
kubectl get pods -w
6969
# Press Ctrl-C when the pod is ready
@@ -74,3 +74,210 @@ After that you can follow
7474
[the instructions](../deploy/real-cluster.md) to install Virtlet on
7575
the cluster if you want, but note that you'll have to disable KVM
7676
because nested virtualization is not yet supported by Virtlet.
77+
78+
# Using local block PVs
79+
80+
To use the block PV examples, you need to enable `BlockVolume`
81+
[feature gate](https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/)
82+
for your Kubernetes cluster components. When using
83+
[kubeadm-dind-cluster](https://github.com/Mirantis/kubeadm-dind-cluster)
84+
for testing, you can use this command to start the cluster with
85+
`BlockVolume` and Ceph support:
86+
```bash
87+
FEATURE_GATES="BlockVolume=true" \
88+
KUBELET_FEATURE_GATES="BlockVolume=true" \
89+
ENABLE_CEPH=1 \
90+
./dind-cluster-v1.11.sh up
91+
```
92+
93+
[ubuntu-vm-local-block-pv.yaml](ubuntu-vm-local-block-pv.yaml)
94+
demonstrates the use of local block volumes. For the sake of
95+
simplicity, it uses a file named `/var/lib/virtlet/looptest` instead
96+
of a real block device, but from the user perspective the usage is the
97+
same except that `/dev/...` path must be specified instead of
98+
`/var/lib/virtlet/looptest` in the most real-world use cases. The
99+
path is chosen to be under `/var/lib/virtlet` because this directory
100+
is mounted into the Virtlet pod by default and Virtlet must have
101+
access to the file or block device specified for the block PV.
102+
First, you need to create the file to be used for the contents
103+
of the local block PV:
104+
```bash
105+
docker exec kube-node-1 dd if=/dev/zero of=/var/lib/virtlet/looptest bs=1M count=1000
106+
docker exec kube-node-1 mkfs.ext4 /var/lib/virtlet/looptest
107+
```
108+
109+
Let's create the PV, PVC and the pod that uses them:
110+
```bash
111+
kubectl apply -f examples/ubuntu-vm-local-block-pv.yaml
112+
```
113+
114+
After the VM boots, we can log into it and verify that the PV is
115+
indeed mounted:
116+
117+
```console
118+
$ virtletctl ssh ubuntu@ubuntu-vm -- -i examples/vmkey
119+
...
120+
ubuntu@ubuntu-vm:~$ sudo touch /mnt/foo
121+
ubuntu@ubuntu-vm:~$ ls -l /mnt
122+
total 16
123+
-rw-r--r-- 1 root root 0 Oct 1 17:27 foo
124+
drwx------ 2 root root 16384 Oct 1 14:41 lost+found
125+
$ exit
126+
```
127+
128+
Then we can delete and re-create the pod
129+
```bash
130+
kubectl delete pod ubuntu-vm
131+
# wait till the pod disappears
132+
kubectl get pod -w
133+
kubectl apply -f examples/ubuntu-vm-local-block-pv.yaml
134+
```
135+
136+
And, after the VM boots, log in again to verify that the file `foo` is
137+
still there:
138+
```console
139+
$ virtletctl ssh ubuntu@ubuntu-vm -- -i examples/vmkey
140+
...
141+
ubuntu@ubuntu-vm:~$ ls -l /mnt
142+
total 16
143+
-rw-r--r-- 1 root root 0 Oct 1 17:27 foo
144+
drwx------ 2 root root 16384 Oct 1 14:41 lost+found
145+
$ exit
146+
```
147+
148+
# Using Ceph block PVs
149+
150+
For Ceph examples you'll also need to start a Ceph test container
151+
(`--privileged` flag and `-v` mounts of `/sys/bus` and `/dev` are
152+
needed for `rbd map` to work from within the `ceph_cluster` container;
153+
they're not needed for persistent root filesystem example in the next
154+
section):
155+
```bash
156+
MON_IP=$(docker exec kube-master route | grep default | awk '{print $2}')
157+
CEPH_PUBLIC_NETWORK=${MON_IP}/16
158+
docker run -d --net=host -e MON_IP=${MON_IP} \
159+
--privileged \
160+
-v /dev:/dev \
161+
-v /sys/bus:/sys/bus \
162+
-e CEPH_PUBLIC_NETWORK=${CEPH_PUBLIC_NETWORK} \
163+
-e CEPH_DEMO_UID=foo \
164+
-e CEPH_DEMO_ACCESS_KEY=foo \
165+
-e CEPH_DEMO_SECRET_KEY=foo \
166+
-e CEPH_DEMO_BUCKET=foo \
167+
-e DEMO_DAEMONS="osd mds" \
168+
--name ceph_cluster docker.io/ceph/daemon demo
169+
# wait for the cluster to start
170+
docker logs -f ceph_cluster
171+
```
172+
173+
Create a pool there:
174+
```bash
175+
docker exec ceph_cluster ceph osd pool create kube 8 8
176+
```
177+
178+
Create an image for testing (it's important to use `rbd create` with
179+
`layering` feature here so as not to get a feature mismatch error
180+
later when creating a pod):
181+
```bash
182+
docker exec ceph_cluster rbd create tstimg \
183+
--size 1G --pool kube --image-feature layering
184+
```
185+
186+
Set up a Kubernetes secret for use with Ceph:
187+
```bash
188+
admin_secret="$(docker exec ceph_cluster ceph auth get-key client.admin)"
189+
kubectl create secret generic ceph-admin \
190+
--type="kubernetes.io/rbd" \
191+
--from-literal=key="${admin_secret}"
192+
```
193+
194+
To test the block PV, we also need to create a filesystem on the node
195+
(this is not needed for testing the persistent rootfs below).
196+
Yo may need to load RBD module on the docker host to be able to do this:
197+
```bash
198+
modprobe rbd
199+
```
200+
201+
Then we can map the RBD, create a filesystem on it and unmap it again:
202+
```bash
203+
rbd=$(docker exec ceph_cluster rbd map tstimg --pool=kube)
204+
docker exec kube-node-1 mkfs.ext4 "${rbd}"
205+
docker exec ceph_cluster rbd unmap tstimg --pool=kube
206+
```
207+
208+
After that, you can create the block PV, PVC and the pod that uses
209+
them and verify the PV being mounted into `ubuntu-vm` the same way as
210+
it was done in the previous section:
211+
212+
```bash
213+
kubectl apply -f examples/ubuntu-vm-rbd-block-pv.yaml
214+
```
215+
216+
# Using the persistent root filesystem
217+
218+
[cirros-vm-persistent-rootfs-local.yaml](cirros-vm-persistent-rootfs-local.yaml)
219+
demonstrates the use of persistent root filesystem. The most important part
220+
is the `volumeDevices` section in the pod's container definition:
221+
```yaml
222+
volumeDevices:
223+
- devicePath: /
224+
name: testpvc
225+
```
226+
227+
Unlike the local PV example above, we can't use a file instead of a
228+
real block device, as Virtlet uses the device mapper internally which
229+
can't work with plain files. We don't need to run `mkfs.ext4` this
230+
time though as Virtlet will copy the VM image over the contents of the
231+
device. Let's create a loop device to be used for the PV:
232+
233+
```bash
234+
docker exec kube-node-1 dd if=/dev/zero of=/rawtest bs=1M count=1000
235+
docker exec kube-node-1 /bin/bash -c 'ln -s $(losetup -f /rawtest --show) /dev/rootdev'
236+
```
237+
We use a symbolic link to the actual block device here so we don't
238+
need to edit the example yaml.
239+
240+
241+
After that, we create the PV, PVC and the pod:
242+
```bash
243+
kubectl apply -f examples/cirros-vm-persistent-rootfs-local.yaml
244+
```
245+
246+
After the VM boots, we can log into it and create a file on the root
247+
filesystem:
248+
249+
```console
250+
$ virtletctl ssh cirros@cirros-vm-p -- -i examples/vmkey
251+
...
252+
$ echo foo >bar.txt
253+
```
254+
255+
Then we delete the pod, wait for it to disappear, and then re-create it:
256+
```bash
257+
kubectl delete pod cirros-vm-p
258+
kubectl apply -f examples/cirros-vm-persistent-rootfs-local.yaml
259+
```
260+
261+
After logging into the new VM pod, we see that the file is still
262+
there:
263+
```console
264+
$ virtletctl ssh cirros@cirros-vm-p -- -i examples/vmkey
265+
...
266+
$ cat bar.txt
267+
foo
268+
```
269+
270+
[cirros-vm-persistent-rootfs-rbd.yaml](cirros-vm-persistent-rootfs-rbd.yaml)
271+
demonstrates the use of persistent root filesystem on a Ceph RBD. To
272+
use it, you need to set up a test Ceph cluster and create a test image
273+
as described in the [previous section](#using-ceph-block-pvs), except
274+
that you don't have to run the Ceph test container as `--privileged`,
275+
don't have to mount `/dev` and `/sys/bus` into the Ceph test container
276+
and don't have to map the RBD and run `mkfs.ext4` on it. You can
277+
create the PV, PVC and the pod for the example using this command:
278+
```bash
279+
kubectl apply -f examples/cirros-vm-persistent-rootfs-rbd.yaml
280+
```
281+
282+
After that, you can verify that the persistent rootfs indeed works
283+
using the same approach as with local PVs.

examples/cirros-vm-persistent-rootfs.yaml examples/cirros-vm-persistent-rootfs-local.yaml

+3-2
Original file line numberDiff line numberDiff line change
@@ -26,8 +26,8 @@ spec:
2626
volumeMode: Block
2727
local:
2828
# set up with:
29-
# docker exec kube-node-1 /bin/bash -c 'dd if=/dev/zero of=/rawtest bs=1M count=1000 && losetup -f /rawtest --show'
30-
path: /dev/loop0
29+
# docker exec kube-node-1 /bin/bash -c 'dd if=/dev/zero of=/rawtest bs=1M count=1000 && ln -s $(losetup -f /rawtest --show) /dev/rootdev'
30+
path: /dev/rootdev
3131
claimRef:
3232
name: local-block-pvc
3333
namespace: default
@@ -72,6 +72,7 @@ spec:
7272
tty: true
7373
stdin: true
7474
volumeDevices:
75+
# Use persistent root
7576
- devicePath: /
7677
name: testpvc
7778
volumes:
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
apiVersion: v1
2+
kind: PersistentVolume
3+
metadata:
4+
name: test-block-pv
5+
spec:
6+
accessModes:
7+
- ReadWriteOnce
8+
capacity:
9+
storage: 10Mi
10+
claimRef:
11+
name: ceph-block-pvc
12+
namespace: default
13+
persistentVolumeReclaimPolicy: Delete
14+
volumeMode: Block
15+
rbd:
16+
image: tstimg
17+
monitors:
18+
- 10.192.0.1:6789
19+
pool: kube
20+
secretRef:
21+
name: ceph-admin
22+
---
23+
kind: PersistentVolumeClaim
24+
apiVersion: v1
25+
metadata:
26+
name: ceph-block-pvc
27+
spec:
28+
accessModes:
29+
- ReadWriteOnce
30+
volumeMode: Block
31+
# storageClassName: ceph-testnew
32+
resources:
33+
requests:
34+
storage: 10Mi
35+
---
36+
apiVersion: v1
37+
kind: Pod
38+
metadata:
39+
name: cirros-vm-p
40+
annotations:
41+
kubernetes.io/target-runtime: virtlet.cloud
42+
# CirrOS doesn't load nocloud data from SCSI CD-ROM for some reason
43+
VirtletDiskDriver: virtio
44+
VirtletSSHKeys: |
45+
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCaJEcFDXEK2ZbX0ZLS1EIYFZRbDAcRfuVjpstSc0De8+sV1aiu+dePxdkuDRwqFtCyk6dEZkssjOkBXtri00MECLkir6FcH3kKOJtbJ6vy3uaJc9w1ERo+wyl6SkAh/+JTJkp7QRXj8oylW5E20LsbnA/dIwWzAF51PPwF7A7FtNg9DnwPqMkxFo1Th/buOMKbP5ZA1mmNNtmzbMpMfJATvVyiv3ccsSJKOiyQr6UG+j7sc/7jMVz5Xk34Vd0l8GwcB0334MchHckmqDB142h/NCWTr8oLakDNvkfC1YneAfAO41hDkUbxPtVBG5M/o7P4fxoqiHEX+ZLfRxDtHB53 me@localhost
46+
# VirtletCloudInitUserData: |
47+
# mounts:
48+
# - ["/dev/testpvc", "/mnt"]
49+
spec:
50+
affinity:
51+
nodeAffinity:
52+
requiredDuringSchedulingIgnoredDuringExecution:
53+
nodeSelectorTerms:
54+
- matchExpressions:
55+
- key: extraRuntime
56+
operator: In
57+
values:
58+
- virtlet
59+
# This is the number of seconds Virtlet gives the VM to shut down cleanly.
60+
# The default value of 30 seconds is ok for containers but probably too
61+
# low for VM, so overriding it here is strongly advised.
62+
terminationGracePeriodSeconds: 120
63+
containers:
64+
- name: cirros-vm
65+
image: virtlet.cloud/download.cirros-cloud.net/0.3.5/cirros-0.3.5-x86_64-disk.img
66+
imagePullPolicy: IfNotPresent
67+
# tty and stdin required for `kubectl attach -t` to work
68+
tty: true
69+
stdin: true
70+
volumeDevices:
71+
# Use persistent rootfs on testpvc
72+
- devicePath: /
73+
name: testpvc
74+
volumes:
75+
- name: testpvc
76+
persistentVolumeClaim:
77+
claimName: ceph-block-pvc

0 commit comments

Comments
 (0)