Skip to content

Commit f9b12db

Browse files
committed
doc: update better VM revert proposal
This proposal is 11 years old and the architecture of xapi ahs changed enough that the proposal needs changing. In particular, SMAPIv3 has been introduced which means that there's an xapi storage interface shared between the two SMAPI backend, and that the fallback the logic that was done at SMAPI level now must be done at the level of xapi. I've also elected to further restrict the fallback behaviour to VM reverts, VDI revert will require backend support: it reduces the chances of failures during revert resulting in wrong fields in the database. Signed-off-by: Pau Ruiz Safont <[email protected]>
1 parent 55b52a9 commit f9b12db

File tree

1 file changed

+92
-50
lines changed

1 file changed

+92
-50
lines changed

doc/content/design/snapshot-revert.md

Lines changed: 92 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -1,85 +1,128 @@
11
---
2-
title: Improving snapshot revert behaviour
2+
title: Better VM revert
33
layout: default
44
design_doc: true
5-
revision: 1
5+
revision: 2
66
status: confirmed
77
---
88

9-
Currently there is a XenAPI `VM.revert` which reverts a "VM" to the state it
10-
was in when a VM-level snapshot was taken. There is no `VDI.revert` so
11-
`VM.revert` uses `VDI.clone` to change the state of the disks.
9+
## Overview
1210

13-
The use of `VDI.clone` has the side-effect of changing VDI refs and uuids.
14-
This causes the following problems:
11+
XenAPI allows users to rollback the state of a VM to a previous state, which is
12+
stored in a snapshot, using the call `VM.revert`. Because there is no
13+
`VDI.revert` call, `VM.revert` uses `VDI.clone` on the snapshot to duplicate
14+
the contents of that disk and then use the new clone as the storage for the VM.
1515

16-
- It is difficult for clients
17-
such as [Apache CloudStack](http://cloudstack.apache.org) to keep track
18-
of the disks it is actively managing
19-
- VDI snapshot metadata (`VDI.snapshot_of` et al) has to be carefully
20-
fixed up since all the old refs are now dangling
16+
Because `VDI.clone` creates new VDI refs and uuids, some problematic
17+
behaviours arise:
2118

22-
We will fix these problems by:
19+
- Clients such as
20+
[Apache CloudStack](http://cloudstack.apache.org) need to include complex
21+
logic to keep track of the disks they are actively managing
22+
- Because the snapshot is cloned and the original vdi is deleted, VDI
23+
references to the VDI become invalid, like `VDI.snapshot_of`. This means
24+
that the database has to be combed through to change these references.
25+
Because the database doesn't support transactions this operation is not atomic
26+
and can produce inconsistent database states.
2327

24-
1. adding a `VDI.revert` to the SMAPIv2 and calling this from `VM.revert`
25-
2. defining a new SMAPIv1 operation `vdi_revert` and a corresponding capability
26-
`VDI_REVERT`
27-
3. the Xapi implementation of `VDI.revert` will first try the `vdi_revert`,
28-
and fall back to `VDI.clone` if that fails
29-
4. implement `vdi_revert` for common storage types, including File and LVM-based
30-
SRs.
28+
Additionally, some filesystems support snapshots natively, doing the clone
29+
procedure is much costlier than allowing the filesystem to do the revert.
3130

32-
XenAPI changes
33-
--------------
31+
We will fix these problems by:
3432

35-
We will add the function `VDI.revert` with arguments:
33+
- introducing the new feature `VDI_REVERT` in SM interface (`xapi_smint`). This
34+
allows backends to advertise that they support the new functionality
35+
- defining a new storage operation `VDI.revert` in storage_interface, which is
36+
gated by the feature `VDI_REVERT`
37+
- proxying the storage operation to SMAPIv3 and SMAPv1 backends accordingly
38+
- adding `VDI.revert` to xapi_vdi which will call the storage operation if the
39+
backend advertises it, and fallback to the previous method that uses
40+
`VDI.clone` if it doesn't advertise it, or issues are detected at runtime
41+
that prevent it
42+
- changing the Xapi implementation of `VM.revert` to use `VDI.revert`
43+
- implement `vdi_revert` for common storage types, including File and LVM-based
44+
SRs
45+
- adding unit and quick tests to xapi to test that `VM.revert` does not regress
46+
47+
## Current VM.revert behaviour
48+
49+
The code that reverts the state of storage is located in
50+
[update_vifs_vbds_vgpus_and_vusbs](https://github.com/xapi-project/xen-api/blob/bc0ba4e9dc8dc4b85b7cbdbf3e0ba5915b4ad76d/ocaml/xapi/xapi_vm_snapshot.ml#L211).
51+
The steps it does is:
52+
1. destroys the VM's VBDs (both disks and CDs)
53+
2. destroys the VM's VDI (disks only), referenced by the snapshot's VDIs using
54+
`snapshot_of`; as well as the suspend VDI.
55+
3. clones the snapshot's VDIs (disks and CDs), if one clone fails none remain.
56+
4. searches the database for all `snapshot_of` references to the deleted VDIs
57+
and replaces them with the referenced of the newly cloned snapshots.
58+
5. clones the snapshot's resume VDI
59+
6. creates copies of all the cloned VBDs and associates them with the cloned VDIs
60+
7. assigns the new resume VDI to the VM
61+
62+
## XenAPI design
63+
64+
### API
65+
66+
The function `VDI.revert` will be added, with arguments:
3667

3768
- in: `snapshot: Ref(VDI)`: the snapshot to which we want to revert
3869
- in: `driver_params: Map(String,String)`: optional extra parameters
39-
- out: `Ref(VDI)` the new VDI
70+
- out: `Ref(VDI)` reference to the new VDI with the reverted contents
71+
72+
The function will extract the reference of VDI whose contents need to be
73+
replaced. This is the snapshot's `snapshot_of` field, then it will call the
74+
storage function function `VDI.revert` to have its contents replaced with the
75+
snapshot's. The VDI object will not be modified, and the reference returned is
76+
the VDI's original reference.
77+
If anything impedes the successful finish of an in-place revert, like the SM
78+
backend does not advertising the feature `VDI_REVERT`, not implement the
79+
feature, or the `snapshot_of` reference is invalid; an exception will be
80+
raised.
4081

41-
The function will look up the VDI which this is a `snapshot_of`, and change
42-
the VDI to have the same contents as the snapshot. The snapshot will not be
43-
modified. If the implementation is able to revert in-place, then the reference
44-
returned will be the VDI this is a `snapshot_of`; otherwise it is a reference
45-
to a fresh VDI (created by the `VDI.clone` fallback path)
82+
### Xapi Storage
4683

47-
References:
84+
The function `VDI.revert` is added, with the following arguments:
4885

49-
- @johnelse's [pull request](https://github.com/xapi-project/xen-api/pull/1963)
50-
which implements this
86+
- in: `dbg`: the task identifier, useful for tracing
87+
- in: `sr`: SR where the new VDI must be created
88+
- in: `snapshot_info`: metadata of the snapshot, the contents of which must be
89+
made available in the VDI indicated by the `snapshot_of` field
5190

52-
SMAPIv1 changes
53-
---------------
91+
#### SMAPIv1
5492

55-
We will define the function `vdi_revert` with arguments:
93+
The function `vdi_revert` is defined with the following arguments:
5694

5795
- in: `sr_uuid`: the UUID of the SR containing both the VDI and the snapshot
58-
- in: `vdi_uuid`: the UUID of the snapshot whose contents should be duplicated
59-
- in: `target_uuid`: the UUID of the target whose contents should be replaced
96+
- in: `vdi_uuid`: the UUID of the snapshot whose contents must be duplicated
97+
- in: `target_uuid`: the UUID of the target whose contents must be replaced
6098

6199
The function will replace the contents of the `target_uuid` VDI with the
62100
contents of the `vdi_uuid` VDI without changing the identify of the target
63101
(i.e. name-label, uuid and location are guaranteed to remain the same).
64102
The `vdi_uuid` is preserved by this operation. The operation is obvoiusly
65103
idempotent.
66104

67-
Xapi changes
68-
------------
105+
#### SMAPIv3
69106

70-
Xapi will
107+
In an analogous way to SMAPIv1, the function `Volume.revert` is defined with the
108+
following arguments:
71109

72-
- use `VDI.revert` in the `VM.revert` code-path
73-
- expose a new `xe vdi-revert` CLI command
74-
- implement the `VDI.revert` by calling the SMAPIv1 function and falling back
75-
to `VDI.clone` if a `Not_implemented` exception is thrown
110+
- in: `dbg`: the task identifier, useful for tracing
111+
- in: `sr`: the UUID of the SR containing both the VDI and the snapshot
112+
- in: `snapshot`: the UUID of the snapshot whose contents must be duplicated
113+
- in: `vdi`: the UUID of the VDI whose contents must be replaced
76114

77-
References:
115+
### Xapi
78116

79-
- @johnelse's [pull request](https://github.com/xapi-project/xen-api/pull/1963)
117+
- add the capability `VDI_REVERT` so backends can advertise it
118+
- use `VDI.revert` in the `VM.revert` after the VDIs have been destroyed, and
119+
before the snapshot's VDIs have been cloned. If any of the reverts fail
120+
because a `Not_implemented` exception is thrown, or the `snapshot_of`
121+
contains an invalid reference, add the affected VDIs to the list to be cloned
122+
and recovered, using the existing method
123+
- expose a new `xe vdi-revert` CLI command
80124

81-
SM changes
82-
----------
125+
## SM changes
83126

84127
We will modify
85128

@@ -92,8 +135,7 @@ We will modify
92135
snapshot/clone machinery
93136
- LVHDoISCSISR.py and LVHDoHBASR.py to advertise the `VDI_REVERT` capability
94137

95-
Prototype code
96-
==============
138+
# Prototype code from the previous proposal
97139

98140
Prototype code exists here:
99141

0 commit comments

Comments
 (0)