@@ -8,9 +8,10 @@ status: confirmed
88
99## Overview
1010
11- Currently there is a XenAPI ` VM.revert ` which reverts a VM to the state it was
12- in when a VM-level snapshot was taken. Because there is no ` VDI.revert ` ,
13- ` VM.revert ` uses ` VDI.clone ` on the snapshot to change the state of the disks.
11+ XenAPI allows users to rollback the state of a VM to a previous state, which is
12+ stored in a snapshot, using the call ` VM.revert ` . Because there is no
13+ ` VDI.revert ` call, ` VM.revert ` uses ` VDI.clone ` on the snapshot to duplicate
14+ the contents of that disk and then use the new clone as the storage for the VM.
1415
1516Because ` VDI.clone ` creates new VDI refs and uuids, some problematic
1617behaviours arise:
@@ -20,21 +21,44 @@ behaviours arise:
2021 logic to keep track of the disks they are actively managing
2122- Because the snapshot is cloned and the original vdi is deleted, VDI
2223 references to the VDI become invalid, like ` VDI.snapshot_of ` . This means
23- there the database has to be combed through to change these references.
24- Because the database doesn't support transaction this operation is not atomic
24+ that the database has to be combed through to change these references.
25+ Because the database doesn't support transactions this operation is not atomic
2526 and can produce inconsistent database states.
2627
28+ Additionally, some filesystems support snapshots natively, doing the clone
29+ procedure is much costlier than allowing the filesystem to do the revert.
30+
2731We will fix these problems by:
2832
29- - defining a new storage operation ` VDI.revert ` in storage_interface and
30- calling this from ` VM.revert `
33+ - introducing the new feature ` VDI_REVERT ` in SM interface (` xapi_smint ` ). This
34+ allows backends to advertise that they support the new functionality
35+ - defining a new storage operation ` VDI.revert ` in storage_interface, which is
36+ gated by the feature ` VDI_REVERT `
3137- proxying the storage operation to SMAPIv3 and SMAPv1 backends accordingly
32- - changing the Xapi implementation of ` VM.revert ` to first try the
33- ` VDI.revert ` , and fall back to ` VDI.clone ` if that fails
38+ - adding ` VDI.revert ` to xapi_vdi which will call the storage operation if the
39+ backend advertises it, and fallback to the previous method that uses
40+ ` VDI.clone ` if it doesn't advertise it, or issues are detected at runtime
41+ that prevent it
42+ - changing the Xapi implementation of ` VM.revert ` to use ` VDI.revert `
3443- implement ` vdi_revert ` for common storage types, including File and LVM-based
3544 SRs
3645- adding unit and quick tests to xapi to test that ` VM.revert ` does not regress
3746
47+ ## Current VM.revert behaviour
48+
49+ The code that reverts the state of storage is located in
50+ [ update_vifs_vbds_vgpus_and_vusbs] ( https://github.com/xapi-project/xen-api/blob/bc0ba4e9dc8dc4b85b7cbdbf3e0ba5915b4ad76d/ocaml/xapi/xapi_vm_snapshot.ml#L211 ) .
51+ The steps it does is:
52+ 1 . destroys the VM's VBDs (both disks and CDs)
53+ 2 . destroys the VM's VDI (disks only), referenced by the snapshot's VDIs using
54+ ` snapshot_of ` ; as well as the suspend VDI.
55+ 3 . clones the snapshot's VDIs (disks and CDs), if one clone fails none remain.
56+ 4 . searches the database for all ` snapshot_of ` references to the deleted VDIs
57+ and replaces them with the referenced of the newly cloned snapshots.
58+ 5 . clones the snapshot's resume VDI
59+ 6 . creates copies of all the cloned VBDs and associates them with the cloned VDIs
60+ 7 . assigns the new resume VDI to the VM
61+
3862## XenAPI design
3963
4064### API
@@ -43,16 +67,17 @@ The function `VDI.revert` will be added, with arguments:
4367
4468- in: ` snapshot: Ref(VDI) ` : the snapshot to which we want to revert
4569- in: ` driver_params: Map(String,String) ` : optional extra parameters
46- - out: ` Ref(VDI) ` the new VDI
47-
48- The function will look up the VDI which this is a ` snapshot_of ` , check that
49- the value is not invalid nor null, and use that VDI reference to call SM to
50- change the VDI to have the same contents as the snapshot. The snapshot object
51- will not be modified, and the reference returned is the reference in
52- ` snapshot_of ` .
53- If anything impedes the successful finish of an in-place revert, the previous
54- method of using ` VDI.clone ` is used as fallback, and the reference returned
55- refers to the fresh VDI created by the ` VDI.clone ` fallback.
70+ - out: ` Ref(VDI) ` reference to the new VDI with the reverted contents
71+
72+ The function will extract the reference of VDI whose contents need to be
73+ replaced. This is the snapshot's ` snapshot_of ` field, then it will call the
74+ storage function function ` VDI.revert ` to have its contents replaced with the
75+ snapshot's. The VDI object will not be modified, and the reference returned is
76+ the VDI's original reference.
77+ If anything impedes the successful finish of an in-place revert, like the SM
78+ backend does not advertising the feature ` VDI_REVERT ` , not implement the
79+ feature, or the ` snapshot_of ` reference is invalid; an exception will be
80+ raised.
5681
5782### Xapi Storage
5883
@@ -62,7 +87,6 @@ The function `VDI.revert` is added, with the following arguments:
6287- in: ` sr ` : SR where the new VDI must be created
6388- in: ` snapshot_info ` : metadata of the snapshot, the contents of which must be
6489 made available in the VDI indicated by the ` snapshot_of ` field
65- - out: ` vdi_info ` : metadata of the resulting VDI
6690
6791#### SMAPIv1
6892
@@ -87,14 +111,16 @@ following arguments:
87111- in: ` sr ` : the UUID of the SR containing both the VDI and the snapshot
88112- in: ` snapshot ` : the UUID of the snapshot whose contents must be duplicated
89113- in: ` vdi ` : the UUID of the VDI whose contents must be replaced
90- -
114+
91115### Xapi
92116
93- - use ` VDI.revert ` in the ` VM.revert ` code-path and fall back to ` VDI.clone ` if
94- a ` Not_implemented ` exception is thrown, or the snapshot_of contains an
95- invalid reference
117+ - add the capability ` VDI_REVERT ` so backends can advertise it
118+ - use ` VDI.revert ` in the ` VM.revert ` after the VDIs have been destroyed, and
119+ before the snapshot's VDIs have been cloned. If any of the reverts fail
120+ because a ` Not_implemented ` exception is thrown, or the ` snapshot_of `
121+ contains an invalid reference, add the affected VDIs to the list to be cloned
122+ and recovered, using the existing method
96123- expose a new ` xe vdi-revert ` CLI command
97- - implement the ` VDI.revert `
98124
99125## SM changes
100126
0 commit comments