Skip to content

kvm: fix volume migration across cluster-scope pools #10266

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

weizhouapache
Copy link
Member

Description

This PR fixes #10078

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • build/CI
  • test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

Copy link

codecov bot commented Jan 24, 2025

Codecov Report

Attention: Patch coverage is 0% with 29 lines in your changes missing coverage. Please review.

Project coverage is 15.14%. Comparing base (2aa2e92) to head (1f1eda9).
Report is 54 commits behind head on 4.19.

Files with missing lines Patch % Lines
...torage/motion/StorageSystemDataMotionStrategy.java 0.00% 14 Missing ⚠️
.../main/java/com/cloud/storage/MigrationOptions.java 0.00% 7 Missing ⚠️
...ud/hypervisor/kvm/storage/KVMStorageProcessor.java 0.00% 4 Missing ⚠️
...ervisor/kvm/resource/LibvirtComputingResource.java 0.00% 3 Missing ⚠️
...motion/KvmNonManagedStorageDataMotionStrategy.java 0.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##               4.19   #10266   +/-   ##
=========================================
  Coverage     15.14%   15.14%           
+ Complexity    11283    11281    -2     
=========================================
  Files          5408     5408           
  Lines        473822   473843   +21     
  Branches      57825    57827    +2     
=========================================
+ Hits          71762    71769    +7     
- Misses       394037   394049   +12     
- Partials       8023     8025    +2     
Flag Coverage Δ
uitests 4.29% <ø> (ø)
unittests 15.86% <0.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@weizhouapache
Copy link
Member Author

@blueorangutan package

@blueorangutan
Copy link

@weizhouapache a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@DaanHoogland DaanHoogland added this to the 4.19.2 milestone Jan 24, 2025
@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 12202

@Pearl1594
Copy link
Contributor

@weizhouapache is this ready for review?

@weizhouapache weizhouapache marked this pull request as ready for review February 21, 2025 14:16
@weizhouapache
Copy link
Member Author

@weizhouapache is this ready for review?

yes @Pearl1594

@Pearl1594
Copy link
Contributor

@blueorangutan package

@blueorangutan
Copy link

@Pearl1594 a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 12559

@Pearl1594
Copy link
Contributor

@blueorangutan test keepEnv

@blueorangutan
Copy link

@Pearl1594 a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@JoaoJandre
Copy link
Contributor

@weizhouapache could you explain a bit about how your patch fixes the problem and which tests did you do to verify it?

@weizhouapache
Copy link
Member Author

weizhouapache commented Feb 25, 2025

@weizhouapache could you explain a bit about how your patch fixes the problem and which tests did you do to verify it?

@JoaoJandre
The steps to reproduce the issue is described in #10078

when migrate volume from a cluster-wide storage to another cluster-wide storage in another cluster, we should treat it as similar as migration between local storages, because the source host is not able to mount the destination pool, although it is shared(e.g. nfs).

I have verified the volume migrations between nfs/nfs,nfs/lcoal,local/local storage across clusters.

@blueorangutan
Copy link

[SF] Trillian test result (tid-12490)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 44321 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr10266-t12490-kvm-ol8.zip
Smoke tests completed. 132 look OK, 1 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_01_secure_vm_migration Error 134.42 test_vm_life_cycle.py
test_01_secure_vm_migration Error 134.42 test_vm_life_cycle.py

@Pearl1594
Copy link
Contributor

@weizhouapache Could you please check if the test failures are in any way related to this PR. I thought they weren't, but would like your confirmation.

2025-02-25 23:01:26,106 - CRITICAL - EXCEPTION: test_01_secure_vm_migration: ...  Execute cmd: deployvirtualmachine failed, due to: errorCode: 431, errorText:Unable to deploy the VM as the host: ol8.localdomain is not in the right state\n'

2025-02-25 23:01:26,111 - CRITICAL - EXCEPTION: test_01_secure_vm_migration: ... errorcode : 530, errortext : 'Failed to generate keystore and get CSR from the host/agent id=1'

@weizhouapache
Copy link
Member Author

test_01_secure_vm_migration

@Pearl1594
I have seen same failures with other PRs, let's rekick the tests

@blueorangutan test

@blueorangutan
Copy link

@weizhouapache a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@JoaoJandre
Copy link
Contributor

@weizhouapache could you explain a bit about how your patch fixes the problem and which tests did you do to verify it?

@JoaoJandre The steps to reproduce the issue is described in #10078

when migrate volume from a cluster-wide storage to another cluster-wide storage in another cluster, we should treat it as similar as migration between local storages, because the source host is not able to mount the destination pool, although it is shared(e.g. nfs).

I have verified the volume migrations between nfs/nfs,nfs/lcoal,local/local storage across clusters.

I see, thank you for the explanations :-)

Copy link
Contributor

@DaanHoogland DaanHoogland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clgtm

Copy link
Contributor

@slavkap slavkap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code LGTM

@blueorangutan
Copy link

[SF] Trillian test result (tid-12496)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 48574 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr10266-t12496-kvm-ol8.zip
Smoke tests completed. 132 look OK, 1 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_02_unsecure_vm_migration Error 492.75 test_vm_life_cycle.py

Copy link
Contributor

@Pearl1594 Pearl1594 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested. Cross-cluster migration works.

@Pearl1594 Pearl1594 merged commit f992ebb into apache:4.19 Feb 27, 2025
24 of 25 checks passed
@DaanHoogland DaanHoogland deleted the 4.19-fix-volume-migration-across-clusters branch February 27, 2025 14:24
@Pearl1594 Pearl1594 moved this to Done in ACS 4.20.1 Mar 17, 2025
dhslove pushed a commit to ablecloud-team/ablestack-cloud that referenced this pull request Jun 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

VM cross cluster migration not working
6 participants