Skip to content

Improve LinstorSR.py to handle thick SR creation #85

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: 3.2.3-8.3
Choose a base branch
from

Conversation

rushikeshjadhav
Copy link

@rushikeshjadhav rushikeshjadhav commented Apr 21, 2025

In some cases thick SR creation may fail due to get_online_hosts as the metrics could take time. Thus, changed the mechanism to get_enabled_hosts.

Comment on lines 589 to 596
for attempt in range(3):
online_hosts = util.get_online_hosts(self.session)
if len(online_hosts) >= len(host_adresses):
util.SMlog("DEBUG: All hosts online")
break # Ok and proceed to create
else:
util.SMlog("DEBUG: Online host: {} ; Adresses: {}".format(online_hosts, host_adresses))
time.sleep(15) # Sleep and retry
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to modify the logic of get_online_hosts instead (without using host metrics, just enabled flag I guess) and not try to add a sleep call during creation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we look for reason on why host metrics were used here? Or we can introduce get_enabled_hosts to be at par with semantics and use it while leaving get_online_hosts as is. Will go with your call.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function get_online_hosts predate the git history, it won't be easy to know why

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rushikeshjadhav I agree with you, a new function is probably a good idea. This would avoid breaking the use of the current function in cleanup.py.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By looking at code, online_hosts seems necessary to fetch hostname and its ip address. There is verification logic about Multiple hosts with same hostname.

Further it calls _prepare_sr_on_all_hosts with enabled=True which subsequently calls _prepare_sr that works if host is enabled. So checking for enabled may be redundant.

This might have to be verified with someone from xapi that if fetching ip address matters in cases of enabled vs online. We can switch to enabled if ip address and hostnames are available at this stage or have to revert to online and wait for it.

In some drivers e.g. Linstor, we need to ensure that hosts are enabeld before performing operations, hence this function is needed.

Signed-off-by: Rushikesh Jadhav <[email protected]>
…as the metrics could take time.

Thus, changed the mechanism to `get_enabled_hosts`.

Signed-off-by: Rushikesh Jadhav <[email protected]>
@rushikeshjadhav rushikeshjadhav force-pushed the 3.2.3-8.3-online-hosts branch from ea5a1e7 to 697676c Compare April 24, 2025 18:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants