Skip to content

Add memory to card 1007 #4284

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open

Add memory to card 1007 #4284

wants to merge 10 commits into from

Conversation

Oglopf
Copy link
Contributor

@Oglopf Oglopf commented Apr 15, 2025

Addresses #1007

I have only been able to test with SLURM. Added a bit to set this as I'm unsure if other schedulers would be incompatible with what I have here, so I want user's to be aware they are turning this on. Maybe it's not needed though.

@@ -433,6 +433,34 @@ def update_cache_completed!
end
end

# The min_memory value from Job::OodCore::Info.native
# @return [Float] of job's memory in GB's
def min_memory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not so sure about converting this. This is not likely to work with other schedulers.

Even so - Slurm always reports in MB according to the squeue documentation, so we likely don't even need such a complex algorithm for it. I'm sure Rails' number_to_human_size would work fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lol, i mean literally i get back "4556M" so the docs are as usual, not totally true for Slurm.

I can remove the conversions, but the point of the config to turn this on was to point out to the admin what they are doing, right? As in they see it's tested and works for slurm but not others yet, they would know this from the docs we provide.

But, I'll monkey with this some more today and see what i can figure out but slurm hands a string back, not numbers, so we have to do some massaging here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

, but the point of the config to turn this on was to point out to the admin what they are doing, right? As in they see it's tested and works for slurm but not others yet, they would know this from the docs we provide.

I guess my argument is it should "just work" if it can and continue to do nothing if it doesn't.

The issue I have with configuration proliferation is just that we have a lot already and it's just untenable to provide a toggle for every single little thing we do. I mean do we need a config for cores and nodes (and GPUs) now too?

Plus we answer discourse topics all the time with "here's the config for that", so I guess I'm trying to avoid that.

But, I'll monkey with this some more today and see what i can figure out but slurm hands a string back, not numbers, so we have to do some massaging here.

Casting to_i will remove it. Yea I'd say let's use number_to_human_size to our advantage. This simple script outputs 4.35 GB

rubyrequire 'rails'
require 'action_view'

include ActionView::Helpers::NumberHelper
puts number_to_human_size("4456M".to_i*1024*1024)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this a little bit more, I created this ticket in ood_core so the Info object can just return the memory. Seems like that's the appropriate layer to actually do all these translations.

OSC/ood_core#878

Until we get that, we should just spot check the other adapters to see if any of them return native[:min_memory]. If no other scheduler does, then this expression return nil if info.native[:min_memory].empty? should work fine for now.

Comment on lines +390 to +396
def show_bc_card_memory?
if ENV['OOD_BC_CARD_MEMORY']
to_bool(ENV['OOD_BC_CARD_MEMORY'])
else
false
end
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need a config for this, it's just another thing us to document and keep track for the admin to miss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants