-
Notifications
You must be signed in to change notification settings - Fork 131
Add memory to card 1007 #4284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add memory to card 1007 #4284
Conversation
@@ -433,6 +433,34 @@ def update_cache_completed! | |||
end | |||
end | |||
|
|||
# The min_memory value from Job::OodCore::Info.native | |||
# @return [Float] of job's memory in GB's | |||
def min_memory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not so sure about converting this. This is not likely to work with other schedulers.
Even so - Slurm always reports in MB according to the squeue
documentation, so we likely don't even need such a complex algorithm for it. I'm sure Rails' number_to_human_size
would work fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lol, i mean literally i get back "4556M" so the docs are as usual, not totally true for Slurm.
I can remove the conversions, but the point of the config to turn this on was to point out to the admin what they are doing, right? As in they see it's tested and works for slurm but not others yet, they would know this from the docs we provide.
But, I'll monkey with this some more today and see what i can figure out but slurm hands a string back, not numbers, so we have to do some massaging here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
, but the point of the config to turn this on was to point out to the admin what they are doing, right? As in they see it's tested and works for slurm but not others yet, they would know this from the docs we provide.
I guess my argument is it should "just work" if it can and continue to do nothing if it doesn't.
The issue I have with configuration proliferation is just that we have a lot already and it's just untenable to provide a toggle for every single little thing we do. I mean do we need a config for cores and nodes (and GPUs) now too?
Plus we answer discourse topics all the time with "here's the config for that", so I guess I'm trying to avoid that.
But, I'll monkey with this some more today and see what i can figure out but slurm hands a string back, not numbers, so we have to do some massaging here.
Casting to_i
will remove it. Yea I'd say let's use number_to_human_size
to our advantage. This simple script outputs 4.35 GB
rubyrequire 'rails'
require 'action_view'
include ActionView::Helpers::NumberHelper
puts number_to_human_size("4456M".to_i*1024*1024)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this a little bit more, I created this ticket in ood_core so the Info object can just return the memory. Seems like that's the appropriate layer to actually do all these translations.
Until we get that, we should just spot check the other adapters to see if any of them return native[:min_memory]
. If no other scheduler does, then this expression return nil if info.native[:min_memory].empty?
should work fine for now.
def show_bc_card_memory? | ||
if ENV['OOD_BC_CARD_MEMORY'] | ||
to_bool(ENV['OOD_BC_CARD_MEMORY']) | ||
else | ||
false | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need a config for this, it's just another thing us to document and keep track for the admin to miss.
Addresses #1007
I have only been able to test with SLURM. Added a bit to set this as I'm unsure if other schedulers would be incompatible with what I have here, so I want user's to be aware they are turning this on. Maybe it's not needed though.