Skip to content

User-defined Output Folder#238

Merged
genevievestarke merged 27 commits intoNatLabRockies:developfrom
elenya-grant:output_dir
May 4, 2026
Merged

User-defined Output Folder#238
genevievestarke merged 27 commits intoNatLabRockies:developfrom
elenya-grant:output_dir

Conversation

@elenya-grant
Copy link
Copy Markdown
Collaborator

@elenya-grant elenya-grant commented Mar 23, 2026

User-defined Output Folder

Before, the outputs of a Hercules simulation (.h5 file and loggers) were written to Path.cwd()/"outputs". This PR introduces the flexibility to define an output folder and output filenames. If one is not input, the default behavior is the same as before. This PR also enables the user to specify more logging options.

This impacts the output .h5 file location and the logger file location. The introduces some new inputs to the Hercules input file:

output_dir: "outputs"
output_file: "hercules_output.h5"
overwrite_outputs: True

logging:
  logger_name: "hercules"
  log_file: "log_hercules.log"
  console_output: True
  console_prefix: "HERCULES"
  log_level: "INFO"
  use_outputs_dir: True
  outputs_dir: "outputs"

Some notes on the logging logic:

  • if logging.use_outputs_dir is True, the output folder will be the logging.outputs_dir if specified. If logging.outputs_dir is not input, then it'll use the output dir specified for as the top-level output_dir.
  • If a component.logging.outputs_dir is provided, then an error will be raised that output folder will be used for the component-level log, if not provided, it'll use the output dir used for the Hercules logger.

Changes since first review:

  • added make_unique_folder_name() function to hercules/utilities.py
  • If h_dict["overwrite_outputs"] is True, then all the files in h_dict["output_dir"] are deleted. If h_dict["overwrite_outputs"] is False, then a new output folder name is created with the same base foldername as h_dict["output_dir"]. If h_dict["overwrite_outputs"] is False and h_dict["output_dir"] already exists (suppose the output dir is called "outputs"), then the output files will be written to a folder named "outputs0". If "outputs0" folder exists, output files will be written to a folder named "outputs1", etc.
  • Removed use_outputs_dir from h_dict["logging"]. In setup_logging(), a log file is written to logging_dir/log_file

TO-DO

  • add tests for make_unique_folder_name()
  • finish updating PR description
  • update documentation (?)
    • docs/h_dict.md
    • docs/examples/07_open_cycle_gas_turbine.md
  • clean-up (removed commented out code, etc)
  • [-] add inline comments to __init__() method of hercules/plant_components/component_base.py

Requested Feedback from Reviewers/Questions (out of date)

  1. What are additional test cases that I should add in for the tests? Test with example 7 as integration test
  2. What documentation should be updated? Maybe key_concepts/outputs or in example 7 doc page write-up. Also update to H_Dict Structure.
  3. Should I update any examples? maybe add to example 07
  4. I think the function prepare_output_directory should either be called in HerculesModel.__init__() so that it can use the output dir specified in the input dictionary. Thoughts? If so - I think that another user-input parameter in the input dictionary should dictate whether that's called or not, like below:
if h_dict.get("remove_existing_output_dir", False):
   prepare_output_dictory(Path(output_dir))

Or - remove the function all together and put the shutil.rmtree(output_dir) logic in HerculesModel.__init__()

  • Gen and Elenya talked - make it an option so that a user can turn on/off overwriting output files or renaming them.
    • this would require changing all the examples

New files

  • tests/example_regression_tests/example_07_regression_test.py: tests 3 cases of different ways of specifying output directory and logger directories.

Files that were modified

  • hercules/utilities_examples.py
    • prepare_output_directory(): added depreciation warning
  • examples/hercules_input_example.yaml
  • hercules/hercules_model.py
    • HerculesModel.__init__()
      • updating logic for initializing the HDF5 output configuration
      • updated logic for getting/specifying output folder
      • updated inputs when calling self._setup_logging()
    • HerculesModel._setup_logging(): updated to take in (optional) other user-specified logger inputs
  • hercules/utilities.py
    • setup_logging()
  • hercules/plant_components/component_base.py
    • ComponentBase.__init__()
      • updated logic for getting/specifying output folder for logger
      • updated inputs when calling self._setup_logging()
    • ComponentBase._setup_logging(): updated to take in (optional) other user-specified logger inputs for the component
  • tests/hercules_model_test.py: updated some strings to Path objects.
    • test_specified_outputs_dir(): new test for user-specified output dir

@paulf81
Copy link
Copy Markdown
Collaborator

paulf81 commented Apr 1, 2026

hi @elenya-grant , thanks for this! I took a first pass through the code and then was reviewing the questions for reviewers. I think the answers included are from @genevievestarke and just noting I agree. This is looking great and let me know when you'd like me do the actual review -> green tick. Thanks!

@elenya-grant elenya-grant marked this pull request as ready for review April 16, 2026 15:19
Copy link
Copy Markdown
Collaborator

@genevievestarke genevievestarke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really great work @elenya-grant!! Just a few comments and doc strings to clear up!

Comment thread hercules/hercules_model.py Outdated
self.starttime_utc = self.h_dict["starttime_utc"]

def _setup_logging(self, logfile="log_hercules.log", console_output=True):
def _setup_logging(self, logfile="log_hercules.log", console_output=True, **kwargs):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is called in two places, this seems like something we could move to utilities instead?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paulf81, thoughts on this?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree, I just note the two functions don't have identical inputs, but I think we can just adopt the more general form of the two?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could y'all clarify what two functions you're referring to?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm referring to the function referenced with this comment, and then the function on line 105 of plant_components/component_base.py . They seem to have really similar functionality!


def test_dont_use_outputs_dir_logging():
test_n = "03"
# what happens with non-default output dir
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# what happens with non-default output dir
# what happens with non-default output dir and a different logging dir

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

hercules_dict["output_dir"] = output_dir
hercules_dict["output_file"] = f"hercules_output_test{test_n}.h5"
hercules_dict["overwrite_outputs"] = True
hercules_dict["logging"] = {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definition seems a little incongruous. We are saying to use the same outputs directory, but defining a separate logging directory?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally understand the confusion here - and it was confusing to me too. The use_outputs_dir is not a new input to the setup logging functions.

If use_outputs_dir is False - that means to basically do Path(log_file) as the output filepath for the logger (meaning that the logger will be written to the current working directory).

If use_outputs_dir is True - it'll use the output folder specified. If user specifies logging['outputs_dir'] then that'll be used as the logging folder. If output_dir (top-level) is specified and logging['outputs_dir'] is not specified , then the log files will be written to the same folder as the output_dir.

I think the use_outputs_dir could be either renamed or that logic could be removed.

Comment thread tests/example_regression_tests/example_07_regression_test.py Outdated
"The method `prepare_output_directory() will be depreciated in future versions"
"Please specify the `output_folder` in the hercules input file. "
"To use the functionality of this function in future versions, please set the "
"`overwrite_outputs` flag in the hercules input file to True. ",
Copy link
Copy Markdown
Collaborator

@paulf81 paulf81 Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing my own understanding, or don't specify overwrite_outputs as True is default right?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! The default behavior is to assume that overwrite_outputs is True if not user input. Do you think that should that be added to the warning message?

Copy link
Copy Markdown
Collaborator

@paulf81 paulf81 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @elenya-grant, thank you very much! I reviewed the code and think it's looking great, I like the generality you're adding everywhere. I added one small comment on a warning and then agreed with one of @genevievestarke on combining the two _setup_logging functions. After that I played around with the new example to confirm everything worked as I expected (it did!) and re-ran the tests.

log_every_n: 1

output_dir: "outputs_07"
overwrite_outputs: True
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, playing around with this a little more, what happens when overwrite_outputs = False? Example 07 doesn't seem to do anything different when I change this to false.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay - it's a little weird here too.

If overwrite_outputs=False, then it won't delete all the files in output_dir. If you don't change output_file to a different name - it'll be overwritten anyway. So - if you keep overwrite_outputs=False and change the output_file name between runs - then you should get two output .h5 files in the output_dir. If you had overwrite_outputs=True, then you'd get one output .h5 file in that folder named whatever the output_file just was (because all earlier files in that folder would've been deleted)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of appending on to the h5 file an incremental number if one already exists? And the same for the log files?

Copy link
Copy Markdown
Collaborator

@genevievestarke genevievestarke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting an error when I run example 07:

Image

I think the inputs for using the output directory for logging aren't quite aligned?

@genevievestarke
Copy link
Copy Markdown
Collaborator

I'm getting an error when I run example 07:

Image I think the inputs for using the output directory for logging aren't quite aligned?

This is fixed by the last push updating the example yaml.

@genevievestarke genevievestarke self-requested a review May 4, 2026 21:18
@genevievestarke genevievestarke merged commit ac36609 into NatLabRockies:develop May 4, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants