User-defined Output Folder#238
Conversation
…itten to non-default output_dir
|
hi @elenya-grant , thanks for this! I took a first pass through the code and then was reviewing the questions for reviewers. I think the answers included are from @genevievestarke and just noting I agree. This is looking great and let me know when you'd like me do the actual review -> green tick. Thanks! |
genevievestarke
left a comment
There was a problem hiding this comment.
Really great work @elenya-grant!! Just a few comments and doc strings to clear up!
| self.starttime_utc = self.h_dict["starttime_utc"] | ||
|
|
||
| def _setup_logging(self, logfile="log_hercules.log", console_output=True): | ||
| def _setup_logging(self, logfile="log_hercules.log", console_output=True, **kwargs): |
There was a problem hiding this comment.
Since this is called in two places, this seems like something we could move to utilities instead?
There was a problem hiding this comment.
Yes I agree, I just note the two functions don't have identical inputs, but I think we can just adopt the more general form of the two?
There was a problem hiding this comment.
Could y'all clarify what two functions you're referring to?
There was a problem hiding this comment.
I'm referring to the function referenced with this comment, and then the function on line 105 of plant_components/component_base.py . They seem to have really similar functionality!
|
|
||
| def test_dont_use_outputs_dir_logging(): | ||
| test_n = "03" | ||
| # what happens with non-default output dir |
There was a problem hiding this comment.
| # what happens with non-default output dir | |
| # what happens with non-default output dir and a different logging dir |
| hercules_dict["output_dir"] = output_dir | ||
| hercules_dict["output_file"] = f"hercules_output_test{test_n}.h5" | ||
| hercules_dict["overwrite_outputs"] = True | ||
| hercules_dict["logging"] = { |
There was a problem hiding this comment.
This definition seems a little incongruous. We are saying to use the same outputs directory, but defining a separate logging directory?
There was a problem hiding this comment.
I totally understand the confusion here - and it was confusing to me too. The use_outputs_dir is not a new input to the setup logging functions.
If use_outputs_dir is False - that means to basically do Path(log_file) as the output filepath for the logger (meaning that the logger will be written to the current working directory).
If use_outputs_dir is True - it'll use the output folder specified. If user specifies logging['outputs_dir'] then that'll be used as the logging folder. If output_dir (top-level) is specified and logging['outputs_dir'] is not specified , then the log files will be written to the same folder as the output_dir.
I think the use_outputs_dir could be either renamed or that logic could be removed.
| "The method `prepare_output_directory() will be depreciated in future versions" | ||
| "Please specify the `output_folder` in the hercules input file. " | ||
| "To use the functionality of this function in future versions, please set the " | ||
| "`overwrite_outputs` flag in the hercules input file to True. ", |
There was a problem hiding this comment.
Testing my own understanding, or don't specify overwrite_outputs as True is default right?
There was a problem hiding this comment.
Yes! The default behavior is to assume that overwrite_outputs is True if not user input. Do you think that should that be added to the warning message?
paulf81
left a comment
There was a problem hiding this comment.
hi @elenya-grant, thank you very much! I reviewed the code and think it's looking great, I like the generality you're adding everywhere. I added one small comment on a warning and then agreed with one of @genevievestarke on combining the two _setup_logging functions. After that I played around with the new example to confirm everything worked as I expected (it did!) and re-ran the tests.
| log_every_n: 1 | ||
|
|
||
| output_dir: "outputs_07" | ||
| overwrite_outputs: True |
There was a problem hiding this comment.
Ok, playing around with this a little more, what happens when overwrite_outputs = False? Example 07 doesn't seem to do anything different when I change this to false.
There was a problem hiding this comment.
okay - it's a little weird here too.
If overwrite_outputs=False, then it won't delete all the files in output_dir. If you don't change output_file to a different name - it'll be overwritten anyway. So - if you keep overwrite_outputs=False and change the output_file name between runs - then you should get two output .h5 files in the output_dir. If you had overwrite_outputs=True, then you'd get one output .h5 file in that folder named whatever the output_file just was (because all earlier files in that folder would've been deleted)
There was a problem hiding this comment.
What do you think of appending on to the h5 file an incremental number if one already exists? And the same for the log files?


User-defined Output Folder
Before, the outputs of a Hercules simulation (.h5 file and loggers) were written to
Path.cwd()/"outputs". This PR introduces the flexibility to define an output folder and output filenames. If one is not input, the default behavior is the same as before. This PR also enables the user to specify more logging options.This impacts the output
.h5file location and the logger file location. The introduces some new inputs to the Hercules input file:Some notes on the logging logic:
ifIflogging.use_outputs_diris True, the output folder will be thelogging.outputs_dirif specified.logging.outputs_diris not input, then it'll use the output dir specified for as the top-leveloutput_dir.component.logging.outputs_diris provided, then an error will be raisedthat output folder will be used for the component-level log, if not provided, it'll use the output dir used for the Hercules logger.Changes since first review:
make_unique_folder_name()function tohercules/utilities.pyh_dict["overwrite_outputs"]is True, then all the files inh_dict["output_dir"]are deleted. Ifh_dict["overwrite_outputs"]is False, then a new output folder name is created with the same base foldername ash_dict["output_dir"]. Ifh_dict["overwrite_outputs"]is False andh_dict["output_dir"]already exists (suppose the output dir is called "outputs"), then the output files will be written to a folder named "outputs0". If "outputs0" folder exists, output files will be written to a folder named "outputs1", etc.use_outputs_dirfromh_dict["logging"]. Insetup_logging(), a log file is written tologging_dir/log_fileTO-DO
make_unique_folder_name()docs/h_dict.mddocs/examples/07_open_cycle_gas_turbine.md__init__()method ofhercules/plant_components/component_base.pyRequested Feedback from Reviewers/Questions (out of date)
key_concepts/outputsor in example 7 doc page write-up. Also update to H_Dict Structure.prepare_output_directoryshould either be called inHerculesModel.__init__()so that it can use the output dir specified in the input dictionary. Thoughts? If so - I think that another user-input parameter in the input dictionary should dictate whether that's called or not, like below:Or - remove the function all together and put the
shutil.rmtree(output_dir)logic inHerculesModel.__init__()New files
tests/example_regression_tests/example_07_regression_test.py: tests 3 cases of different ways of specifying output directory and logger directories.Files that were modified
hercules/utilities_examples.pyprepare_output_directory(): added depreciation warningexamples/hercules_input_example.yamlhercules/hercules_model.pyHerculesModel.__init__()self._setup_logging()HerculesModel._setup_logging(): updated to take in (optional) other user-specified logger inputshercules/utilities.pysetup_logging()hercules/plant_components/component_base.pyComponentBase.__init__()self._setup_logging()ComponentBase._setup_logging(): updated to take in (optional) other user-specified logger inputs for the componenttests/hercules_model_test.py: updated some strings toPathobjects.test_specified_outputs_dir(): new test for user-specified output dir