Feature/substep pvwatts#254
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds an option to run PySAM/PVWatts at the native weather-file resolution (when coarser than the Hercules dt) and then upsample PVWatts outputs to the Hercules grid, reducing repeated PVWatts execution cost and clarifying the time-averaging convention around interpolation.
Changes:
- Add
use_native_solar_dt(defaultTrue) to drive a “run PVWatts once on native dt, then upsample outputs” pathway. - Refactor PVWatts precompute to collect compute-grid outputs and upsample them onto the Hercules grid.
- Add tests and documentation describing the new pathway and the time-convention rationale.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
hercules/plant_components/solar_pysam_base.py |
Adds native-dt compute-grid selection, dt_solar detection, and output upsampling helper. |
hercules/plant_components/solar_pysam_pvwatts.py |
Refactors output extraction to upsample PVWatts outputs onto Hercules dt. |
tests/solar_pysam_pvwatts_native_dt_test.py |
Adds tests for the new use_native_solar_dt behavior and fallback behavior. |
docs/timing.md |
Documents that the averaged→instantaneous conversion can apply to PVWatts outputs in native-dt mode. |
docs/solar_pv.md |
Documents use_native_solar_dt behavior and PVWatts time convention details. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Interpolate df_solar on to the time steps | ||
| time_steps_all = np.arange(self.starttime, self.endtime, self.dt, dtype=hercules_float_type) | ||
| # Determine the dt implied by the weather file (after sorting to be safe) | ||
| df_solar = df_solar.sort_values("time").reset_index(drop=True) |
There was a problem hiding this comment.
dt_solar is inferred via df_solar["time"].iloc[1] - iloc[0], which will raise an IndexError when the solar input file has fewer than 2 rows (and the resulting exception message won’t explain what’s wrong). Add an explicit length check with a clear ValueError (or define a fallback) before indexing iloc[1].
| df_solar = df_solar.sort_values("time").reset_index(drop=True) | |
| df_solar = df_solar.sort_values("time").reset_index(drop=True) | |
| if len(df_solar) < 2: | |
| raise ValueError( | |
| "Solar input file must contain at least two rows to infer the native " | |
| "solar timestep" | |
| ) |
There was a problem hiding this comment.
Good catch, seems like the edgiest edge case but taking the suggestion
| n_compute = max(int(np.ceil((self.endtime - self.starttime) / self._compute_dt)) + 1, 2) | ||
| self._compute_time_steps = self.starttime + np.arange( | ||
| n_compute, dtype=hercules_float_type | ||
| ) * hercules_float_type(self._compute_dt) |
There was a problem hiding this comment.
In the use_native_solar_dt branch the compute grid is re-created as starttime + n * dt_solar (anchored to the simulation start), rather than using the weather file’s native timestamp alignment. If starttime_utc is not aligned to the weather file’s reporting boundary, this shifts the meaning of each “start-of-period average” interval that PVWatts assumes (e.g., hourly data stamped on the hour but a simulation starting at :30 would cause PVWatts to treat interpolated values as averages over :30→:30). Consider building _compute_time_steps from the sorted df_solar["time"] values (sliced to cover the sim window, plus any extra point needed for upsampling), or snapping the compute grid to the nearest earlier native timestamp so PVWatts intervals stay consistent with the input file.
| n_compute = max(int(np.ceil((self.endtime - self.starttime) / self._compute_dt)) + 1, 2) | |
| self._compute_time_steps = self.starttime + np.arange( | |
| n_compute, dtype=hercules_float_type | |
| ) * hercules_float_type(self._compute_dt) | |
| # Preserve the native weather-file timestamp alignment instead of | |
| # rebuilding the compute grid as starttime + n * dt_solar. This | |
| # keeps PVWatts' start-of-period interval semantics consistent with | |
| # the source data even when the simulation start is off-boundary. | |
| native_time_steps = np.sort( | |
| np.unique(df_solar["time"].to_numpy(dtype=hercules_float_type)) | |
| ) | |
| start_idx = np.searchsorted(native_time_steps, self.starttime, side="right") - 1 | |
| start_idx = max(start_idx, 0) | |
| end_idx = np.searchsorted(native_time_steps, self.endtime, side="right") | |
| if end_idx < len(native_time_steps): | |
| end_idx += 1 | |
| self._compute_time_steps = native_time_steps[start_idx:end_idx] | |
| # If the weather data does not fully span the requested native grid, | |
| # fall back to the previous arithmetic construction to preserve the | |
| # existing behaviour and guarantee enough points for interpolation. | |
| if len(self._compute_time_steps) < 2: | |
| n_compute = max( | |
| int(np.ceil((self.endtime - self.starttime) / self._compute_dt)) + 1, 2 | |
| ) | |
| self._compute_time_steps = self.starttime + np.arange( | |
| n_compute, dtype=hercules_float_type | |
| ) * hercules_float_type(self._compute_dt) |
There was a problem hiding this comment.
I think I've now accounted for this but you should recheck
| starttime_utc = pd.to_datetime("2024-06-24T17:00:00Z") | ||
| endtime_utc = pd.to_datetime("2024-06-24T17:10:00Z") | ||
| dt = 1.0 | ||
| dt_native = 60.0 | ||
|
|
There was a problem hiding this comment.
The new use_native_solar_dt path isn’t exercised with a simulation start/end that’s not aligned to the weather file’s native reporting boundary (e.g., start at :30 with hourly data). Adding a regression test for an offset starttime would help catch interval-alignment bugs in _compute_time_steps/upsampling logic.
There was a problem hiding this comment.
I think this is now accounted for but you should recheck
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| assert spv._compute_dt == pytest.approx(dt) | ||
| expected_n = int((spv.endtime - spv.starttime) / dt) | ||
| assert spv.ac_power_available_array.shape[0] == expected_n | ||
| assert spv._compute_time_steps is spv._hercules_time_steps |
There was a problem hiding this comment.
This test asserts object identity (is) between _compute_time_steps and _hercules_time_steps. Identity isn’t required for correctness (only equality of values), so this makes the test brittle to harmless refactors (e.g., if _compute_time_steps becomes a copy/view). Prefer asserting value equality (e.g., np.array_equal/assert_allclose) instead of is.
| assert spv._compute_time_steps is spv._hercules_time_steps | |
| assert np.array_equal(spv._compute_time_steps, spv._hercules_time_steps) |
| / 1000.0, | ||
| "dc_power_available": np.array(self.system_model.Outputs.dc, dtype=hercules_float_type) | ||
| / 1000.0, |
There was a problem hiding this comment.
np.array(..., dtype=hercules_float_type) / 1000.0 upcasts the result to float64 (because 1000.0 is float64), so in the no-op path (_compute_dt == dt) the power arrays won’t remain hercules_float_type. To keep memory/precision consistent, cast after the division or divide by a hercules_float_type(1000.0) scalar.
| / 1000.0, | |
| "dc_power_available": np.array(self.system_model.Outputs.dc, dtype=hercules_float_type) | |
| / 1000.0, | |
| / hercules_float_type(1000.0), | |
| "dc_power_available": np.array(self.system_model.Outputs.dc, dtype=hercules_float_type) | |
| / hercules_float_type(1000.0), |
| if self.use_native_solar_dt and self.dt_solar > self.dt: | ||
| self._compute_dt = self.dt_solar | ||
| native_time = df_solar["time"].to_numpy(dtype=hercules_float_type) | ||
| i_start = max(np.searchsorted(native_time, self.starttime, side="right") - 1, 0) | ||
| i_end = min( | ||
| np.searchsorted(native_time, self.endtime, side="left"), len(native_time) - 1 | ||
| ) | ||
| self._compute_time_steps = native_time[i_start : i_end + 1] | ||
| interpolation_method = "instantaneous_to_instantaneous" |
There was a problem hiding this comment.
In the use_native_solar_dt branch, i_start uses searchsorted(..., side="right") - 1, which will not include the native timestamp immediately before starttime when starttime falls exactly on a native weather timestamp (it selects the starttime row itself). That makes the subsequent averaged_to_instantaneous upsampling extrapolate at the left edge (because the first midpoint is t0 + dt_solar/2), and can change the initial segment of the output compared to the intended midpoint-corrected interpolation when earlier weather data exists. Consider selecting the previous native stamp when available (e.g., using side="left" and subtracting 1) so the upsample has a point on both sides of the first Hercules time step.
The potential solar power is computed in advance in Hercules. #249 corrected a bit how this was computed by noting that when upsampling 15 min / 1 hour solar input data to the hercules 1/10S, it was most appropriate to interpolate from the midpoint of the time period, since the timestamp marks the beginning of the period.
The primary objective of this PR is to delay the upsampling of the solar data from before calling pysam/pvwatts to after. This is because recent experience shows that the over-running of pvwatts on identical inputs which was cheap previously can be expensive if bifaciality is applied. In adding this feature, the opportunity is taken to correct a slight issue from #249 that pvwatts is also assuming timestamps are start of period and shifting to the midpoint for solar calculations. This doesn't mean the previous code has to change but some of the comments were slightly confused and are clarified in detail here.
A new option for the solar module is included
use_native_solar_dt, which defaults toTrueand uses the new method of running pvwatts on the underlying solar time step before upsampling to the hercules time stamp. Ifuse_native_solar_dtis false, or the hercules time step is not smaller than the solar data time step, then the behavior should be equivalent to the beahavior from before. A new test module is added to confirm that while the new method saves a lot of computation time, it's output is very similar to the previous behavior and should probably be always used (since calling pvwatts multiple times with identical inputs and only small changes in solar position should not impact results too much)The documentation is also updated.