Skip to content

Conversation

@ejsimley
Copy link
Collaborator

@ejsimley ejsimley commented Nov 13, 2025

This PR enhances the operational wake loss estimation method open.analysis.wake_losses.WakeLosses by adding an option to correct the estimated potential power in the absence of wake interactions for freestream wind speed variations across a wind plant (rather than assuming the wind resource is the same at all turbines). This method was used for the wake loss estimates in the WP3 Benchmark wake loss analysis paper. An overview of this method is as follows:

  • A csv file must be provided by the user containing relative freestream wind speedup factors as a function of wind direction for each turbine (columns are turbines and rows are wind direction bins). The speedup factors are intended to be relative to the mean freestream wind speed over all turbine locations for a given wind direction bin (i.e., the mean speedup factor is 1 for each wind direction bin).
  • The speedup factor csv file is used to estimate the freestream wind speed at each turbine location for each timestamp by applying a correction to the mean wind speed measured by the unwaked turbines.
  • An empirical power curve for the wind plant is created using the wind speed and power from each turbine from SCADA data to determine the mean power in each wind speed bin.
  • The estimated freestream wind speed at each turbine is used with the empirical power curve to estimate the potential freestream power at each turbine location for each timestamp.
  • The estimated freestream power at each turbine is scaled by the total power measured by the unwaked turbines divided by the total estimated freestream power of the unwaked turbines to account for time-dependent factors that could cause higher or lower power across the entire wind plant (e.g., atmospheric conditions).
  • Finally, the estimated freestream power at each turbine is used to determine the potential wind plant power at each timestamp needed to compute wake losses.

Additionally, two other enhancements are made in this PR.

  1. When correct_for_derating=True in the WakeLosses method, the power curve outlier detection now classifies abnormal wind speeds in addition to derating. Derating still detects outliers where the power production is below normal for a given wind speed, but abnormal wind speeds are classified when wind speeds are below normal for a given power production. All calculations based on wind speed in the code are now performed with both derated points and abnormal wind speed points removed, instead of only derated points being filtered out.
  2. An option was added to the openoa.utils.power_curve.functions.IEC power curve model to linearly interpolate power between wind speed bins instead of returning the mean power of the wind speed bin closest to the input wind speed. This enhancement was made to assist with the empirical power curve used by the WakeLosses freestream wind speed heterogeneity corrections.

Affected locations in the code are described below.

  • openoa.analysis.wake_losses.WakeLosses includes the enhancements described above. Two new arguments are used when correcting for wind speed heterogeneity: correct_for_ws_heterogeneity (True or False) and ws_speedup_factor_map , which is either a DataFrame or path to a csv file containing the speedup factors.
  • openoa.utils.power_curve.functions.IEC now contains the argument interpolate (True or False) to return linearly interpolated points from the power curve, as described above.
  • The README file and sphinx documentation have been updated to describe the wake loss heterogeneity corrections and to cite the Wind Energy paper which uses the WakeLosses analysis method.
  • examples/06_wake_loss_analysis.ipynb demonstrates how to use the freestream wind speed heterogeneity option, relying on an example wind speedup factor csv file in the examples folder called example_la_haute_borne_ws_speedup_factors.csv.
  • test/regression/wake_losses.py includes a regression test for wake loss estimation using the heterogeneity corrections. Results for two other tests were updated because of the slight change in the results now that abnormal wind speed points are flagged and removed from wind speed-based calculations in the wake loss analysis method.
  • The schema used to validate the data for wake loss analysis using met tower data have been updated to remove the requirement that wind speed is provided in the met tower data.

The PR addresses issue #257, though it currently doesn't use WRG files. This could be added as an option later if there is interest.

@ejsimley
Copy link
Collaborator Author

Note that some tests for parts of the code not modified in this PR are failing. This is caused by an updated version of NumPy for at least one test. The other failing tests might be caused by updated package dependencies, but will need to better understand what's going on.

@ejsimley ejsimley linked an issue Nov 14, 2025 that may be closed by this pull request
@ejsimley ejsimley marked this pull request as ready for review November 14, 2025 23:09
@ejsimley ejsimley requested a review from RHammond2 November 14, 2025 23:09
@codecov-commenter
Copy link

codecov-commenter commented Nov 19, 2025

Codecov Report

❌ Patch coverage is 84.94624% with 14 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.21%. Comparing base (a53308e) to head (6834ce9).
⚠️ Report is 18 commits behind head on develop.

Files with missing lines Patch % Lines
openoa/analysis/wake_losses.py 82.71% 8 Missing and 6 partials ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #310      +/-   ##
===========================================
- Coverage    72.49%   70.21%   -2.28%     
===========================================
  Files           29       29              
  Lines         3690     3922     +232     
  Branches       796      586     -210     
===========================================
+ Hits          2675     2754      +79     
- Misses         826      975     +149     
- Partials       189      193       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ejsimley
Copy link
Collaborator Author

Note that some tests for parts of the code not modified in this PR are failing. This is caused by an updated version of NumPy for at least one test. The other failing tests might be caused by updated package dependencies, but will need to better understand what's going on.

Tests are passing now

Copy link
Collaborator

@RHammond2 RHammond2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting all these changes together, @ejsimley, there isn't really much to comment on from a software perspective! I only left one comment on a type check that I think needs to be updated, otherwise the other suggestions are exactly that.

Comment on lines +400 to +402
if self.correct_for_ws_heterogeneity & (
type(self.ws_speedup_factor_map) not in [pd.DataFrame, str]
):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if self.correct_for_ws_heterogeneity & (
type(self.ws_speedup_factor_map) not in [pd.DataFrame, str]
):
if self.correct_for_ws_heterogeneity & isinstance(self.ws_speedup_factor_map, (pd.DataFrame, str))

In Python 3.10 or 3.11 (can't remember which) the preferred format for the multi-type check will use the union symbol, but the tuple will be find since we haven't updated our minimum python or version targeting yet.

self.power_curve_func = power_curve.IEC(
self.aggregate_df_sample.loc[:, "windspeed_normal"].stack(future_stack=True),
self.aggregate_df_sample.loc[:, "power_normal"].stack(future_stack=True),
windspeed_end=50.0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an option? It doesn't have to be (especially if my thought process is underinformed on the mechanics at play), but it caught my attention as something that a user could want control over.

Comment on lines +666 to +667
self.aggregate_df_sample[("windspeed_freestream_estimate", t)] = np.nan
self.aggregate_df_sample[("power_freestream_estimate", t)] = np.nan
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.aggregate_df_sample[("windspeed_freestream_estimate", t)] = np.nan
self.aggregate_df_sample[("power_freestream_estimate", t)] = np.nan
new_cols = ["windspeed_freestream_estimate", "power_freestream_estimate"]
self.aggregate_df_sample[list(itertools.product(new_cols, self.turbine_ids))] = np.nan

It's generally preferable to do data frame manipulations like the above once rather than repeatedly. For larger data frames or more than a handful of turbines this will have a much more noticeable impact. import itertools or from itertools import product will also have to be added up top, but it'll be worth it with the reduced number of column additions.

Comment on lines +800 to +810
valid_inds_freestream_power = (
(self.aggregate_df_sample["power_mean_freestream"] > 0)
& (
(
~self.aggregate_df_sample["derate_flag"]
* self.aggregate_df_sample["power_freestream_estimate"]
).sum(axis=1)
> 0
)
& (self.aggregate_df_sample["power_mean_freestream_estimate"] > 1.0)
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
valid_inds_freestream_power = (
(self.aggregate_df_sample["power_mean_freestream"] > 0)
& (
(
~self.aggregate_df_sample["derate_flag"]
* self.aggregate_df_sample["power_freestream_estimate"]
).sum(axis=1)
> 0
)
& (self.aggregate_df_sample["power_mean_freestream_estimate"] > 1.0)
)
valid_ix = self.aggregate_df_sample["power_mean_freestream"] > 0
valid_ix &= (
(
~self.aggregate_df_sample["derate_flag"]
* self.aggregate_df_sample["power_freestream_estimate"]
).sum(axis=1)
> 0
)
valid_ix &= self.aggregate_df_sample["power_mean_freestream_estimate"] > 1.0

I would recommend shortening up the valid_inds_freestream_power to valid_ix and breaking this multicriteria check into 3 distinct steps just to make it a little bit easier to follow. I think the name simplification is justified by the thorough inline documentation about this process.

Comment on lines +822 to +830
total_potential_freestream_power.loc[
~valid_inds_freestream_power
] = self.aggregate_df_sample.loc[
~valid_inds_freestream_power, "power_mean_freestream"
] * (
~self.aggregate_df_sample.loc[~valid_inds_freestream_power, "derate_flag"]
).sum(
axis=1
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
total_potential_freestream_power.loc[
~valid_inds_freestream_power
] = self.aggregate_df_sample.loc[
~valid_inds_freestream_power, "power_mean_freestream"
] * (
~self.aggregate_df_sample.loc[~valid_inds_freestream_power, "derate_flag"]
).sum(
axis=1
)
total_potential_freestream_power.loc[~valid_ix] = (
self.aggregate_df_sample.loc[~valid_ix, "power_mean_freestream"]
* ~self.aggregate_df_sample.loc[~valid_ix, "derate_flag"]
).sum(axis=1)

If the above suggestion is accepted, then this piece will also need to be updated, which helps reduce the amount of line breakage from the automated formatting.

# turbines are greater than zero, and mean estimated freestream power is
# sufficiently large (treated as greater than 1 kW times the number of normally
# operating freestream turbines), allowing valid potential power corrections.
valid_inds_freestream_power = (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In comparison to the above shortening, there isn't much to be done to fit any of the nested checks onto fewer lines without butchering the naming or assigning temporary variables, so this whole block works well as-is.


df_wd_bin = self.aggregate_df_sample.groupby("wind_direction_bin").sum()

index = np.arange(0.0, 360.0, self.wd_bin_width_LT_corr).tolist()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think index needs to be converted to a list since Pandas can handle arrays fairly easily.

# Save long-term corrected plant and turbine-level wake losses binned by wind direction
df_1hr_wd_bin = df_1hr_bin.groupby(level=[0]).sum()

index = np.arange(0.0, 360.0, self.wd_bin_width_LT_corr).tolist()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment from before about not needing to convert this to a list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Heterogeneous power assumptions for wake loss module?

3 participants