Add averages with time dimension removed #236

tomvothecoder · 2022-05-10T22:53:52Z

Description

Closes [FEATURE]: Add averages with time dimension removed #201

This PR adds the ability to calculate averages weighted by time with the time dimension removed (via ds.temporal.average()). It simply wraps xarray's multi-line function calls into an API, with an example below for comparison.

# xarray
month_length = ds.time.dt.days_in_month
weights = (
    month_length.groupby("time.month") / month_length.groupby("time.month").sum()
)
pr_avg_xarray = ds.pr.weighted(weights).mean(dim="time")

#xcdat
pr_avg_xcdat = ds.temporal.average("pr",  freq="month")

Summary of Changes

Add unit tests for temporal.average()
Rename previous .average() to .group_average()
Refactor many private methods in class TemporalAccessor
- Rename methods and variables, update docstrings
- Refactor ._validate_weights() for a significant performance increase of about 4-5x (for the monthly Dataset I was testing on)
Remove redundant tests for private methods

Related API docs: https://xcdat.readthedocs.io/en/feature-201-temporal-mean/generated/xarray.Dataset.temporal.average.html#

Floating point closeness comparison with CDAT/cdutil: https://github.com/xCDAT/xcdat_test/blob/pr-236-validation/validation/v1.0.0/temporal_average/qa_cdat_closeness.ipynb

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my own code
My changes generate no new warnings
Any dependent changes have been merged and published in downstream modules

If applicable:

I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass with my changes (locally and CI/CD build)
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have noted that this is a breaking change for a major release (fix or feature that would cause existing functionality to not work as expected)

- Update docstrings of methods in `TemporalAccessor` class - Add `DEFAULT_SEASON_CONFIG` to reduce code duplication - Add comments for sections of code that is opaque - Rename `_averager()` to `_grouped_average()`

codecov-commenter · 2022-05-10T22:56:41Z

Codecov Report

Merging #236 (8447736) into main (1fdc8a9) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##              main      #236   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            9         8    -1     
  Lines          742       723   -19     
=========================================
- Hits           742       723   -19

Impacted Files	Coverage Δ
xcdat/axis.py	`100.00% <ø> (ø)`
xcdat/utils.py	`100.00% <ø> (ø)`
xcdat/__init__.py	`100.00% <100.00%> (ø)`
xcdat/bounds.py	`100.00% <100.00%> (ø)`
xcdat/dataset.py	`100.00% <100.00%> (ø)`
xcdat/spatial.py	`100.00% <100.00%> (ø)`
xcdat/temporal.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4a35d6a...8447736. Read the comment docs.

- Remove `data_var` arg from `_get_weights()` - Update `_convert_df_to_dt()` to use regular dt objects for mean mode and non-monthly frequencies Add `TemporalAccessor` attribute `_dim_name` - Update docstrings

- Rename `.mean()` to `.average()` - Rename existing `.average()` to `.group_average()` - Update "mean" to "average" and "average" to "group_average in `Mode` type alias

- Rename `_grouped_average()` to `_group_average()` - Rename `_time_grouped` to `_grouped_time` - Update docstring for `_convert_df_to_dt()` - Update conditional in `_convert_df_to_dt()` - Update `_validate_weights()` to get `num_groups` from `_get_weights()`instead of `self._time_grouped` - Rename classes in `test_temporal.py` to reflect private methods - Add placeholder tests in `test_temporal.py`

tomvothecoder

Initial review comments

xcdat/temporal.py

chengzhuzhang · 2022-05-12T20:36:19Z

Hey Tom, just to follow up the xcdat meeting that we had. Regarding to two points:

new average (or mean old name) method returns weighted means for a data variable with time dimension removed
potential API naming inconsistencies: differentiate ds.temporal.average() vs ds.temporal.mean(): with the former for grouped average, later to collapse on time dimension.

I'm thinking one possibility to resolve both: is to make the new method an option of average()/group_average().
i.e.:

Assume that mean only average based on original freqency: for instance, if target frequency is not specified, the frequency should be the original data frequency (based on length of time interval), and returns means for the data variable with time dimension removed. if original data frequency is monthly or yearly, then the mean is weighted, otherwise, unweighted.

example:
ds_month = ds.temporal.average("ts", freq=None, center_times=False)
or
ds_month = ds.temporal.average("ts", center_times=False)

- Update `_group_time_coords()` to `_label_time_coords()` - Update `_groupby_freq()` to `_group_data()` - Extract `_get_dt_components()` from `_group_time_coords()` - Rename `_process_season_dataframe()` to `_process_season_df()` - Move `_convert_df_to_dt()` further down the class - Rename `DATETIME_COMPONENTS` to `TIME_GROUPS` - Update season frequency in `TIME_GROUPS` - Add logic to `_get_dt_components()` to handle seasonal frequency

tomvothecoder · 2022-05-13T21:55:38Z

I'm thinking one possibility to resolve both: is to make the new method an option of average()/group_average().
i.e.:

Assume that mean only average based on original freqency: for instance, if target frequency is not specified, the frequency should be the original data frequency (based on length of time interval), and returns means for the data variable with time dimension removed. if original data frequency is monthly or yearly, then the mean is weighted, otherwise, unweighted.

Hi Jill, thanks for the suggestion. On the surface it seems like a relatively simple solution. It would be nice to have a single API that can easily solve the naming inconsistency!

After investigating more closely with a prototype implementation, I unfortunately found that combining these two different averaging operations under a single API makes the API less predictable because each operation produces a different output based on the freq argument.

For example, an end-user might expect a grouped average output, but they get a time collapsed average because they don't pass a freq argument (and vice versa). They might not understand why until they dig into the docs (which they should probably be reference first before using any API anyways haha). For the developer, we'd have to write more conditional logic based on the method arguments and more documentation.

My proposed (and current) solution is to have two well-defined APIs that serve different purposes because I think it is easier to use and can be cleanly implemented.

The name of the APIs/methods explains what they do, while their respective docstrings fills the gap in how to use them.

.average() -- returns averages for a data variable and collapses the time dimension
- Arguments include data_var, weighted, and center_times
- I think we should include the weighted boolean arg so that the user explicitly declares whether or not output is weighted or not, rather than handling whether it is weighted internally based on the time frequency. It makes the output more predictable to the user because they know exactly what the output reflects.
.group_average() -- returns grouped averages for a data variable
- Arguments include data_var, freq, weighted, center_times, and season_config

As mentioned before, we will write an API translation table for CDAT users to help smooth the transition process to xCDAT.

Let me know what you think! Also tagging @lee1043 to see if he has any thoughts or ideas.

chengzhuzhang · 2022-05-13T22:10:27Z

@tomvothecoder Thank you for the effort to explore the different options! Appreciate it. What you suggested about two APIs actually make sense to me. Tagging @lee1043 or @pochedls to see if they have some input on this implementation or perhaps better API names?

lee1043 · 2022-05-13T22:25:47Z

@tomvothecoder @chengzhuzhang thank you for the discussion. I was initially leaning toward what @chengzhuzhang suggested earlier -- single API on top of two capabilities -- but I am leaning back to the well-defined 2 separate APIs. I think the point @tomvothecoder raised is very important -- make API predictable, which I think is one of key to be a successful community tool. While we expect users do READ docs first, we know it often not the case. In my personal experiences, I often start directly from some sample of the code, and read docs when I get stumbled. When API's predictability is great and I know what exactly API does, I can trust the API more. I think group_average is fine, and also open to any other suggestions.

- Testing private methods introduces coupling to implementation details. We should be testing public methods, which tests behaviors and outputs

- Use dataset custom dataset fixtures with less coordinate points and values other than 1 for easier and more robust testing

- Add tests for `season_config` arg - Update names of tests

tomvothecoder

Hi @lee1043 and @chengzhuzhang, this PR is ready for review.

The PR's description should contain all of the relevant information to make the review process smooth. Thanks!

tomvothecoder · 2022-05-17T21:00:52Z

xcdat/temporal.py

+    def _validate_weights(self, weights: xr.DataArray, num_groups: int):
+        """Validates that the sum of the weights for each group equals 1.0.

        Parameters
        ----------
-        data_var : xr.DataArray
-            The data variable.
        weights : xr.DataArray
-            The data variable's time coordinates weights.
+            The weights for the time coordinates.
+        num_groups : int
+            The number of groups.
        """
-        freq_groups = self._groupby_freq(data_var).count()  # type: ignore
-        # Sum the frequency group counts by all the dims except the grouped time
-        # dimension to get a 1D array of counts.
-        summing_dims = tuple(
-            x for x in freq_groups.dims if x != self._time_grouped.name
-        )
-        freq_sums = freq_groups.sum(summing_dims)
+        actual_sum = self._group_data(weights).sum().values  # type: ignore
+        expected_sum = np.ones(num_groups)

-        # Replace all non-zero counts with 1.0 (total weight of 100%).
-        expected_sum = np.where(freq_sums > 0, 1.0, freq_sums)
-        actual_sum = self._groupby_freq(weights).sum().values  # type: ignore
        np.testing.assert_allclose(actual_sum, expected_sum)


This refactored implementation of _validate_weights() speeds up weighted averages significantly.
The bottleneck was due to the call freq_groups=self_groupby_freq(data_var).count() for getting the count of unique grouping labels. Now we just pass num_groups directly to the method from _get_weights().

For the dataset I was testing on and a frequency of month, _get_weights() is now about 4-5x faster (~10.8-11.6 secs down to 2.8-3.2 seconds).

tomvothecoder · 2022-05-17T21:01:30Z

tests/test_temporal.py

@@ -1,8 +1,5 @@
-from datetime import datetime


I removed a lot of the unit tests for private methods in test_temporal.py. I learned that testing private methods is unnecessary and introduces coupling to the implementation details, which breaks encapsulation. It makes refactoring significantly more tedious because the tests for the private methods need to also be updated (also risk of test duplication with public methods). Instead, tests should revolve around public methods and their behaviors/outputs.

More info on why testing private methods is considered "bad" practice:
https://softwareengineering.stackexchange.com/a/380296
https://stackoverflow.com/questions/105007/should-i-test-private-methods-or-only-public-ones

tomvothecoder · 2022-05-17T21:17:37Z

xcdat/temporal.py

    def average(
+        self,
+        data_var: str,
+        freq: Frequency,
+        center_times: bool = False,
+        season_config: SeasonConfigInput = DEFAULT_SEASON_CONFIG,
+    ):
+        """
+        Returns a Dataset with the time weighted average of a data variable
+        and the time dimension removed.
+
+        This method is particularly useful for monthly or yearly time series
+        data since the number of days per month can vary based on the calendar
+        type, which can affect weighting. For unweighted time averages or other
+        frequencies, call xarray's native ``.mean()`` method on the data
+        variable instead.
+
+        Weights are calculated by first determining the length of time for
+        each coordinate point using the difference of its upper and lower
+        bounds. The time lengths are grouped, then each time length is
+        divided by the total sum of the time lengths to get the weight of
+        each coordinate point.
+
+        Parameters
+        ----------
+        data_var: str
+            The key of the data variable for calculating averages
+        freq : Frequency
+            The time frequency for calculating weights.
+
+            * "year": weights by year
+            * "season": weights by season
+            * "month": weights by month
+            * "day": weights by day
+            * "hour": weights by hour
+
+        center_times: bool, optional
+            If True, center time coordinates using the midpoint between its
+            upper and lower bounds. Otherwise, use the provided time
+            coordinates by default False.
+
+        season_config: SeasonConfigInput, optional
+            A dictionary for "season" frequency configurations. If configs for
+            predefined seasons are passed, configs for custom seasons are
+            ignored and vice versa.
+
+            Configs for predefined seasons:
+
+            * "dec_mode" (Literal["DJF", "JFD"], by default "DJF")
+                The mode for the season that includes December.
+
+                * "DJF": season includes the previous year December.
+                * "JFD": season includes the same year December.
+                    Xarray labels the season with December as "DJF", but it is
+                    actually "JFD".
+
+            * "drop_incomplete_djf" (bool, by default False)
+                If the "dec_mode" is "DJF", this flag drops (True) or keeps
+                (False) time coordinates that fall under incomplete DJF seasons
+                Incomplete DJF seasons include the start year Jan/Feb and the
+                end year Dec.
+
+            Configs for custom seasons:
+
+            * "custom_seasons" ([List[List[str]]], by default None)
+                List of sublists containing month strings, with each sublist
+                representing a custom season.
+
+                * Month strings must be in the three letter format (e.g., 'Jan')
+                * Each month must be included once in a custom season
+                * Order of the months in each custom season does not matter
+                * Custom seasons can vary in length
+
+                >>> # Example of custom seasons in a three month format:
+                >>> custom_seasons = [
+                >>>     ["Jan", "Feb", "Mar"],  # "JanFebMar"
+                >>>     ["Apr", "May", "Jun"],  # "AprMayJun"
+                >>>     ["Jul", "Aug", "Sep"],  # "JunJulAug"
+                >>>     ["Oct", "Nov", "Dec"],  # "OctNovDec"
+                >>> ]
+
+        Returns
+        -------
+        xr.Dataset
+            Dataset with the time weighted average of the data variable and the
+            time dimension removed.
+
+        Examples
+        --------
+
+        Get weighted averages for a monthly time series data variable:
+
+        >>> ds_month = ds.temporal.average("ts", freq="month", center_times=False)
+        >>> ds_month.ts
+        """
+        return self._averager(
+            data_var, "average", freq, True, center_times, season_config
+        )
+


The new average() method.

I wonder freq can be automatically assigned with a new parameter if weighted=True. Having freq seems can confuse with group_average.

tomvothecoder · 2022-05-18T18:10:14Z

Hi @lee1043 and @chengzhuzhang, this PR is ready for review.

The PR's description should contain all of the relevant information to make the review process smooth. Thanks!

@chengzhuzhang won't be able to review this PR this week. @lee1043 can you give this a shot whenever you get a chance? Thanks

lee1043 · 2022-05-18T23:18:10Z

I am testing this PR at this notebook. While I can give an overview at our regular meeting or we can have a separate chat, just sharing in advance for your interest. Basically what I am doing in the notebook is trying every option, even if that does not make sense, so I can learn how the function work and respond.

While I think I am understanding average() which returns temporal averaged with time coordinate collapses to 1 time step, and I know the intention of having freq there which is for weighting adjustment, but I think there is a chance that user can confused it with group average be seeing the "freq" option. I wonder if the function automatically recognize temporal interval of the given dataset and its data array, and user only decide weighted=True/False without dealing with the freq parameter, and the function automatically apply proer freq if weighted=True.

In the notebook, I also tried some silly things, such as get temporal average of monthly time series with freq=day or hour (which does not make sense but just for curious what would return) and compare their differences in Section 1.6. theoretically they might have to return the same value as the original monthly data because they don't have higher frequency input, but It is interesting to see how different they are and there are some patterns in the differences.

Regarding the group average, in the notebook I am using monthly time series as input (shape: (1872, Y, X)), so when I was doing ds_gra_month = ds.temporal.group_average("ts", freq="month") in Section 2.1, my logical sense was expecting a shape: (12, Y, X) (group same months together and get average of each group, like getting an annual cycle). However what returned was shape: (1872, Y, X), so same as the original one.

For year and season, it seems right numbers were returned for the time dimension.

So far I mostly checked number in time dimensions and yet to cross-validate returned values with CDAT. I will continue to work on that.

tomvothecoder · 2022-05-19T17:13:34Z

Thanks for reviewing @lee1043! You have a great validation process going on.

While I think I am understanding average() which returns temporal averaged with time coordinate collapses to 1 time step, and I know the intention of having freq there which is for weighting adjustment, but I think there is a chance that user can confused it with group average be seeing the "freq" option. I wonder if the function automatically recognize temporal interval of the given dataset and its data array, and user only decide weighted=True/False without dealing with the freq parameter, and the function automatically apply proer freq if weighted=True.

I agree, and Jill mentioned a similar solution to what you outlined.

I think it might confuse users if there is a freq argument for specifying the frequency used to calculate weights. Users might unintentionally pass a frequency that is not aligned with the actual frequency of the time intervals, resulting in incorrect weighting.

I will try to figure out how cdutil recognizes the time interval. If cdutil's implementation isn't clear and/or robust, we can figure out our own implementation.

In the notebook, I also tried some silly things, such as get temporal average of monthly time series with freq=day or hour (which does not make sense but just for curious what would return) and compare their differences in Section 1.6. theoretically they might have to return the same value as the original monthly data because they don't have higher frequency input, but It is interesting to see how different they are and there are some patterns in the differences.

I noticed this too. This isn't necessarily an implementation bug, but if the user passes a frequency that isn't aligned with the actual frequency of the time intervals (e.g., freq="day" with monthly data), it can produce unintended odd behaviors.

For example, with the monthly time coordinates ['1850-01-16T12:00:00.000000000', '1850-02-15T00:00:00.000000000']:

If user passes freq="day": calculate weights using groups day 16 and day 15
If user passes freq="hour": calculates weights using groups hour 12 and hour 0

This is possible for the user if they want to calculate weights like that, but it doesn't really make sense. Removing the freq argument will eliminate the possibility of passing incorrect/unintended frequencies.

Regarding the group average, in the notebook I am using monthly time series as input (shape: (1872, Y, X)), so when I was doing ds_gra_month = ds.temporal.group_average("ts", freq="month") in Section 2.1, my logical sense was expecting a shape: (12, Y, X) (group same months together and get average of each group, like getting an annual cycle). However what returned was shape: (1872, Y, X), so same as the original one.

The grouping behaviors for group averages and climatologies are based on CDAT.

In the case of the monthly time series input, grouping on month will return the original data because freq=month groups on year and month.

Here is a code snippet to validate this behavior against CDAT:

import cdms2
import cdutil
import numpy as np
import xarray as xr

import xcdat

filepath = "/home/vo13/XCDAT/xcdat/input/demo_data/CMIP5_demo_data/pr_Amon_ACCESS1-0_historical_r1i1p1_185001-200512.nc"

ds = xcdat.open_dataset(filepath)
ds_cdat = cdms2.openDataset(filepath)

pr_avg_cdat = cdutil.ANNUALCYCLE(ds_cdat["pr"])
# pr_avg_cdat.shape  == (1872, 145, 192)

pr_avg_xcdat = ds.temporal.group_average("pr", freq="month")["pr"]
# pr_avg_xcdat.shape == (1872, 145, 192)

The grouping behavior that produces (12, Y, X) is a result of calculating the climatology, not the group average:

pr_climo_cdat = cdutil.ANNUALCYCLE.climatology(ds_cdat["pr"])
pr_climo_cdat.shape == (12, 145, 192)

pr_climo_xcdat = ds.temporal.climatology("pr", freq="month")["pr"]
pr_climo_xcdat.shape == (12, 145, 192)

The code lines below show how grouping is done in xCDAT and CDAT, which was also validated in this notebook
https://github.com/xCDAT/xcdat_test/blob/master/validation/temporal_average/floating_point_comparisons.ipynb.

xcdat/xcdat/temporal.py

Lines 33 to 59 in da2cf9e

    
           #: A dictionary mapping temporal averaging mode and frequency to the time groups. 
        
           TIME_GROUPS: Dict[Mode, Dict[Frequency, Tuple[DateTimeComponent, ...]]] = { 
        
               "average": { 
        
                   "year": ("year",), 
        
                   "season": ("season",), 
        
                   "month": ("month",), 
        
                   "day": ("day",), 
        
                   "hour": ("hour",), 
        
               }, 
        
               "group_average": { 
        
                   "year": ("year",), 
        
                   "season": ("year", "season"), 
        
                   "month": ("year", "month"), 
        
                   "day": ("year", "month", "day"), 
        
                   "hour": ("year", "month", "day", "hour"), 
        
               }, 
        
               "climatology": { 
        
                   "season": ("season",), 
        
                   "month": ("month",), 
        
                   "day": ("month", "day"), 
        
               }, 
        
               "departures": { 
        
                   "season": ("season",), 
        
                   "month": ("month",), 
        
                   "day": ("month", "day"), 
        
               }, 
        
           }

- Update `TestAverage` unit tests to test different time frequencies

tomvothecoder · 2022-05-19T22:09:46Z

xcdat/temporal.py

+    def _infer_freq(self) -> Frequency:
+        """Infers the time frequency from the coordinates.
+
+        This method infers the time frequency from the coordinates by
+        calculating the minimum delta and comparing it against a set of
+        conditionals.
+
+        The native ``xr.infer_freq()`` method does not work for all cases
+        because the frequency can be irregular (e.g., different hour
+        measurements), which ends up returning None.
+
+        Returns
+        -------
+        Frequency
+            The time frequency.
+        """
+        time_coords = self._dataset[self._dim_name]
+        min_delta = pd.to_timedelta(np.diff(time_coords).min(), unit="ns")
+
+        if min_delta < pd.Timedelta(days=1):
+            return "hour"
+        elif min_delta >= pd.Timedelta(days=1) and min_delta < pd.Timedelta(days=28):
+            return "day"
+        elif min_delta >= pd.Timedelta(days=28) and min_delta < pd.Timedelta(days=365):
+            return "month"
+        else:
+            return "year"


@lee1043 I added this new method, _infer_freq() for inferring the freq of the coordinates. The frequency is used for calculating weights if weighed=True.

Let me know how the algorithm looks, thanks.

@tomvothecoder I think this is great and very straightforward, thank you for revising it. I am playing with the new function in the notebook now.

P.S: typo in above comment weighed=True : weighed --> weighted

tomvothecoder · 2022-05-19T22:11:30Z

xcdat/temporal.py

+    def _average(self, data_var: xr.DataArray) -> xr.DataArray:
+        """
+        Returns the weighted averages for a data variable with the time
+        dimension removed.
+
+        Parameters
+        ----------
+        data_var : xr.DataArray
+            The data variable.
+
+        Returns
+        -------
+        xr.DataArray
+        The weighted averages for a data variable with the time dimension
+        removed.
+        """
+        dv = data_var.copy()

-        This method groups the data variable's values by the time coordinates
-        and averages them with or without weights. The parameters for
-        ``self._temporal_average()`` are stored as DataArray attributes in the
-        averaged data variable.
+        with xr.set_options(keep_attrs=True):
+            if self._weighted:
+                self._weights = self._get_weights()
+                dv = dv.weighted(self._weights).mean(dim=self._dim_name)
+            else:
+                dv = dv.mean(dim=self._dim_name)
+
+        dv = self._add_operation_attrs(dv)
+
+        return dv


This is the main method that performs the work for ds.temporal.average().

lee1043 · 2022-05-20T19:13:30Z

@tomvothecoder thank you for the revision. I have tested the function in following notebooks in this PR that are using

And I think the function is working as expected!

tomvothecoder · 2022-05-23T16:19:32Z

@lee1043 Great, thanks for adding those notebooks! I will do a final review and then merge.

chengzhuzhang · 2022-05-24T00:20:16Z

@lee1043 thank you for reviewing and testing this PR, also great to get more validation notebooks for it! At some point, we can also do a similar test as Steve's, to loop over all CMIP models and compare results between cdat and xcdat.

tomvothecoder · 2022-05-24T15:45:58Z

@chengzhuzhang I agree, I'll ask Steve to post his scripts so we can reuse them for validation.

tomvothecoder added 2 commits May 10, 2022 13:09

Add mean() to TemporalAccessor class

56fa075

- Update docstrings of methods in `TemporalAccessor` class - Add `DEFAULT_SEASON_CONFIG` to reduce code duplication - Add comments for sections of code that is opaque - Rename `_averager()` to `_grouped_average()`

Add support for additional frequencies

1ed1906

tomvothecoder added type: enhancement New enhancement request Priority: Medium labels May 10, 2022

tomvothecoder self-assigned this May 10, 2022

tomvothecoder added 5 commits May 11, 2022 08:56

Update docstrings

6f8ad3e

Refactor _validate_weights() to optimize performance

360e370

- Remove `data_var` arg from `_get_weights()` - Update `_convert_df_to_dt()` to use regular dt objects for mean mode and non-monthly frequencies Add `TemporalAccessor` attribute `_dim_name` - Update docstrings

Rename TemporalAccessor methods for consistency

4a55ae8

- Rename `.mean()` to `.average()` - Rename existing `.average()` to `.group_average()` - Update "mean" to "average" and "average" to "group_average in `Mode` type alias

Update api.rst methods list

79efc1c

tomvothecoder commented May 12, 2022

View reviewed changes

tomvothecoder added 3 commits May 12, 2022 11:29

Apply suggestions from code review

360ef93

Fix docstrings from PR review commit

cd309f9

Update comment

eeb249a

chengzhuzhang self-requested a review May 12, 2022 20:22

tomvothecoder added 2 commits May 12, 2022 16:44

Update method names, docstrings, and comments

ff55341

tomvothecoder added 7 commits May 16, 2022 10:31

Update docstrings

d3714be

Update Test_CenterTimes to TestCenterTimes

cb8479b

Remove redundant tests for private methods

e6d7b1a

- Testing private methods introduces coupling to implementation details. We should be testing public methods, which tests behaviors and outputs

Update TestAverage and TestGroupAverage fixtures

5e8f7ab

- Use dataset custom dataset fixtures with less coordinate points and values other than 1 for easier and more robust testing

Rename test methods and update average docstring

13d34d9

Add season_config to average()

c957760

- Add tests for `season_config` arg - Update names of tests

Revert operation string update in test

131ff69

tomvothecoder force-pushed the feature/201-temporal-mean branch from 38ac565 to 131ff69 Compare May 17, 2022 18:38

tomvothecoder added 3 commits May 17, 2022 12:06

Fix failing test

d8b60ac

Add drop_dims in TestAverage tests

1cffe6d

Update docstrings for "JFD"

da2cf9e

This was referenced May 17, 2022

Add validation notebook for PR 236 xCDAT/xcdat-validation#30

Closed

Add validation notebook for PR 236 xCDAT/xcdat-validation#31

Merged

tomvothecoder changed the title ~~Add weighted time averages with time dimension removed~~ Add averages weighted by time with time dimension removed May 17, 2022

tomvothecoder added the breaking-change A breaking change, intended for a major release. label May 17, 2022

tomvothecoder requested a review from lee1043 May 17, 2022 22:25

tomvothecoder commented May 17, 2022

View reviewed changes

Add _infer_freq() for inferring time freq

f69df41

- Update `TestAverage` unit tests to test different time frequencies

tomvothecoder commented May 19, 2022

View reviewed changes

Revert accidental code change

8447736

tomvothecoder changed the title ~~Add averages weighted by time with time dimension removed~~ Add averages with time dimension removed May 23, 2022

Final review comments

2d94c2c

tomvothecoder merged commit 112eb58 into main May 23, 2022

tomvothecoder deleted the feature/201-temporal-mean branch May 23, 2022 17:53

tomvothecoder mentioned this pull request May 26, 2022

[FEATURE]: Refactor test suite that covers private tests #242

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add averages with time dimension removed #236

Add averages with time dimension removed #236

tomvothecoder commented May 10, 2022 •

edited

Loading

codecov-commenter commented May 10, 2022 •

edited

Loading

tomvothecoder left a comment

chengzhuzhang commented May 12, 2022 •

edited

Loading

tomvothecoder commented May 13, 2022

chengzhuzhang commented May 13, 2022

lee1043 commented May 13, 2022

tomvothecoder left a comment

tomvothecoder May 17, 2022

tomvothecoder May 17, 2022

tomvothecoder May 17, 2022

lee1043 May 18, 2022

tomvothecoder commented May 18, 2022

lee1043 commented May 18, 2022 •

edited

Loading

tomvothecoder commented May 19, 2022 •

edited

Loading

tomvothecoder May 19, 2022 •

edited

Loading

lee1043 May 20, 2022 •

edited

Loading

tomvothecoder May 19, 2022

lee1043 commented May 20, 2022 •

edited

Loading

tomvothecoder commented May 23, 2022

chengzhuzhang commented May 24, 2022

tomvothecoder commented May 24, 2022

Add averages with time dimension removed #236

Add averages with time dimension removed #236

Conversation

tomvothecoder commented May 10, 2022 • edited Loading

Description

Checklist

codecov-commenter commented May 10, 2022 • edited Loading

Codecov Report

tomvothecoder left a comment

Choose a reason for hiding this comment

chengzhuzhang commented May 12, 2022 • edited Loading

tomvothecoder commented May 13, 2022

chengzhuzhang commented May 13, 2022

lee1043 commented May 13, 2022

tomvothecoder left a comment

Choose a reason for hiding this comment

tomvothecoder May 17, 2022

Choose a reason for hiding this comment

tomvothecoder May 17, 2022

Choose a reason for hiding this comment

tomvothecoder May 17, 2022

Choose a reason for hiding this comment

lee1043 May 18, 2022

Choose a reason for hiding this comment

tomvothecoder commented May 18, 2022

lee1043 commented May 18, 2022 • edited Loading

tomvothecoder commented May 19, 2022 • edited Loading

tomvothecoder May 19, 2022 • edited Loading

Choose a reason for hiding this comment

lee1043 May 20, 2022 • edited Loading

Choose a reason for hiding this comment

tomvothecoder May 19, 2022

Choose a reason for hiding this comment

lee1043 commented May 20, 2022 • edited Loading

tomvothecoder commented May 23, 2022

chengzhuzhang commented May 24, 2022

tomvothecoder commented May 24, 2022

tomvothecoder commented May 10, 2022 •

edited

Loading

codecov-commenter commented May 10, 2022 •

edited

Loading

chengzhuzhang commented May 12, 2022 •

edited

Loading

lee1043 commented May 18, 2022 •

edited

Loading

tomvothecoder commented May 19, 2022 •

edited

Loading

tomvothecoder May 19, 2022 •

edited

Loading

lee1043 May 20, 2022 •

edited

Loading

lee1043 commented May 20, 2022 •

edited

Loading