|
| 1 | + |
| 2 | +<!-- README.md is generated from README.Rmd. Please edit that file --> |
| 3 | + |
| 4 | +# harpPoint <img src='man/figures/harp_logo_dark.svg' align="right" width = "80" /> |
| 5 | + |
| 6 | +<!-- badges: start --> |
| 7 | +<!-- badges: end --> |
| 8 | + |
| 9 | +harpPoint provides functionality for the verification of meteorological |
| 10 | +data at geographic points. Typically this would be the verification of |
| 11 | +forecasts interpolated to the locations of weather stations. Functions |
| 12 | +are provided for computing verification scores for both deterministic |
| 13 | +and ensemble forecasts. In addition, confidence intervals for scores, or |
| 14 | +the differences between scores for different forecast models can be |
| 15 | +computed using bootstrapping. |
| 16 | + |
| 17 | +## Installation |
| 18 | + |
| 19 | +You can install harpPoint from [GitHub](https://github.com/) with: |
| 20 | + |
| 21 | +``` r |
| 22 | +# install.packages("remotes") |
| 23 | +remotes::install_github("harphub/harpPoint") |
| 24 | +``` |
| 25 | + |
| 26 | +## Verification |
| 27 | + |
| 28 | +harpPoint functions for verification are designed to work with data read |
| 29 | +in using functions from [harpIO](https://harphub.github.io/harpIO). This |
| 30 | +means `harp_df` data frames and `harp_lists`. These data must include a |
| 31 | +column for observations against which the forecasts should be verified. |
| 32 | + |
| 33 | +There are two main functions for verification: |
| 34 | + |
| 35 | +[`det_verify()`](https://harphub.github.io/harpPoint/reference/det_verify.html) - |
| 36 | +for deterministic forecasts; |
| 37 | + |
| 38 | +[`ens_verify()`](https://harphub.github.io/harpPoint/reference/ens_verify.html) - |
| 39 | +for ensemble forecasts. |
| 40 | + |
| 41 | +Both of these functions will output a `harp_verif` list. This is a list |
| 42 | +of data frames with scores separated out into summary scores and |
| 43 | +threshold scores. Threshold scores are computed when thresholds are |
| 44 | +provided to the functions and are computed for probabilities of |
| 45 | +threshold exceedance. |
| 46 | + |
| 47 | +## Deterministic scores |
| 48 | + |
| 49 | +`det_verify()` computes scores with the following column names: |
| 50 | + |
| 51 | +### *Summary scores* |
| 52 | + |
| 53 | +- **bias** - The mean difference between forecasts and observations |
| 54 | +- **rmse** - The root mean squared error |
| 55 | +- **mae** - The mean of the absolute error |
| 56 | +- **stde** - The standard deviation of the error |
| 57 | +- **hexbin** - A heat map of paired hexagonal bins of forecasts and |
| 58 | + observations |
| 59 | + |
| 60 | +### *Threshold scores* |
| 61 | + |
| 62 | +- **cont_tab** - A contingency table of forecast hits, misses, false |
| 63 | + alarms and correct rejections |
| 64 | +- **threat_score** - The ratio of hits to the sum of hits, misses and |
| 65 | + false alarms |
| 66 | +- **hit_rate** - The ratio of hits to the sum of hits and misses |
| 67 | +- **miss_rate** - The ratio of misses to the sum of hits and misses |
| 68 | +- **false_alarm_rate** - The ratio of false alarms to the sum of false |
| 69 | + alarms and correct rejections |
| 70 | +- **false_alarm_ratio** - The ratio of false alarms to the sum of false |
| 71 | + alarms and hits |
| 72 | +- **heidke_skill_score** - The fraction of correct forecasts after |
| 73 | + eliminating those forecasts that would be correct purely due to random |
| 74 | + chance |
| 75 | +- **pierce_skill_score** - 1 - miss rate - false alarm rate |
| 76 | +- **kuiper_skill_score** - How well forecasts separates hits from false |
| 77 | + alarms |
| 78 | +- **percent_correct** - The ratio of the sum of hits and correct |
| 79 | + rejections to the total number of cases |
| 80 | +- **frequency_bias** - The ratio of the sum of hits and false alarms to |
| 81 | + the sum of hits and misses |
| 82 | +- **equitable_threat_score** - How well the forecast measures hits |
| 83 | + accounting for hits due to pure chance |
| 84 | +- **odds_ratio** - The ratio of the product of hits and correct |
| 85 | + rejections to the product of misses and false alarms |
| 86 | +- **log_odds_ratio** - The sum of the logs of hits and correct |
| 87 | + rejections minus the sum of the logs of misses and false alarms. |
| 88 | +- **odds_ratio_skill_score** - The ratio of the product of hits and |
| 89 | + correct rejections minus the product of misses and false alarms to the |
| 90 | + product of hits and correct rejections plus the product of misses and |
| 91 | + false alarms |
| 92 | +- **extreme_dependency_score** - The ratio of the difference between the |
| 93 | + logs of observations climatology and hit rate to the sum of the logs |
| 94 | + of observations climatology and hit rate |
| 95 | +- **symmetric_eds** - The symmetric extreme dependency score, which is |
| 96 | + the ratio of the difference between the logs of forecast climatology |
| 97 | + and hit rate to the sum of the logs of forecast climatology and hit |
| 98 | + rate |
| 99 | +- **extreme_dependency_index** - The ratio of the difference between the |
| 100 | + logs of false alarm rate and hit rate to the sum of the logs of false |
| 101 | + alarm rate and hit rate |
| 102 | +- **symmetric_edi** - The symmetric extreme dependency index, which is |
| 103 | + the ratio of the sum of the difference between the logs of false alarm |
| 104 | + rate and hit rate and the difference between the logs of the inverse |
| 105 | + hit rate and false alarm rate to the sum of the logs of hit rate, |
| 106 | + false alarm rate, inverse hit rate and inverse false alarm rate. Here |
| 107 | + the inverse is 1 - the value. |
| 108 | + |
| 109 | +## Ensemble scores |
| 110 | + |
| 111 | +`ens_verify()` computes scores with the following column names: |
| 112 | + |
| 113 | +### *Summary scores* |
| 114 | + |
| 115 | +- **mean_bias** - The mean difference between the ensemble mean of |
| 116 | + forecasts and observations |
| 117 | +- **rmse** - The root mean squared error |
| 118 | +- **stde** - The standard deviation of the error |
| 119 | +- **spread** - The square root of the mean variance of the ensemble |
| 120 | + forecasts |
| 121 | +- **hexbin** - A heat map of paired hexagonal bins of forecasts and |
| 122 | + observations |
| 123 | +- **rank_histogram** - Observation counts ranked by ensemble member bins |
| 124 | +- **crps**- The cumulative rank probability score - the difference |
| 125 | + between the cumulative distribution of the ensemble forecasts and the |
| 126 | + step function of the observations |
| 127 | +- **crps_potential** - The crps that could be achieved with a perfectly |
| 128 | + reliable ensemble |
| 129 | +- **crps_reliability** - Measures the ability of the ensemble to produce |
| 130 | + a cumulative distribution with desired statisical properties. |
| 131 | +- **fair_crps** - The crps that would be achieved for either an ensemble |
| 132 | + with an infinite number of members, or for a number of members |
| 133 | + provided to the function |
| 134 | + |
| 135 | +### *Threshold scores* |
| 136 | + |
| 137 | +- **brier_score** - The mean of the squared error of the ensemble in |
| 138 | + probability space |
| 139 | +- **fair_brier_score** - The Brier score that would be achieved for |
| 140 | + either an ensemble with an infinite number of members, or for a number |
| 141 | + of members provided to the function |
| 142 | +- **brier_skill_score** - The Brier score compared to that of a |
| 143 | + reference probabilistic forecast (usually the observed climatology) |
| 144 | +- **brier_score_reliability** - A measure of the ensemble’s ability to |
| 145 | + produce reliable (forecast probability = observed frequency) forecasts |
| 146 | +- **brier_score_resolution** - A measure of the ensemble’s ability to |
| 147 | + discriminate between “on the day” uncertainty and climatological |
| 148 | + uncertainty |
| 149 | +- **brier_score_uncertainty** - The inherent uncertainty of the events |
| 150 | +- **reliability** - The frequency of observations for bins of forecast |
| 151 | + probability |
| 152 | +- **roc** - The relative operating characteristic of the ensemble - the |
| 153 | + hit rates and false alarm rates for forecast probability bins |
| 154 | +- **roc_area** - The area under a roc curve - summarises the ability of |
| 155 | + the ensemble to discriminate between events and non events |
| 156 | +- **economic_value** - The relative improvement in economic value of the |
| 157 | + forecast compared to climatology for a range of cost / loss ratios |
| 158 | + |
| 159 | +## Getting gridded data to points |
| 160 | + |
| 161 | +For interpolation of gridded data to points see the [Interpolate |
| 162 | +section](https://harphub.github.io/harpIO/articles/transformations.html#interpolate) |
| 163 | +of the [Transforming model |
| 164 | +data](https://harphub.github.io/harpIO/articles/transformations.html) |
| 165 | +article on the [harpIO website](https://harphub.github.io/harpIO), or |
| 166 | +the documentation for |
| 167 | +[geo_points](https://harphub.github.io/harpCore/reference/geo_points.html). |
0 commit comments