Skip to content

Commit e1182ca

Browse files
committed
Add README
1 parent 6eae6b9 commit e1182ca

File tree

4 files changed

+649
-0
lines changed

4 files changed

+649
-0
lines changed

.Rbuildignore

+1
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
11
^.*\.Rproj$
22
^\.Rproj\.user$
33
^LICENSE\.md$
4+
^README\.Rmd$

README.Rmd

+165
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,165 @@
1+
---
2+
output: github_document
3+
---
4+
5+
<!-- README.md is generated from README.Rmd. Please edit that file -->
6+
7+
```{r, include = FALSE}
8+
knitr::opts_chunk$set(
9+
collapse = TRUE,
10+
comment = "#>",
11+
fig.path = "man/figures/README-",
12+
out.width = "100%"
13+
)
14+
```
15+
16+
# harpPoint <img src='man/figures/harp_logo_dark.svg' align="right" width = "80" />
17+
18+
<!-- badges: start -->
19+
<!-- badges: end -->
20+
21+
harpPoint provides functionality for the verification of meteorological data at
22+
geographic points. Typically this would be the verification of forecasts
23+
interpolated to the locations of weather stations. Functions are provided for
24+
computing verification scores for both deterministic and ensemble forecasts. In
25+
addition, confidence intervals for scores, or the differences between scores for
26+
different forecast models can be computed using bootstrapping.
27+
28+
## Installation
29+
30+
You can install harpPoint from [GitHub](https://github.com/) with:
31+
32+
``` r
33+
# install.packages("remotes")
34+
remotes::install_github("harphub/harpPoint")
35+
```
36+
## Verification
37+
38+
harpPoint functions for verification are designed to work with
39+
data read in using functions from [harpIO](https://harphub.github.io/harpIO).
40+
This means `harp_df` data frames and `harp_lists`. These data must include a
41+
column for observations against which the forecasts should be verified.
42+
43+
There are two main functions for verification:
44+
45+
[`det_verify()`](https://harphub.github.io/harpPoint/reference/det_verify.html)
46+
- for deterministic forecasts;
47+
48+
[`ens_verify()`](https://harphub.github.io/harpPoint/reference/ens_verify.html)
49+
- for ensemble forecasts.
50+
51+
Both of these functions will output a `harp_verif` list. This is a list of
52+
data frames with scores separated out into summary scores and threshold scores.
53+
Threshold scores are computed when thresholds are provided to the functions
54+
and are computed for probabilities of threshold exceedance.
55+
56+
## Deterministic scores
57+
58+
`det_verify()` computes scores with the following column names:
59+
60+
### _Summary scores_
61+
62+
* __bias__ - The mean difference between forecasts and observations
63+
* __rmse__ - The root mean squared error
64+
* __mae__ - The mean of the absolute error
65+
* __stde__ - The standard deviation of the error
66+
* __hexbin__ - A heat map of paired hexagonal bins of forecasts and observations
67+
68+
### _Threshold scores_
69+
* __cont_tab__ - A contingency table of forecast hits, misses, false alarms and
70+
correct rejections
71+
* __threat_score__ - The ratio of hits to the sum of hits, misses and false
72+
alarms
73+
* __hit_rate__ - The ratio of hits to the sum of hits and misses
74+
* __miss_rate__ - The ratio of misses to the sum of hits and misses
75+
* __false_alarm_rate__ - The ratio of false alarms to the sum of false alarms
76+
and correct rejections
77+
* __false_alarm_ratio__ - The ratio of false alarms to the sum of false alarms and
78+
hits
79+
* __heidke_skill_score__ - The fraction of correct forecasts after eliminating
80+
those forecasts that would be correct purely due to random chance
81+
* __pierce_skill_score__ - 1 - miss rate - false alarm rate
82+
* __kuiper_skill_score__ - How well forecasts separates hits from false alarms
83+
* __percent_correct__ - The ratio of the sum of hits and correct rejections to the
84+
total number of cases
85+
* __frequency_bias__ - The ratio of the sum of hits and false alarms to the sum
86+
of hits and misses
87+
* __equitable_threat_score__ - How well the forecast measures hits accounting
88+
for hits due to pure chance
89+
* __odds_ratio__ - The ratio of the product of hits and correct rejections to the
90+
product of misses and false alarms
91+
* __log_odds_ratio__ - The sum of the logs of hits and correct rejections minus
92+
the sum of the logs of misses and false alarms.
93+
* __odds_ratio_skill_score__ - The ratio of the product of hits and correct
94+
rejections minus the product of misses and false alarms to the product of
95+
hits and correct rejections plus the product of misses and false alarms
96+
* __extreme_dependency_score__ - The ratio of the difference between the logs of
97+
observations climatology and hit rate to the sum of the logs of observations
98+
climatology and hit rate
99+
* __symmetric_eds__ - The symmetric extreme dependency score, which is the ratio
100+
of the difference between the logs of forecast climatology and hit rate to the
101+
sum of the logs of forecast climatology and hit rate
102+
* __extreme_dependency_index__ - The ratio of the difference between the logs of
103+
false alarm rate and hit rate to the sum of the logs of false alarm rate and hit
104+
rate
105+
* __symmetric_edi__ - The symmetric extreme dependency index, which is the ratio
106+
of the sum of the difference between the logs of false alarm rate and hit rate
107+
and the difference between the logs of the inverse hit rate and false alarm rate
108+
to the sum of the logs of hit rate, false alarm rate, inverse hit rate and
109+
inverse false alarm rate. Here the inverse is 1 - the value.
110+
111+
## Ensemble scores
112+
113+
`ens_verify()` computes scores with the following column names:
114+
115+
### _Summary scores_
116+
117+
* __mean_bias__ - The mean difference between the ensemble mean of forecasts and
118+
observations
119+
* __rmse__ - The root mean squared error
120+
* __stde__ - The standard deviation of the error
121+
* __spread__ - The square root of the mean variance of the ensemble forecasts
122+
* __hexbin__ - A heat map of paired hexagonal bins of forecasts and observations
123+
* __rank_histogram__ - Observation counts ranked by ensemble member bins
124+
* __crps__- The cumulative rank probability score - the difference between the
125+
cumulative distribution of the ensemble forecasts and the step function of the
126+
observations
127+
* __crps_potential__ - The crps that could be achieved with a perfectly reliable
128+
ensemble
129+
* __crps_reliability__ - Measures the ability of the ensemble to produce a
130+
cumulative distribution with desired statisical properties.
131+
* __fair_crps__ - The crps that would be achieved for either an ensemble with an
132+
infinite number of members, or for a number of members provided to the function
133+
134+
### _Threshold scores_
135+
136+
* __brier_score__ - The mean of the squared error of the ensemble in probability
137+
space
138+
* __fair_brier_score__ - The Brier score that would be achieved for either an
139+
ensemble with an infinite number of members, or for a number of members provided
140+
to the function
141+
* __brier_skill_score__ - The Brier score compared to that of a reference
142+
probabilistic forecast (usually the observed climatology)
143+
* __brier_score_reliability__ - A measure of the ensemble's ability to produce
144+
reliable (forecast probability = observed frequency) forecasts
145+
* __brier_score_resolution__ - A measure of the ensemble's ability to
146+
discriminate between "on the day" uncertainty and climatological uncertainty
147+
* __brier_score_uncertainty__ - The inherent uncertainty of the events
148+
* __reliability__ - The frequency of observations for bins of forecast
149+
probability
150+
* __roc__ - The relative operating characteristic of the ensemble - the hit
151+
rates and false alarm rates for forecast probability bins
152+
* __roc_area__ - The area under a roc curve - summarises the ability of the
153+
ensemble to discriminate between events and non events
154+
* __economic_value__ - The relative improvement in economic value of the
155+
forecast compared to climatology for a range of cost / loss ratios
156+
157+
158+
## Getting gridded data to points
159+
160+
For interpolation of gridded data to points see the
161+
[Interpolate section](https://harphub.github.io/harpIO/articles/transformations.html#interpolate)
162+
of the [Transforming model data](https://harphub.github.io/harpIO/articles/transformations.html)
163+
article on the [harpIO website](https://harphub.github.io/harpIO), or the
164+
documentation for [geo_points](https://harphub.github.io/harpCore/reference/geo_points.html).
165+

README.md

+167
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
2+
<!-- README.md is generated from README.Rmd. Please edit that file -->
3+
4+
# harpPoint <img src='man/figures/harp_logo_dark.svg' align="right" width = "80" />
5+
6+
<!-- badges: start -->
7+
<!-- badges: end -->
8+
9+
harpPoint provides functionality for the verification of meteorological
10+
data at geographic points. Typically this would be the verification of
11+
forecasts interpolated to the locations of weather stations. Functions
12+
are provided for computing verification scores for both deterministic
13+
and ensemble forecasts. In addition, confidence intervals for scores, or
14+
the differences between scores for different forecast models can be
15+
computed using bootstrapping.
16+
17+
## Installation
18+
19+
You can install harpPoint from [GitHub](https://github.com/) with:
20+
21+
``` r
22+
# install.packages("remotes")
23+
remotes::install_github("harphub/harpPoint")
24+
```
25+
26+
## Verification
27+
28+
harpPoint functions for verification are designed to work with data read
29+
in using functions from [harpIO](https://harphub.github.io/harpIO). This
30+
means `harp_df` data frames and `harp_lists`. These data must include a
31+
column for observations against which the forecasts should be verified.
32+
33+
There are two main functions for verification:
34+
35+
[`det_verify()`](https://harphub.github.io/harpPoint/reference/det_verify.html) -
36+
for deterministic forecasts;
37+
38+
[`ens_verify()`](https://harphub.github.io/harpPoint/reference/ens_verify.html) -
39+
for ensemble forecasts.
40+
41+
Both of these functions will output a `harp_verif` list. This is a list
42+
of data frames with scores separated out into summary scores and
43+
threshold scores. Threshold scores are computed when thresholds are
44+
provided to the functions and are computed for probabilities of
45+
threshold exceedance.
46+
47+
## Deterministic scores
48+
49+
`det_verify()` computes scores with the following column names:
50+
51+
### *Summary scores*
52+
53+
- **bias** - The mean difference between forecasts and observations
54+
- **rmse** - The root mean squared error
55+
- **mae** - The mean of the absolute error
56+
- **stde** - The standard deviation of the error
57+
- **hexbin** - A heat map of paired hexagonal bins of forecasts and
58+
observations
59+
60+
### *Threshold scores*
61+
62+
- **cont_tab** - A contingency table of forecast hits, misses, false
63+
alarms and correct rejections
64+
- **threat_score** - The ratio of hits to the sum of hits, misses and
65+
false alarms
66+
- **hit_rate** - The ratio of hits to the sum of hits and misses
67+
- **miss_rate** - The ratio of misses to the sum of hits and misses
68+
- **false_alarm_rate** - The ratio of false alarms to the sum of false
69+
alarms and correct rejections
70+
- **false_alarm_ratio** - The ratio of false alarms to the sum of false
71+
alarms and hits
72+
- **heidke_skill_score** - The fraction of correct forecasts after
73+
eliminating those forecasts that would be correct purely due to random
74+
chance
75+
- **pierce_skill_score** - 1 - miss rate - false alarm rate
76+
- **kuiper_skill_score** - How well forecasts separates hits from false
77+
alarms
78+
- **percent_correct** - The ratio of the sum of hits and correct
79+
rejections to the total number of cases
80+
- **frequency_bias** - The ratio of the sum of hits and false alarms to
81+
the sum of hits and misses
82+
- **equitable_threat_score** - How well the forecast measures hits
83+
accounting for hits due to pure chance
84+
- **odds_ratio** - The ratio of the product of hits and correct
85+
rejections to the product of misses and false alarms
86+
- **log_odds_ratio** - The sum of the logs of hits and correct
87+
rejections minus the sum of the logs of misses and false alarms.
88+
- **odds_ratio_skill_score** - The ratio of the product of hits and
89+
correct rejections minus the product of misses and false alarms to the
90+
product of hits and correct rejections plus the product of misses and
91+
false alarms
92+
- **extreme_dependency_score** - The ratio of the difference between the
93+
logs of observations climatology and hit rate to the sum of the logs
94+
of observations climatology and hit rate
95+
- **symmetric_eds** - The symmetric extreme dependency score, which is
96+
the ratio of the difference between the logs of forecast climatology
97+
and hit rate to the sum of the logs of forecast climatology and hit
98+
rate
99+
- **extreme_dependency_index** - The ratio of the difference between the
100+
logs of false alarm rate and hit rate to the sum of the logs of false
101+
alarm rate and hit rate
102+
- **symmetric_edi** - The symmetric extreme dependency index, which is
103+
the ratio of the sum of the difference between the logs of false alarm
104+
rate and hit rate and the difference between the logs of the inverse
105+
hit rate and false alarm rate to the sum of the logs of hit rate,
106+
false alarm rate, inverse hit rate and inverse false alarm rate. Here
107+
the inverse is 1 - the value.
108+
109+
## Ensemble scores
110+
111+
`ens_verify()` computes scores with the following column names:
112+
113+
### *Summary scores*
114+
115+
- **mean_bias** - The mean difference between the ensemble mean of
116+
forecasts and observations
117+
- **rmse** - The root mean squared error
118+
- **stde** - The standard deviation of the error
119+
- **spread** - The square root of the mean variance of the ensemble
120+
forecasts
121+
- **hexbin** - A heat map of paired hexagonal bins of forecasts and
122+
observations
123+
- **rank_histogram** - Observation counts ranked by ensemble member bins
124+
- **crps**- The cumulative rank probability score - the difference
125+
between the cumulative distribution of the ensemble forecasts and the
126+
step function of the observations
127+
- **crps_potential** - The crps that could be achieved with a perfectly
128+
reliable ensemble
129+
- **crps_reliability** - Measures the ability of the ensemble to produce
130+
a cumulative distribution with desired statisical properties.
131+
- **fair_crps** - The crps that would be achieved for either an ensemble
132+
with an infinite number of members, or for a number of members
133+
provided to the function
134+
135+
### *Threshold scores*
136+
137+
- **brier_score** - The mean of the squared error of the ensemble in
138+
probability space
139+
- **fair_brier_score** - The Brier score that would be achieved for
140+
either an ensemble with an infinite number of members, or for a number
141+
of members provided to the function
142+
- **brier_skill_score** - The Brier score compared to that of a
143+
reference probabilistic forecast (usually the observed climatology)
144+
- **brier_score_reliability** - A measure of the ensemble’s ability to
145+
produce reliable (forecast probability = observed frequency) forecasts
146+
- **brier_score_resolution** - A measure of the ensemble’s ability to
147+
discriminate between “on the day” uncertainty and climatological
148+
uncertainty
149+
- **brier_score_uncertainty** - The inherent uncertainty of the events
150+
- **reliability** - The frequency of observations for bins of forecast
151+
probability
152+
- **roc** - The relative operating characteristic of the ensemble - the
153+
hit rates and false alarm rates for forecast probability bins
154+
- **roc_area** - The area under a roc curve - summarises the ability of
155+
the ensemble to discriminate between events and non events
156+
- **economic_value** - The relative improvement in economic value of the
157+
forecast compared to climatology for a range of cost / loss ratios
158+
159+
## Getting gridded data to points
160+
161+
For interpolation of gridded data to points see the [Interpolate
162+
section](https://harphub.github.io/harpIO/articles/transformations.html#interpolate)
163+
of the [Transforming model
164+
data](https://harphub.github.io/harpIO/articles/transformations.html)
165+
article on the [harpIO website](https://harphub.github.io/harpIO), or
166+
the documentation for
167+
[geo_points](https://harphub.github.io/harpCore/reference/geo_points.html).

0 commit comments

Comments
 (0)