-
Notifications
You must be signed in to change notification settings - Fork 10
/
Copy path02b-x02-weatherAPI.Rmd
83 lines (60 loc) · 2.83 KB
/
02b-x02-weatherAPI.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
---
knit: bookdown::preview_chapter
---
#### Weather in Ames (API Version)
In many cases, you are not the first person attempting to use an online dataset for some sort of analysis. Some sites maintain an API (Application Programming Interface) which allows data access via simplified commands. In situations where an API is available (and free), it is generally better to utilize the API than to write a custom scraping function, as typically APIs are more stable than web page structure over time.
In some cases, there are even simplified interfaces to the site's API: for weather data, the `rnoaa` package provides R bindings to several APIs maintained by the National Oceanic and Atmospheric Administration.
```{r loadnoaa}
if (!"rnoaa" %in% installed.packages()) {
install.packages("rnoaa")
}
library(rnoaa)
```
First, we use the `meteo_nearby_stations` function to identify weather stations near Ames, IA:
```{r getnoaa}
lat_lon_df <- data.frame(id = "ames",
latitude = 42.034722,
longitude = -93.62)
# Get weather station closest to Ames
# Not all stations have all variables, so make sure TMAX is included
nearby_stations <- meteo_nearby_stations(
lat_lon_df = lat_lon_df, radius = 10,
limit = 3, var = "TMAX")
```
The `meteo_pull_monitors` function pulls data from the specified variables for the specified date range, for each station id provided.
```{r librarynoaa, eval = T, include = F, echo = F}
library(dplyr)
```
```{r wranglenoaa, message = F, eval = -1}
library(dplyr)
# According to https://www1.ncdc.noaa.gov/pub/data/ghcn/daily/readme.txt,
# data are provided in tenths of a degree Celsius
max_temperature <- meteo_pull_monitors(
nearby_stations$ames$id,
var = "TMAX",
date_min = "2008-01-01",
date_max = format.Date(Sys.Date(), "%Y-%m-%d")) %>%
mutate(tmax = tmax/10) # Convert to whole degrees Celsius
```
Formatting this data for plotting is relatively straightforward:
```{r plotnoaa, message = F}
max_temperature <- max_temperature %>%
mutate(yday_tmax = lag(tmax, 1))
library(ggplot2)
ggplot(data = max_temperature, aes(x = yday_tmax, y = tmax)) +
geom_jitter(alpha = I(0.5)) +
geom_smooth() +
xlab("Yesterday's maximum temperature") +
ylab("Today's maximum temperature") +
coord_equal()
ggplot(data = max_temperature, aes(x = yday_tmax, y = tmax)) +
geom_hex() +
xlab("Yesterday's maximum temperature") +
ylab("Today's maximum temperature") +
coord_equal()
```
Using the API makes for much simpler code!
#### Your turn
+ Use the API to get minimum and average daily temperatures in addition to maximum temperatures.
+ Reformat the weather data set in such a way that you can easily compare maximum and minimum temperatures in a scatterplot.
+ How do temperatures behave over the course of a year? Organize the data correspondingly and plot.