National Oceanic and Atmospheric Administration (NOAA) maintains one of the largest climate data archives, the National Climatic Data Center (NCDC). NCDC's Climate Data Online (CDO) offers web services that provide access to weather and climate data, which include (but is not limited to) minimum and maximum temperature in different zip codes by date, amount of precipitation, longitude and latitude of stations, and more.
It is the only known R package that enables the user to search historical weather data by zip code.
Install from CRAN:
install.packages("noaa")
Install from rOpenGov:
options(repos = c(
ropengov = "https://ropengov.r-universe.dev",
CRAN = "https://cloud.r-project.org"))
install.packages("noaa")
Accessing CDO data requires a token.
Get a token by going to www.ncdc.noaa.gov/cdo-web/token and entering your email address.
The get_climate_data()
function allows you to retrieve climate and weather data from the NOAA Climate Data Online (CDO) API for a wide range of dataset types. It supports pagination and can retrieve large numbers of records by iteratively querying the API.
get_climate_data(noaa_token, datasetid, stationid = NULL, locationid = NULL, startdate, enddate, n_results = Inf)
Argument | Type | Description |
---|---|---|
noaa_token |
string |
Your NOAA API token, available by registering at NOAA CDO. |
datasetid |
string |
The dataset identifier. Must be one of the valid NOAA dataset IDs (see below). |
stationid |
string or NULL |
The station ID (e.g., "USW00094728" ). Required for station-based datasets. |
locationid |
string or NULL |
The location ID (e.g., "FIPS:48" for Texas). Required for location-based datasets. |
startdate |
string |
Start date for the query in "YYYY-MM-DD" format. |
enddate |
string |
End date for the query in "YYYY-MM-DD" format. |
n_results |
numeric |
Maximum number of results to return. Use Inf (default) to fetch all available records. |
The function currently supports the following datasets:
GHCND
– Daily summariesGSOM
– Monthly summariesGSOY
– Yearly summariesNEXRAD2
,NEXRAD3
– Radar dataNORMAL_DLY
,NORMAL_MLY
,ANN
,ANNUAL
– Climate normalsPRECIP_15
,PRECIP_HLY
– Precipitation dataISD
– Integrated Surface DataCLIMDIV
– Climate division dataCPC
– Climate Prediction Center dataLCD
– Local Climatological DataAGRMET
– Agricultural Meteorological dataSTORM_EVENTS
– Storm event data
# Example: Get daily precipitation for Central Park, NY in January 2020
df <- get_climate_data(
noaa_token = "YOUR_API_KEY",
datasetid = "GHCND",
stationid = "USW00094728",
startdate = "2020-01-01",
enddate = "2020-01-31",
n_results = 1000
)
The get_locationid()
function retrieves location identifiers from the NOAA Climate Data Online (CDO) API based on a specified category. It supports pagination to return large sets of location data.
get_locationid(noaa_token, category_id, n_results = Inf)
Argument | Type | Description |
---|---|---|
noaa_token |
string |
NOAA API token used for authentication. |
category_id |
string |
The location category identifier. Must be one of the valid location categories (see below). |
n_results |
numeric |
Maximum number of results to retrieve. Defaults to Inf to fetch all available records. |
The function supports the following location categories:
ST
– U.S. States and territories (e.g.,"ST:TX"
for Texas)CITY
– Major U.S. cities with weather stationsCOUNTY
– U.S. counties (e.g.,"COUNTY:US36061"
for New York County, NY)ZIP
– ZIP Code areasCLIM_REG
– NOAA-defined climate regions (e.g., Southeast, Midwest)HYDROL_REG
– Hydrologic regions used for water resource planningFIPS
– Federal Information Processing Standards codes (e.g.,"FIPS:37"
for North Carolina)
# Example: Retrieve a list of U.S. states
df <- get_locationid(
noaa_token = "YOUR_API_KEY",
category_id = "ST",
n_results = 100
)
A data frame containing location metadata from the specified category. If no results are returned or the request fails, an error is thrown or an empty data frame is returned.
The get_stationid()
function retrieves weather station metadata from the NOAA Climate Data Online (CDO) API for a given dataset and time range. It supports pagination and can return a large number of station records.
get_stationid(noaa_token, datasetid, locationid = NULL, startdate, enddate, n_results = Inf)
Argument | Type | Description |
---|---|---|
noaa_token |
string |
NOAA API token used for authentication. |
datasetid |
string |
The dataset identifier. Must be one of the valid dataset IDs (see list used by valid_ids() ). |
locationid |
string or NULL |
Optional location identifier to filter stations geographically (e.g., "FIPS:48" for Texas). |
startdate |
string |
Start date in "YYYY-MM-DD" format. |
enddate |
string |
End date in "YYYY-MM-DD" format. |
n_results |
numeric |
Maximum number of station records to retrieve. Defaults to Inf to fetch all. |
# Example: Get stations in Texas for the GHCND dataset during 2020
df <- get_stationid(
noaa_token = "YOUR_API_KEY",
datasetid = "GHCND",
locationid = "FIPS:48",
startdate = "2020-01-01",
enddate = "2020-12-31",
n_results = 1000
)
A data frame containing weather station metadata, such as station IDs, names, geographic coordinates, and available coverage. If no results are returned or the request fails, an error is thrown or an empty data frame is returned.
A detailed list of NOAA acronyms can be found here: https://www1.ncdc.noaa.gov/pub/data/ghcn/daily/readme.txt