Skip to content

NOAA-Big-Data-Program/EpiNOAA-R

Repository files navigation

EpiNOAA

The EpiNOAA R package is designed as a programmatic interface to the nClimGrid daily data created by NOAA and made publicly available through the NOAA Open Data Dissemination Program on public cloud providers. The goal of this project is to allow data subsetting and facilitate access to an analysis ready dataset using the tidyverse framework for tabular data.

If you would like to know more about nClimGrid data and the background behind this dataset, please visit: AWS Open Data Registry

[toc]

Installation

# Install from CRAN (coming soon...)
install.packages("epinoaa")

# Or the development version from GitHub
# install.packages("devtools")
devtools::install_github("https://github.com/NOAA-Big-Data-Program/EpiNOAA-R")

Configuration

This relies on a number of underlying packages and data to facilitate access to nClimGrid data.

Arrow Arrow is used to facilitate fast reading of data files in parquet format. Installing Apache Arrow for R can sometimes be problematic if the underlying C++ libraries are not installed/configured appropriately. If you have difficulty, please consult the install docs here: Arrow Package Install

Example

This is a basic example which pulls in county data from 2021:

library(EpiNOAA)

# Make sure that you have enough memory available to support the calls - the entire dataset at the census tract level is several tens of gigabytes depending on how many columns you select

# Read in time series of state data for NC and SC
read_nclimgrid_epinoaa(
  spatial_res = 'ste',
  states = c('NC', 'SC'))

Data Considerations

The data pulled using this package are produced by NOAA and made available through the NOAA Open Data Dissemination Program. This specific dataset is available through the AWS Open Data Registry. All of the raw data can be accessed here. If you are interested in working with the data out-of-memory, raw csv and parquet data files can be accessed here.

Scaled and Preliminary Data As these data are made available through NOAA, they are first released as preliminary datasets, then adjusted through a quality control process. After that additional processing and scaling, they are released as scaled datasets. This package combines scaled and preliminary data and provides flags to allow for filtering if needed. Make sure you are aware of the differences for your use-case.

About

R Package to access NOAA nClimGrid Data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published