adding vignette

use vignette to provide detailed walkthrough of package and as a template for producing publication. save copy of vignette used for each publication to document and comment on the analysis for each release.
DCMSstats · Oct 10, 2017 · 35da8af · 35da8af
1 parent 7fcd351
commit 35da8af
Show file tree

Hide file tree

Showing 3 changed files with 199 additions and 9 deletions.
diff --git a/.gitignore b/.gitignore
@@ -6,29 +6,22 @@
 .Ruserdata
 *.Rapp.history
 *.nb.html
-
 # Temporary system files
 .DS_Store
 ~$*.xls*
 *.tmp
-
 # Compiled packrat libraries
-
 packrat/lib*/
-
 # Environmental variables
-
 GITHUB_PAT
 .Renviron
-
 # Ignore files created by tests
-
 tests/testthat/figures/test_figure*_test.png
 tests/testthat/testdata/mtcars.Rds
 tests/testthat/testdata/test_output*
 ._*
 .~*
 *.swp
-
 inst/GVA_by_sector.Rds
 eesectors.Rcheck
+inst/doc
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -19,7 +19,8 @@ Suggests:
     knitr (>= 1.15.1),
     testthat (>= 1.0.2),
     purrr (>= 0.2.2),
-    visualTest (>= 1.0.0)
+    visualTest (>= 1.0.0),
+    rmarkdown
 Imports:
     assertr (>= 1.0.2),
     dplyr (>= 0.5.0),

diff --git a/vignettes/publication_template.Rmd b/vignettes/publication_template.Rmd
@@ -0,0 +1,196 @@
+---
+title: "Publication Template"
+author: "Max Unsted"
+date: "`r Sys.Date()`"
+output: rmarkdown::html_vignette
+vignette: >
+  %\VignetteIndexEntry{Vignette Title}
+  %\VignetteEngine{knitr::rmarkdown}
+  %\VignetteEncoding{UTF-8}
+---
+
+
+```{r setup, include=FALSE}
+knitr::opts_chunk$set(echo = TRUE)
+```
+
+!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
+
+NEED TO RUN DEVTOOLS::LOADALL() IN CONSOLE BEFORE RUNNING THIS
+
+
+
+## Raw Data
+
+#### extract functions
+```{r, include=FALSE}
+#read in raw data
+rm(list = ls()[!(ls() %in% c("keep"))])
+#input <- "C:/Users/max.unsted/Documents/working_file_dcms_V13.xlsm"
+
+#testing example data
+input <- example_working_file("example_working_file.xlsx")
+
+ABS = eesectors::extract_ABS_data(input)
+GVA = eesectors::extract_GVA_data(input)
+SIC91 = eesectors::extract_SIC91_data(input)
+tourism = eesectors::extract_tourism_data(input)
+```
+
+
+#### ABS Data
+aunnual business survey data
+SIC calculated from DOMVAL, contains a mixture of SIC and SIC group
+
+```{r}
+library(knitr)
+#kable(head(ABS,3))
+head(ABS,3)
+```
+
+
+#### GVA Data
+main data containing GVA
+contains a mixture of SIC and SIC group - but for eacmple 10.1 etc but no 10, and
+some non numeric values - see "CP £ Millions" raw data sheet
+
+```{r}
+head(GVA,3)
+```
+
+
+
+#### SIC91 Data
+SIC 91 Sales Data - used in place of the usual GVA values.
+Only contains 91 type SIC codes, 91.01 etc
+contains a mixture of SIC 91.xx and SIC group 91
+
+```{r}
+#kable(head(SIC91,3))
+head(SIC91,3)
+
+```
+
+
+#### Tourism Data
+tourism data - no sector column just year
+what is total, overlap, and perc?
+
+Read in SIC 91 Sales Data from spreadsheet. This data is used in place of the usual GVA values.
+Keeps SIC, year and value = ABS
+Only contains 91 type SIC codes, 91.01 etc ?????? but there is no SIC code?
+```{r}
+head(tourism,3)
+```
+
+
+#### eesectors::DCMS_sectors data
+this is just a dictionary of all the SIC codes with description, sectors, present, and SIC2
+
+
+present is a flag of whether there is a star in the columns for that sector in 'working file',
+obviously just determining which SICs belong to which sector - surely just want to drop all FALSE
+and get rid of this column?
+SIC2 is just integer part of SIC
+SIC and SIC group are in separate columns (SIC 92 is same in both)
+
+```{r}
+#identical(floor(as.numeric(DCMS_sectors$SIC)),round(as.numeric(DCMS_sectors$SIC2)))
+head(eesectors::DCMS_sectors,3)
+```
+
+
+## Summarising Data
+
+combine GVA datasets in long format and instanstiate year_sector_data class
+
+```{r}
+combined_GVA <- combine_GVA(
+  ABS = ABS,
+  GVA = GVA,
+  SIC91 = SIC91,
+  DCMS_sectors = eesectors::DCMS_sectors,
+  tourism = tourism
+) %>%
+  
+year_sector_data()
+
+head(combined_GVA$df,3)
+```
+
+
+## produce output for publication
+
+
+convert data to wide format used in publication
+```{r}
+year_period <- paste0(min(combined_GVA$years), " - ", max(combined_GVA$years))
+year_sector_table(combined_GVA) #how to add title? do in rmd?
+```
+
+
+produce figure 3.1
+```{r, echo=FALSE, results='asis'}
+figure3.1(combined_GVA) +
+  ggplot2::ggtitle(paste0('Figure 3.1: GVA contribution by DCMS sectors: ', year_period))
+```
+
+
+produce figure 3.2
+```{r}
+figure3.2(combined_GVA) +
+  ggplot2::ggtitle(paste0('Figure 3.2: Indexed growth in GVA (2010 = 100)\n in DCMS sectors and UK: ', year_period))
+```
+
+
+produce figure 3.3
+```{r}
+figure3.3(combined_GVA) +
+  ggplot2::ggtitle(paste0('Figure 3.3: Indexed growth in GVA (2010 = 100) in each DCMS sector: ', year_period))
+```
+
+
+produce figure 3.4
+```{r}
+figure3.4(combined_GVA) +
+  ggplot2::ggtitle(paste0('Figure 3.3: Indexed growth in GVA (2010 = 100) in each DCMS sector: ', year_period))
+```
+
+#### run bookdown or markdown knitr code to produce pdf
+
+
+#### produce excel tables
+tables in pdf and excel should be the same - both in £million not £billion
+```{r}
+year_period <- paste0(min(combined_GVA$years), " - ", max(combined_GVA$years))
+year_sector_table(combined_GVA) #how to add title? do in rmd?
+
+# create GVA tables
+excel_workbook(
+  contents_page(),
+  year_sector_table(combined_GVA),
+  year_sector_table(index(combined_GVA))
+)
+
+# create GVA subsectors tables
+excel_workbook(
+  contents_page(),
+  year_sector_table(creative_GVA),
+  year_sector_table(digital_GVA)
+)
+
+
+```
+
+
+
+
+
+## run checks
+
+```{r}
+
+```
+
+
+