diff --git a/analysis/preprocessing/full_code.Rmd b/analysis/preprocessing/full_code.Rmd index dce70d500bbfded47921b1ab24f86aac3ca21633..c4d7e846416b95279ba56d924919a16f017dd981 100644 --- a/analysis/preprocessing/full_code.Rmd +++ b/analysis/preprocessing/full_code.Rmd @@ -7,18 +7,20 @@ output: bookdown::html_document2: toc: yes toc_float: yes -params: - analyze: TRUE - visualize: TRUE --- # Setup -This file contains all code to estimate the European income-stratified footprints, the European consumption deciles, and generate the figures in both the main manuscript and the supplementary information. It is located in the 'code' sub-folder, pulling files from the 'data' sub-folder and outputting figures and tables into the 'figures' sub-folder. These three sub-folders are self-contained. Keeping 'params:analyze:' and 'params:visualize:' both set to 'TRUE' in the Rmarkdown yaml header above runs the entire code, except for those code chunks set to 'eval = FALSE' which all required a high performance cluster computer. In order to generate only the figures from the main manuscript, set 'params: analyze:' to 'FALSE' in the yaml header above. The data frame from which all main manuscript figures are produced ('mrio_results_eu_ntile_mapped_n_10') is saved in the 'data' sub-folder as both a .csv and .rds file. It can be read in for generating those figures without running the whole code. Both 'params:analyze:' and params:visualize:' need to be set to TRUE in order to generate the figures in the SI, as they require a few more input files than the main 'mrio_results_eu_ntile_mapped_n_10' data frame. +This file contains all code to estimate the European income-stratified footprints and then the European expenditures deciles. There are three code chunks in this 'full code' Rmarkdown file. The first shows the EXIOBASE code, the second the income-stratified-footprints using the EUROSTAT HBS, and the third the estimation of the European expenditure deciles. All three code chunks are currently set to 'eval = FALSE', and the necessary raw data files to run them are not included in the git due to their size. We have also run the first two code chunks on a high-performance cluster computer. -We first load required R packages. +The code in this 'full code' Rmarkdown file writes derived data files to the folder: 'analysis' > 'data' > 'derived'. These files can be accessed there, and own analysis performed on them without running any of the code in this Rmarkdown. The derived data files are used to create the figures in the main paper and SI (the code for the figures can be found directly within the paper and si Rmarkdown files: 'analysis' > 'paper'). + +To run the code in this 'full code' Rmarkdown file, follow the instructions at the start of each section (before the code chunk) explaining which files must be downloaded from where, and in which folder they should be extracted. Then remove 'eval = FALSE' in the code chunk header before running it. ```{r setup, echo = FALSE, include = FALSE, message = FALSE} + +# first load required R packages + knitr::opts_chunk$set( collapse = TRUE, warning = FALSE, @@ -45,24 +47,11 @@ pacman::p_load(tidyverse, # EXIOBASE -EE-MRIO: EXIOBASE version3 ixi, from: https://zenodo.org/record/3583071#.XjC7kSN4wpY [accessed on 12.03.2020] @stadler_exiobase_2018 - -We have set an empty 'EXIOBASE' folder in the 'preprocessing'. To run the first code chunk, calculating total intensities in EXIOBASE, first access the website above. Download the input-output tables (IOT) ixi version .zip files for the relevant study years and extract them to the EXIOBASE folder. This creates a folder for each year with all of the relevant files. The first code chunk below will run only if the .zip files of all study years have been extracted to the EXIOBASE folder. These are large folders, we have used a high-performance cluster computer to run the first and second code chunks in this file. - - -We use standard input-output calculations to calculate total intensity vectors in EXIOBASE. EXIOBASE publishes the A matrix, the final demand matrix, the satellite extensions matrix, and satellite extensions direct from final demand matrix. We use the industry by industry (ixi) EXIOBASE data tables from EXIOBASE version3. This means 163 industry production sectors and 6 final demand categories for 49 regions worldwide (44 countries and 5 rest-of-world regions), from 1995 - 2016. All monetary units are in million current Euros. Stadler et al. (2018) @stadler_exiobase_2018 describe the EXIOBASE version3 compilation procedure in detail, including nine supporting information documents with further detailed information on the compilation of the monetary tables (S1), energy (S2), emissions (S3), and others. - -For each year, we first load the A matrix and calculate the Leontief inverse (the inverse of the A matrix). We load the final demand matrix and calculate total output ($x$) by pre-multiplying the Leontief inverse ($L$) by the row sums of the final demand matrix ($Y$): - -We then load the satellite extensions matrix and extract the relevant extensions. To calculate direct intensity vectors ($DIV$) we divide the satellite extension vectors ($f$) by total output ($x$): - -The total intensity vectors ($TIV$) are calculated by pre-multiplying the direct intensity vectors ($DIV$) by the Leontief inverse ($L$): - -The footprint is then calculated by row-wise multiplying the TIV by final demand: +The EXIOBASE files are publicly available online. For the results in the main paper we use EXIOBASE version3 industry-by-industry, which is available from: https://zenodo.org/record/3583071#.XjC7kSN4wpY [accessed on 12.03.2020]. In the SI, we also show some results from using EXIOBASE version3 product-by-product, which is available from: [accessed on 12.03.2020]. -Before that final step, however, we decompose national household final demand by income quintile according to the structure of the household budget survey, explained in the 'EUROSTAT HBS' section below. +These files are large, global input-output tables, and we performed standard input-output calculations on them to calculate and save total intensity vectors using a high-performance cluster computer. In this document we show the code that was run on the cluster computer, but have not uploaded any EXIOBASE files to the git. Running the first code chunk would require downloading the industry-by-industry version for the years 2005, 2010 and 2015, and the product-by-product version for the years 2005 and 2010. -The results in the main paper also present the footprint broken down by its domestic, other European, and non-European parts. To calculate these domestic and foreign parts of the footprint, we row-wise multiply the direct intensity vectors ($DIV$) by the Leontief inverse ($L$): +Each year is available as a .zip file ('IOT_year_ixi' or 'IOT_year_pxp') from the websites above. If the .zip files for the relevant study years and versions have been downloaded, they can be extracted into the 'EXIOBASE' folder in this git, which is found in the 'analysis' > 'preprocessing' folder. We have set 'EXIOBASE' empty for this purpose. Extracting each year into the EXIOBASE folder creates a folder for each year with all of the relevant files for that year and version. All code in this first code chunk leaves all file names as they are after extraction. ```{r exiobase, eval = FALSE}