si.Rmd

---
title: "SI"
output:
  html_document:
    toc: yes
    toc_depth: 4
    df_print: paged
  bookdown::html_document2:
    toc: yes
    toc_float: yes
  pdf_document:
    toc: yes
    toc_depth: '4'
  word_document:
    toc: yes
    toc_depth: '4'
---

# To-Do

- make sure final results are based off the most recent cluster run
- clean code (& explanations)

```{r setup, echo = FALSE, include = FALSE, message = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  warning = FALSE,
  message = FALSE,
  echo = FALSE,
  comment = "#>",
  fig.path = "../figures/",
  dpi = 300
)

if (!require("pacman")) install.packages("pacman")
pacman::p_load(tidyverse,
               janitor,
               here,
               wbstats,
               ISOcodes,
               viridis,
               hrbrthemes,
               wesanderson,
               glue,
               ggridges,
               patchwork,
               kableExtra)

pal <- wes_palette("Cavalcanti1", 5, type = "discrete")
extrafont::loadfonts()

# change knitr.table.format to "latex" for knitting pdf, to '"html" for knitting .html
options(knitr.table.format = "html", digits = round(2))

```

```{r }

library(here)
```

# Supplementary Materials and Methods

In this paper our first aim was to decompose an Environmentally-Extended Multi-Regional Input-Output (EE-MRIO) model's household final demand expenditure by income quantile, and then multiply this income-stratified expenditure by 'total' intensities calculated in the EE-MRIO to estimate national income-stratified footprints. An EE-MRIO is used for estimating national footprints from final demand expenditure. Final demand expenditure on different production sectors, in monetary units, is multiplied by 'total' intensities per production sector to calculate the footprint (for one year). The 'total' intensities per production sector estimate the total physical amount of environmental extension, whether emissions, energy, etc., anywhere along the supply chain per monetary unit of final demand expenditure. In this way, the total amount of environmental pressure in that year is allocated to each country based on their final demand expenditure (this 'footprint' approach is also called consumption-based accounting as opposed to territorial-based accounting). The 'total' intensities per production sector are calculated using standard EE-MRIO calculations. 

The final demand expenditure, and thus the national footprint, is typically disaggregated into several final demand categories, including households, non-profits serving households, government, gross-fixed capital formation, change in inventories and valuables. To the authors' knowledge, publicly available EE-MRIOs do not decompose their final demand expenditure by income quantile. The underlying distribution of final demand expenditure and footprint across income groups is therefore hidden. In order to decompose the EE-MRIO final demand expenditure and footprint by income quantile, a second data input is needed with information on the income/expenditure distribution. This could be a household budget survey (HBS) decomposed by income quantile or, at a more aggregate level, the national income distribution. In this paper we use:

1) EE-MRIO: EXIOBASE version3 ixi, from: https://zenodo.org/record/3583071#.XjC7kSN4wpY [accessed on 12.03.2020] @stadler_exiobase_2018

2) HBS: European household budget survey from EUROSTAT, macro-data, from : https://ec.europa.eu/eurostat/web/household-budget-surveys/database [acessed on 22.05.2020]

We discuss each of these in turn, including additional data inputs needed to complement the HBS. We then present our methodology for our results in the main paper, and data gaps and limitations. The final 'supplementary materials and methods' section presents an alternative methodology, and the final section of this supplementary information document presents supplementary results from the methodology used in the main paper and the alternative methodology. 

## EXIOBASE

We use standard input-output calculations to calculate total intensity vectors in EXIOBASE. EXIOBASE publishes the A matrix, the final demand matrix, the satellite extensions matrix, and satellite extensions direct from final demand matrix. We use the industry by industry (ixi) EXIOBASE data tables from EXIOBASE version3. This means 163 industry production sectors and 6 final demand categories for 49 regions worldwide (44 countries and 5 rest-of-world regions), from 1995 - 2016. All monetary units are in million current Euros. Stadler et al. (2018) @stadler_exiobase_2018 describe the EXIOBASE version3 compilation procedure in detail, including nine supporting information documents with further detailed information on the compilation of the monetary tables (S1), energy (S2), emissions (S3), and others. 

For each year, we first load the A matrix and calculate the Leontief inverse (the inverse of the A matrix). We load the final demand matrix and calculate total output ($x$) by pre-multiplying the Leontief inverse ($L$) by the row sums of the final demand matrix ($Y$): 

$$
x = L ° sum(Y)
$$

We then load the satellite extensions matrix and extract the relevant extensions. To calculate direct intensity vectors ($DIV$) we divide the satellite extension vectors ($f$) by total output ($x$): 

$$
DIV = f/x
$$

The total intensity vectors ($TIV$) are calculated by pre-multiplying the direct intensity vectors ($DIV$) by the Leontief inverse ($L$):

$$
TIV = DIV ° L
$$

The footprint is then calculated by row-wise multiplying the TIV by final demand:

$$
fp = TIV * Y
$$

Before that final step, however, we decompose national household final demand by income quintile according to the structure of the household budget survey, explained in the 'EUROSTAT HBS' section below. 

The results in the main paper also present the footprint broken down by its domestic, other European, and non-European parts. To calculate these domestic and foreign parts of the footprint, we row-wise multiply the direct intensity vectors ($DIV$) by the Leontief inverse ($L$):

$$
TIV breakdown = DIV * L
$$

### Satellite extensions

The satellite extensions we use are emissions of CO2-equivalence (in kilograms) and gross total energy use (in terajoules). We create the CO2-equivalence extension by summing together CO2, CH4, N2O, SF6, HFCs, and PFCs, from combustion, noncombustion, agriculture and waste. We use Global Warming Potential (GWP) values for a 100-year time horizon taken from the IPCC Fifth Assessment Report (ref: AR5 2014: https://www.ghgprotocol.org/sites/default/files/ghgp/Global-Warming-Potential-Values%20%28Feb%2016%202016%29_1.pdf): 28 for CH4, 265 for N2O and 23500 for SF6 (HFCs and PFCs are in CO2-equivalence already in the EXIOBASE satellite extensions). 

Gross total energy use... ('Energy Carrier Use: Total' (TJ))

In their Supporting Information 2, Stadler et al. (2018) @stadler_exiobase_2018 describe the compilation of the energy extensions in EXIOBASE version3. Energy supply and use tables from the International Energy Agency (IEA) are converted from the territory to the residence principle, before being allocated to the EXIOBASE industries and final use categories. This affects four transportation types: international air transport (deliveries from international aviation bunkers), international maritime transport (deliveries from international marine bunkers), fishing, and international road transport. The conversion to the residence principle means that the EXIOBASE energy extensions refer to the functional border of a country's economy. In this case, the system border is defined by the 'residence' of the agent. This means that energy supply and use from international transport by ships, airplanes, fishing vessels, cars and trucks are allocated to the resident units of a country, independent from where these activities take place. In EXIOBASE version3, because emissions from these transport activities are estimated from the energy extensions via emission factors, the emissions extensions follow the residence principle as well. 

### Satellite extensions direct from final demand

For CO2-equivalence emissions and energy use direct from household final demand, we use two further EUROSTAT emissions and energy tables to split 'Total activities by households' between heating/cooling (HH_HEAT), transport activities (HH_TRA), and other (HH_OTH). Definitions of what is included in these can be found in EUROSTAT's 'manual for air emissions accounts' (2015, p.66) @eurostat_manual_2015. While this disaggregation exists for nearly all EUROSTAT HBS countries, 2015 is the earliest year in our sample with complete coverage (except for Turkey in energy, which we impute using the Bulgarian splits). Therefore, we also use the 2015 splits between these 3 categories for our 2005 and 2010 estimates. The two data tables are:

1) For energy, the EUROSTAT data table 'Energy supply and use by NACE Rev. 2 activity' [env_ac_pefasu] at: http://appsso.eurostat.ec.europa.eu/nui/show.do?dataset=env_ac_pefasu [accessed on 03.06.2020]. 

2) For emissions to air, we download the EUROSTAT data table 'Air emissions accounts by NACE Rev. 2 activity' [env_ac_ainah_r2] at: https://appsso.eurostat.ec.europa.eu/nui/show.do?dataset=env_ac_ainah_r2&lang=en [accessed on 03.06.2020].

## EUROSTAT HBS

The EUROSTAT HBS compiles national household budget surveys from European countries. Countries provide EUROSTAT with their HBSs in micro-data form, and EUROSTAT has aggregated and published these survey data in macro-data form every five years since 1988. Of the publicly available data tables, we use two:

1) 'Mean consumption expenditure by income quintile (hbs_exp_t133)'

2) 'Structure of consumption expenditure by income quintile and COICOP consumption purpose (hbs_str_t223)'

'Mean consumption expenditure by income quintile' is expressed in purchasing power standard (PPS) per year, country and income quintile, in two possible units; per household and per adult equivalent. Purchasing power standard (PPS) is a currency measure that adjusts the euro for national price-level differences. The income quintiles are determined by household, ie. all households in the sample are ranked by income and evenly distributed into five quintiles, thus there are the same number of households in each income quintile. Per adult equivalent unit, the mean consumption expenditure by income quintile is adjusted for household size between countries, years, and income quintiles, using the modified OECD scale: the first adult in the household is given a weight of 1.0, each adult thereafter 0.5, and each child 0.3 @eurostat_description_2016. 

The 'Structure of consumption expenditure by income quintile and COICOP consumption purpose' table gives the distribution of consumption expenditure across COICOP consumption categories in each income quintile, expressed in 'parts per mille (pm)'. There are three 'levels' of COICOP breakdown. All countries have level 1 (12 consumption categories) and 2, but only a few with level 3. For the current analysis, we select our own mix of COICOP level 1 and 2. We use COICOP level 1, except for the categories of food (CP01), housing (CP04) and transport (CP07), where we use level 2. This is primarily because of the diversity of level 2 categories within those aggregate level 1 categories. This is most clearly seen in the housing category, where, at level 2, housing is broken down into: 'Actual rent' (CP041), 'Imputed rent' (CP042), 'Maintenance and repair' (CP043), 'Water' (CP044), and 'Electricity' (CP045) (GET PROPER NAMES IN HERE). Mapping all shelter-related EXIOBASE production sectors only to the HBS level 1 'housing' consumption category (CP04) would obscure the difference between level 2 categories with extremely different effects on the footprint (electricity production vs. rental payments). We do not take COICOP level 2 across all consumption categories, however, because there are a few COICOP categories that are more disaggregated than in EXIOBASE (EXPLAIN THIS BETTER). The COICOP consumption categories we thus use are: CP011, CP012, CP02, CP03, 'rent' (we collapse CP041 with CP042), CP043, CP044, CP045, CP05, CP06, CP071, CP072, CP073, CP08, CP09, CP10, CP11, CP12 (INCLUDE NAMES).

National footprints are often expressed in per capita terms; the national footprint divided by total national population. This masks any underlying inequality in per capita footprints across income groups. As mentioned above, the EUROSTAT HBS 'mean consumption expenditure by income quintile' macro-data table is expressed only per household and per adult equivalent, but not per capita. In order to express our footprint estimates per household, we need the total number of private households in each country and year. In order to adjust our footprint estimates for household size, we need the average adult equivalent units per household in each country, year and income quintile, or alternatively, national population in adult equivalent units (using the same scale as the HBS). Additional data tables are needed for the total number of private households in each country and year. We use:

1) The 'lfst_hhnhtych' table from EUROSTAT, selecting all available years and total households [accessed on 04.05.2020]. 

2) Norway is missing from the 'lfst_hhnhtych' EUROSTAT table. We download Norwegian data from the Norwegian statistical office: https://www.ssb.no/en/statbank/table/10986/. We select 'Private households', 'the whole country', no household type selection, all years (2005-2019), and continue with 'Table - Layout 1', then save the table as a 'Tab delimited without heading (csv)' file [accessed on 04.05.2020].
  
```{r, echo = FALSE, warning = FALSE, eval = FALSE}

options(digits=4)

total_private_households_Eurostat = read.csv("../data/lfst_hhnhtych_1_Data.csv") %>%
  filter(!(GEO %in% c("European Union - 27 countries (from 2020)",
                    "Euro area - 19 countries  (from 2015)",
                    "European Union - 28 countries (2013-2020)",
                    "European Union - 15 countries (1995-2004)"))) %>%
  mutate(geo = dplyr::recode(GEO,
                             "Belgium" = "BE",
                             "Bulgaria" = "BG",
                             "Czechia" = "CZ",
                             "Denmark" = "DK",
                             "Germany (until 1990 former territory of the FRG)" = "DE",
                             "Estonia" = "EE",
                             "Ireland" = "IE",
                             "Greece" = "EL",
                             "Spain" = "ES",
                             "France" = "FR",
                             "Croatia" = "HR",
                             "Italy" = "IT",
                             "Cyprus" = "CY",
                             "Latvia" = "LV",
                             "Lithuania" = "LT",
                             "Luxembourg" = "LU",
                             "Hungary" = "HU",
                             "Malta" = "MT",
                             "Netherlands" = "NL",
                             "Austria" = "AT",
                             "Poland" = "PL",
                             "Portugal" = "PT",
                             "Romania" = "RO",
                             "Slovenia" = "SI",
                             "Slovakia" = "SK",
                             "Finland" = "FI",
                             "Sweden" = "SE",
                             "United Kingdom" = "UK",
                             "Montenegro" = "ME",
                             "North Macedonia" = "MK",
                             "Serbia" = "RS",
                             "Turkey" = "TR")) %>%
  select(TIME,geo,Value) %>%
  rename(year = TIME, total_private_households = Value) %>%
  mutate(total_private_households = parse_number(total_private_households),
         total_private_households = as.numeric(total_private_households),
         total_private_households = total_private_households*1000)

total_private_households_Norway = read.csv("../data/Privathusholdninger.csv") %>%
  gather(year, total_private_households, Private.households.2005:Private.households.2019) %>%
  mutate(geo = dplyr::recode(region,
                             "0 The whole country" = "NO"),
         year = dplyr::recode(year,
                              "Private.households.2005" = 2005,
                              "Private.households.2006" = 2006,
                              "Private.households.2007" = 2007,
                              "Private.households.2008" = 2008,
                              "Private.households.2009" = 2009,
                              "Private.households.2010" = 2010,
                              "Private.households.2011" = 2011,
                              "Private.households.2012" = 2012,
                              "Private.households.2013" = 2013,
                              "Private.households.2014" = 2014,
                              "Private.households.2015" = 2015,
                              "Private.households.2016" = 2016,
                              "Private.households.2017" = 2017,
                              "Private.households.2018" = 2018,
                              "Private.households.2019" = 2019)) %>%
  select(year,geo,total_private_households)

total_private_households = rbind(total_private_households_Eurostat,
                                 total_private_households_Norway) %>%
  mutate(geo = as.character(geo),
         year = as.numeric(year),
         total_private_households = as.numeric(total_private_households))

write_csv(total_private_households, "../data/total_private_households.csv")

options(digits=2)

```

## Method for main paper results

For our results in the main paper, we first decompose the EXIOBASE national household final demand expenditure (before calculating the footprint) by income quintile, using the EUROSTAT HBS. We do this for 2005, 2010 and 2015, but only present 2015 in the main paper. The initial step is mapping the COICOP consumption categories in the HBS to the production sectors in EXIOBASE. Because the share of each consumption category/production sector in the total amount of expenditure is not identical between the HBS and EXIOBASE, there are two possible methods for decomposing the EXIOBASE household final demand expenditure: one that keeps the EXIOBASE production sector shares of total EXIOBASE expenditure intact (breaking the HBS consumption category shares), and one that keeps the HBS consumption category shares of total HBS expenditure intact (breaking the EXIOBASE production sector shares). Our results in the main paper use the first method, keeping the EXIOBASE production sector shares of total EXIOBASE expenditure intact. This means that the total footprint is identical to when it is calculated in EXIOBASE without any decomposition by income quantile. The alternative method, on the other hand, results in a different total footprint because a different amount of final demand expenditure in each production sector is now multiplied by the same original 'total' intensity for that production sector. We present the alternative method and some results in the last sections of this document. (Oswald et al. (2020) @oswald_large_2020 discuss their use of a third option in between these two, where they recalculate intensities using HBS expenditure, still estimating different total footprints than the non-decomposed footprint but with good agreement between the two). In this section we use a two sector toy model to illustrate the method behind the main paper results. In the two sector model, the term 'sector' represents HBS consumption categories/EXIOBASE production sectors that have already been mapped to each other. 

Step 1) Multiply 'mean consumption expenditure by income quintile' in purchasing power standard per household (pps hh) by the 'structure of consumption expenditure by income quintile and COICOP consumption purpose' (in 'parts per mille' or pm) to calculate the consumption expenditure structure in 'pps hh'. Then calculate the shares of eachincome quintile within each sector. Table S1 shows the two sector example, where 'pps hh' is multiplied by the shares of each sector ('parts per mille' divided by 1000) to calculate the expenditure on each sector per income quintile in 'pps hh' ('s1 (pps hh)' and 's2 (pps hh)'). 's1 (q share)' and 's2 (q share)' are each income quintile's share of the total amount of expenditure (in 'pps hh') on that sector, according to the HBS.

```{r tableS1}

quintile = c("q1","q2","q3","q4","q5")
pps_hh = c(10,13,20,33,40)

pm_sector1 = c(200,300,400,500,600)
pm_sector2 = c(800,700,600,500,400)

hbs = data.frame(quintile, pps_hh, pm_sector1, pm_sector2) %>%
  mutate(pps_hh_sector1 = pps_hh*(pm_sector1/1000),
         sector_1_shares = pps_hh_sector1/sum(pps_hh_sector1),
         pps_hh_sector2 = pps_hh*(pm_sector2/1000),
         sector_2_shares = pps_hh_sector2/sum(pps_hh_sector2))

knitr::kable(hbs, caption = "Table S1: HBS structure with calculations of quintile shares per sector.", 
             escape = F, 
             booktabs = TRUE,
             col.names = c("quintile",
                           "pps hh", 
                           "s1 (pm)", 
                           "s2 (pm)",
                           "s1 (pps hh)",
                           "s1 (q share)",
                           "s2 (pps hh)",
                           "s2 (q share)")) %>%
  kable_styling(latex_options = "HOLD_position")
                            
```

Step 2) Join the HBS income quintile shares per sector to the EE-MRIO by sector, and multiply these shares by the EE-MRIO household final demand expenditure per sector ('eemrio hh fd') to decompose it by income quintile (Table S2).

```{r tableS2}

q_share_of_sector = hbs %>%
  select(quintile,pps_hh_sector1,pps_hh_sector2) %>%
  gather(sector,value,-quintile) %>%
  mutate(sector = dplyr::recode(sector,
                                "pps_hh_sector1" = 1,
                                "pps_hh_sector2" = 2)) %>%
  group_by(sector) %>%
  mutate(q_share_of_sector = value/sum(value)) %>%
  select(-value) %>%
  spread(quintile,q_share_of_sector) 

eemrio_hh_fd = c(300,800)

eemrio = data.frame(q_share_of_sector, eemrio_hh_fd) %>%
  mutate(q1_eemrio = q1*eemrio_hh_fd,
         q2_eemrio = q2*eemrio_hh_fd,
         q3_eemrio = q3*eemrio_hh_fd,
         q4_eemrio = q4*eemrio_hh_fd,
         q5_eemrio = q5*eemrio_hh_fd)

knitr::kable(eemrio, caption = "Table S2: HBS income quintile shares per sector 
             multiplied by EE-MRIO household final demand expenditure vector to 
             calculate EE-MRIO household final demand expenditure per 
             income quintile and sector.",
             booktabs = TRUE,
             col.names = c("sector",
                           "q1 share",
                           "q2 share",
                           "q3 share",
                           "q4 share",
                           "q5 share",
                           "eemrio hh fd",
                           "q1 fd",
                           "q2 fd",
                           "q3 fd",
                           "q4 fd",
                           "q5 fd")) %>%
  kable_styling(latex_options = "HOLD_position")

```

Step 3) Once the EE-MRIO household final demand expenditure is decomposed by income quintile, it is multiplied by the EE-MRIO total intensities per sector to calculate the EE-MRIO household footprint per sector decomposed by income quintile (Table S3). The total footprint per sector ('total fp') is the same as if we had simply multiplied the non-decomposed EE-MRIO household final demand expenditure per sector ('eemrio hh fd') by the total intensity vector ('TIV'). 

```{r tableS3}

TIV = c(0.2,0.4)

footprint = data.frame(eemrio,TIV) %>%
  select(sector,
         eemrio_hh_fd,
         q1_eemrio,
         q2_eemrio,
         q3_eemrio,
         q4_eemrio,
         q5_eemrio,
         TIV) %>%
  mutate(q1_footprint = q1_eemrio*TIV,
         q2_footprint = q2_eemrio*TIV,
         q3_footprint = q3_eemrio*TIV,
         q4_footprint = q4_eemrio*TIV,
         q5_footprint = q5_eemrio*TIV,
         total_footprint = eemrio_hh_fd*TIV)

knitr::kable(footprint, caption = "Table S3: Calculation of EE-MRIO household 
             footprint decomposed by income quintile, through multiplication of 
             EE-MRIO household final demand expenditure per income quintile and 
             sector by the total intensity vector (TIV) calculated in the EE-MRIO 
             (with only one total intensity per sector).",
             booktabs = TRUE,
             col.names = c("sector",
                           "eemrio hh fd",
                           "q1 fd",
                           "q2 fd",
                           "q3 fd",
                           "q4 fd",
                           "q5 fd",
                           "TIV",
                           "q1 fp",
                           "q2 fp",
                           "q3 fp",
                           "q4 fp",
                           "q5 fp",
                           "total fp")) %>%
  kable_styling(latex_options = "HOLD_position")

```

## Mapping between EXIOBASE and EUROSTAT HBS

Table S4 shows our mapping between the EUROSTAT HBS consumption categories and the EXIOBASE production sectors, along with our grouping of the EXIOBASE production sectors into the five aggregated consumption categories we present in the main paper. 

```{r tableS4}

labels = read.csv(here("/analysis/preprocessing/Exiobase_T_labels_ixi_w_coicop_mapping_no_rent.csv")) %>%
  select(V2,coicop,five_sectors) %>%
  unique()

knitr::kable(labels, caption = "Table S4: Mapping EXIOBASE industry 
             production sectors to COICOP consumption categories and 
             five aggregate consumption categories",
             longtable = TRUE,
             booktabs = TRUE,
             col.names = c("exiobase industry production sector",
                           "coicop consumption category",
                           "aggregate consumption category")) %>%
  kable_styling(latex_options = c("HOLD_position", "repeat_header")) %>%
  column_spec(1, width = "30em") %>%
  scroll_box(width = "100%", height = "200px")

```

## Country and year coverage

The EXIOBASE version3 industry-by-industry is available for the years 1995 to 2016, albeit with the caveat that the original data series ends in 2011, the 2012-2016 estimates are based on trade and macro-economic data, and care must be taken, especially analysing trends over time. The EUROSTAT HBS is available for the years: 1988, 1994, 1999, 2005, 2010 and 2015, although not all countries are available for all years. Table S5 shows the country and year coverage between EXIOBASE and the EUROSTAT HBS. Rows with black text show countries that are represented in EXIOBASE, and an 'x' for those years where EUROSTAT HBS data also exists for that country. Rows with red text show countries where EUROSTAT HBS data exists, but who are not represented individually in EXIOBASE (they are in 'rest-of-world' categories).

```{r tableS5}

geo = c("Austria",
        "Belgium",
        "Bulgaria",
        "Cyprus",
        "Czech Rep.",
        "Germany",
        "Denmark",
        "Estonia",
        "Greece",
        "Spain",
        "Finland",
        "France",
        "Croatia",
        "Hungary",
        "Ireland",
        "Italy",
        "Lithuania",
        "Luxembourg",
        "Latvia",
        "Montenegro",
        "North Macedonia",
        "Malta",
        "Netherlands",
        "Norway",
        "Poland",
        "Portugal",
        "Romania",
        "Serbia",
        "Sweden",
        "Slovenia",
        "Slovakia",
        "Turkey",
        "UK",
        "Kosovo")

year_2015 = c("x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x")

year_2010 = c("x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "",
              "x",
              "",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "",
              "x",
              "x",
              "x",
              "x",
              "x",
              "")
  
year_2005 = c("x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "x",
              "",
              "x",
              "x",
              "x",
              "x",
              "x",
              "")
  
year_1999 = c("x",
              "x",
              "",
              "",
              "",
              "x",
              "x",
              "",
              "x",
              "x",
              "x",
              "x",
              "",
              "",
              "x",
              "x",
              "",
              "x",
              "",
              "",
              "",
              "",
              "x",
              "",
              "",
              "x",
              "",
              "",
              "x",
              "",
              "",
              "",
              "x",
              "")

country_year_coverage = data.frame(geo,
                                   year_2015,
                                   year_2010,
                                   year_2005,
                                   year_1999)

knitr::kable(country_year_coverage, caption = "Table S5: Country and year 
             coverage between EXIOBASE and the EUROSTAT HBS. Rows with black 
             text show countries that are represented in EXIOBASE, and an 'x' 
             for those years where EUROSTAT HBS data also exists for that country. 
             Rows with red text show countries where EUROSTAT HBS data exists, 
             but who are not represented individually in EXIOBASE (they are in 
             'rest-of-world' categories)",
             col.names = c("geo",
                           "2015",
                           "2010",
                           "2005",
                           "1999")) %>%
  kable_styling(latex_options = "HOLD_position") %>%
  row_spec(c(20,21,28,34), color = "red") %>%
  scroll_box(width = "100%", height = "200px")

```

## Limitations

Where do I say why we chose EXIOBASE?

While the EUROSTAT HBS is compiled for cross-country comparison purposes and aims for harmonization, there remains imperfect harmonization in the frequency of surveys, timing, content and structure between countries and years @eurostat_eu_2020. Some types of households may also be excluded from the samples, including super-rich households, for example Germany, which excludes households with over €18,000 monthly net income @eurostat_eu_2020. Sensitive goods and services, such as alcohol, may be under-reported in household budget surveys, while expenditure on infrequent purchases such as a vehicle may create artificially large expenditure differences between households depending on the timing of the survey @eurostat_eu_2020. The EUROSTAT HBS macro-data also does not report direct foreign purchases, and we assumed that the expenditure shares between income quintiles of direct final demand for foreign goods and services was the same as direct final demand for domestic goods and services.

There are also well known limitations when using and selecting an EE-MRIO (ref). The production sectors in EXIOBASE are harmonized across countries and years, but needing to map the EUROSTAT HBS to EXIOBASE meant that the most recent year of 2015 could only use the industry-by-industry version of EXIOBASE version3. This version assumes fixed product sales. Furthermore, because EXIOBASE version3 is extrapolated beyond 2011, caution should be used when comparing results over time. This, and the fact that harmonization guidelines in the EUROSTAT HBS have changed over time, were the justification for presenting only the 2015 results in the main paper, and presenting 2005 and 2010 results only here in the SI and data file. We also show 2010 results using the product-by-product version of EXIOBASE version3 in the final section of this SI document. 

Mapping the EUROSTAT HBS to EXIOBASE means mapping the COICOP consumption categories in the HBS to industry production sectors in EXIOBASE, which is not one-to-one. Both the EUROSTAT HBS and EXIOBASE are limited in their consumption category/production sector level of detail. The share of each consumption category/production sector in the total amount of expenditure is also not identical between the HBS and EXIOBASE. As discussed in the 'methods for main paper results' section of this SI document, there are alternative methods for decomposing the EXIOBASE household final demand expenditure: one that keeps the EXIOBASE production sector shares of total expenditure intact, and one that keeps the HBS consumption category shares of total expenditure intact. Our results in the main paper use the first method, keeping the EXIOBASE sectoral shares of total expenditure intact, which means that the total footprint is identical to when it is calculated in EXIOBASE without any decomposition by income quantile. The alternative method, on the other hand, results in a different total footprint because a different amount of final demand expenditure in each sector is multiplied by the same original 'total' intensities, but stays faithful to the original HBS consumption category shares of total HBS expenditure. We show the alternative method and some results in the last sections of this SI document.

### Purchaser price versus base price

The EUROSTAT HBS is reported in purchaser price (consumers report their expenditure in the prices they paid) while we use EXIOBASE version3 in base price. Because we decided to keep the EXIOBASE production section shares the same (and just decompose by income quintile within each production sector), which also keeps the total footprint the same, this distinction between purchaser price and base price does not matter. The 'alternative method' section of this document shows that, in the alternative methodology where we break the EXIOBASE production sector shares and keep the consumption category shares of the HBS, the distinction between purchaser price and base price does matter.

In the methodology used for the main paper, once we have multiplied HBS 'pps hh' per income quintile by the sectoral shares in 'pm', to get sectoral expenditure per income quintile in 'pps hh', we include the fact that sector 1 and sector 2 have base price to purchaser price ratios that are different. The base price to purchaser price ratio for sector 1 ('s1 bp') is 0.5, whereas for sector 2 ('s2 bp') it is 1 (ie. there are no trade, transport, tax or subsidy margins in sector 2). This results in sectoral expenditure per income quintile in 'pps hh' now expressed in base price ('s1 pps hh bp' and 's2 pps hh bp') (Table S6).

```{r tableS6}

bp_share_in_pp_sector_1 = c(0.5,0.5,0.5,0.5,0.5)
bp_share_in_pp_sector_2 = c(1,1,1,1,1)

hbs_bp = data.frame(quintile, 
                    pps_hh, 
                    pm_sector1, 
                    pm_sector2, 
                    bp_share_in_pp_sector_1, 
                    bp_share_in_pp_sector_2) %>%
  mutate(pps_hh_sector1 = pps_hh*(pm_sector1/1000),
         pps_hh_sector1_bp = pps_hh_sector1*bp_share_in_pp_sector_1,
         sector_1_shares = pps_hh_sector1/sum(pps_hh_sector1),
         sector_1_shares_bp = pps_hh_sector1_bp/sum(pps_hh_sector1_bp),
         pps_hh_sector2 = pps_hh*(pm_sector2/1000),
         pps_hh_sector2_bp = pps_hh_sector2*bp_share_in_pp_sector_2,
         sector_2_shares = pps_hh_sector2/sum(pps_hh_sector2),
         sector_2_shares_bp = pps_hh_sector2_bp/sum(pps_hh_sector2_bp),
         pps_hh_bp = pps_hh_sector1_bp + pps_hh_sector2_bp)
  
hbs_bp_pps = hbs_bp %>%
  select(quintile, 
         pps_hh,
         pm_sector1,
         pm_sector2,
         pps_hh_sector1,
         bp_share_in_pp_sector_1,
         pps_hh_sector1_bp,
         pps_hh_sector2,
         bp_share_in_pp_sector_2,
         pps_hh_sector2_bp)

knitr::kable(hbs_bp_pps, caption = "Table S6: Same as Table S1 but now 
             with base price shares of purchaser price per sector, and 
             sectoral expenditure per quintile in pps hh base price.", 
             escape = F, 
             booktabs = TRUE,
             col.names = c("quintile",
                           "pps hh", 
                           "s1 (pm)", 
                           "s2 (pm)",
                           "s1 (pps hh)",
                           "s1 bp",
                           "s1 (pps hh bp)",
                           "s2 (pps hh)",
                           "s2 bp",
                           "s2 (pps hh bp)")) %>%
  kable_styling(latex_options = "HOLD_position")

```

Table S7 shows that even as the sectoral expenditure for sector 1 is 50% lower in base price than in purchaser price, because we only have one base price to purchaser price ratio per sector (ie. we assume that every income quintile in a country faces the same base price to purchaser price ratio when purchasing a good or service from a given sector), the sectoral shares per income quintile in base price are the same as they were in Table S1. 

```{r tableS7}

hbs_bp_shares = hbs_bp %>%
  select(quintile, 
         pps_hh_bp,
         pps_hh_sector1_bp,
         sector_1_shares_bp,
         pps_hh_sector2_bp,
         sector_2_shares_bp) %>%
  mutate(pm_sector1_bp = (pps_hh_sector1_bp/(pps_hh_sector1_bp + pps_hh_sector2_bp))*1000,
         pm_sector2_bp = (pps_hh_sector2_bp/(pps_hh_sector1_bp + pps_hh_sector2_bp))*1000)

knitr::kable(hbs_bp_shares, caption = "Table S7: Quintile shares per sector 
             in base price, with new 'pps hh' per quintile in base price and 
             new sectoral pm values in base price.", 
             escape = F, 
             booktabs = TRUE,
             col.names = c("quintile",
                           "pps hh bp",
                           "s1 (pps hh bp)",
                           "s1 (q share bp)",
                           "s2 (pps hh bp)",
                           "s2 (q share bp)",
                           "s1 (pm bp)",
                           "s2 (pm bp)")) %>%
  kable_styling(latex_options = "HOLD_position")

```

If we now do the same as in Table S3, joining these income quintile shares per sector to the EE-MRIO by sector, and multiplying them by the EE-MRIO household final demand expenditure per sector to decompose it by income quintile, we end up with the same decomposition which will result in the same decomposed footprint as before. 

```{r tableS8}

q_share_of_sector_bp = hbs_bp_shares %>%
  select(quintile,pps_hh_sector1_bp,pps_hh_sector2_bp) %>%
  gather(sector,value,-quintile) %>%
  mutate(sector = dplyr::recode(sector,
                                "pps_hh_sector1_bp" = 1,
                                "pps_hh_sector2_bp" = 2)) %>%
  group_by(sector) %>%
  mutate(q_share_of_sector_bp = value/sum(value)) %>%
  select(-value) %>%
  spread(quintile,q_share_of_sector_bp) 

eemrio_hh_fd = c(300,800)

eemrio_bp = data.frame(q_share_of_sector_bp, eemrio_hh_fd) %>%
  mutate(q1_eemrio = q1*eemrio_hh_fd,
         q2_eemrio = q2*eemrio_hh_fd,
         q3_eemrio = q3*eemrio_hh_fd,
         q4_eemrio = q4*eemrio_hh_fd,
         q5_eemrio = q5*eemrio_hh_fd)

knitr::kable(eemrio_bp, caption = "Table S8: Identical to Table S2.",
             booktabs = TRUE,
             col.names = c("sector",
                           "q1 share bp",
                           "q2 share bp",
                           "q3 share bp",
                           "q4 share bp",
                           "q5 share bp",
                           "eemrio hh fd",
                           "q1 fd",
                           "q2 fd",
                           "q3 fd",
                           "q4 fd",
                           "q5 fd")) %>%
  kable_styling(latex_options = "HOLD_position")

```

## European expenditure deciles

Those countries with data in 2005, 2010 and 2015. 

## Alternative method

Our methodology used for the main paper, and explained in the sections above, keeps the production sector shares of EE-MRIO household final demand expenditure (and subsequently the footprint) the same as they are found in the original EE-MRIO household final demand expenditure when not decomposed by income quantile. The alternative method is to keep the consumption category shares of total HBS expenditure the same as they are found in the HBS. This means taking the total sum of household final demand expenditure from the EE-MRIO and decomposing it first based on the share of each income quantile's consumption expenditure of the total consumption expenditure as found in the HBS, before decomposing into sectors as well using the HBS 'parts per mille' per sector. This leads to a different total footprint than the original EE-MRIO footprint (when not decomposed by income quantile), because a different amount of final demand expenditure in each sector is now multiplied by the same total intensities per sector that are originally calculated in the EE-MRIO.  

Here the distinction between purchaser price and base price matters, as we directly take the HBS expenditure structure (which is reported in purchaser price) and apply it to the aggregated sum of EE-MRIO household final demand expenditure which is originally in base price. To correct for this, we can either convert the household budget survey to base price or take the total EE-MRIO household final demand expenditure in purchaser price, decompose it according to the HBS structure, then convert to base price. Here we show how we transform the household budget survey from purchaser price to base price using EUROSTAT margin data tables. 

We use two data tables from the OECD statistics database, which are provided to the OECD by EUROSTAT (for European countries). The units here are: 'Trade and transport margins in percentage of final consumption expenditure by households' and 'Taxes less subsidies on product in percentage final consumption expenditure by households':

- Trade and Transport margin tables, Taxes less Subsidies tables (National Accounts > Annual National Accounts > Supply and Use Tables > SUT Indicators):   https://stats.oecd.org/index.aspx?queryid=84864#  

  1) Trade and transport margins in percentage of final consumption expenditure by households: Export as text file (csv) - customize to include years 2005 to 2015 [accessed on: 15-04-2020]
  
  2) Taxes less subsidies on product in percentage of final consumption expenditure by households: Export as text file (csv) - customize to include years 2005 to 2015 [accessed on: 15-04-2020]

We first map the HBS COICOP consumption categories to the margin data 'PRODUCT' sectors. Most countries have margin data here starting only in 2010. We take the 2010 margin data and apply this to all three years in our sample (the margins do not change significantly over time). This was possible for 20 countries. There are then two countries with only 'rent' margins missing: the Czech Republic and Slovakia. Here we apply zero to both trade and transport, and taxes less subsidies margins. There are then five countries with some missing margin data (Estonia, Ireland, Lithuania, Norway and Sweden), and three countries with no margin data at all (Germany, Spain, Turkey). We impute their margin percentages using the margin percentages of a similar, neighboring country. For Estonia and Lithuania, we use the Latvian margin percentages. For Ireland we use the UK margin percentages. For Norway and Sweden we use the Finnish margin percentages. For Germany we use the Austrian margin percentages. For Spain we use the Portuguese margin percentages, and finally, for Turkey we use the Bulgarian margin percentages. These countries appear to be close approximations based on examination of the 2010 values of another table; 'trade and transport margins in percentage of total supply at purchaser's prices.' The need to apply margin percentage data is a potential limitation of this method. If we relied on this method to produce main paper results, acquiring and applying more precise margin percentage data would arguably be a requirement. 

```{r tableS9}

bp_share_in_pp_sector_1 = c(0.5,0.5,0.5,0.5,0.5)
bp_share_in_pp_sector_2 = c(1,1,1,1,1)

hbs_bp = data.frame(quintile, 
                    pps_hh, 
                    pm_sector1, 
                    pm_sector2, 
                    bp_share_in_pp_sector_1, 
                    bp_share_in_pp_sector_2) %>%
  mutate(pps_hh_sector1 = pps_hh*(pm_sector1/1000),
         pps_hh_sector1_bp = pps_hh_sector1*bp_share_in_pp_sector_1,
         sector_1_shares = pps_hh_sector1/sum(pps_hh_sector1),
         sector_1_shares_bp = pps_hh_sector1_bp/sum(pps_hh_sector1_bp),
         pps_hh_sector2 = pps_hh*(pm_sector2/1000),
         pps_hh_sector2_bp = pps_hh_sector2*bp_share_in_pp_sector_2,
         sector_2_shares = pps_hh_sector2/sum(pps_hh_sector2),
         sector_2_shares_bp = pps_hh_sector2_bp/sum(pps_hh_sector2_bp),
         pps_hh_bp = pps_hh_sector1_bp + pps_hh_sector2_bp)
  
hbs_bp_pps = hbs_bp %>%
  select(quintile, 
         pps_hh,
         pm_sector1,
         pm_sector2,
         pps_hh_sector1,
         bp_share_in_pp_sector_1,
         pps_hh_sector1_bp,
         pps_hh_sector2,
         bp_share_in_pp_sector_2,
         pps_hh_sector2_bp)

knitr::kable(hbs_bp_pps, caption = "Table S9: Same as Table S1 but now 
             with base price shares of purchaser price per sector, and 
             sectoral expenditure per income quintile in pps hh base price.", 
             escape = F, 
             booktabs = TRUE,
             col.names = c("quintile",
                           "pps hh", 
                           "s1 (pm)", 
                           "s2 (pm)",
                           "s1 (pps hh)",
                           "s1 bp",
                           "s1 (pps hh bp)",
                           "s2 (pps hh)",
                           "s2 bp",
                           "s2 (pps hh bp)")) %>%
  kable_styling(latex_options = "HOLD_position")

```

```{r tableS10}

hbs_bp_shares = hbs_bp %>%
  select(quintile, 
         pps_hh_bp,
         pps_hh_sector1_bp,
         pps_hh_sector2_bp) %>%
  mutate(pm_sector1_bp = (pps_hh_sector1_bp/(pps_hh_sector1_bp + pps_hh_sector2_bp))*1000,
         pm_sector2_bp = (pps_hh_sector2_bp/(pps_hh_sector1_bp + pps_hh_sector2_bp))*1000)

knitr::kable(hbs_bp_shares, caption = "Table S10: Income quintile shares per sector 
             in base price, with new 'pps hh' per income quintile in base price and 
             new sectoral pm values in base price ('pps hh bp' is the sum of 
             's1 (pps hh bp)' and 's2 (pps hh bp)').", 
             escape = F, 
             booktabs = TRUE,
             col.names = c("quintile",
                           "pps hh bp",
                           "s1 (pps hh bp)",
                           "s2 (pps hh bp)",
                           "s1 (pm bp)",
                           "s2 (pm bp)")) %>%
  kable_styling(latex_options = "HOLD_position")

```

Table S11 shows the HBS data on the left-hand side ('pps hh', 's1 (pm)' and 's2 (pm)') as before, but now the income quintile shares of mean consumption expenditure (the sum of 'pps hh') are calculated and shown as 'mean exp'. These shares are multiplied by total EE-MRIO household final demand expenditure, which in this case is 1100 (300 in sector 1 plus 800 in sector 2) (see Table S2), to calculate the amount of EE-MRIO household final demand expenditure per income quintile. These are then multiplied by the sectoral parts per mille shares to calculate EE-MRIO household final demand expenditure per sector and income quintile ('hh fd s1' and 'hh fd s2'). 

```{r tableS11}

quintile = c("q1","q2","q3","q4","q5")
pps_hh = c(9,11,16,25,28)

pm_sector1 = c(111,176,250,333,429)
pm_sector2 = c(889,824,750,667,571)

hbs_alt_method = data.frame(quintile, 
                            pps_hh, 
                            pm_sector1, 
                            pm_sector2,
                            bp_share_in_pp_sector_1,
                            bp_share_in_pp_sector_2) %>%
  mutate(mean_expenditure_share = pps_hh/sum(pps_hh),
         eemrio_hh_fd = 1100*mean_expenditure_share,
         hh_sector1 = eemrio_hh_fd*(pm_sector1/1000),
         hh_sector1_bp = hh_sector1*bp_share_in_pp_sector_1,
         hh_sector2 = eemrio_hh_fd*(pm_sector2/1000),
         hh_sector2_bp = hh_sector2*bp_share_in_pp_sector_2)

hbs_alt_method_fd = hbs_alt_method %>%
  select(quintile,
         pps_hh,
         pm_sector1,
         pm_sector2,
         mean_expenditure_share,
         eemrio_hh_fd,
         hh_sector1,
         hh_sector2)

knitr::kable(hbs_alt_method_fd, caption = "Table S11: Same as Table S9 but 
             with total EE-MRIO household final demand in purchaser price 
             instead of base price.", 
             escape = F, 
             booktabs = TRUE,
             col.names = c("quintile",
                           "pps hh", 
                           "s1 (pm)", 
                           "s2 (pm)",
                           "mean exp",
                           "eemrio hh fd",
                           "hh fd s1",
                           "hh fd s2")) %>%
  kable_styling(latex_options = "HOLD_position")

```

Table S12 shows that we then have our EE-MRIO household final demand expenditure decomposed by income quintile. 

```{r tableS12}

eemrio_alt_method = hbs_alt_method %>%
  select(quintile,hh_sector1,hh_sector2) %>%
  gather(sector,value,-quintile) %>%
  spread(quintile,value) %>%
  mutate(sector = dplyr::recode(sector,
                                "hh_sector1" = "1",
                                "hh_sector2" = "2"))

knitr::kable(eemrio_alt_method, caption = "Table S12: EE-MRIO 
             household final demand per quintile and sector.", 
             booktabs = TRUE,
             col.names = c("sector",
                           "q1 fd", 
                           "q2 fd", 
                           "q3 fd",
                           "q4 fd",
                           "q5 fd")) %>%
  kable_styling(latex_options = "HOLD_position")

```

If we now multiply this by the total intensity vector ('TIV'), we calculate the footprint decomposed by income quintile, but our total footprint is different than in the other methodology because now different amounts of EE-MRIO household final demand expenditure per sector are being multiplied by the same total intensity vector as previously calculated inside the EE-MRIO (compare Table S13 with Table S3). 

```{r tableS13}

footprint_alt_method = data.frame(eemrio_alt_method, TIV) %>%
  mutate(q1_footprint = q1*TIV,
         q2_footprint = q2*TIV,
         q3_footprint = q3*TIV,
         q4_footprint = q4*TIV,
         q5_footprint = q5*TIV,
         total_footprint = q1_footprint +
           q2_footprint +
           q3_footprint +
           q4_footprint +
           q5_footprint)

knitr::kable(footprint_alt_method, caption = "Table S13: EE-MRIO household 
             final demand per quintile and sector multiplied by the TIV to 
             calculate footprint per quintile and sector.",
             booktabs = TRUE,
             col.names = c("sector",
                           "q1 fd",
                           "q2 fd",
                           "q3 fd",
                           "q4 fd",
                           "q5 fd",
                           "TIV",
                           "q1 fp",
                           "q2 fp",
                           "q3 fp",
                           "q4 fp",
                           "q5 fp",
                           "total fp")) %>%
  kable_styling(latex_options = "HOLD_position")

```

# Supplementary Results

- brief description of each graphic. Can then say something tentative about whether or not inequality increased or decreased, the footprints increased or decreased, differences between methods and versions, and then call for more investigation of effects over time. 

## 2005 using main method, EXIOBASE industry-by-industry

```{r load-data, include=FALSE}
# load data wrangling functions
source(here("analysis", "r", "wrangler_functions.R"))

## load result data for EU deciles
eu_q_count = 10

# summary countries aggregated by country quintiles and eu ntile
dat_country_summary_by_cquint_and_euntile = get_country_summary_by_cquint_and_euntile(eu_q_count)
# pivot to long format for plotting and attach readable indicator names
cols_ex = c("year", "iso2", "quint", "eu_q_rank")
pdat_country_summary_by_cquint_and_euntile =
  pivot_results_longer_adorn(dat_country_summary_by_cquint_and_euntile, cols_ex)

# summary of countries by EU quantile without sectoral resolution
dat_country_summary_by_eu_ntile = get_country_summary_by_eu_ntile(eu_q_count)
# pivot to long format for plotting and attach readable indicator names
cols_ex = c("year", "iso2", "eu_q_rank")
pdat_country_summary_by_eu_ntile = 
  pivot_results_longer_adorn(dat_country_summary_by_eu_ntile, cols_ex)

# summary of countries by country quintile with aggregate sectoral resolution
dat_sector_summary_by_country_quintile = get_sector_summary_by_country_quintile(eu_q_count)
# pivot to long format for plotting and attach readable indicator names
cols_ex = c("year", "iso2", "quint", "eu_q_rank", "sector_agg_id")
pdat_sector_summary_by_country_quintile =
  pivot_results_longer_adorn(dat_sector_summary_by_country_quintile, cols_ex)

# summary of eu ntile with aggregate sectoral resolution
dat_sector_summary_by_eu_ntile = get_sector_summary_by_eu_ntile(eu_q_count)
# pivot to long format for plotting and attach readable indicator names
cols_ex = c("year", "eu_q_rank", "sector_agg_id")
pdat_sector_summary_by_eu_ntile =
  pivot_results_longer_adorn(dat_sector_summary_by_eu_ntile, cols_ex)

```

```{r ntiles-total-2005-ixi}

p1 = pdat_country_summary_by_eu_ntile %>%
  filter(year == 2005, 
         indicator == "total_fd_me") %>%
  group_by(eu_q_rank) %>%
  summarise(value = sum(value)*0.000001,
            eu_ntile_name = first(eu_ntile_name)) %>%
  ggplot(aes(x=eu_ntile_name, y=value)) +
    geom_col(position = position_dodge(), fill=pal[1]) +
    theme_minimal() +
    theme(text=element_text(family="Liberation Sans Narrow")) +
    labs(x="", y="Expenditure (trn€)") +
    theme(axis.text.x = element_text(angle = 90))

p2 = pdat_country_summary_by_eu_ntile %>%
  filter(year == 2005, 
         indicator == "total_energy_use_tj") %>%
  group_by(eu_q_rank) %>%
  summarise(value = sum(value)*0.000001,
            eu_ntile_name = first(eu_ntile_name)) %>%
  ggplot(aes(x=eu_ntile_name, y=value)) +
    geom_col(position = position_dodge(), fill=pal[1]) +
    theme_minimal() +
    theme(text=element_text(family="Liberation Sans Narrow")) +
    labs(x="", y="Energy footprint (EJ)") +