Merge pull request #70 from dklein-pik/develop

Update tutorials.

Merge pull request #70 from dklein-pik/develop
Update tutorials.
b0272892 · Lavinia Baumstark · GitHub · 625e8e85 · 23fc5eb1 · b0272892
Unverified Commit b0272892 authored 5 years ago by Lavinia Baumstark Committed by GitHub 5 years ago
--- a/tutorials/1_GettingREMIND.md
+++ b/tutorials/1_GettingREMIND.md
-This guide will give you a brief technical introduction in how to run and use the model REMIND.
+Install the REMIND model and all software/data required
 ================
-Felix Schreyer (<felix.schreyer@pik-potsdam.de>), Lavinia Baumstark (<baumstark@pik-potsdam.de>)
-30 April, 2019
+Anastasis Giannousakis (<giannou@pik-potsdam.de>), Felix Schreyer (<felix.schreyer@pik-potsdam.de>)
+16 February, 2020

-Introduction
+HOW TO INSTALL
 --------------

-As normal runs with REMIND take quite a while (from a couple of hours to several days), you normally don't want to run them locally (i.e., on your own machine) but on the cluster provided by the IT-services. The first step is to access the Cluster. In general, there are three ways how to access the Cluster:
-	
-1. Putty console 
-2. WinSCP 
-3. Windows Explorer, click on network drive (only possible if you are in PIK LAN)
+To get the REMIND code you need to have git installed and then clone the model from <https://github.com/remindmodel/remind.git>.

-They all have their upsides and downsides. Don't worry! If they are new to you, you will figure out what is best for which kind of task after some time and get more famliar just by your practice. Using either Putty or the network drive in Windows Explorer, the first step is:
+REMIND requires *GAMS* (<https://www.gams.com/>) including licenses for the solvers *CONOPT* and (optionally) *CPLEX* for its core calculations. As the model benefits significantly from recent improvements in *GAMS* and *CONOPT4* it is recommended to work with the most recent versions of both. Please make sure that the GAMS installation path is added to the PATH variable of the system:

-adjust .Rprofile
-----------------
-First, log onto the Cluster via WinSCP and open the file `/home/username/.profile` in a text editor. Add these two lines and save the file.
+-   the easiest way to add is by simply checking the "Use advanced installation mode" box at the beginning of the installation. At a later step you have to tick again a checkbox that adds the GAMS path to your PATH variable
+-   you can also edit your computer's advanced settings and add the GAMS path to the PATH variable manually (applies also if GAMS is installed but not included in PATH).

-``` bash
-module load piam 
-umask 0002
-```
-This loads the piam environment once you log onto the Cluster via Putty the next time. This envrionment will enable you to manage the runs that you do on the Cluster. Next, you need to specify the kind of run you would like to do. 
-   	
-Find a Place to Start
-----------------------
+This tutorial shows how to check and add variables to your PATH variable: <https://www.youtube.com/watch?v=5P9EDJwfXBo>

-Create a folder on the Cluster where you want to store REMIND. It is recommended not to use the `home` directory. For your first experiments you can use the /p/tmp/YourPIKName/ directory (only stored for 3 months) and create a following folder:
+Please add the GAMS training license you have been provided (gamslice.txt) by saving the file to your GAMS local folder. Under Windows something like `C:\Program Files (x86)\GAMS\28.2`

-``` bash
-p/tmp/YourPIKName/REMIND
-```
-(in case you are using Putty and are not familiar with unix commands, google a list of basic unix commands, you will need `mkdir` to create a folder). Go inside this folder.
+In addition *R* (<https://www.r-project.org/>) is required for pre- and postprocessing and run management (needs to be added to the user's PATH variable as well). It is recommended to install also RSudio (<https://www.rstudio.com>).

-Now, you need to download REMIND into this folder. The download works via git. In this way, different people can develop the model simultaneously and changes can be traced back and undone in case the merged code does not work as it is supposed to. Cloning a new REMIND version via git is always possible. However, before pushing your changes to the common version for the first time, please talk to the research software engineering group. They are happy to give you an introduction.
+For R, some packages are required to run REMIND. All are either distributed via the offical R CRAN or via a separate repository hosted at PIK (PIK-CRAN). Before proceeding PIK-CRAN should be added to the list of available repositories via:

-Cloning REMIND
--------------------
+``` r
+options(repos = c(CRAN = "@CRAN@", pik = "https://rse.pik-potsdam.de/r/packages"))
+```

-To clone REMIND via Windows Explorer: Right-click in your REMIND folder and choose `git clone` (if not availble, install tortoise git or ask the RSE group). Insert <https://gitlab.pik-potsdam.de/REMIND/REMIND}>
-as `URL of repository`. On command line use
+After that all remaining packages can be installed via `install.packages`

-``` bash
-			git clone git@gitlab.pik-potsdam.de:REMIND/REMIND.git
+``` r
+pkgs <- c("gdxrrw",
+          "ggplot2",
+          "curl",
+          "gdx",
+          "magclass",
+          "madrat",
+          "mip",
+          "lucode",
+          "remind",
+          "lusweave",
+          "luscale",
+          "goxygen",
+          "luplot",
+          "shinyresults")
+install.packages(pkgs)
 ```

-and hit enter. This will download the REMIND version in the current folder.
+For post-processing model outputs *Latex* is required (<https://www.latex-project.org/get/>). To be seen by the model it also needs to be added to the PATH variable of your system.

+If the following lines of code are executed withour error, then you are all set!

-Great, you now have REMIND!
+``` r
+system("gams")
+library(gdxrrw)
+library(remind)
+print("")
+if(.Platform$OS.type == "unix") {
+  system("which pdflatex")
+} else {
+  system("where pdflatex")
+}
+```

+NOTE: If the model fails to start from the Windows console, try starting it from within RStudio.
--- a/tutorials/2_RunningREMIND.md
+++ b/tutorials/2_RunningREMIND.md
 Start running REMIND with default settings
 ================
-Felix Schreyer (<felix.schreyeru@pik-potsdam.de>), Lavinia Baumstark (<baumstark@pik-potsdam.de>)
+Felix Schreyer (<felix.schreyeru@pik-potsdam.de>), Lavinia Baumstark (<baumstark@pik-potsdam.de>), David Klein (<dklein@pik-potsdam.de>)
 30 April, 2019

 -   [1. Your first run](#Your first run)
-    -   [Default Configurations](#Default_Configurations)
+    -   [Default configurations](#Default_Configurations)
    -   [Configuration with scenario_config.csv](#configuration_with_scenario_config)
-    -   [Starting the Run](#Starting the Run)
+    -   [Starting the run](#Starting the Run)
 -   [2. What happens during a REMIND run?](#2. What happens during a REMIND run?)
-   [3. What happens once you start REMIND on the Cluster? ](#3. What happens once you start REMIND on the Cluster? )
-    -   [a) Input Data Preparation](#Input Data Preparation)
+-   [3. What happens once you start REMIND on the cluster? ](#3. What happens once you start REMIND on the cluster? )
+    -   [a) Input data preparation](#Input Data Preparation)
    -   [b) Optimization](#Optimization)
-    -   [c) Output Processing](#Output Processing)
+    -   [c) Output processing](#Output Processing)


 1. Your first run
@@ -23,20 +23,20 @@ This section will explain how you start your first run in REMIND.
 Default Configurations (config/default.cfg)
 -------------------------------------------

-The **default.cfg** file is divided into four parts: MODULES, SWITCHES, FLAGS, and Explanations of Switches and Flags.
+The **default.cfg** file is divided into four parts: MODULES, SWITCHES, FLAGS, and Explanations of switches and flags.

-a. The first part, MODULES, contains the various modules used in REMIND and various realisations therein. Various realisations within the particular module differ from each other in their features, for e.g., bounds, parametric values, different policy cases etc. Depending on the module, you see set which one to use as default (# def), and which one you can change for your current run 
+a. The first part, MODULES, contains settings for the various modules and their realizations. The realisations within the particular module differ from each other in their features, for e.g., bounds, parametric values, different policy cases etc. For each module you choose which realization of the module will be activated for your current run 

 ``` bash
-cfg$gms$<module name
+cfg$gms$module_name
 ```

-b. The SWITCHES and FLAGS section are various settings to control, for e.g., how many iterations to run, which technologies to run, which SSP to use, start and end year of model run etc. See the fourth section, explanations of switches and flags, to know more. 
+b. The SWITCHES and FLAGS section contain settings to control, for e.g., how many iterations to run, which technologies to include, which SSP to use, start and end year of model run etc. See the fourth section, explanations of switches and flags, to learn more. 

 Configuration with scenario_config.csv
 -------------------------------------------

-The folder **config** in your REMIND folder contains a number of csv files that start with **scenario_config**. Those files are used to start a set of runs each with different configurations. In your REMIND folder on the Cluster, open **config/scenario_config.csv** in Excel. The scenario_config files have `;` as their delimiter. In Excel, you might need to select the first column, choose *Data* and *Text to Columns* and set the right delimiter `;` to see the csv-file spread over the columns.   
+The folder **config** in your REMIND folder contains a number of csv files that start with **scenario_config**. Those files are used to start a set of runs each with different configurations. In your REMIND folder on the cluster, open **config/scenario_config.csv** in Excel. The scenario_config files have `;` as their delimiter. In Excel, you might need to select the first column, choose *Data* and *Text to Columns* and set the right delimiter `;` to see the csv-file spread over the columns.   

 In the config file, each line represents a different run. The `title` column labels the runs. The more runs you will have, the more it will be important that you label them in a way such that you easily remember the specific settings you chose for this run. The `start` column lets you choose whether or not you would like to start this run once you submit this config file to the modeling routine. It often makes sense to keep some runs in the csv file to remember their configurations for the next time although you do not want to run them now and therefore swtich them off. You do this by setting `start` to 0. The rest of the columns are the configurations that you can choose for the specific runs. You will see different config files with a different number of columns. If a specific setting is not specified in the config file, it takes a default value from the file **config/default.cfg**. Let us not worry too much about the many configurations you can do here. We will focus on this another time. We will start just one run:

@@ -53,48 +53,80 @@ Save the config file as a csv file with `;` as delimiter. You can check that, fo

 To finally start REMIND with this config file, you need to run the R-script ***start_bundle.R*** on the cluster on this config file. For this:

-Starting the Run
+Accessing the cluster
 ------------------
-Open a Putty session on the Cluster. Go to your REMIND folder (i.e. you have "config", "core", and "modules" as subfolders) and type directly:
+
+As normal runs with REMIND take quite a while (from a couple of hours to several days), you normally don't want to run them locally (i.e., on your own machine) but on the cluster provided by the IT-services. The first step is to access the cluster. In general, there are three ways how to access the cluster:
+	
+1. Putty console 
+2. WinSCP 
+3. Windows Explorer, click on network drive (only possible if you are in PIK LAN)
+
+They all have their upsides and downsides. Don't worry! If they are new to you, you will figure out what is best for which kind of task after some time and get more famliar just by your practice. Using either Putty or the network drive in Windows Explorer, the first step is:
+
+Adjust .Rprofile
+-----------------
+First, log onto the cluster via WinSCP and open the file `/home/username/.profile` in a text editor. Add these two lines and save the file.
+
+``` bash
+module load piam 
+umask 0002
+```
+This loads the piam environment once you log onto the cluster via Putty the next time. This envrionment will enable you to manage the runs that you do on the cluster. Next, you need to specify the kind of run you would like to do. 
+   	
+Start the run
+-----------------------
+
+Open a Putty session on the cluster and create a folder on the cluster where you want to store REMIND. It is recommended not to use the `home` directory. For your first experiments you can use the /p/tmp/YourPIKName/ directory (only stored for 3 months) and create a following folder: `p/tmp/YourPIKName/REMIND`
+
+In case you are using Putty and are not familiar with unix commands, google a list of basic unix commands, you will need e.g. `mkdir` to create a folder. Go inside this folder and download REMIND into this folder via a git clone (see above and tutorial 0_Git_and_GitHub_workflow).
+
+Go to your REMIND main folder (i.e. you have "config", "core", and "modules" as subfolders) and start a REMIND run by typing:

 ``` bash
 Rscript start.R
 ```
-Also, on Windows, you can double-click the **start.cmd** file.
-NOTE: In order to use those scripts on local machines, you have to have R installed on your machine.
-Don't forget to update the R libraries from time to time (explained in the Wiki page above, you need to do it only on local machines, on the cluster it happens automatically)
+Without additional arguments this starts a single REMIND runs using the settings from config/default.cfg. Also, on Windows, you can double-click the `start.cmd` file. NOTE: In order to use those scripts on local machines, you have to have R installed on your machine. Don't forget to update the R libraries from time to time (explained in the Wiki page above, you need to do it only on local machines, on the cluster it happens automatically)
+
+Control the script's behavior by providing additional arguments:

-For starting one single or a bundle of runs via scenario_config.csv you use the file **start_bundle.R** and type: 
+Starting a single REMIND run in OneRegi mode using the settings from config/default.cfg (useful to quickly check if your changes to the code break the model):

-``` r
-nohup Rscript start_bundle.R config/scenario_config.csv &
+``` bash
+Rscript start.R --testOneRegi
 ```
-Now, keep your fingers crossed that everything works as it should.The process of your job submission is documented in the file nohup.out that you created with the nohup command. After a couple of minutes, you should see something like `Submitted Batch Job ...` in the nohup.out file. This means that your run has been started. To see how far your run is or whether it was stopped due to some problems, go to the `Output` folder and type 

+Starting a bundle of REMIND runs using the settings from a scenario_config_XYZ.csv:
+ 
 ``` bash
-rs
+Rscript start.R config/scenario_config_XYZ.csv
 ```
-into the Putty console. For more commands to manage your runs, type **piaminfo**. 

-NOTE: A few words on the scripts that we currently use to start runs. The scripts containing the string 'start' have a double functionality:
- they create the full.gms file and compile the needed files to start a run in a subfolder of the output folder
- they submit the run to the cluster or to your GAMS system if you work locally
-The scripts containing the string 'submit' have only the second of the above funcionalities, that's why they are found only inside the run folders and are used directly from there.
+Note: Please do not make changes to the REMIND code until the last run has stared running GAMS (including subsequent runs).

-Shortly, a message will be displayed specifying the number of job submission; the message will look similar to the following:
+A message similar to following confirms that your runs has been submitted to the cluster: `The job "cwsa.iplex.pik-potsdam.de.65539" has been submitted.`

+You can check if the run has been accepted by the cluster just by using the command 

 ``` bash
-The job "cwsa.iplex.pik-potsdam.de.65539" has been submitted.
+sq
 ```
+in the terminal.

-You can check if the running has been accepted by the cluster just by using the command 
+To see how far your run is or whether it was stopped due to some problems, go to the `Output` folder and type 

 ``` bash
-squeue -u yourusername
+rs
 ```
-in the terminal.
-	
+into the Putty console. For more commands to manage your runs, type **piaminfo**. 
+
+NOTE: A few words on the scripts that we currently use to start runs. The scripts containing the string 'start' have a double functionality:
+- they submit the run to the cluster or to your GAMS system if you work locally
+- they create the full.gms file and compile the needed files to start a run in a subfolder of the output folder
+
+
+
+
 2. What happens during a REMIND run?
 =====================================
 	
@@ -107,14 +139,14 @@ REMIND modeling routine
 </p>
 	

-3. What happens once you start REMIND on the Cluster? 
+3. What happens once you start REMIND on the cluster? 
 =======================================================

-First, a number of R libraries like **madrat**, **moinput** and **remind** are loaded into your cache on the Cluster. These libraries were and are still developed at PIK. They contain the functions necessary for the input data preparation and the output processing. Let us go through each of the stages and briefly describe what happens:
+First, a number of R libraries like **madrat**, **moinput** and **remind** are loaded into your cache on the cluster. These libraries were and are still developed at PIK. They contain the functions necessary for the input data preparation and the output processing. Let us go through each of the stages and briefly describe what happens:
 	
 a) Input Data Preparation
 --------------------------
-The optimization in REMIND requires a lot of input data. For example, the model needs to know energy production capacities per region for its initial time steps. Furthermore, it builds on GDP, population and energy demand projections that are results of other models. These kind of data are stored on the Cluster in
+The optimization in REMIND requires a lot of input data. For example, the model needs to know energy production capacities per region for its initial time steps. Furthermore, it builds on GDP, population and energy demand projections that are results of other models. These kind of data are stored on the cluster in
 		
 ``` bash
 /p/projects/rd3mod/inputdata/sources.

--- a/tutorials/3_RunningBundleOfRuns.md
+++ b/tutorials/3_RunningBundleOfRuns.md
 Running more than one REMIND scenarios
 ================
+
+Starting a bundle of REMIND runs using the settings from a scenario_config_XYZ.csv:
+ 
+``` bash
+Rscript start.R config/scenario_config_XYZ.csv
\ No newline at end of file
--- a/tutorials/4_RunningREMINDandMAgPIE.md
+++ b/tutorials/4_RunningREMINDandMAgPIE.md
 Running REMIND and MAgPIE in coupled mode
 ================
+David Klein (<dklein@pik-potsdam.de>)
+16 February, 2020
+
+### Clone the models
+
+```bash
+git clone https://github.com/magpiemodel/magpie.git
+git clone https://github.com/remindmodel/remind.git
+```
+
+### Switch to relevant branchs
+
+For both models switch to the git branches you want to use for your runs.
+
+### Create snapshot of R libraries
+
+Coupled runs may take a bit longer. During the runtime some of the R packages that are used by the coupled runs might get updated.
+Updates might change the behaviour of functions in an unexpected way. To avoid this create a snapshot of the R libraries before starting
+the runs:
+
+```bash
+bash /p/projects/rd3mod/R/libraries/Scripts/create_snapshot_with_day.sh
+```
+
+### Activate snapshot for REMIND and MAgPIE
+
+Direct the models to the snapshot you created above by editing .Rprofile in the model's main folder respectively.
+
+### Configure start_bundle_coupled.R 
+
+See comments in the head section of the file.
+
+### Configure the scenario_config_coupled.csv of your choice
+
+### Use the latest GAMS version
+
+This is step is optional.
+
+```bash
+module load gams/30.1.0
+```
+
+### Perform test start before actually submitting runs
+
+```bash
+Rscript start_bundle_coupled.R test
+```
+
+### After checking that coupling scripts finds all gdxes and mifs start runs
+
+```bash
+Rscript start_bundle_coupled.R
+```
\ No newline at end of file