Introduction

Key Software & Terms

Accounts and Access

  • PIC Issues: Contact PNNL Computing Help: pic-support@pnnl.gov
  • Complete Training:
    • Need to complete training to receive username for PIC
    • You will receive an email from administration with a link to http://elearner.pnnl.gov
    • In the email you will have a temporary UN and PW
  • Receive PIC Username:
  • Create SecureID PIN:
    • When you arrive you will have received a SecureID token generator. This is a physical device like a USB stick which displays a 6 digit token. This token continuously changes and will become part of your new PW. You will use the secure token to generate a new SecureID PIN
      • Go to https://portal.pnnl.gov
      • Enter your username and in the SecureID Field enter the SecureID token from your token generator. Leave the PW field blank.
      • You will be put into a PIN setting screen, where you enter your choice of PIN twice. PINs must be 6 digits
      • Once your PIN is set, the system will ask for PASSCODE. This PASSCODE will be your PIN followed by the new six digits from the token (which constantly change). For example, if the PIN is xxxxx and 6 digits from token are 123456, the PASSCODE is xxxxx123456.
  • Login to PIC:
    • Now you can login to PIC by opening PuTTY and entering the correct connection settings with your UN and PW.
    • Open PuTTY and do the following:
      • In “Host Name (or IP address)” enter “constance.pnl.gov”
      • In “Port” enter “22”
      • Leave everything else as is and click “Open”
      • A command prompt will open and ask for login. Enter your username fro PIC
      • Then enter your PASSCODE as created in the step above.
PuTTY Configuration

PuTTY Configuration

  • Login to WinSCP:
    • Now you can login to WinSCP and connect to your folder on PIC.
    • After you login you will be able to view folders on your desktop on the left and your folder on the PIC supercomputer on the right.
    • Open WinSCP and do the following:
      • In “Host name” enter “constance.pnl.gov”
      • In “Port number” enter “22”
      • In “User name” enter your username
      • In “Password” enter your PASSCODE
PuTTY Configuration

PuTTY Configuration

  • Access to GCAM folder:
    • You may need access to the GCAM folder on PIC which has all the libraries and files to install.
    • Email pic-support@pnnl.gov to ask for access to the pic/projects/GCAM folder.
  • Remote access:
    • In case remote access is needed users will need to setup a virtual private network.
    • Go to https://remote.pnl.gov to download the Cisco Anyconnect client
    • vpn should be set to: remote.pnl.gov
    • You can login with your US and Passcode

Install Release Version

Compile GCAM on Desktop

Install GCAM on PIC

module load git
module load svn/1.8.13
module load R/3.2.0
module load java/1.8.0_31
module load gcc/5.2.0
  • Clone the core model version from stash
git clone https://github.com/JGCRI/gcam-core.git
# If using a stash version
#git clone https://stash.pnnl.gov/scm/jgcri/gcam-core.git
  • Enter the GCAM WorkSpace:
cd ./gcam-core
# ONLY IF USING A LOCAL STASH VERSION
# On the stash page check the version of GCAM that you want to work with. Ask the PNNL team member you are working with about which branch to use. Checkout the branch:
#git checkout ENTER_BRANCH_NAME_HERE
  • Load Hector
git submodule init ./cvs/objects/climate/source/hector
git submodule update ./cvs/objects/climate/source/hector
  • Build GCAM (Need to Load Modules again)
module load git
module load svn/1.8.13
module load R/3.2.0
module load java/1.8.0_31
module load gcc/5.2.0
make gcam -j 12
  • Note: You need access to GCAM libraries as stated earlier otherwise you will get an error.
  • After GCAm is built you should get a message which says “BUILD COMPLETED” and a date
  • Go to the exe folder and check that the date is the same as the build command output
cd ./exe
ls -l gcam.exe
  • Build Data System
    • Open WinSCP and connect to PIC using username and PASSCODE
    • Does not apply to most users outside of JGCRI. If you are using a stash version and need access to the EIA data then in gcam-core/input/gcam-data-system/energy-data/level0 need to add “en_nonOECD.sv” and “en_OECD.csv”. If you don’t have these files you will need to get these from a staff member at PNNL-JGCRI
    • In PuTTY:
cd ./gcam-core
module load R/3.3.3
make clean
make

Run GCAM

  • Key Files: In gcam-core/exe you will edit the following key files:
    • gcam.zsh
    • configuration.xml (can have another name eg. conifugration_ref.xml)
    • batch.xml (can have other names)
  • gcam.zsh:This file is in BASH and calls the configuration file. It will not be in your cloned version and you need to create or copy it. See the instructions in the “Installing GCAM on PIC”. Within this file will be a line “-Cconfig-$job.xml”. If you change this to “-C$job.xml” then you can call the configuration file by whatever name you call it. This line means the configuration file needs to be named config-NAME.xml where NAME is whatever you choose to name your configuration file.The gcam.zsh file includes the following lines of code (more details are explained in the Installing GCAM on PIC guide): Note: You will need the correct account name to run a GCAM run on PIC. This will be provided by the scientists working in your project.
#!/bin/zsh
#SBATCH -A ACCOUNTNAME
#SBATCH -t 90
#SBATCH -N 1
job=$SLURM_JOB_NAME
echo 'Library config:'
ldd ./gcam.exe
date
time ./gcam.exe -C$job.xml -Llog_conf.xml
date
  • configuration.xml: This will be named by the user and have a format based on how it was defined in the gcam.zsh as explained above. For example it could be config-ref.xml. Within this file a few key items are usually edited. key items to edit (Will need to search for these):
    • batch.xml (Make sure batch.xml is the same name as your batch file)
    • Reference (The Name of your reference scenario)
    • 1 (Make sure this is 1 if a batch file is included)
    • -1 (Leave -1 if want to run for all periods otherwise period 0 is upto 1975, 1 is upto 1990, 2 upto 2005, 3 upto 2010, 4 upto 2015 and so on in increments of 5 years per period. Setting this to 11 will stop the model in and including the year 2050.
  • Run Model: In PuTTY go to your gcam-core/exe folder and then type the following commands. After your run has been submitted the status will show “PD” for pending and “R” for running. After the run has been completed you can check the exe folder for a file named “slurmXXX” where XXX is the JOBID. This is a log of the model run. You can also see the complete log in “master.log”.
# You may have to load the various modules again if you reopen PIC before run
cd ../.././exe   # Need to be in  gcam-core/exe
sbatch -J configuration run-gcam.zsh  # Where "configuration" is the configuration file.
squeue -u USERNAME # to check the status of your run
# scancel JOBID  # This can be used to cancel a job which is in progress
# After run check "slurmXXX" where XX is JOBID  in the exe folder or "main_log.txt" in the exe/logs folder for model run stats.

Run Batch Scenarios

  • Key Files: In gcam-core/exe you will edit the following key files:
    • configuration.xml
    • batch.xml (can have other names)
  • configuration.xml: Change the following lines in the configuration file:
    • batch.xml (Make sure batch.xml is the same name as your batch file)
    • 1 (Make sure this is 1 if a batch file is included)
  • batch.xml: This file is used to run combination of multiple scenarios. In this file we include various “add-on” xml files which then replace existing data. This file may also not be in the cloned version and will need to be created or copied.It should also be place in the exe folder.
    • In this batch file we have “ComponentSet” and the “FileSet”. The component sets can be thought of as broader themes and the files within each component are different scenarios to be explored. Each element or “FileSet” within each component will be combined with elements from all other components. So in the example above the model will run GasRef+Theme2Ref, GasRef+Theme2Changes, GasHi+Theme2Ref and GasHi+Theme2Changes. The file set names are appended together.
    • Within each file set we include the “add-on” xml files which overwrite existing data inputs. These can be new population projects, carbon policies or caps, changes in technology prices or changing the shareweights of technologies like nuclear or coal for example. These changes are discuss in a separate section below.
      • The following syntax is used:
<!-- To comment out code -->

<BatchRunner xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="batch_runner.xsd">

<ComponentSet name="Gas">
    <FileSet name="GasRef">
    </FileSet>
    <FileSet name="GasHi">
        <Value name = "x">../input/policy/YOURPOLICY.xml</Value>
    </FileSet>
</ComponentSet>
 
 
<ComponentSet name="Theme2">
    <FileSet name="Theme2Ref">
    </FileSet>
    <FileSet name="Theme2Changes">
        <Value name = "x">../input/policy/YOURPOLICYTHEME2.xml</Value>
    </FileSet>
</ComponentSet>
 
</BatchRunner>

Build Batch Scenarios

  • In the batch file in each “FileSet” we can add the path to an “.xml” file which basically overwrites the existing data with new data.
  • The “add-on” .xml files are usually kept in gcam-core/input/policy. You will need to find out which .xml file needs to be changed to create the new policy depending on what you are trying to achieve. This is explained further in the section below.
  • Two ways to create these are:
    1. Make a copy of a known xml file that you want to change and then edit this by opening it in XML maker.
    2. Using Level 2 data:
      • Find out which data file needs to be changed. For example for not allowing any new nuclear technology to be installed in the system you can change the sahreweight of nuclear to 0 for the years that you want. This is changed in the file L223.SubsectorShrwt_nuc which is lying in gcam-core/input/energy-data/level2/.
      • Make a copy of this csv file in the gcam-core/input/policy folder and append an appropriate name.
      • Edit the csv by removing all the data that is not relevant and changing the values for the year, region and technology that is relevant to your case.
      • Then open Model Interface and open the modified .csv file.
      • Model Interface will ask for a header file. Point it to: “gcam-core-data-system_common_headers.txt"
      • Save the file as an .xml.
      • The path to this file can now be added in your bacth.xml as part of your scenario “FileSet”.

Key files to Modify Scenarios

Theme Description Files
Population
  • Change the level 2 gSSP2 data
  1. \input\gcam-data-system\socioeconomics-data\level2\L201.Pop_gSSP2.csv
Energy Technology Mix
  • After the base year has been calibrated, investments in new technologies are based on a logit function which has a shareweight. By changing the shareweight for a particular technology we can control how much of that technology is invested in future years. By making the shareweight ‘0’ none of the technology will be invested in while increasing the shareweight means more will be installed.
  1. \input\gcam-data-system\energy-data\level2\L223.SubsectorShrwt_nuc
  1. \input\gcam-data-system\energy-data\level2\L223.SubsectorShrwt_coal
Irrigation Expansion
Agricultural Production
NDCs and Emissions

View Data Model Interface

  • Once GCAM has run it will output all its data into a database to the gcam-core/output folder. The name of the database by default is database_basexdb but this can be changed in the configuration file described above.
  • For viewing the data there is a tool called “Model Interface” which is found in any of the release versions of GCAM. This can be downloaded from the github page https://github.com/JGCRI/gcam-core/releases. The “Model Interface” can be found in gcam-core/ModelInterface. Follow the instruction on the github page to install the release version. It may need additional installations of java, XML Maker and R.
  • Use WinSCP to drag the output database from your PIC output folder to a local folder on your computer. You will view the data offline locally on your computer.
  • The “Model Interface” read in different queries which are based on a query file which is an .xml file.
  • This query file is defined in gcam-core/ModelInterface in the “model_interface.properties”
  • The Main_queries file lying in “gcam-core/output/queries” can be modified to get relevant queries for the analysis.
  • Inside gcam-core/ModelInterface:
    • click on “run-model-interface.bat”
    • Click File>Open>DB Open and then point to the output database that you copied from PIC into your local drive.
    • This will open the ModelInterface and it will show you all the scenarios you ran from your batch.xml, the different regions and a list of queries as shown below.
    • To edit the query file being read open the “model_interface.properties” file and change the line which says: “.._queries.xml"
PuTTY Configuration

PuTTY Configuration

View Data in R

In order to view data in R the following is needed:

  • A folder where the R script is saved
  • A folder for outputs (can be the same as the R script folder)
  • A gcam database with the full location address
  • A query file which has been formatted correctly to be read by R
library(ggplot2);library(dplyr); library(rgcam) # Install packages first if necessary

wd0   <- "D:/Example/example_gcamFolder/output"  # Folder for save outputs
setwd(wd0)

myDB  <- "database_basexdb"          # Name of GCAM database
wddb  <- "D:/Example/example_gcamFolder/output"  # Location of GCAM database
connx <- localDBConn(wddb, myDB)    # Connect to database

# Running connx will give a list of the scenarios in the gcam database
scenariosAnalyze<-c("Scenario1","Scenario2") # Choose and create list of scenarios to analyze
myQueryfile <- "D:/Example/example_Folder/myQuery.xml"  # Full path to R formatted query.xml file

reReadData=0;
if(reReadData==1){
# Create a proj file and add data from each chosen scenario
for (scenario_i in scenariosAnalyze){
queryData.proj<<-addScenario(conn=connx, proj="queryData.proj",scenario=scenario_i,queryFile=myQueryfile)  # Check your queries file
}} else {queryData.proj<-loadProject("queryData.proj")
  }

lscen <-listScenarios(queryData.proj);lscen  # List of Scenarios that were read
lquer <-listQueries(queryData.proj);lquer    # List of queries that were read
getQuery(queryData.proj,"ag production by tech")   # Example to view a query


#Choose your parameter
param <- c("ag production by tech") # Choose your parameters

# Get Data
df <- getQuery(queryData.proj, param)%>%
  filter(region=="Argentina")

#Create Plot (Adjust x and y axis according to columns in df)
plot <- ggplot(data=df,aes(x=year,y=value,group=sector,fill=sector)) + 
  geom_bar(aes(),stat="identity",position="stack") + 
  facet_wrap(~scenario)

# Save Plot (Can change resolution and dimensions)
png(paste(param,".png",sep=""),width=13,height=9, units="in",res=300)
print(plot)
dev.off()

# Save Table
write.csv(df,file=paste(param,".csv",sep=""),row.names=F)
Example Figure

Example Figure

Input Data Sources & Files

Population

Historical: Historical population from 1975-2010 based on data from Maddison_population.csv and UN_popTot.csv

Future: Future population calculated as final historical year (2010) population times ratio of SSP future years to SSP 2010 from the IIASA SSP_database_v9.csv

GDP

Historical: Historical GDP from 1975-2010 is based on data from the USDA_GDP_MER.csv

Future: For SSP scenarios (without g) SSP GDP projections from SSP_database_v9.csv are scaled to match values from the historical year 2010. For the gSSP scenarios: IMF growth projections from IMF_GDP_growth.csv are applied from 2010-2020 and then SSP projections are scaled to match the 2020 values resulting from this process.

Sector Specific FAQ

This section is a summary of the key points from the more complete documentation available at: * GCAM User Documentation : http://jgcri.github.io/gcam-doc/ * GCAM Documentation TOC: http://jgcri.github.io/gcam-doc/toc.html

WATER

For futher details on water in GCAM: http://jgcri.github.io/gcam-doc/water.html

  • What are “Municipal” Water Withdrawals?
    GCAM uses the FAO definition of municipal water demands. From Hejazi et al. 2013 : The FAO defines municipal water as water that is provided by the public network system; thus, municipal water withdrawal includes domestic water use (drinking, cooking and cleaning) along with water used for urban industry, urban landscaping (including gardens) or urban irrigated agriculture.
    • [Hejazi et al. 2013] Hejazi, M., Edmonds, J., Chaturvedi, V., Davies, E., & Eom, J. (2013). Scenarios of global municipal water-use-demand projections over the 21st century. Hydrological sciences journal, 58(3), 519-538.
  • What does water use for the mining sector include?
    In GCAM the mining sector comprises primary energy processing which includes coal, oil, gas, ethanol and coal-to-liquids. From Hejazi et al. 2014: Water used for producing and processing primary energy is also dis-aggregated from the AQUASTAT industrial water use, using estimates derived from bottom-up water demand intensities assigned to various activities in the GCAM energy sector. Specifically, we assess water demands of the following processes: oil production and refining (conventional and unconventional), coal mining, gas production and processing, ethanol production, and coal-to-liquids fuel production
    • [Hejazi et al. 2014] Hejazi, M., Edmonds, J., Clarke, L., Kyle, P., Davies, E., Chaturvedi, V., … & Moss, R. (2014). Long-term global water projections using six socioeconomic scenarios in an integrated assessment modeling framework. Technological Forecasting and Social Change, 81, 205-226.
  • Does GCAM include blue, green and gray water footprints?
    As of November 2018 grey water is not tracked in GCAM. We only track blue and green water.

  • How are water withdrawals and consumption calculated in each sector?
    Water withdrawals and consumption is calculated for six different sectors in GCAM (Municipal, Mining, Livestock, industry, Electricity and Agriculture). Details for each will be added here. TO DO!!

LAND

For futher details on water in GCAM: http://jgcri.github.io/gcam-doc/aglu.html

Type Description
Pasture (Grazed) x
Pasture (Other) x
Grass x
  • Why is soy missing?
    We will investigate this further to see if we can distinguish soy from other crops in the model. TO DO!!

  • What is included in Biomass?
    Luckow et al. 2010 discusses the details of biomass crops as used in GCAM. Biomass crops represent purpose grown third generation biomass crops grown explicitly for bioenergy feedstocks. A representative cellulosic bioenergy crop based on switchgrass is used for the analysis.
    • [Luckow et al. 2010] Luckow, P., Wise, M. A., Dooley, J. J., & Kim, S. H. (2010). Large-scale utilization of biomass energy and carbon dioxide capture and storage in the transport and electricity sectors under stringent CO2 concentration limit scenarios. International Journal of Greenhouse Gas Control, 4(5), 865-877.
  • What are the “Sparse” and “Shurb” categories?
    Le Page et al. 2016 discusses the different kinds of land-use land cover (LULC) types used in GCAM. These include 22 different types of LULC (5 types of natural ecosystems and 17 types of managed land). The different classifications are based on MODIS data from Friedl et al. 2010 which are based on the land-use classification in Strahler e al. (1999). These are then grouped into the seven different land-use type you see in the presentation.
    The Sparse category includes land-use which is defined as Rock, Ice or Desert in the original GCAM LULC types which corresponds to the Snow and Ice and Barren categories from Friedl et al. 2010 in Table 1. These are defined as (Strahler et al. 1999):
    1. Snow and Ice=Lands under snow/ice cover throughout the year
    2. Barren=Lands with exposed soil, sand, rocks, or snow and never has more than 10% vegetated cover during any time of the year.
    3. The Shrubs category corresponds to the closed and open shrublands category from Friedl et al. 2010 which are defined as Lands with woody vegetation less than 2 meters tall. The shrub foliage can be either evergreen or deciduous.
    • [Le Page et al. 2016] Le Page, Y., West, T. O., Link, R., & Patel, P. (2016). Downscaling land use and land cover from the Global Change Assessment Model for coupling with Earth system models. Geoscientific Model Development, 9(9), 3055.
    • [Friedl et al. 2010] Friedl, M. A., Sulla-Menashe, D., Tan, B., Schneider, A., Ramankutty, N., Sibley, A., & Huang, X. (2010). MODIS Collection 5 global land cover: Algorithm refinements and characterization of new datasets. Remote sensing of Environment, 114(1), 168-182.
    • *[Strahler et al. 1999] Strahler et al. (1999). MODIS Land Cover Product Algorithm Theoretical Basis Document (ATBD) Version 5.0 MODIS Land Cover and Land-Cover Change https://modis.gsfc.nasa.gov/data/atbd/atbd_mod12.pdf*
  • How is pasture calculated?
    In GCAM the world is divided into 283 sub-regions based on the Agro-Economic Zones (AEZs) from the work performed in the GTAP project. In each of these regions the land is categorized into different land-use types based on a nested structure as shown in the figure below:

Sources for all data types mentioned in the following paragraphs are provided in the end.

  1. Production of all crops and forestry products are based on FAO and GTAP data.
  2. Consumption of animal feed is based on IMAGE data and consumption of all other products is based on FAO data.
  3. Land-use is based on data from the Hurtt-Hyde land-cover product from CMIP5, SAGE potential vegetation map and FAO/GTAP cropland area. The historical data is calibrated up to the year 2010. After this the model allows land-use to change based on changes in different parameters such as prices of crops and demands for meat etc. and elasticity to these parameters expressed as logit exponents. As seen in the diagram Pasture is a category of Arable Land which is divided into Intensively Grazed Pasture and Other Pasture. Intensively Grazed Pasture is pasture which feeds marketed livestock while Other Pasture is grass feed for other livestock.