NOAA Observation Station Analysis

Creating Station Indicators



Project Synopsis

According to a report published on January 18, 2017 by the National Aeronautics and Space Administration (NASA) and the National Oceanic and Atmospheric Administration (NOAA):

…(the) Earth’s 2016 surface temperatures were the warmest since modern record keeping began in 1880.

Globally-averaged temperatures in 2016 were 1.78 degrees Fahrenheit (0.99 degrees Celsius) warmer than the mid-20th century mean. This makes 2016 the third year in a row to set a new record for global average surface temperatures.

Source: https://www.nasa.gov/press-release/nasa-noaa-data-show-2016-warmest-year-on-record-globally


The 2016 Weather Data Exploratory Analysis project was started to review the raw data from NOAA and identify areas of uncertainty and their potential impact on reaching a greater than 95% scientific certainty.

This is Part 7 of the 2016 Weather Data Exploratory Analysis.


This project is designed to add geographic indicators to the master station table in order to facilitate temperature data analysis in upcoming projects. In the previous project, Processing 1900-2016 Temperature Data, we found a relatively incomplete temperature record for all of the years anlyzed. That is, half of all years analyzed failed to reach a 90% complete observation rate and zero years were able to achieve a 95% completion rate.

In order to understand and quantify the impact of the missing data, it will be useful to summarize the findings by geographic means. In previous projects, we have reviewed station data by hemisphere, quadrant, temperature zone, and by global sector. As we begin to analyze the actual temperature data recorded, having these types of geographic indicators as part of the master station table will simplify the temperature analysis and speed the summarization of data.

The goals of the project are to update the master station data with the following geographic indicators:


Libraries Required

library(dplyr)          # Data manipulation
library(knitr)          # Dynamic report generation
setwd("~/NOAA Projects/NOAA Station Analysis")

Stations Data

The Stations Data to be used is the worldwide master list created in Part 1 of 2016 Weather Data Exploratory Analysis:


Read Station Files

A brief review of the source data will follow the loading of the master station file.

Read Master Station Data

master_stations_ww <- readRDS("~/Temp Stations/master_stations_ww.rds")
ID Latitude Longitude Elevation Country State Location FirstYear LastYear
ACW00011604 17.1167 -61.7833 33.14 ANTIGUA AND BARBUDA NA ST JOHNS COOLIDGE FLD 1949 1949
ACW00011647 17.1333 -61.7833 62.99 ANTIGUA AND BARBUDA NA ST JOHNS 1961 1961
AE000041196 25.3330 55.5170 111.55 UNITED ARAB EMIRATES NA SHARJAH INTER. AIRP 1944 2017
AEM00041194 25.2550 55.3640 34.12 UNITED ARAB EMIRATES NA DUBAI INTL 1983 2017
AEM00041217 24.4330 54.6510 87.93 UNITED ARAB EMIRATES NA ABU DHABI INTL 1983 2017

Data Processing

The processing of the master station data will occur in four steps and will match the processes from the previous projects. The results of the processing will be located in a new master stations table called master_stations_in in order to maintain continuity of data tables throughout the series of projects. The new table will be saved and used in the future projects.


Hemisphere Processing

  • Northern Hemisphere consists of stations with latitude greater than 0 degrees
  • Southern Hemisphere consists of stations with latitude less than 0 degrees

There are zero stations that have a latitude equal to 0.

Add Hemisphere Indicator

master_stations_in <- master_stations_ww %>%
                      mutate(Hemi = ifelse(Latitude > 0, "N", "S"))
ID Latitude Longitude Elevation Country State Location FirstYear LastYear Hemi
ACW00011604 17.1167 -61.7833 33.14 ANTIGUA AND BARBUDA NA ST JOHNS COOLIDGE FLD 1949 1949 N
ACW00011647 17.1333 -61.7833 62.99 ANTIGUA AND BARBUDA NA ST JOHNS 1961 1961 N
AE000041196 25.3330 55.5170 111.55 UNITED ARAB EMIRATES NA SHARJAH INTER. AIRP 1944 2017 N
AEM00041194 25.2550 55.3640 34.12 UNITED ARAB EMIRATES NA DUBAI INTL 1983 2017 N
AEM00041217 24.4330 54.6510 87.93 UNITED ARAB EMIRATES NA ABU DHABI INTL 1983 2017 N
AEM00041218 24.2620 55.6090 869.09 UNITED ARAB EMIRATES NA AL AIN INTL 1994 2017 N
AF000040930 35.3170 69.0170 11043.31 AFGHANISTAN NA NORTH-SALANG 1973 1992 N
AFM00040938 34.2100 62.2280 3206.04 AFGHANISTAN NA HERAT 1973 1992 N
AFM00040948 34.5660 69.2120 5876.97 AFGHANISTAN NA KABUL INTL 1966 2004 N
AFM00040990 31.5000 65.8500 3313.65 AFGHANISTAN NA KANDAHAR AIRPORT 1973 2013 N
## [1] "Total Worldwide Stations: 34285"
## 
##     N     S 
## 31715  2570

Quadrant Processing

  • Northwest Quadrant (positive latitude, negative longitude)
  • Northeast Quadrant (positive latitude, positive longitude)
  • Southwest Quadrant (negative latitude, negative longitude)
  • Southeast Quadrant (negative latitude, positive longitude)

There are 4 stations with a longitude of 0. These stations will be considered part of the Eastern Quadrants.

Add Quadrant Indicator

master_stations_in <- master_stations_in %>%
                      mutate(Quad = case_when(.$Latitude > 0 & .$Longitude <  0 ~ "NW",
                                              .$Latitude > 0 & .$Longitude >= 0 ~ "NE",
                                              .$Latitude < 0 & .$Longitude <  0 ~ "SW",
                                              .$Latitude < 0 & .$Longitude >= 0 ~ "SE"))
ID Latitude Longitude Elevation Country State Location FirstYear LastYear Hemi Quad
ACW00011604 17.1167 -61.7833 33.14 ANTIGUA AND BARBUDA NA ST JOHNS COOLIDGE FLD 1949 1949 N NW
ACW00011647 17.1333 -61.7833 62.99 ANTIGUA AND BARBUDA NA ST JOHNS 1961 1961 N NW
AE000041196 25.3330 55.5170 111.55 UNITED ARAB EMIRATES NA SHARJAH INTER. AIRP 1944 2017 N NE
AEM00041194 25.2550 55.3640 34.12 UNITED ARAB EMIRATES NA DUBAI INTL 1983 2017 N NE
AEM00041217 24.4330 54.6510 87.93 UNITED ARAB EMIRATES NA ABU DHABI INTL 1983 2017 N NE
AEM00041218 24.2620 55.6090 869.09 UNITED ARAB EMIRATES NA AL AIN INTL 1994 2017 N NE
AF000040930 35.3170 69.0170 11043.31 AFGHANISTAN NA NORTH-SALANG 1973 1992 N NE
AFM00040938 34.2100 62.2280 3206.04 AFGHANISTAN NA HERAT 1973 1992 N NE
AFM00040948 34.5660 69.2120 5876.97 AFGHANISTAN NA KABUL INTL 1966 2004 N NE
AFM00040990 31.5000 65.8500 3313.65 AFGHANISTAN NA KANDAHAR AIRPORT 1973 2013 N NE
## [1] "Total Worldwide Stations: 34285"
## 
##    NE    NW    SE    SW 
##  5102 26613  2091   479

Temperature Zone Processing

  • Polar Zone (latitude greater than 60 or latitude less than -60)
  • Temperate Zone (latitude between 30 and 60 or between -30 and -60)
  • Tropical Zone (latitude between 30 and -30)

Add Temperature Zone Indicator

master_stations_in <- master_stations_in %>%
                      mutate(Zone = case_when(.$Latitude >  60 ~ "PO",
                                              .$Latitude < -60 ~ "PO",
                                              .$Latitude >  30 & .$Latitude <=  60 ~ "TE",
                                              .$Latitude < -30 & .$Latitude >= -60 ~ "TE",
                                              .$Latitude <= 30 & .$Latitude >= -30 ~ "TR"))
ID Latitude Longitude Elevation Country State Location FirstYear LastYear Hemi Quad Zone
ACW00011604 17.1167 -61.7833 33.14 ANTIGUA AND BARBUDA NA ST JOHNS COOLIDGE FLD 1949 1949 N NW TR
ACW00011647 17.1333 -61.7833 62.99 ANTIGUA AND BARBUDA NA ST JOHNS 1961 1961 N NW TR
AE000041196 25.3330 55.5170 111.55 UNITED ARAB EMIRATES NA SHARJAH INTER. AIRP 1944 2017 N NE TR
AEM00041194 25.2550 55.3640 34.12 UNITED ARAB EMIRATES NA DUBAI INTL 1983 2017 N NE TR
AEM00041217 24.4330 54.6510 87.93 UNITED ARAB EMIRATES NA ABU DHABI INTL 1983 2017 N NE TR
AEM00041218 24.2620 55.6090 869.09 UNITED ARAB EMIRATES NA AL AIN INTL 1994 2017 N NE TR
AF000040930 35.3170 69.0170 11043.31 AFGHANISTAN NA NORTH-SALANG 1973 1992 N NE TE
AFM00040938 34.2100 62.2280 3206.04 AFGHANISTAN NA HERAT 1973 1992 N NE TE
AFM00040948 34.5660 69.2120 5876.97 AFGHANISTAN NA KABUL INTL 1966 2004 N NE TE
AFM00040990 31.5000 65.8500 3313.65 AFGHANISTAN NA KANDAHAR AIRPORT 1973 2013 N NE TE
## [1] "Total Worldwide Stations: 34285"
## 
##    PO    TE    TR 
##  2280 28375  3630

Global Sector

The global sector will range from 1 to 80 based on the Hansen-Lebedeff spatial map from the Aggregating Data project.

As a reminder, the Earth is divided into 8 latitudinal zones. Each of these zones is then sub-divided into 4 to 16 sectors depending on the latitude range. The reason for this is the eliptical shape of the Earth where the distance between longitudinal points is greater towards the equator and decreases as you move towards the poles. The result is 80 sectors with 2,500 square kilometers of surface area respectively.

Add Latitudinal Zones

master_stations_in <- master_stations_in %>%
                      mutate(Lzne = case_when(.$Latitude <=  90.0 & .$Latitude >   67.5 ~ "G1",
                                              .$Latitude <=  67.5 & .$Latitude >   45.0 ~ "G2",
                                              .$Latitude <=  45.0 & .$Latitude >   22.5 ~ "G3",
                                              .$Latitude <=  22.5 & .$Latitude >    0.0 ~ "G4",
                                              .$Latitude <=   0.0 & .$Latitude >  -22.5 ~ "G5",
                                              .$Latitude <= -22.5 & .$Latitude >  -45.0 ~ "G6",
                                              .$Latitude <= -45.0 & .$Latitude >  -67.5 ~ "G7",
                                              .$Latitude <= -67.5 & .$Latitude >= -90.0 ~ "G8"))

Add Sectors

master_stations_in <- master_stations_in %>%
                      mutate(Sect = case_when(
                              
                                      .$Lzne == "G1" & .$Longitude >= -180.0 & .$Longitude <  -90.0 ~ "S01",
                                      .$Lzne == "G1" & .$Longitude >=  -90.0 & .$Longitude <    0.0 ~ "S02",
                                      .$Lzne == "G1" & .$Longitude >=    0.0 & .$Longitude <   90.0 ~ "S03",
                                      .$Lzne == "G1" & .$Longitude >=   90.0 & .$Longitude <  180.0 ~ "S04",
                                        
                                      .$Lzne == "G2" & .$Longitude >= -180.0 & .$Longitude < -135.0 ~ "S05",
                                      .$Lzne == "G2" & .$Longitude >= -135.0 & .$Longitude <  -90.0 ~ "S06",
                                      .$Lzne == "G2" & .$Longitude >=  -90.0 & .$Longitude <  -45.0 ~ "S07",
                                      .$Lzne == "G2" & .$Longitude >=  -45.0 & .$Longitude <    0.0 ~ "S08",
                                      .$Lzne == "G2" & .$Longitude >=    0.0 & .$Longitude <   45.0 ~ "S09",
                                      .$Lzne == "G2" & .$Longitude >=   45.0 & .$Longitude <   90.0 ~ "S10",
                                      .$Lzne == "G2" & .$Longitude >=   90.0 & .$Longitude <  135.0 ~ "S11",
                                      .$Lzne == "G2" & .$Longitude >=  135.0 & .$Longitude <  180.0 ~ "S12",
                                        
                                      .$Lzne == "G3" & .$Longitude >= -180.0 & .$Longitude < -150.0 ~ "S13",
                                      .$Lzne == "G3" & .$Longitude >= -150.0 & .$Longitude < -120.0 ~ "S14",
                                      .$Lzne == "G3" & .$Longitude >= -120.0 & .$Longitude <  -90.0 ~ "S15",
                                      .$Lzne == "G3" & .$Longitude >=  -90.0 & .$Longitude <  -60.0 ~ "S16",
                                      .$Lzne == "G3" & .$Longitude >=  -60.0 & .$Longitude <  -30.0 ~ "S17",
                                      .$Lzne == "G3" & .$Longitude >=  -30.0 & .$Longitude <    0.0 ~ "S18",
                                      .$Lzne == "G3" & .$Longitude >=    0.0 & .$Longitude <   30.0 ~ "S19",
                                      .$Lzne == "G3" & .$Longitude >=   30.0 & .$Longitude <   60.0 ~ "S20",
                                      .$Lzne == "G3" & .$Longitude >=   60.0 & .$Longitude <   90.0 ~ "S21",
                                      .$Lzne == "G3" & .$Longitude >=   90.0 & .$Longitude <  120.0 ~ "S22",
                                      .$Lzne == "G3" & .$Longitude >=  120.0 & .$Longitude <  150.0 ~ "S23",
                                      .$Lzne == "G3" & .$Longitude >=  150.0 & .$Longitude <  180.0 ~ "S24",
                                        
                                      .$Lzne == "G4" & .$Longitude >= -180.0 & .$Longitude < -157.5 ~ "S25",
                                      .$Lzne == "G4" & .$Longitude >= -157.5 & .$Longitude < -135.0 ~ "S26",
                                      .$Lzne == "G4" & .$Longitude >= -135.0 & .$Longitude < -112.5 ~ "S27",
                                      .$Lzne == "G4" & .$Longitude >= -112.5 & .$Longitude <  -90.0 ~ "S28",
                                      .$Lzne == "G4" & .$Longitude >=  -90.0 & .$Longitude <  -67.5 ~ "S29",
                                      .$Lzne == "G4" & .$Longitude >=  -67.5 & .$Longitude <  -45.0 ~ "S30",
                                      .$Lzne == "G4" & .$Longitude >=  -45.0 & .$Longitude <  -22.5 ~ "S31",
                                      .$Lzne == "G4" & .$Longitude >=  -22.5 & .$Longitude <    0.0 ~ "S32",
                                      .$Lzne == "G4" & .$Longitude >=    0.0 & .$Longitude <   22.5 ~ "S33",
                                      .$Lzne == "G4" & .$Longitude >=   22.5 & .$Longitude <   45.0 ~ "S34",
                                      .$Lzne == "G4" & .$Longitude >=   45.0 & .$Longitude <   67.5 ~ "S35",
                                      .$Lzne == "G4" & .$Longitude >=   67.5 & .$Longitude <   90.0 ~ "S36",
                                      .$Lzne == "G4" & .$Longitude >=   90.0 & .$Longitude <  112.5 ~ "S37",
                                      .$Lzne == "G4" & .$Longitude >=  112.5 & .$Longitude <  135.0 ~ "S38",
                                      .$Lzne == "G4" & .$Longitude >=  135.0 & .$Longitude <  157.5 ~ "S39",
                                      .$Lzne == "G4" & .$Longitude >=  157.5 & .$Longitude <  180.0 ~ "S40",
                                        
                                      .$Lzne == "G5" & .$Longitude >= -180.0 & .$Longitude < -157.5 ~ "S41",
                                      .$Lzne == "G5" & .$Longitude >= -157.5 & .$Longitude < -135.0 ~ "S42",
                                      .$Lzne == "G5" & .$Longitude >= -135.0 & .$Longitude < -112.5 ~ "S43",
                                      .$Lzne == "G5" & .$Longitude >= -112.5 & .$Longitude <  -90.0 ~ "S44",
                                      .$Lzne == "G5" & .$Longitude >=  -90.0 & .$Longitude <  -67.5 ~ "S45",
                                      .$Lzne == "G5" & .$Longitude >=  -67.5 & .$Longitude <  -45.0 ~ "S46",
                                      .$Lzne == "G5" & .$Longitude >=  -45.0 & .$Longitude <  -22.5 ~ "S47",
                                      .$Lzne == "G5" & .$Longitude >=  -22.5 & .$Longitude <    0.0 ~ "S48",
                                      .$Lzne == "G5" & .$Longitude >=    0.0 & .$Longitude <   22.5 ~ "S49",
                                      .$Lzne == "G5" & .$Longitude >=   22.5 & .$Longitude <   45.0 ~ "S50",
                                      .$Lzne == "G5" & .$Longitude >=   45.0 & .$Longitude <   67.5 ~ "S51",
                                      .$Lzne == "G5" & .$Longitude >=   67.5 & .$Longitude <   90.0 ~ "S52",
                                      .$Lzne == "G5" & .$Longitude >=   90.0 & .$Longitude <  112.5 ~ "S53",
                                      .$Lzne == "G5" & .$Longitude >=  112.5 & .$Longitude <  135.0 ~ "S54",
                                      .$Lzne == "G5" & .$Longitude >=  135.0 & .$Longitude <  157.5 ~ "S55",
                                      .$Lzne == "G5" & .$Longitude >=  157.5 & .$Longitude <  180.0 ~ "S56",
                                      
                                      .$Lzne == "G6" & .$Longitude >= -180.0 & .$Longitude < -150.0 ~ "S57",
                                      .$Lzne == "G6" & .$Longitude >= -150.0 & .$Longitude < -120.0 ~ "S58",
                                      .$Lzne == "G6" & .$Longitude >= -120.0 & .$Longitude <  -90.0 ~ "S59",
                                      .$Lzne == "G6" & .$Longitude >=  -90.0 & .$Longitude <  -60.0 ~ "S60",
                                      .$Lzne == "G6" & .$Longitude >=  -60.0 & .$Longitude <  -30.0 ~ "S61",
                                      .$Lzne == "G6" & .$Longitude >=  -30.0 & .$Longitude <    0.0 ~ "S62",
                                      .$Lzne == "G6" & .$Longitude >=    0.0 & .$Longitude <   30.0 ~ "S63",
                                      .$Lzne == "G6" & .$Longitude >=   30.0 & .$Longitude <   60.0 ~ "S64",
                                      .$Lzne == "G6" & .$Longitude >=   60.0 & .$Longitude <   90.0 ~ "S65",
                                      .$Lzne == "G6" & .$Longitude >=   90.0 & .$Longitude <  120.0 ~ "S66",
                                      .$Lzne == "G6" & .$Longitude >=  120.0 & .$Longitude <  150.0 ~ "S67",
                                      .$Lzne == "G6" & .$Longitude >=  150.0 & .$Longitude <  180.0 ~ "S68",
                                        
                                      .$Lzne == "G7" & .$Longitude >= -180.0 & .$Longitude < -135.0 ~ "S69",
                                      .$Lzne == "G7" & .$Longitude >= -135.0 & .$Longitude <  -90.0 ~ "S70",
                                      .$Lzne == "G7" & .$Longitude >=  -90.0 & .$Longitude <  -45.0 ~ "S71",
                                      .$Lzne == "G7" & .$Longitude >=  -45.0 & .$Longitude <    0.0 ~ "S72",
                                      .$Lzne == "G7" & .$Longitude >=    0.0 & .$Longitude <   45.0 ~ "S73",
                                      .$Lzne == "G7" & .$Longitude >=   45.0 & .$Longitude <   90.0 ~ "S74",
                                      .$Lzne == "G7" & .$Longitude >=   90.0 & .$Longitude <  135.0 ~ "S75",
                                      .$Lzne == "G7" & .$Longitude >=  135.0 & .$Longitude <  180.0 ~ "S76",
                                       
                                      .$Lzne == "G8" & .$Longitude >= -180.0 & .$Longitude <  -90.0 ~ "S77",
                                      .$Lzne == "G8" & .$Longitude >=  -90.0 & .$Longitude <    0.0 ~ "S78",
                                      .$Lzne == "G8" & .$Longitude >=    0.0 & .$Longitude <   90.0 ~ "S79",
                                      .$Lzne == "G8" & .$Longitude >=   90.0 & .$Longitude <  180.0 ~ "S80"))

Calculate Sum of Sectors

sect_cts <- master_stations_in %>%
            group_by(Sect) %>%
            summarize(Cts = n()) %>%
            ungroup() %>%
            select(Cts) %>%
            summarize(Total = sum(.))

Remove Latitudinal Zones

master_stations_in <- master_stations_in %>%
                      select(-Lzne)
ID Latitude Longitude Elevation Country State Location FirstYear LastYear Hemi Quad Zone Sect
ACW00011604 17.1167 -61.7833 33.14 ANTIGUA AND BARBUDA NA ST JOHNS COOLIDGE FLD 1949 1949 N NW TR S30
ACW00011647 17.1333 -61.7833 62.99 ANTIGUA AND BARBUDA NA ST JOHNS 1961 1961 N NW TR S30
AE000041196 25.3330 55.5170 111.55 UNITED ARAB EMIRATES NA SHARJAH INTER. AIRP 1944 2017 N NE TR S20
AEM00041194 25.2550 55.3640 34.12 UNITED ARAB EMIRATES NA DUBAI INTL 1983 2017 N NE TR S20
AEM00041217 24.4330 54.6510 87.93 UNITED ARAB EMIRATES NA ABU DHABI INTL 1983 2017 N NE TR S20
AEM00041218 24.2620 55.6090 869.09 UNITED ARAB EMIRATES NA AL AIN INTL 1994 2017 N NE TR S20
AF000040930 35.3170 69.0170 11043.31 AFGHANISTAN NA NORTH-SALANG 1973 1992 N NE TE S21
AFM00040938 34.2100 62.2280 3206.04 AFGHANISTAN NA HERAT 1973 1992 N NE TE S21
AFM00040948 34.5660 69.2120 5876.97 AFGHANISTAN NA KABUL INTL 1966 2004 N NE TE S21
AFM00040990 31.5000 65.8500 3313.65 AFGHANISTAN NA KANDAHAR AIRPORT 1973 2013 N NE TE S21
## [1] "Total Worldwide Stations: 34285"
## [1] "Total Sector Count: 34285"

Save Master Station Table with Indicators

saveRDS(master_stations_in, "~/Temp Stations/master_stations_in.rds")

Next Steps

With the master station data updated with the geographic indicators required to fully analyize the impacts of missing data from the historical temperature data, the next projects will explore the following:




sessionInfo()
## R version 3.4.0 (2017-04-21)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 15063)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United States.1252 
## [2] LC_CTYPE=English_United States.1252   
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] knitr_1.16  dplyr_0.5.0
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.11     digest_0.6.12    rprojroot_1.2    assertthat_0.2.0
##  [5] R6_2.2.1         DBI_0.6-1        backports_1.1.0  magrittr_1.5    
##  [9] evaluate_0.10    highr_0.6        rlang_0.1.1      stringi_1.1.5   
## [13] lazyeval_0.2.0   rmarkdown_1.5    tools_3.4.0      stringr_1.2.0   
## [17] yaml_2.1.14      compiler_3.4.0   htmltools_0.3.6  tibble_1.3.3