Introduction

This report examines how education indicators vary across different administrative units at level 4 in Nepal in 2011. The analysis aims to provide insights into the spatial variations in education indicators.

Load Libraries and Data

## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## New names:
## Rows: 11952 Columns: 180
## ── Column specification
## ──────────────────────────────────────────────────────── Delimiter: "," chr
## (6): Unique ID, Total/Rural/Urban Division, Level 0 name, Level 1 name... dbl
## (173): Spatial database year, Level 0 code, Level 1 code, Level 2 code, ... lgl
## (1): ...151
## ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
## Specify the column types or set `show_col_types = FALSE` to quiet this message.
## • `` -> `...151`
## • `Area (sq. km.)` -> `Area (sq. km.)...172`
## • `Population (thousands)` -> `Population (thousands)...173`
## • `Population density (people per sq. km.)` -> `Population density (people per
##   sq. km.)...174`
## • `Area (sq. km.)` -> `Area (sq. km.)...175`
## • `Population (thousands)` -> `Population (thousands)...176`
## • `Population density (people per sq. km.)` -> `Population density (people per
##   sq. km.)...177`
## • `Area (sq. km.)` -> `Area (sq. km.)...178`
## • `Population (thousands)` -> `Population (thousands)...179`
## • `Population density (people per sq. km.)` -> `Population density (people per
##   sq. km.)...180`

Explore the structure and initial observations of the dataset.

head(data)
##                                                                             
## 1 function (..., list = character(), package = NULL, lib.loc = NULL,        
## 2     verbose = getOption("verbose"), envir = .GlobalEnv, overwrite = TRUE) 
## 3 {                                                                         
## 4     fileExt <- function(x) {                                              
## 5         db <- grepl("\\\\.[^.]+\\\\.(gz|bz2|xz)$", x)                     
## 6         ans <- sub(".*\\\\.", "", x)
str(data)
## function (..., list = character(), package = NULL, lib.loc = NULL, verbose = getOption("verbose"), 
##     envir = .GlobalEnv, overwrite = TRUE)
names(nepal2011)
##   [1] "Unique ID"                                                                       
##   [2] "Spatial database year"                                                           
##   [3] "Total/Rural/Urban Division"                                                      
##   [4] "Level 0 code"                                                                    
##   [5] "Level 0 name"                                                                    
##   [6] "Level 1 code"                                                                    
##   [7] "Level 1 name"                                                                    
##   [8] "Level 2 code"                                                                    
##   [9] "Level 2 name"                                                                    
##  [10] "Level 4 code"                                                                    
##  [11] "Level 4 name"                                                                    
##  [12] "Gender ratio, all ages (percent)"                                                
##  [13] "Gender ratio, 0-4 years (percent)"                                               
##  [14] "Dependency ratio, 0-14 or 65+ years (percent)"                                   
##  [15] "Dependency ratio, 65+ years (percent)"                                           
##  [16] "Dependency ratio, 0-14 years (percent)"                                          
##  [17] "Light intensity per 1000 people(Digital Numbers of light per 1000 people)"       
##  [18] "Light intensity per area (Digital Numbers of light per sq. km.)"                 
##  [19] "Literacy rate,7+ years,total(percent of population group)"                       
##  [20] "Literacy rate,7+ years,female(percent of population group)"                      
##  [21] "Literacy rate,7+ years,male(percent of population group)"                        
##  [22] "Monthly temperature - Jan (Co)"                                                  
##  [23] "Monthly temperature - Feb (Co)"                                                  
##  [24] "Monthly temperature - Mar (Co)"                                                  
##  [25] "Monthly temperature - Apr (Co)"                                                  
##  [26] "Monthly temperature - May (Co)"                                                  
##  [27] "Monthly temperature - Jun (Co)"                                                  
##  [28] "Monthly temperature - Jul (Co)"                                                  
##  [29] "Monthly temperature - Aug (Co)"                                                  
##  [30] "Monthly temperature - Sep (Co)"                                                  
##  [31] "Monthly temperature - Oct (Co)"                                                  
##  [32] "Monthly temperature - Nov (Co)"                                                  
##  [33] "Monthly temperature - Dec (Co)"                                                  
##  [34] "Decadal average of monthly temperature - Jan (Co)"                               
##  [35] "Decadal average of monthly temperature - Feb (Co)"                               
##  [36] "Decadal average of monthly temperature - Mar (Co)"                               
##  [37] "Decadal average of monthly temperature - Apr (Co)"                               
##  [38] "Decadal average of monthly temperature - May (Co)"                               
##  [39] "Decadal average of monthly temperature - Jun (Co)"                               
##  [40] "Decadal average of monthly temperature - Jul (Co)"                               
##  [41] "Decadal average of monthly temperature - Aug (Co)"                               
##  [42] "Decadal average of monthly temperature - Sep (Co)"                               
##  [43] "Decadal average of monthly temperature - Oct (Co)"                               
##  [44] "Decadal average of monthly temperature - Nov (Co)"                               
##  [45] "Decadal average of monthly temperature - Dec (Co)"                               
##  [46] "Decadal variation of monthly temperature - Jan (Co)"                             
##  [47] "Decadal variation of monthly temperature - Feb (Co)"                             
##  [48] "Decadal variation of monthly temperature - Mar (Co)"                             
##  [49] "Decadal variation of monthly temperature - Apr (Co)"                             
##  [50] "Decadal variation of monthly temperature - May (Co)"                             
##  [51] "Decadal variation of monthly temperature - Jun (Co)"                             
##  [52] "Decadal variation of monthly temperature - Jul (Co)"                             
##  [53] "Decadal variation of monthly temperature - Aug (Co)"                             
##  [54] "Decadal variation of monthly temperature - Sep (Co)"                             
##  [55] "Decadal variation of monthly temperature - Oct (Co)"                             
##  [56] "Decadal variation of monthly temperature - Nov (Co)"                             
##  [57] "Decadal variation of monthly temperature - Dec (Co)"                             
##  [58] "Temperature anomaly - Jan (Co)"                                                  
##  [59] "Temperature anomaly - Feb (Co)"                                                  
##  [60] "Temperature anomaly - Mar (Co)"                                                  
##  [61] "Temperature anomaly - Apr (Co)"                                                  
##  [62] "Temperature anomaly - May (Co)"                                                  
##  [63] "Temperature anomaly - Jun (Co)"                                                  
##  [64] "Temperature anomaly - Jul (Co)"                                                  
##  [65] "Temperature anomaly - Aug (Co)"                                                  
##  [66] "Temperature anomaly - Sep (Co)"                                                  
##  [67] "Temperature anomaly - Oct (Co)"                                                  
##  [68] "Temperature anomaly - Nov (Co)"                                                  
##  [69] "Temperature anomaly - Dec (Co)"                                                  
##  [70] "Precipitation - Jan (millimeters)"                                               
##  [71] "Precipitation - Feb (millimeters)"                                               
##  [72] "Precipitation - Mar (millimeters)"                                               
##  [73] "Precipitation - Apr (millimeters)"                                               
##  [74] "Precipitation - May (millimeters)"                                               
##  [75] "Precipitation - Jun (millimeters)"                                               
##  [76] "Precipitation - Jul (millimeters)"                                               
##  [77] "Precipitation - Aug (millimeters)"                                               
##  [78] "Precipitation - Sep (millimeters)"                                               
##  [79] "Precipitation - Oct (millimeters)"                                               
##  [80] "Precipitation - Nov (millimeters)"                                               
##  [81] "Precipitation - Dec (millimeters)"                                               
##  [82] "Decadal average of monthly precipitation - Jan (millimeters)"                    
##  [83] "Decadal average of monthly precipitation - Feb (millimeters)"                    
##  [84] "Decadal average of monthly precipitation - Mar (millimeters)"                    
##  [85] "Decadal average of monthly precipitation - Apr (millimeters)"                    
##  [86] "Decadal average of monthly precipitation - May (millimeters)"                    
##  [87] "Decadal average of monthly precipitation - Jun (millimeters)"                    
##  [88] "Decadal average of monthly precipitation - Jul (millimeters)"                    
##  [89] "Decadal average of monthly precipitation - Aug (millimeters)"                    
##  [90] "Decadal average of monthly precipitation - Sep (millimeters)"                    
##  [91] "Decadal average of monthly precipitation - Oct (millimeters)"                    
##  [92] "Decadal average of monthly precipitation - Nov (millimeters)"                    
##  [93] "Decadal average of monthly precipitation - Dec (millimeters)"                    
##  [94] "Decadal variation of monthly precipitation - Jan (millimeters)"                  
##  [95] "Decadal variation of monthly precipitation - Feb (millimeters)"                  
##  [96] "Decadal variation of monthly precipitation - Mar (millimeters)"                  
##  [97] "Decadal variation of monthly precipitation - Apr (millimeters)"                  
##  [98] "Decadal variation of monthly precipitation - May (millimeters)"                  
##  [99] "Decadal variation of monthly precipitation - Jun (millimeters)"                  
## [100] "Decadal variation of monthly precipitation - Jul (millimeters)"                  
## [101] "Decadal variation of monthly precipitation - Aug (millimeters)"                  
## [102] "Decadal variation of monthly precipitation - Sep (millimeters)"                  
## [103] "Decadal variation of monthly precipitation - Oct (millimeters)"                  
## [104] "Decadal variation of monthly precipitation - Nov (millimeters)"                  
## [105] "Decadal variation of monthly precipitation - Dec (millimeters)"                  
## [106] "Precipitation anomaly - Jan (millimeters)"                                       
## [107] "Precipitation anomaly - Feb (millimeters)"                                       
## [108] "Precipitation anomaly - Mar (millimeters)"                                       
## [109] "Precipitation anomaly - Apr (millimeters)"                                       
## [110] "Precipitation anomaly - May (millimeters)"                                       
## [111] "Precipitation anomaly - Jun (millimeters)"                                       
## [112] "Precipitation anomaly - Jul (millimeters)"                                       
## [113] "Precipitation anomaly - Aug (millimeters)"                                       
## [114] "Precipitation anomaly - Sep (millimeters)"                                       
## [115] "Precipitation anomaly - Oct (millimeters)"                                       
## [116] "Precipitation anomaly - Nov (millimeters)"                                       
## [117] "Precipitation anomaly - Dec (millimeters)"                                       
## [118] "Cropland (percent of area)"                                                      
## [119] "Forest (percent of area)"                                                        
## [120] "Mineral facilities - Total  (number of facilities)"                              
## [121] "Mineral facilities - Public  (number of facilities)"                             
## [122] "Mineral facilities - Private  (number of facilities)"                            
## [123] "Mineral facilities - Foreign  (number of facilities)"                            
## [124] "Mineral production capacity  (million metric tons per year )"                    
## [125] "Protected land - Total (percent of area)"                                        
## [126] "Protected land - National (percent of area)"                                     
## [127] "Protected land - International (percent of area)"                                
## [128] "Aerosol particle radius (percent of small particles)"                            
## [129] "Aerosol optical thickness (thickness scale  0-1 )"                               
## [130] "Carbon monoxide levels (parts per billion by volume)"                            
## [131] "Nitrogen dioxide levels (billion molecules/mm2)"                                 
## [132] "Elevation (meters)"                                                              
## [133] "Surface roughness (meters)"                                                      
## [134] "Land area equipped for irrigation - Total (percent of land area)"                
## [135] "Land area actually irrigated (percent of land area)"                             
## [136] "Land area equipped for irrigation  - Ground water (percent of land area)"        
## [137] "Land area equipped for irrigation  - Surface water (percent of land area)"       
## [138] "Land area with limited or no constraints (percent of land area)"                 
## [139] "Households' access to cellphone,total(percent of households)"                    
## [140] "Households' access to computer,total(percent of households)"                     
## [141] "Road length,Total(km.)"                                                          
## [142] "Road length,Major highways, primary and secondary(km.)"                          
## [143] "Road length,Tertiary and rural(km.)"                                             
## [144] "Road length,Other(km.)"                                                          
## [145] "Road intensity,Total(km. per 1000 sq. km.)"                                      
## [146] "Road intensity,Major highways, primary and secondary(km. per 1000 sq. km.)"      
## [147] "Road intensity,Tertiary and rural(km. per 1000 sq. km.)"                         
## [148] "Road intensity,Other(km. per 1000 sq. km.)"                                      
## [149] "Number of stations,Total(stations)"                                              
## [150] "Number of stations,Railway(stations)"                                            
## [151] "...151"                                                                          
## [152] "Number of stations,Total(stations per 1000 sq. km.)"                             
## [153] "Number of stations,Railway(stations per 1000 sq. km.)"                           
## [154] "Households' access to electricity, total(percent of households)"                 
## [155] "Households's use of fuel for lighting,kerosene(percent of households)"           
## [156] "Households's use of fuel for lighting,Biogas(percent of households)"             
## [157] "Households's use of fuel for lighting,Electricity(percent of households)"        
## [158] "Households's use of fuel for lighting,Others(percent of households)"             
## [159] "Households's use of fuel for lighting,No lightning arrangement(percent of househ"
## [160] "Households' use of fuel for cooking,biomass(percent of households)"              
## [161] "Households' use of fuel for Coal/lignite/charcoal ,(percent of households)"      
## [162] "Households' use of fuel for cooking,kerosene(percent of households)"             
## [163] "Households' use of fuel for Gas/LPG,(percent of households)"                     
## [164] "Households' use of fuel for cooking,electricity (percent of households)"         
## [165] "Households' use of fuel for Bio-gas,(percent of households)"                     
## [166] "Housing units built with substantial solid material, total(percent of households"
## [167] "Households' access to improved water,total(percent of households)"               
## [168] "Households' access to improved sanitation,total(percent of households)"          
## [169] "Households' access to enhanced improved sanitation,total(percent of households)" 
## [170] "People per household,total(people)"                                              
## [171] "Housing ownership(percent of households)"                                        
## [172] "Area (sq. km.)...172"                                                            
## [173] "Population (thousands)...173"                                                    
## [174] "Population density (people per sq. km.)...174"                                   
## [175] "Area (sq. km.)...175"                                                            
## [176] "Population (thousands)...176"                                                    
## [177] "Population density (people per sq. km.)...177"                                   
## [178] "Area (sq. km.)...178"                                                            
## [179] "Population (thousands)...179"                                                    
## [180] "Population density (people per sq. km.)...180"

Data Filtering and Formatting

#There are 180 columns in the dataset Filter column names based on keywords related to education indicators
column_names <- names(nepal2011)

education_keywords <- c("education", "school", "literacy", "enrollment", "degree")

education_columns <- column_names[grepl(paste(education_keywords, collapse = "|"), column_names, ignore.case = TRUE)]
print("Education Indicator Columns:")
## [1] "Education Indicator Columns:"
print(education_columns)
## [1] "Literacy rate,7+ years,total(percent of population group)" 
## [2] "Literacy rate,7+ years,female(percent of population group)"
## [3] "Literacy rate,7+ years,male(percent of population group)"

New dataset for 2011 Nepal education across different administrative units at level 4

nepal2011_education <- data.frame(nepal2011$`Level 4 name`,nepal2011$`Literacy rate,7+ years,female(percent of population group)`,nepal2011$`Literacy rate,7+ years,male(percent of population group)`,nepal2011$`Literacy rate,7+ years,total(percent of population group)`)

names(nepal2011_education)
## [1] "nepal2011..Level.4.name."                                              
## [2] "nepal2011..Literacy.rate.7..years.female.percent.of.population.group.."
## [3] "nepal2011..Literacy.rate.7..years.male.percent.of.population.group.."  
## [4] "nepal2011..Literacy.rate.7..years.total.percent.of.population.group.."
nepal2011_education <- nepal2011_education %>%
  rename(
    female_literacy_rate = `nepal2011..Literacy.rate.7..years.female.percent.of.population.group..`,
    male_literacy_rate = `nepal2011..Literacy.rate.7..years.male.percent.of.population.group..`,
    total_literacy_rate = `nepal2011..Literacy.rate.7..years.total.percent.of.population.group..`
  )
head(nepal2011_education)
##   nepal2011..Level.4.name. female_literacy_rate male_literacy_rate
## 1                Phungling                 73.2               88.3
## 2                Phungling                 73.2               88.3
## 3                Phungling                   NA                 NA
## 4                 Hangdewa                 67.5               83.7
## 5                 Hangdewa                 67.5               83.7
## 6                 Hangdewa                   NA                 NA
##   total_literacy_rate
## 1                80.3
## 2                80.3
## 3                  NA
## 4                75.0
## 5                75.0
## 6                  NA

Summarize the education indicators at level 4.

summary(nepal2011_education)
##  nepal2011..Level.4.name. female_literacy_rate male_literacy_rate
##  Length:11952             Min.   :10.90        Min.   :20.80     
##  Class :character         1st Qu.:41.30        1st Qu.:64.33     
##  Mode  :character         Median :54.00        Median :73.50     
##                           Mean   :52.19        Mean   :71.53     
##                           3rd Qu.:63.30        3rd Qu.:80.80     
##                           Max.   :84.50        Max.   :95.80     
##                           NA's   :4010         NA's   :4010      
##  total_literacy_rate
##  Min.   :15.60      
##  1st Qu.:52.80      
##  Median :62.90      
##  Mean   :61.31      
##  3rd Qu.:71.00      
##  Max.   :89.20      
##  NA's   :4010

Filter out rows with NA values for more accurate summaries.

cleaned_data <- nepal2011_education %>%
  filter(!is.na(female_literacy_rate) & !is.na(male_literacy_rate) & !is.na(total_literacy_rate))
# Summary statistics for female literacy rate
female_summary <- cleaned_data %>%
  summarize(
    Min_Female_Literacy = min(female_literacy_rate, na.rm = TRUE),
    Median_Female_Literacy = median(female_literacy_rate, na.rm = TRUE),
    Mean_Female_Literacy = mean(female_literacy_rate, na.rm = TRUE),
    Max_Female_Literacy = max(female_literacy_rate, na.rm = TRUE)
  )

# Summary statistics for male literacy rate
male_summary <- cleaned_data %>%
  summarize(
    Min_Male_Literacy = min(male_literacy_rate, na.rm = TRUE),
    Median_Male_Literacy = median(male_literacy_rate, na.rm = TRUE),
    Mean_Male_Literacy = mean(male_literacy_rate, na.rm = TRUE),
    Max_Male_Literacy = max(male_literacy_rate, na.rm = TRUE)
  )

# Summary statistics for total literacy rate
total_summary <- cleaned_data %>%
  summarize(
    Min_Total_Literacy = min(total_literacy_rate, na.rm = TRUE),
    Median_Total_Literacy = median(total_literacy_rate, na.rm = TRUE),
    Mean_Total_Literacy = mean(total_literacy_rate, na.rm = TRUE),
    Max_Total_Literacy = max(total_literacy_rate, na.rm = TRUE)
  )

list(female_summary = female_summary, male_summary = male_summary, total_summary = total_summary)
## $female_summary
##   Min_Female_Literacy Median_Female_Literacy Mean_Female_Literacy
## 1                10.9                     54             52.18507
##   Max_Female_Literacy
## 1                84.5
## 
## $male_summary
##   Min_Male_Literacy Median_Male_Literacy Mean_Male_Literacy Max_Male_Literacy
## 1              20.8                 73.5           71.53062              95.8
## 
## $total_summary
##   Min_Total_Literacy Median_Total_Literacy Mean_Total_Literacy
## 1               15.6                  62.9            61.31032
##   Max_Total_Literacy
## 1               89.2

Visualize the distribution of literacy rates.

# Plot distribution of female literacy rates
ggplot(cleaned_data, aes(x = female_literacy_rate)) +
  geom_histogram(bins = 30, fill = "skyblue", color = "black") +
  labs(title = "Distribution of Female Literacy Rates (2011)",
       x = "Female Literacy Rate",
       y = "Frequency") +
  theme_minimal()

# Plot distribution of male literacy rates
ggplot(cleaned_data, aes(x = male_literacy_rate)) +
  geom_histogram(bins = 30, fill = "lightgreen", color = "black") +
  labs(title = "Distribution of Male Literacy Rates (2011)",
       x = "Male Literacy Rate",
       y = "Frequency") +
  theme_minimal()

# Plot distribution of total literacy rates
ggplot(cleaned_data, aes(x = total_literacy_rate)) +
  geom_histogram(bins = 30, fill = "lightcoral", color = "black") +
  labs(title = "Distribution of Total Literacy Rates (2011)",
       x = "Total Literacy Rate",
       y = "Frequency") +
  theme_minimal()

## ##

Conclusion

Female Literacy Rate:

Minimum: 10.9% Median: 54% Mean: 52.19% Maximum: 84.5% The data indicates a wide disparity in female literacy rates, with a significant range between the minimum and maximum values. The median and mean values reflect moderate overall literacy rates, suggesting that while some areas are performing well, others lag significantly.

Male Literacy Rate:

Minimum: 20.8% Median: 73.5% Mean: 71.53% Maximum: 95.8% The male literacy rates show a higher central tendency compared to female literacy rates, with most regions exhibiting high literacy levels. The wide range and high mean and median values indicate that while there are some areas with low literacy rates, overall, male literacy is relatively high.

Male literacy rates have a higher minimum and maximum compared to female literacy rates, indicating that, overall, male literacy is higher and more uniformly distributed across the regions.

The median and mean values for male literacy rates are significantly higher than those for female literacy rates. This indicates that, on average, male literacy rates are better across the regions compared to female literacy rates.

The distribution of female literacy rates is more skewed with a broader range, indicating greater disparities. In contrast, male literacy rates have a higher central tendency and a more concentrated distribution around higher values, reflecting overall better literacy levels for males.

Male literacy rates in Nepal in 2011 are generally higher and less variable than female literacy rates. The broader range and lower central tendency for female literacy suggest more pronounced regional disparities and overall lower literacy. The comparison highlights a need for targeted educational interventions to address gender disparities and improve literacy rates for females in regions where they are lagging.