Introduction

This is the summary report on cells administrative data regarding ownership of cells office building, existence of rain water harvest system, electricity and internet connectivity, availability of public TV for community use, etc. For more details on how this report was generated, kindly visit the below link: https://github.com/birasafab/Cells-administrative-data-analysis.

Reading in the data to be analyzed by using direct excel files

## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems
## New names:
## * `` -> ...1
## * `` -> ...2
## * `` -> ...3
## * `` -> ...4
## * `` -> ...5
## * ... and 10 more problems

?? How to rename columns of multiple lists?

## Preparing the names of columns to be included in imported data from Excel

Readme<-read_csv("C:/Users/user/Desktop/data analysis project/ReadMe.csv")

names_col<-t(Readme[,3])


all_districts %>%
  map(colnames,names_col)

Concatenate worksheets into one data frame

What if the datasets found on different sheets have the same variables? Then you’ll want to row-bind them, after import, to form one big, beautiful data frame.

CSV caching and iterating over sheets

What if we want to read all the sheets in at once and simultaneously cache to CSV? we define read_then_csv() as read_excel() %>% write_csv() and use purrr::map() again

## New names:
## * `4` -> `4...5`
## * `3` -> `3...6`
## * `3` -> `3...7`
## * `4` -> `4...8`
## * `3` -> `3...9`
## * ... and 6 more problems
##  [1] "a.c"                                                               
##  [2] "all districts for analysis-Gatsibo.csv"                            
##  [3] "Cells-administrative-data-analysis.Rproj"                          
##  [4] "Learning-how-to-use-stringr-functions-to-manage-data-strings.html" 
##  [5] "Learning-how-to-use-stringr-functions-to-manage-data-strings_cache"
##  [6] "Learning-how-to-use-stringr-functions-to-manage-data-strings_files"
##  [7] "Learning how to use stringr functions to manage data strings.Rmd"  
##  [8] "README.md"                                                         
##  [9] "rsconnect"                                                         
## [10] "sample-data-analysis-report.docx"                                  
## [11] "sample-data-analysis-report.Rmd"                                   
## [12] "sample data analysis report.Rmd"
## Parsed with column specification:
## cols(
##   No = col_double(),
##   Province = col_character(),
##   District = col_character(),
##   Sector = col_character(),
##   `Total nuber of cells within the sector` = col_character(),
##   `Number of cells owning office building` = col_character(),
##   `Number of cells with office building to be rehabilitated` = col_character(),
##   `Number of staff to operate in the office building mentioned` = col_character(),
##   `Number of office building without rain water harvesting tanks` = col_character(),
##   `number of cells connected to the National Grid` = col_character(),
##   `number of cells only using Solar energy` = col_character(),
##   `Number of cells having atleast one functioning computer` = col_character(),
##   `Number of cells having functioning printer` = col_character(),
##   `Number of cells currently connected to fiber optic or using modems for internet` = col_character(),
##   `Number of cells with public TV screens readily available at the cell's waiting room` = col_character()
## )
## Parsed with column specification:
## cols(
##   Index = col_double(),
##   `Indicator name` = col_character(),
##   Abbreviation = col_character(),
##   observation = col_character()
## )

Renaming column in the loaded dataset basing on the list of variable in it summarized

Summarizing the the existing dataset

Data validation part

## Parsed with column specification:
## cols(
##   `INTARA/MVK` = col_character(),
##   AKARERE = col_character(),
##   UMURENGE = col_character(),
##   AKAGARI = col_character(),
##   UMUDUGUDU = col_character()
## )

Adjusting Nyanza data which are not following the common template

## # A tibble: 10 x 4
## # Groups:   Province, District [1]
##    Province District Sector     `(Cells = n())`
##    <chr>    <chr>    <chr>                <int>
##  1 South    Nyanza   BUSASAMANA               5
##  2 South    Nyanza   BUSORO                   6
##  3 South    Nyanza   CYABAKAMYI               5
##  4 South    Nyanza   KIBILIZI                 4
##  5 South    Nyanza   KIGOMA                   5
##  6 South    Nyanza   MUKINGO                  6
##  7 South    Nyanza   MUYIRA                   5
##  8 South    Nyanza   NTYAZO                   4
##  9 South    Nyanza   NYAGISOZI                5
## 10 South    Nyanza   RWABICUMA                6

Examples of removing unwanted text characters by showing an example

Checking for errors in the submitted data against the available numbers of cells and villages in Rwanda

##        Cell
##  [1,]  TRUE
##  [2,]  TRUE
##  [3,]  TRUE
##  [4,]  TRUE
##  [5,]  TRUE
##  [6,]  TRUE
##  [7,] FALSE
##  [8,]  TRUE
##  [9,]  TRUE
## [10,]  TRUE
## [11,]  TRUE
## [12,]  TRUE
## [13,]  TRUE
## [14,]  TRUE

Diagnostic my dataset for typos and missing values existing

Exporting back to excell the data identified to be with issues

## Warning: package 'openxlsx' was built under R version 3.6.3


### Adjusting for Gatsibo raw data




```{}
# Addressing the issues different cases in text variables (from Upper to lower case with first character capitalized)
all_districts %>%
  View()
        mutate(Province=tools::toTitleCase(tolower(all_districts$Province))) %>%
  View()
all_districts$Province<-tools::toTitleCase(tolower(all_districts$Province))
all_districts$Districts<-tools::toTitleCase(tolower(all_districts$District))
all_districts$Sector<-tools::toTitleCase(tolower(all_districts$Sector))