- Stages of data cleaning
Load the necessary packages
For this step, I loaded the following: install.packages(“readr”),
install.packages(“readxl”), library(readr) &
library(readxl)
Read the datasets
Here I read in the “rule of law” indicator CSV file on my desktop and
the ESG reporting Excel file in my downloads.
Read the “rule of law” indicator CSV file
rule_of_law_data <-
read_csv(“users/Nikhila/desktop/rule_of_law_data.csv”)
Read the ESG reporting Excel file
esg_reporting_data <-
read_excel(“users/Nikhila/downloads/esg_reporting_data.xlsx”)
Merge the datasets
Since both datasets contain data for the same countries, I then
merged them based on a common identifier (here that was the country
name/code).
merged_data <- merge(rule_of_law_data, esg_reporting_data, by =
“Country”, all = TRUE)
Handle missing values
There were some missing values after this portion and so I removed
these countries in the dataset based off them (since manually inputting
them would have led to too much inaccuracy).
Check data types
I then checked that the columns containing numeric data are of the
correct data type ( numeric and letter coded where necessary).
B. Cleaned dataset:
options(repos = c(CRAN = "https://cran.rstudio.com/"))
install.packages("readxl")
trying URL 'https://cran.rstudio.com/bin/macosx/big-sur-x86_64/contrib/4.3/readxl_1.4.3.tgz'
Content type 'application/x-gzip' length 1563699 bytes (1.5 MB)
==================================================
downloaded 1.5 MB
The downloaded binary packages are in
/var/folders/yd/c2zm5hrx3z9f72991hsqsbqh0000gn/T//RtmpCP2ekt/downloaded_packages
library(readxl)
# Replace "path_to_your_excel_file.xlsx" with the actual path to your Excel file
file_path <- "C:/Users/nikhila/Desktop/Data (KPMG & rule of law).xlsx"
# Read the Excel file into a data frame
data <- file_path
To verify that my data is correct:
head(data)
[1] "C:/Users/nikhila/Desktop/Data (KPMG & rule of law).xlsx"
tail(data)
[1] "C:/Users/nikhila/Desktop/Data (KPMG & rule of law).xlsx"
Now my data is ready to use for
analysis
LS0tCnRpdGxlOiAiRGF0YSBDbGVhbmluZyBQcm9jZXNzICYgRmluYWwgRGF0YXNldCBmb3IgQW5hbHlzaXMiCm91dHB1dDogaHRtbF9ub3RlYm9vawplZGl0b3Jfb3B0aW9uczogCiAgbWFya2Rvd246IAogICAgd3JhcDogNzIKLS0tCgotICAgTmlraSBMaW5nYW51cgoKQSkgICoqU3RhZ2VzIG9mIGRhdGEgY2xlYW5pbmcqKgoKPCEtLSAtLT4KCjEpICAqKkxvYWQgdGhlIG5lY2Vzc2FyeSBwYWNrYWdlcyoqCgogICAgRm9yIHRoaXMgc3RlcCwgSSBsb2FkZWQgdGhlIGZvbGxvd2luZzogaW5zdGFsbC5wYWNrYWdlcygicmVhZHIiKSwKICAgIGluc3RhbGwucGFja2FnZXMoInJlYWR4bCIpLCBsaWJyYXJ5KHJlYWRyKSAmIGxpYnJhcnkocmVhZHhsKQoKMikgICoqUmVhZCB0aGUgZGF0YXNldHMqKgoKICAgIEhlcmUgSSByZWFkIGluIHRoZSAicnVsZSBvZiBsYXciIGluZGljYXRvciBDU1YgZmlsZSBvbiBteSBkZXNrdG9wCiAgICBhbmQgdGhlIEVTRyByZXBvcnRpbmcgRXhjZWwgZmlsZSBpbiBteSBkb3dubG9hZHMuCgpSZWFkIHRoZSAicnVsZSBvZiBsYXciIGluZGljYXRvciBDU1YgZmlsZQoKKnJ1bGVfb2ZfbGF3X2RhdGEgXDwtCnJlYWRfY3N2KCJ1c2Vycy9OaWtoaWxhL2Rlc2t0b3AvcnVsZV9vZl9sYXdfZGF0YS5jc3YiKSoKClJlYWQgdGhlIEVTRyByZXBvcnRpbmcgRXhjZWwgZmlsZQoKKmVzZ19yZXBvcnRpbmdfZGF0YSBcPC0KcmVhZF9leGNlbCgidXNlcnMvTmlraGlsYS9kb3dubG9hZHMvZXNnX3JlcG9ydGluZ19kYXRhLnhsc3giKSoKCjMuICAqKk1lcmdlIHRoZSBkYXRhc2V0cyoqCgogICAgU2luY2UgYm90aCBkYXRhc2V0cyBjb250YWluIGRhdGEgZm9yIHRoZSBzYW1lIGNvdW50cmllcywgSSB0aGVuCiAgICBtZXJnZWQgdGhlbSBiYXNlZCBvbiBhIGNvbW1vbiBpZGVudGlmaWVyIChoZXJlIHRoYXQgd2FzIHRoZSBjb3VudHJ5CiAgICBuYW1lL2NvZGUpLlwKICAgICptZXJnZWRfZGF0YSBcPC0gbWVyZ2UocnVsZV9vZl9sYXdfZGF0YSwgZXNnX3JlcG9ydGluZ19kYXRhLCBieSA9CiAgICAiQ291bnRyeSIsIGFsbCA9IFRSVUUpKgoKNC4gICoqSGFuZGxlIG1pc3NpbmcgdmFsdWVzKioKCiAgICBUaGVyZSB3ZXJlIHNvbWUgbWlzc2luZyB2YWx1ZXMgYWZ0ZXIgdGhpcyBwb3J0aW9uIGFuZCBzbyBJIHJlbW92ZWQKICAgIHRoZXNlIGNvdW50cmllcyBpbiB0aGUgZGF0YXNldCBiYXNlZCBvZmYgdGhlbSAoc2luY2UgbWFudWFsbHkKICAgIGlucHV0dGluZyB0aGVtIHdvdWxkIGhhdmUgbGVkIHRvIHRvbyBtdWNoIGluYWNjdXJhY3kpLgoKNS4gICoqQ2hlY2sgZGF0YSB0eXBlcyoqCgogICAgSSB0aGVuIGNoZWNrZWQgdGhhdCB0aGUgY29sdW1ucyBjb250YWluaW5nIG51bWVyaWMgZGF0YSBhcmUgb2YgdGhlCiAgICBjb3JyZWN0IGRhdGEgdHlwZSAoIG51bWVyaWMgYW5kIGxldHRlciBjb2RlZCB3aGVyZSBuZWNlc3NhcnkpLgoKKipCLiBDbGVhbmVkIGRhdGFzZXQ6KioKCmBgYHtyfQpvcHRpb25zKHJlcG9zID0gYyhDUkFOID0gImh0dHBzOi8vY3Jhbi5yc3R1ZGlvLmNvbS8iKSkKaW5zdGFsbC5wYWNrYWdlcygicmVhZHhsIikgIApsaWJyYXJ5KHJlYWR4bCkKIyBSZXBsYWNlICJwYXRoX3RvX3lvdXJfZXhjZWxfZmlsZS54bHN4IiB3aXRoIHRoZSBhY3R1YWwgcGF0aCB0byB5b3VyIEV4Y2VsIGZpbGUKZmlsZV9wYXRoIDwtICJDOi9Vc2Vycy9uaWtoaWxhL0Rlc2t0b3AvRGF0YSAoS1BNRyAmIHJ1bGUgb2YgbGF3KS54bHN4IgoKIyBSZWFkIHRoZSBFeGNlbCBmaWxlIGludG8gYSBkYXRhIGZyYW1lCmRhdGEgPC0gZmlsZV9wYXRoCgpgYGAKCioqKlRvIHZlcmlmeSB0aGF0IG15IGRhdGEgaXMgY29ycmVjdDoqKioKCmBgYHtyfQpoZWFkKGRhdGEpCnRhaWwoZGF0YSkKYGBgCgoqKipOb3cgbXkgZGF0YSBpcyByZWFkeSB0byB1c2UgZm9yIGFuYWx5c2lzKioqCg==