1 The World Bank’s Doing Business Report

  • The World Bank’s Doing Business Report has published the ease of doing business rankings between 2004 and 2020.

  • The Doing Business measured business regulations for local firms in 190 economies.

  • However, the World Bank has decided to discontinue the Doing Business report, after data irregularities on Doing Business 2018 and 2020 were reported internally in June 2020.

  • The Doing Business’s complete historical data up to Doing Business 2020, revised to correct data irregularities are available at the World Bank’s website.

2 Load the data

library(rio)
DB<- import("Data_raw/WB_Doing_Business/Historical-data.dta")

3 Describe the data

library(gtsummary)
library(dplyr)
# Describe the data
DB %>%
  select(-cod) %>%
  select(-economy) %>%
  tbl_summary()
Characteristic N = 3,6061
Region
     77 (2.1%)
    East Asia & Pacific 479 (13%)
    Europe & Central Asia 420 (12%)
    High income: OECD 615 (17%)
    Latin America & Caribbean 615 (17%)
    Middle East & North Africa 340 (9.4%)
    South Asia 217 (6.0%)
    Sub-Saharan Africa 843 (23%)
Income group
     77 (2.1%)
    High income 1,042 (29%)
    Low income 510 (14%)
    Lower middle income 934 (26%)
    Upper middle income 1,043 (29%)
DB Year 2,012.0 (2,008.0, 2,016.0)
Paid-in Minimum capital (% of income per capita) 0 (0, 27)
Extent of disclosure index (0-10) 6.00 (4.00, 7.00)
    Unknown 424
Extent of director liability index (0-10) 5.00 (2.00, 6.00)
    Unknown 424
Ease of shareholder suits index (0-10) (DB06-14 methodology) 6.00 (4.00, 7.00)
    Unknown 1,698
Extent of corporate transparency index (0-7) (DB15-20 methodology)
    0 424 (29%)
    1 57 (3.8%)
    2 138 (9.3%)
    3 112 (7.5%)
    4 110 (7.4%)
    5 267 (18%)
    5.4 7 (0.5%)
    6 294 (20%)
    7 77 (5.2%)
    Unknown 2,120
Extent of shareholder rights index (0-6) (DB15-20 methodology)
    0 420 (28%)
    1 21 (1.4%)
    2 67 (4.5%)
    3 178 (12%)
    4 336 (23%)
    5 346 (23%)
    6 118 (7.9%)
    Unknown 2,120
Extent of ownership and control index (0-7) (DB15-20 methodology)
    0 415 (28%)
    1 83 (5.6%)
    2 169 (11%)
    3 171 (12%)
    3.4 7 (0.5%)
    4 204 (14%)
    5 228 (15%)
    6 157 (11%)
    7 52 (3.5%)
    Unknown 2,120
Ease of shareholder suits index (0-10) (DB15-20 methodology) 6.00 (5.00, 8.00)
    Unknown 2,120
Strength of investor protection index (0-30) (DB06-14 methodology) 16.0 (11.0, 18.0)
    Unknown 1,698
Strength of minority investor protection index (0-50) (DB15-20 methodology) 28 (18, 33)
    Unknown 2,120
Strength of insolvency framework index (0-16) (DB15-20 methodology) 7.5 (5.0, 10.5)
Commencement of proceedings index (0-3) (DB15-20 methodology)
    0 96 (2.7%)
    1 18 (0.5%)
    1.5 52 (1.4%)
    2 1,765 (49%)
    2.5 869 (24%)
    3 806 (22%)
Management of debtor's assets index (0-6) (DB15-20 methodology) 4.00 (2.00, 5.50)
Reorganization proceedings index (0-3) (DB15-20 methodology)
    0 1,596 (44%)
    0.5 728 (20%)
    1 513 (14%)
    1.5 75 (2.1%)
    2 204 (5.7%)
    2.5 151 (4.2%)
    3 339 (9.4%)
Creditor participation index (0-4) (DB15-20 methodology)
    0 689 (19%)
    1 1,571 (44%)
    2 800 (22%)
    3 512 (14%)
    4 34 (0.9%)
Procedures - Men (number) 8.0 (6.0, 10.0)
    Unknown 367
Time - Men (days) 20 (11, 39)
    Unknown 367
Cost - Men (% of income per capita) 15 (5, 44)
    Unknown 367
Procedures - Women (number) 8.0 (6.0, 11.0)
    Unknown 367
Time - Women (days) 20 (11, 39)
    Unknown 367
Cost - Women (% of income per capita) 15 (5, 44)
    Unknown 367
Procedures (number) 15.0 (12.0, 18.0)
    Unknown 667
Time (days) 161 (112, 219)
    Unknown 667
Cost (% of Warehouse value) 3 (1, 8)
    Unknown 667
Building quality control index (0-15) (DB16-20 methodology) 11.0 (8.0, 12.0)
    Unknown 2,333
Quality of building regulations index (0-2) (DB16-20 methodology)
    -9999 28 (2.2%)
    0 83 (6.5%)
    0.385 3 (0.2%)
    0.5 38 (3.0%)
    1 240 (19%)
    1.5 51 (4.0%)
    1.55 3 (0.2%)
    1.78 2 (0.2%)
    2 825 (65%)
    Unknown 2,333
Quality control before construction index (0-1) (DB16-20 methodology)
    -9999 28 (2.2%)
    0 155 (12%)
    1 1,090 (86%)
    Unknown 2,333
Quality control during construction index (0-3) (DB16-20 methodology)
    -9999 28 (2.2%)
    0 191 (15%)
    1 124 (9.7%)
    1.35 5 (0.4%)
    2 824 (65%)
    2.53 3 (0.2%)
    3 98 (7.7%)
    Unknown 2,333
Quality control after construction index (0-3) (DB16-20 methodology)
    -9999 28 (2.2%)
    0 55 (4.3%)
    2 281 (22%)
    2.17 6 (0.5%)
    2.39 6 (0.5%)
    2.65 4 (0.3%)
    2.77 6 (0.5%)
    3 887 (70%)
    Unknown 2,333
Liability and insurance regimes index (0-2) (DB16-20 methodology)
    -9999 28 (2.2%)
    0 418 (33%)
    0.4 6 (0.5%)
    0.5 80 (6.3%)
    0.975 6 (0.5%)
    1 467 (37%)
    1.17 6 (0.5%)
    1.5 24 (1.9%)
    2 238 (19%)
    Unknown 2,333
Professional certifications index (0-4) (DB16-20 methodology)
    -9999 28 (2.2%)
    0 237 (19%)
    1 102 (8.0%)
    1.1 1 (<0.1%)
    2 267 (21%)
    3 128 (10%)
    3.32 6 (0.5%)
    3.47 5 (0.4%)
    4 499 (39%)
    Unknown 2,333
Procedures (number) 5.00 (4.00, 6.00)
    Unknown 1,382
Time (days) 83 (56, 120)
    Unknown 1,382
Cost (% of income per capita) 397 (79, 1,309)
    Unknown 1,382
Reliability of supply and transparency of tariff index (0-8) (DB16-20 methodolog 5.00 (0.00, 7.00)
    Unknown 2,333
Price of electricity (US cents per kWh) (DB16-20 methodology) 0.15 (0.11, 0.21)
    Unknown 2,333
Total duration and frequency of outages per customer a year (0-3) (DB16-20 metho 1.00 (0.00, 2.00)
    Unknown 2,333
System average interruption duration index (SAIDI) (DB16-20 methodology) 1 (-4, 6)
    Unknown 2,333
System average interruption frequency index (SAIFI) (DB16-20 methodology) 1 (-4, 5)
    Unknown 2,333
Minimum outage time (in minutes) (DB16-20 methodology) 3.00 (0.00, 5.00)
    Unknown 2,333
Mechanisms for monitoring outages (0-1) (DB16-20 methodology)
    -9999 20 (1.6%)
    0 335 (26%)
    1 918 (72%)
    Unknown 2,333
Mechanisms for restoring service (0-1) (DB16-20 methodology)
    -9999 20 (1.6%)
    0 372 (29%)
    1 881 (69%)
    Unknown 2,333
Regulatory monitoring (0-1) (DB16-20 methodology)
    -9999 20 (1.6%)
    0 330 (26%)
    0.6 6 (0.5%)
    1 917 (72%)
    Unknown 2,333
Financial deterrents aimed at limiting outages (0-1) (DB16-20 methodology)
    -9999 20 (1.6%)
    0 660 (52%)
    1 593 (47%)
    Unknown 2,333
Communication of tariffs and tariff changes (0-1) (DB16-20 methodology)
    -9999 20 (1.6%)
    0 312 (25%)
    0.77 2 (0.2%)
    1 939 (74%)
    Unknown 2,333
Procedures (number) 6.00 (5.00, 7.00)
    Unknown 512
Time (days) 39 (18, 69)
    Unknown 512
Cost (% of property value) 5.0 (2.5, 7.9)
    Unknown 512
Quality of land administration index (0-30) (DB17-20 methodology) 14 (8, 22)
    Unknown 2,544
Reliability of infrastructure index (0-8) (DB17-20 methodology) 4.61 (1.00, 7.00)
    Unknown 2,544
Transparency of information index (0-6) (DB17-20 methodology) 3.00 (1.50, 4.00)
    Unknown 2,544
Geographic coverage index (0-8) (DB17-20 methodology)
    -9999 20 (1.9%)
    0 494 (47%)
    2 76 (7.2%)
    2.34 5 (0.5%)
    4 240 (23%)
    6 25 (2.4%)
    8 202 (19%)
    Unknown 2,544
Land dispute resolution index (0-8) (DB17-20 methodology) 5.00 (3.50, 6.00)
    Unknown 2,544
Equal access to property rights index (-2-0) (DB17-20 methodology)
    -9999 20 (1.9%)
    -2 5 (0.5%)
    -1 66 (6.2%)
    0 971 (91%)
    Unknown 2,544
Strength of legal rights index (0-10) (DB05-14 methodology) 5.00 (3.00, 7.00)
    Unknown 1,785
Strength of legal rights index (0-12) (DB15-20 methodology) 5.0 (2.0, 7.0)
    Unknown 2,122
creditinformation6pnt
    0 443 (28%)
    1 141 (8.8%)
    2 141 (8.8%)
    3 120 (7.5%)
    4 234 (15%)
    5 326 (20%)
    6 205 (13%)
    Unknown 1,996
Depth of credit information index (0-6) (DB05-14 methodology)
    0 887 (49%)
    1 5 (0.3%)
    2 35 (1.9%)
    3 70 (3.8%)
    4 218 (12%)
    5 390 (21%)
    6 216 (12%)
    Unknown 1,785
Depth of credit information index (0-8) (DB15-20 methodology)
    0 400 (27%)
    1 8 (0.5%)
    2 31 (2.1%)
    3 18 (1.2%)
    4 35 (2.4%)
    5 88 (5.9%)
    6 237 (16%)
    7 381 (26%)
    8 286 (19%)
    Unknown 2,122
Credit registry coverage (% of adults) 0 (0, 8)
    Unknown 512
Credit bureau coverage (% of adults) 1 (0, 49)
    Unknown 512
tradeprocedures 8.00 (6.00, 9.00)
    Unknown 2,400
Time to export: Documentary compliance (hours) (DB16-20 methodology) 26 (4, 66)
    Unknown 2,333
Time to import: Documentary compliance (hours) (DB16-20 methodology) 37 (4, 96)
    Unknown 2,333
Time to export: Border compliance (hours) (DB16-20 methodology) 48 (17, 83)
    Unknown 2,333
Time to import: Border compliance (hours) (DB16-20 methodology) 56 (10, 100)
    Unknown 2,333
Cost to export: Documentary compliance (USD) (DB16-20 methodology) 86 (50, 160)
    Unknown 2,333
Cost to import: Documentary compliance (USD) (DB16-20 methodology) 100 (50, 183)
    Unknown 2,333
Cost to export: Border compliance (USD) (DB16-20 methodology) 335 (156, 547)
    Unknown 2,333
Cost to import: Border compliance (USD) (DB16-20 methodology) 386 (175, 660)
    Unknown 2,333
Documents to export (number) (DB06-15 methodology) 6.00 (5.00, 8.00)
    Unknown 1,740
Documents to import (number) (DB06-15 methodology) 7.00 (6.00, 9.00)
    Unknown 1,740
tradingexportcost 1,109 (810, 1,545)
    Unknown 1,729
Cost to export (US$ per container deflated) (DB06-15 methodology) 1,349 (968, 2,094)
    Unknown 1,729
Time to export (days) (DB06-15 methodology) 20 (13, 27)
    Unknown 1,729
tradingimportcost 1,297 (900, 1,910)
    Unknown 1,729
Cost to import (US$ per container deflated) (DB06-15 methodology) 1,560 (1,042, 2,547)
    Unknown 1,729
Time to import (days) (DB06-15 methodology) 21 (14, 33)
    Unknown 1,729
Filing and service (days) 30 (22, 50)
    Unknown 1,748
Trial and judgment (days) 365 (260, 488)
    Unknown 1,748
Enforcement of judgment (days) 180 (105, 260)
    Unknown 1,748
Attorney fees (% of claim) 18 (12, 24)
    Unknown 1,748
Court fees (% of claim) 5.2 (3.5, 7.8)
    Unknown 1,748
Enforcement fees (% of claim) 3.5 (1.2, 8.0)
    Unknown 1,748
Procedures (number) 38 (34, 42)
    Unknown 1,429
Time (days) 565 (447, 750)
    Unknown 367
Cost (% of claim) 28 (22, 39)
    Unknown 367
Quality of judicial processes index (0-18) (DB17-20 methodology) 8.0 (6.0, 10.5)
    Unknown 2,544
Court structure and proceedings (0-5) (DB17-20 methodology) 3.00 (2.50, 4.50)
    Unknown 2,544
Case management (0-6) (DB16-20 methodology) 1.50 (1.00, 3.00)
    Unknown 2,544
Court automation (0-4) (DB16-20 methodology) 0.50 (0.00, 2.00)
    Unknown 2,544
Alternative dispute resolution (0-3) (DB16-20 methodology)
    0 9 (0.8%)
    0.5 5 (0.5%)
    1 8 (0.8%)
    1.5 105 (9.9%)
    2 330 (31%)
    2.5 493 (46%)
    3 112 (11%)
    Unknown 2,544
Time (years) 2 (2, 3)
    Unknown 367
closingtime 2 (2, 3)
    Unknown 367
Recovery rate (cents on the dollar) 31 (18, 46)
    Unknown 367
Cost (% of estate) 12 (7, 20)
    Unknown 367
Outcome (0 as piecemeal sale and 1 as going concern)
    -9999 140 (4.3%)
    0 2,328 (72%)
    1 771 (24%)
    Unknown 367
Time to comply with VAT refund (hours) (DB17-20 methodology) 1 (-222, 12)
    Unknown 2,544
Time to obtain VAT refund (weeks) (DB17-20 methodology) 6 (-222, 27)
    Unknown 2,544
Time to comply with a corporate income tax correction (hours) (DB17-20 methodolo 7 (3, 17)
    Unknown 2,544
Time to complete a corporate income tax correction (weeks) (DB17-20 methodology) 0 (0, 18)
    Unknown 2,544
Payments (number per year) 28 (11, 40)
    Unknown 667
Time (hours per year) 221 (147, 317)
    Unknown 667
Total tax and contribution rate (% of profit) 40 (31, 50)
    Unknown 667
Profit tax (% of profit) 18 (10, 23)
    Unknown 667
Labor tax and contributions (% of profit) 14 (8, 24)
    Unknown 667
Other taxes (% of profit) 3 (1, 7)
    Unknown 667
1 n (%); Median (IQR)

4 Select the necessary columns

# Display the names of columns
names(DB)
##   [1] "cod"                            "economy"                       
##   [3] "region"                         "incomegroup"                   
##   [5] "dbyear"                         "startbuscapital"               
##   [7] "investorsdisclosure"            "investorsliability"            
##   [9] "investorssuitsOld"              "investorstransparency"         
##  [11] "investorsrights"                "investorsownership"            
##  [13] "investorssuits"                 "investorsProtectingInv"        
##  [15] "investorsProtectingMinInves"    "closinglegal"                  
##  [17] "closing_SI_Commencement"        "closing_SI_ManagementAssets"   
##  [19] "closing_SI_Reorganization"      "closing_SI_CreditorParticipate"
##  [21] "startbusprocedures"             "startbustime"                  
##  [23] "startbuscost"                   "startbusproceduresW"           
##  [25] "startbustimeW"                  "startbuscostW"                 
##  [27] "permitprocedures"               "permittime"                    
##  [29] "permitcostWhV"                  "permitsLegalindex"             
##  [31] "permits_SI_Buildregulation"     "permits_SI_Beforeconstr"       
##  [33] "permits_SI_Duringconst"         "permits_SI_Afterconst"         
##  [35] "permits_SI_Liability"           "permits_SI_Certifications"     
##  [37] "electricityprocedures"          "electricitytime"               
##  [39] "electricitycost"                "electricitySAIDISAIFIscore"    
##  [41] "electricityPriceperkWhUSD"      "electricitySaidiSaifiPoints"   
##  [43] "electricitySaidiIndex"          "electricitySaifiIndex"         
##  [45] "electricityMinimumOutage"       "electricityOutagemoniscore"    
##  [47] "electricityOutagerestoscore"    "electricityRegulatormoniscore" 
##  [49] "electricityCompensationscore"   "electricityTransparencyscore"  
##  [51] "registerprocedures"             "registertime"                  
##  [53] "registercost"                   "registerLegalindexGener"       
##  [55] "registerReliabilityInfrast"     "registerTransparency"          
##  [57] "registerCoverage"               "registerLegalBackground"       
##  [59] "registerWEBEqualAccess"         "creditrights"                  
##  [61] "creditrights_new"               "creditinformation6pnt"         
##  [63] "creditinformation6pnt5pr"       "creditinformation8pnt5pr"      
##  [65] "creditinformationCoverPublic"   "creditinformationCoverPrivate" 
##  [67] "tradeprocedures"                "tradeXtimedocs"                
##  [69] "tradeMtimedocs"                 "tradeXtimeborder"              
##  [71] "tradeMtimeborder"               "tradeXcostdocs"                
##  [73] "tradeMcostdocs"                 "tradeXcostborder"              
##  [75] "tradeMcostborder"               "tradingexportdocs"             
##  [77] "tradingimportdocs"              "tradingexportcost"             
##  [79] "tradingexportcost_def"          "tradingexporttime"             
##  [81] "tradingimportcost"              "tradingimportcost_def"         
##  [83] "tradingimporttime"              "contracts_filing_days"         
##  [85] "contracts_trial_days"           "contracts_enforcement_days"    
##  [87] "contracts_attorney_fee"         "contracts_court_fee"           
##  [89] "contracts_enforcement_fee"      "contractsproceduresBunus"      
##  [91] "contractstime"                  "contractscost"                 
##  [93] "contractsLegalindexGender"      "contracts_SI_CourtStructureGen"
##  [95] "contracts_SI_CaseManagement"    "contracts_SI_CourtAutomation"  
##  [97] "contracts_SI_ADR"               "closingtime_web"               
##  [99] "closingtime"                    "closingrecovery"               
## [101] "closingcostpercentage"          "closingOutcome"                
## [103] "taxesTimeComplyVATrefund"       "taxesTimeObtainVATrefund"      
## [105] "taxesTimeComplyCITaudit"        "taxesTimeCompleteCITaudit"     
## [107] "taxespayments"                  "taxestime"                     
## [109] "taxestotal"                     "taxesTTRprofit"                
## [111] "taxesTTRlabor"                  "taxesTTROther"
# Keep cod year startbustime 
DB <- DB[, c("cod", "dbyear", "startbustime", "taxestotal")]
# Rename "dbyear" to "year"
names(DB)[names(DB) == "dbyear"] <- "year"

5 Write the data as CSV using rio package

export(DB, "Data_output/DB.csv")

6 Distribution of the observations

library(ggplot2)

# plot the distribution of the observations by year
DB %>%
  group_by(year) %>%
  summarise(n = n()) %>%
  ggplot(aes(x = year, y = n)) +
  geom_bar(stat = "identity") +
  labs(title = "Distribution of the observations by year",
       x = "Year",
       y = "Number of observations")

7 Describe the data again

library(gtsummary)
DB %>%
  select(-cod) %>%
  tbl_summary()
Characteristic N = 3,6061
DB Year 2,012.0 (2,008.0, 2,016.0)
Time - Men (days) 20 (11, 39)
    Unknown 367
Total tax and contribution rate (% of profit) 40 (31, 50)
    Unknown 667
1 Median (IQR)