STATA Assignment - Group7

Research report 1_Computing: COMH7189A/7198A/7294A/7308A

Authors

Vusumuzi Mabasa ORCID

Thandeka Mchunu

Nontobeko Mnisi

Phindile Stowe

Prosperity Hadebe

Shantel Mphogo


Connecting STATA to RStudio using Quarto

options(scipen = 999)
library(Statamarkdown)
Warning: package 'Statamarkdown' was built under R version 4.4.2
Stata found at C:/Program Files/Stata18/StataSE-64.exe
The 'stata' engine is ready to use.
stataexe <- "C:/Program Files/Stata18/StataSE-64.exe"

knitr::opts_chunk$set(engine.path = list(stata=stataexe))

Executing STATA commands in Quarto

Setting the working directory

cd "C:\Users\VUSI\Downloads\Group 7 Biostats Assignment"
C:\Users\VUSI\Downloads\Group 7 Biostats Assignment

Checking if we are working in the correct working directory

pwd
C:\Users\VUSI\Downloads\Group 7 Biostats Assignment

Data pre-processing and manipulation (Data cleaning)

Reading the data-set generated using RedCap and merging it


sysuse demo, clear

codebook
(HonsMSc20257_DATA_NOHDRS_2025-02-14_1523.csv)


-------------------------------------------------------------------------------
record_id                                                             Record ID
-------------------------------------------------------------------------------

                  Type: Numeric (byte)

                 Range: [1,31]                        Units: 1
         Unique values: 24                        Missing .: 0/24

                  Mean: 15.2083
             Std. dev.: 9.25005

           Percentiles:     10%       25%       50%       75%       90%
                              3       7.5        15        23        27

-------------------------------------------------------------------------------
redcap_event_name                                                    Event Name
-------------------------------------------------------------------------------

                  Type: String (str24)

         Unique values: 1                         Missing "": 0/24

            Tabulation: Freq.  Value
                           24  "demographic_inform_arm_1"

-------------------------------------------------------------------------------
redcap_survey_identifier                                      Survey Identifier
-------------------------------------------------------------------------------

                  Type: Numeric (byte)

                 Range: [.,.]                         Units: .
         Unique values: 0                         Missing .: 24/24

            Tabulation: Freq.  Value
                           24  .

-------------------------------------------------------------------------------
demographic_informat_v_0                                       Survey Timestamp
-------------------------------------------------------------------------------

                  Type: String (str19)

         Unique values: 23                        Missing "": 0/24

              Examples: "2025-02-08 21:02:10"
                        "2025-02-09 20:40:32"
                        "2025-02-10 16:56:02"
                        "2025-02-11 14:32:22"

               Warning: Variable has embedded blanks.

-------------------------------------------------------------------------------
dob                                                               Date of birth
-------------------------------------------------------------------------------

                  Type: Numeric daily date (float)

                 Range: [10894,23875]                 Units: 1
       Or equivalently: [29oct1989,14may2025]         Units: days
         Unique values: 24                        Missing .: 0/24

                  Mean:   15287 = 08nov2001(+ 1 hour)
             Std. dev.: 3164.95
           Percentiles:       10%        25%        50%        75%        90%
                            12072    13174.5      15629    15883.5      18295
                        19jan1993  26jan1996  16oct2002  27jun2003  02feb2010

-------------------------------------------------------------------------------
consent_date                                                      Consent date.
-------------------------------------------------------------------------------

                  Type: Numeric daily date (float)

                 Range: [20005,23785]                 Units: 1
       Or equivalently: [09oct2014,13feb2025]         Units: days
         Unique values: 8                         Missing .: 0/24

            Tabulation: Freq.  Value
                            1  20005  09oct2014
                            1  21663  24apr2019
                            1  23779  07feb2025
                            5  23780  08feb2025
                            4  23781  09feb2025
                            2  23782  10feb2025
                            5  23783  11feb2025
                            5  23785  13feb2025

-------------------------------------------------------------------------------
age                                                                         Age
-------------------------------------------------------------------------------

                  Type: Numeric (byte)

                 Range: [0,35]                        Units: 1
         Unique values: 14                        Missing .: 0/24

                  Mean:      23
             Std. dev.: 7.72348

           Percentiles:     10%       25%       50%       75%       90%
                             15        21        22      28.5        32

-------------------------------------------------------------------------------
edu_level                              What is your highest level of education?
-------------------------------------------------------------------------------

                  Type: Numeric (byte)
                 Label: edu_level_

                 Range: [1,3]                         Units: 1
         Unique values: 3                         Missing .: 0/24

            Tabulation: Freq.   Numeric  Label
                            1         1  Primary school
                            5         2  High school
                           18         3  Tertiary education

-------------------------------------------------------------------------------
employ_status                           What is your current employment status?
-------------------------------------------------------------------------------

                  Type: Numeric (byte)
                 Label: employ_status_

                 Range: [1,5]                         Units: 1
         Unique values: 5                         Missing .: 0/24

            Tabulation: Freq.   Numeric  Label
                            5         1  Full-time
                            3         2  Part-time
                            7         3  Unemployed
                            3         4  Self-employed
                            6         5  Student

-------------------------------------------------------------------------------
monthly_income         What is your current household income (monthly, in ZAR)?
-------------------------------------------------------------------------------

                  Type: Numeric (byte)
                 Label: monthly_income_

                 Range: [1,5]                         Units: 1
         Unique values: 5                         Missing .: 0/24

            Tabulation: Freq.   Numeric  Label
                           10         1  Less than R3,500
                            1         2  R3, 500-R7,000
                            4         3  R7, 001-R15,000
                            4         4  R15, 001-R30,000
                            5         5  More than R30,000

-------------------------------------------------------------------------------
preg_complications                     Did you experience any pregnancy
                                       complications? (e.g., gestational
                                       diabetes, pre
-------------------------------------------------------------------------------

                  Type: Numeric (byte)
                 Label: preg_complications_

                 Range: [0,1]                         Units: 1
         Unique values: 2                         Missing .: 2/24

            Tabulation: Freq.   Numeric  Label
                           21         0  No
                            1         1  Yes
                            2         .  

-------------------------------------------------------------------------------
yes_specify                                             If yes, please specify.
-------------------------------------------------------------------------------

                  Type: String (str16)

         Unique values: 1                         Missing "": 23/24

            Tabulation: Freq.  Value
                           23  ""
                            1  "Placenta Previa "

               Warning: Variable has embedded and trailing blanks.

-------------------------------------------------------------------------------
demographic_informat_v_1                                              Complete?
-------------------------------------------------------------------------------

                  Type: Numeric (byte)
                 Label: demographic_informat_v_1_

                 Range: [2,2]                         Units: 1
         Unique values: 1                         Missing .: 0/24

            Tabulation: Freq.   Numeric  Label
                           24         2  Complete
. use "demo.dta"
(HonsMSc20257_DATA_NOHDRS_2025-02-14_1523.csv)

. merge 1:1  record_id  using "baseline.dta"

    Result                      Number of obs
    -----------------------------------------
    Not matched                             3
        from master                         3  (_merge==1)
        from using                          0  (_merge==2)

    Matched                                21  (_merge==3)
    -----------------------------------------

. drop if _merge==1
(3 observations deleted)

. 
. 
. 
. save "Merged.dta", replace
file Merged.dta saved

. 
. 
. 
. 

Describing the second data-set to be merged with the data-set above

use Follow_up

describe
(HonsMSc20257_DATA_NOHDRS_2025-02-14_1528.csv)


Contains data from Follow_up.dta
 Observations:            25                  HonsMSc20257_DATA_NOHDRS_2025-0
                                                2-14_1528.csv
    Variables:            18                  14 Feb 2025 17:19
-------------------------------------------------------------------------------
Variable      Storage   Display    Value
    name         type    format    label      Variable label
-------------------------------------------------------------------------------
record_id       byte    %8.0g                 Record ID
redcap_event_~e str22   %22s                  Event Name
redcap_survey~r byte    %8.0g                 Survey Identifier
followup_time~p str19   %19s                  Survey Timestamp
baby_current_~t float   %9.0g                 What is your babys current weight
                                                (kg)?
baby_current_~h float   %9.0g                 What is your babys current length
                                                (cm)?
baby_bmi2       float   %9.0g                 Babys BMI
recent_growth~s byte    %8.0g      recent_growth_issues_
                                              Has your baby been diagnosed with
                                                growth-related issues?
yes_issues      byte    %8.0g                 If yes, please specify.
healthcare_vi~s byte    %17.0g     healthcare_visits_
                                              How many times has your baby
                                                visited a healthcare facility
                                                for a check-up since
feed_3month     byte    %39.0g     feed_3month_
                                              How is your baby currently fed?
feed_perday_3~h byte    %9.0g      feed_perday_3month_
                                              If breastfeeding, how many times
                                                per day does your baby feed?
bottles_perda~h byte    %11.0g     bottles_perday_3month_
                                              If formula feeding, how many
                                                bottles per day does your baby
                                                consume?
complementary~h byte    %8.0g      complementary_food_3month_
                                              Have you introduced any
                                                complementary foods (e.g.,
                                                porridge, purees)?
latest_illnes~h byte    %47.0g     latest_illness_3month_
                                              Has your baby experienced any of
                                                the following since the last
                                                survey? (Check all
hospitalizati~h byte    %8.0g      hospitalization_3month_
                                              Has your baby been hospitalized
                                                since the last survey?
maternal_meals  byte    %8.0g      maternal_meals_
                                              How many meals do you eat per
                                                day?
followup_comp~e byte    %10.0g     followup_complete_
                                              Complete?
-------------------------------------------------------------------------------
Sorted by: 

Removing duplicates for smooth merging

use Follow_up, clear
duplicates drop record_id, force

save Follow_upD.dta, replace
(HonsMSc20257_DATA_NOHDRS_2025-02-14_1528.csv)


Duplicates in terms of record_id

(12 observations deleted)

file Follow_upD.dta saved

Now the entire data-set has been cleaned and merged, and it is ready for subsequent analysis


use Merged.dta, clear

merge 1:1 record_id using "Follow_upD", nogenerate

save "Merged1.dta", replace
(HonsMSc20257_DATA_NOHDRS_2025-02-14_1523.csv)

    Result                      Number of obs
    -----------------------------------------
    Not matched                             8
        from master                         8  
        from using                          0  

    Matched                                13  
    -----------------------------------------

file Merged1.dta saved

Summarizing the data


use "Merged1", clear

misstable summarize
(HonsMSc20257_DATA_NOHDRS_2025-02-14_1523.csv)

                                                               Obs<.
                                                +------------------------------
               |                                | Unique
      Variable |     Obs=.     Obs>.     Obs<.  | values        Min         Max
  -------------+--------------------------------+------------------------------
  redcap_sur~r |        21                   0  |      0          .           .
  preg_compl~s |         2                  19  |      2          0           1
  weigh_faci~y |         1                  20  |      1          1           1
   last_weight |         4                  17  |     17        2.9        66.6
   last_length |         5                  16  |     13         10          96
      baby_bmi |         4                  17  |     17   9.608708         290
  yes_select~s |        20                   1  |      1          1           1
  maternal_d~t |         1                  20  |      4          1           4
   clean_water |         1                  20  |      2          0           1
  mental_hea~s |         1                  20  |      2          0           1
  baby_curre~t |         8                  13  |     12        3.5          70
  baby_curre~h |         8                  13  |     11      25.99          98
     baby_bmi2 |         8                  13  |     13     6.6482    458.7848
  recent_gro~s |         8                  13  |      1          0           0
    yes_issues |        21                   0  |      0          .           .
  healthcare~s |         8                  13  |      4          1           4
   feed_3month |         8                  13  |      3          2           4
  feed_perda~h |        20                   1  |      1          2           2
  bottles_pe~h |        20                   1  |      1          2           2
  complement~h |         8                  13  |      2          0           1
  latest_ill~h |         8                  13  |      4          3           6
  hospitaliz~h |         8                  13  |      2          0           1
  maternal_m~s |         8                  13  |      2          2           3
  followup_c~e |         8                  13  |      1          2           2
  -----------------------------------------------------------------------------

Messing the data up by re-creating duplicates and introducing errors


use "Merged1", clear
expand 2 if _n <= 10 


#replace record_id = record_id + " " if mod(_n, 5) == 0  


replace redcap_event_name = "wrong_event" if mod(_n, 7) == 0  

save merged_data_with_errors.dta, replace
(HonsMSc20257_DATA_NOHDRS_2025-02-14_1523.csv)

(10 observations created)

Unknown #command
(4 real changes made)

file merged_data_with_errors.dta saved

Cleaning the errors

use merged_data_with_errors, clear

duplicates drop record_id redcap_event_name, force


replace redcap_event_name = "baseline_arm_1" if redcap_event_name == "wrong_event"


replace redcap_event_name = trim(redcap_event_name) 

save merged_data_clean.dta, replace
(HonsMSc20257_DATA_NOHDRS_2025-02-14_1523.csv)


Duplicates in terms of record_id redcap_event_name

(10 observations deleted)

(3 real changes made)

(0 real changes made)

file merged_data_clean.dta saved

Now we are making use of the clean data-set and we are formatting the dates correctly according to STATA standards.

use merged_data_clean, clear

gen dob_str = string(dob, "%td")


gen dob_date = date(dob_str, "DMY")


format dob_date %td
(HonsMSc20257_DATA_NOHDRS_2025-02-14_1523.csv)

Changing the variable, baby_bmi from a string variable to a numeric variable

use merged_data_clean, clear
describe
(HonsMSc20257_DATA_NOHDRS_2025-02-14_1523.csv)


Contains data from merged_data_clean.dta
 Observations:            21                  HonsMSc20257_DATA_NOHDRS_2025-0
                                                2-14_1523.csv
    Variables:            45                  8 Mar 2025 09:51
-------------------------------------------------------------------------------
Variable      Storage   Display    Value
    name         type    format    label      Variable label
-------------------------------------------------------------------------------
record_id       byte    %8.0g                 Record ID
redcap_event_~e str24   %24s                  Event Name
redcap_survey~r byte    %8.0g                 Survey Identifier
demographic_i~0 str19   %19s                  Survey Timestamp
dob             float   %dM_d,_CY             Date of birth
consent_date    float   %dM_d,_CY             Consent date.
age             byte    %8.0g                 Age
edu_level       byte    %18.0g     edu_level_
                                              What is your highest level of
                                                education?
employ_status   byte    %13.0g     employ_status_
                                              What is your current employment
                                                status?
monthly_income  byte    %17.0g     monthly_income_
                                              What is your current household
                                                income (monthly, in ZAR)?
preg_complica~s byte    %8.0g      preg_complications_
                                              Did you experience any pregnancy
                                                complications? (e.g.,
                                                gestational diabetes, pre
yes_specify     str16   %16s                  If yes, please specify.
demographic_i~1 byte    %10.0g     demographic_informat_v_1_
                                              Complete?
baseline_info~p str19   %19s                  Survey Timestamp
feed_baseline   byte    %18.0g     feed_baseline_
                                              How is your baby currently fed?
feed_per_day    byte    %9.0g      feed_per_day_
                                              How often does your baby feed per
                                                day?
solid_foods     byte    %8.0g      solid_foods_
                                              Have you introduced any solid
                                                foods to your baby?
nutrition_cou~l byte    %8.0g      nutrition_counsel_
                                              Do you have access to nutritional
                                                counseling?
weigh_facility  byte    %8.0g      weigh_facility_
                                              Has your baby been weighed at a
                                                healthcare facility since
                                                birth?
last_weight     float   %9.0g                 What was your babys last recorded
                                                weight?
last_length     float   %9.0g                 What was your babys last recorded
                                                length (in cm)?
baby_bmi        float   %9.0g                 Babys BMI.
condition_aff~h byte    %8.0g      condition_affecting_growth_
                                              Does your baby have any diagnosed
                                                medical conditions affecting
                                                growth?
yes_select_op~s byte    %21.0g     yes_select_options_
                                              If yes, please select the
                                                following options.
other_specify   str9    %9s                   If other, please specify.
maternal_diet   byte    %121.0g    maternal_diet_
                                              What is your daily diet like?
clean_water     byte    %8.0g      clean_water_
                                              Do you have access to clean
                                                drinking water?
mental_health~s byte    %8.0g      mental_health_concerns_
                                              Have you experienced any mental
                                                health concerns since giving
                                                birth?
baseline_info~e byte    %10.0g     baseline_information_complete_
                                              Complete?
_merge          byte    %23.0g     _merge     Matching result from merge
followup_time~p str19   %19s                  Survey Timestamp
baby_current_~t float   %9.0g                 What is your babys current weight
                                                (kg)?
baby_current_~h float   %9.0g                 What is your babys current length
                                                (cm)?
baby_bmi2       float   %9.0g                 Babys BMI
recent_growth~s byte    %8.0g      recent_growth_issues_
                                              Has your baby been diagnosed with
                                                growth-related issues?
yes_issues      byte    %8.0g                 If yes, please specify.
healthcare_vi~s byte    %17.0g     healthcare_visits_
                                              How many times has your baby
                                                visited a healthcare facility
                                                for a check-up since
feed_3month     byte    %39.0g     feed_3month_
                                              How is your baby currently fed?
feed_perday_3~h byte    %9.0g      feed_perday_3month_
                                              If breastfeeding, how many times
                                                per day does your baby feed?
bottles_perda~h byte    %11.0g     bottles_perday_3month_
                                              If formula feeding, how many
                                                bottles per day does your baby
                                                consume?
complementary~h byte    %8.0g      complementary_food_3month_
                                              Have you introduced any
                                                complementary foods (e.g.,
                                                porridge, purees)?
latest_illnes~h byte    %47.0g     latest_illness_3month_
                                              Has your baby experienced any of
                                                the following since the last
                                                survey? (Check all
hospitalizati~h byte    %8.0g      hospitalization_3month_
                                              Has your baby been hospitalized
                                                since the last survey?
maternal_meals  byte    %8.0g      maternal_meals_
                                              How many meals do you eat per
                                                day?
followup_comp~e byte    %10.0g     followup_complete_
                                              Complete?
-------------------------------------------------------------------------------
Sorted by: 

use merged_data_clean, clear
destring baby_bmi, replace
destring age, replace

dtable age preg_complications
(HonsMSc20257_DATA_NOHDRS_2025-02-14_1523.csv)

baby_bmi already numeric; no replace

age already numeric; no replace


-----------------------------------------------------------------------------------------------
                                                                                     Summary   
-----------------------------------------------------------------------------------------------
N                                                                                            21
Age                                                                              22.857 (7.663)
Did you experience any pregnancy complications? (e.g., gestational diabetes, pre  0.053 (0.229)
-----------------------------------------------------------------------------------------------

Creating a logistic regression stats table and saving it in a working directory

asdoc logistic preg_complications age 
Logistic regression                                     Number of obs =     22
                                                        LR chi2(1)    =   0.86
                                                        Prob > chi2   = 0.3532
Log likelihood = -3.6370114                             Pseudo R2     = 0.1059

------------------------------------------------------------------------------
preg_compl~s | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
         age |   1.180138   .2443618     0.80   0.424     .7864683     1.77086
       _cons |   .0006154   .0037186    -1.22   0.221     4.42e-09    85.67737
------------------------------------------------------------------------------
Note: _cons estimates baseline odds.
(file Myfile.doc not found)
Click to Open File:  Myfile.doc

Data Visualization in QUARTO using STATA commands


use merged_data_clean, clear
scatter last_length last_weight,  mcolor(pink) xtitle("Weight(kg)") ytitle("Length (cm)"), ,title("length vs. Weight")

graph export "scatter.png",as(png) replace
(HonsMSc20257_DATA_NOHDRS_2025-02-14_1523.csv)


file scatter.png saved as PNG format
use merged_data_clean, clear
scatter last_length last_weight, mcolor(pink) xtitle("Weight (kg)") ytitle("Length (cm)") title("Length vs. Weight")
graph save graph1.gph, replace
(HonsMSc20257_DATA_NOHDRS_2025-02-14_1523.csv)


file graph1.gph saved
knitr::include_graphics("scatter.png")

use merged_data_clean, clear
graph box last_weight, over(feed_baseline) box(1, fcolor(red)) ytitle("Weight (kg)") title("Box-Plot: Weight by Feeding type") medtype(line) medline(lcolor(black))

graph export "boxplot.png", as(png) replace
(HonsMSc20257_DATA_NOHDRS_2025-02-14_1523.csv)


file boxplot.png saved as PNG format
use merged_data_clean, clear
graph box last_weight, over(feed_baseline) box(1, fcolor(red)) ytitle("Weight (kg)") title("Box-Plot: Weight by Feeding type") medtype(line) medline(lcolor(black))
graph save graph2.gph, replace
(HonsMSc20257_DATA_NOHDRS_2025-02-14_1523.csv)


file graph2.gph saved
knitr::include_graphics("boxplot.png")

use merged_data_clean, clear
graph bar (mean) last_weight, over(feed_baseline) bar(1, fcolor(orange)) title("Mean Weight by Feeding Type") ytitle("Mean Weight (kg)")
graph export bargraph.png, as(png) replace
(HonsMSc20257_DATA_NOHDRS_2025-02-14_1523.csv)


file bargraph.png saved as PNG format

use merged_data_clean, clear
graph bar (mean) last_weight, over(feed_baseline) bar(1, fcolor(orange)) title("Mean Weight by Feeding Type") ytitle("Mean Weight (kg)")
graph save graph3.gph, replace
(HonsMSc20257_DATA_NOHDRS_2025-02-14_1523.csv)


file graph3.gph saved
plot3 <- knitr::include_graphics("bargraph.png")
plot3

use merged_data_clean, clear
graph pie, over(edu_level) plabel(_all percent, color(black)) ///
  pie(1, color(cranberry)) pie(2, color(yellow)) pie(3, color(lime)) ///
  title("Educational Level") legend(on)
graph export "piechart.png", as(png) replace
(HonsMSc20257_DATA_NOHDRS_2025-02-14_1523.csv)


file piechart.png saved as PNG format

use merged_data_clean, clear
graph pie, over(edu_level) plabel(_all percent, color(black)) ///
  pie(1, color(cranberry)) pie(2, color(yellow)) pie(3, color(lime)) ///
  title("Educational Level") legend(on)
graph save graph4.gph, replace
(HonsMSc20257_DATA_NOHDRS_2025-02-14_1523.csv)


file graph4.gph saved
plot4 <- knitr::include_graphics("piechart.png")
plot4

graph combine graph1.gph graph2.gph ///
              graph3.gph graph4.gph, ///
              title("Combined Graphs") cols(2)
              
graph export Combined_GraphsG7.png, as(png) replace
file Combined_GraphsG7.png saved as PNG format
knitr::include_graphics("Combined_GraphsG7.png")

Spatial Analysis - Visualization in Quarto using STATA commands

ssc install shp2dta
ssc install spmap
checking shp2dta consistency and verifying not already installed...
all files already exist and are up to date.

checking spmap consistency and verifying not already installed...
all files already exist and are up to date.

Converting the shapefiles into dta files


clear
shp2dta using "boundaries.shp", database(provinces) coordinates(zacoordinates) genid(id) replace
type: 5

Reading the data

use provinces, clear

describe
Contains data from provinces.dta
 Observations:            62                  
    Variables:            11                  8 Mar 2025 09:56
-------------------------------------------------------------------------------
Variable      Storage   Display    Value
    name         type    format    label      Variable label
-------------------------------------------------------------------------------
area_level      byte    %10.0g                area_level
area_lev_1      str8    %9s                   area_lev_1
area_id         str10   %10s                  area_id
area_name       str25   %25s                  area_name
parent_are      str9    %9s                   parent_are
spectrum_r      byte    %10.0g                spectrum_r
area_sort_      byte    %10.0g                area_sort_
center_x        double  %10.0g                center_x
center_y        double  %10.0g                center_y
name            str10   %10s                  name
id              byte    %12.0g                
-------------------------------------------------------------------------------
Sorted by: id

Importing the master file to be merged with the shapefile data

import delimited "spectrum_calibration", clear
save "spectrum.dta", replace
(encoding automatically selected: ISO-8859-1)
(15 vars, 1,836 obs)

file spectrum.dta saved
use Spectrum, clear

describe
Contains data from Spectrum.dta
 Observations:         1,836                  
    Variables:            15                  8 Mar 2025 09:56
-------------------------------------------------------------------------------
Variable      Storage   Display    Value
    name         type    format    label      Variable label
-------------------------------------------------------------------------------
spectrum_reg~de byte    %8.0g                 
spectrum_reg~me str13   %13s                  
sex             str6    %9s                   
age_group       str8    %9s                   
calendar_quar~r str8    %9s                   
population_sp~m float   %9.0g                 
plhiv_spectrum  float   %9.0g                 
art_current_s~m float   %9.0g                 
infections_sp~m float   %9.0g                 
unaware_spect~m float   %9.0g                 
births_hivpop~m float   %9.0g                 
births_artpop~m float   %9.0g                 
time_step       str8    %9s                   
population_raw  float   %9.0g                 
population_ca~d float   %9.0g                 
-------------------------------------------------------------------------------
Sorted by: 

use Spectrum, clear

rename spectrum_reg~me area_name

save Spectrum1, replace
file Spectrum1.dta saved

Data pre-processing before merging

use provinces, clear
duplicates drop area_name, force

save ProvD.dta, replace
Duplicates in terms of area_name

(0 observations are duplicates)

file ProvD.dta saved
use Spectrum1, clear
duplicates drop area_name, force

save SpectrumD.dta, replace
Duplicates in terms of area_name

(1,827 observations deleted)

file SpectrumD.dta saved

Merging the datasets for spatial analysis


use provinces, clear
merge 1:1 area_name using "SpectrumD.dta"

save map1.dta, replace
    Result                      Number of obs
    -----------------------------------------
    Not matched                            53
        from master                        53  (_merge==1)
        from using                          0  (_merge==2)

    Matched                                 9  (_merge==3)
    -----------------------------------------

file map1.dta saved

use map1, clear

describe
Contains data from map1.dta
 Observations:            62                  
    Variables:            26                  8 Mar 2025 09:56
-------------------------------------------------------------------------------
Variable      Storage   Display    Value
    name         type    format    label      Variable label
-------------------------------------------------------------------------------
area_level      byte    %10.0g                area_level
area_lev_1      str8    %9s                   area_lev_1
area_id         str10   %10s                  area_id
area_name       str25   %25s                  area_name
parent_are      str9    %9s                   parent_are
spectrum_r      byte    %10.0g                spectrum_r
area_sort_      byte    %10.0g                area_sort_
center_x        double  %10.0g                center_x
center_y        double  %10.0g                center_y
name            str10   %10s                  name
id              byte    %12.0g                
spectrum_regi~e byte    %8.0g                 
sex             str6    %9s                   
age_group       str8    %9s                   
calendar_quar~r str8    %9s                   
population_sp~m float   %9.0g                 
plhiv_spectrum  float   %9.0g                 
art_current_s~m float   %9.0g                 
infections_sp~m float   %9.0g                 
unaware_spect~m float   %9.0g                 
births_hivpop~m float   %9.0g                 
births_artpop~m float   %9.0g                 
time_step       str8    %9s                   
population_raw  float   %9.0g                 
population_ca~d float   %9.0g                 
_merge          byte    %23.0g     _merge     Matching result from merge
-------------------------------------------------------------------------------
Sorted by: area_name

Map Visualisation

Antiretroviral Treatment Spectrum

use map1, clear

spmap art_current_s~m  using zacoordinates, id(id) fcolor(Greens) clnumber(6)

graph export "map2.png", as(png) replace
file map2.png saved as PNG format
knitr::include_graphics("map2.png")

People living with HIV spectrum

use map1, clear

spmap plhiv_spectrum  using zacoordinates, id(id) fcolor(Blues2) clnumber(6)

graph export "map3.png", as(png) replace
file map3.png saved as PNG format
knitr::include_graphics("map3.png")

Population spectrum

use map1, clear

spmap population_sp~m using zacoordinates, id(id) fcolor(Reds) clnumber(6)

graph export "map4.png", as(png) replace
file map4.png saved as PNG format
knitr::include_graphics("map4.png")

use map1, clear

spmap spectrum_r using zacoordinates, id(id) fcolor(Spectral) clnumber(6)

graph export "map5.png", as(png) replace
file map5.png saved as PNG format
knitr::include_graphics("map5.png")