qsn3_09 (Chronic illnesses).
A. Education Variables: * qsn2_05: Ever
attended school (Binary). * qsn2_13: Highest level of
education completed (Categorical: Primary, Secondary, Tertiary). *
qsn2_03/04: Literacy status (Can read/write).
B. Agriculture & Livelihood Variables: *
lab11: Engagement in household farming/livestock (Binary).
* _v2631: Ownership of agricultural land. *
_v2670-v2672: Livestock ownership (Camels, Cattle, Shoats)
— representing traditional rural wealth.
C. Technology Variables: * qsn6_04:
Personal mobile phone ownership. * qsn6_08: Internet usage
in the last 3 months. * qsn6_11: Access to mobile banking
(Financial technology).
D. Control/Covariate Variables: *
qsn1_05: Age (Crucial for NCD analysis). *
qsn1_03: Sex (Male/Female). * ea_type_n:
Residence type (Urban, Rural, Nomadic). *
qsn3_22 / qsn3_25: Lifestyle risks (Smoking and Qat
chewing). * _v2311: Body Mass Index (BMI).
You requested code using replace and focusing on the
sihbs dataset. Below are the steps to clean and generate
your analysis-ready variables.
* 1. Load the dataset
use "SIHBS.dta", clear
* 2. Create the Dependent Variable: NCD Status
* Assuming qsn3_09 codes 1-14 are specific illnesses and some code (like 0 or 95) is "None"
gen ncd_status = 0
replace ncd_status = 1 if qsn3_09 > 0 & qsn3_09 < 15
label define ncd_lab 0 "No Chronic Illness" 1 "Has Chronic Illness"
label values ncd_status ncd_lab
* 3. Clean Education Variable
gen edu_level = 0
replace edu_level = 1 if qsn2_13 == 1 // Primary
replace edu_level = 2 if qsn2_13 == 2 // Secondary
replace edu_level = 3 if qsn2_13 == 3 // University/Higher
label define edu_lab 0 "None" 1 "Primary" 2 "Secondary" 3 "Tertiary"
label values edu_level edu_lab
* 4. Create Agriculture Engagement Variable
gen is_farmer = 0
replace is_farmer = 1 if lab11 == 1 | lab12 == 1
label var is_farmer "Engaged in Farming/Livestock"
* 5. Create Technology Adoption Index
gen tech_access = 0
replace tech_access = 1 if qsn6_04 == 1 // Has mobile
replace tech_access = 2 if qsn6_04 == 1 & qsn6_08 == 1 // Has mobile + Internet
label define tech_lab 0 "None" 1 "Mobile Only" 2 "Mobile & Internet"
label values tech_access tech_lab
* 6. Handling Wealth/Livestock (Agriculture wealth)
gen total_livestock = _v2670 + _v2671 + _v2672 if !missing(_v2670)
* 7. Clean Lifestyle Controls (Smoking & Qat)
gen risk_behavior = 0
replace risk_behavior = 1 if qsn3_22 == 1 | qsn3_25 == 1
label var risk_behavior "Smokes or Chews Qat"
* 8. Basic Logistic Regression Model (Example)
logit ncd_status i.edu_level i.is_farmer i.tech_access qsn1_05 i.qsn1_03 i.ea_type_n risk_behavior [pweight=wgt_adj_pop]