title: “Health Insurance Charges” author: “shannel” date: “2025-11-20” output: html_document For those whose bmi is greater than 40, what is the median charge? > median_charge <- median(health_insurance_charges\(charges[ + health_insurance_charges\)bmi > 40 + ], + na.rm = TRUE) > median_charge [1] 9748.911 > what is the median of bmi variable > median_bmi <- median(health_insurance_charges\(bmi, na.rm = TRUE) > median_bmi [1] 30.4 > How many female smokers are there in the dataset? > female_smokers_count <- sum(health_insurance_charges\)sex == “female” & + health_insurance_charges\(smoker == "yes") > female_smokers_count [1] 115 > What is the standard deviation of the bmi of age of smokers in northwest? > sd_bmi_smokers_northwest <- sd( + health_insurance_charges\)bmi[ + health_insurance_charges\(smoker == "yes" & + health_insurance_charges\)region == “northwest” + ], + na.rm = TRUE + ) > > sd_bmi_smokers_northwest [1] 4.726628 > What is the age of the oldest female non-smoker from southeast? > oldest_female_nonsmoker_se <- max( + health_insurance_charges\(age[ + health_insurance_charges\)sex == “female” & + health_insurance_charges\(smoker == "no" & + health_insurance_charges\)region == “southeast” + ], + na.rm = TRUE + ) > > oldest_female_nonsmoker_se [1] 64 > What is the average charges for males aged above 60? > avg_charges_males_over60 <- mean( + health_insurance_charges\(charges[ + health_insurance_charges\)sex == “male” & + health_insurance_charges\(age > 60 + ], + na.rm = TRUE + ) > > avg_charges_males_over60 [1] 21035.28 > What is the correlation coefficient of age and charges of southeastern smokers. > cor_age_charges_se_smokers <- cor( + health_insurance_charges\)age[ + health_insurance_charges\(smoker == "yes" & + health_insurance_charges\)region == “southeast” + ], + health_insurance_charges\(charges[ + health_insurance_charges\)smoker == “yes” & + health_insurance_charges\(region == "southeast" + ], + use = "complete.obs" + ) > > cor_age_charges_se_smokers [1] 0.3044483 > What is the maximum number of children does a female non-smoker from northeast have? > max_children_female_nonsmoker_ne <- max( + health_insurance_charges\)children[ + health_insurance_charges\(sex == "female" & + health_insurance_charges\)smoker == “no” & + health_insurance_charges\(region == "northeast" + ], + na.rm = TRUE + ) > > max_children_female_nonsmoker_ne [1] 4 > > What is the mode number of children for male non-smokers from southeast region Error: unexpected symbol in "What is" What is the mode number of children for male non-smokers from southeast region > # Filter the data > children_vals <- health_insurance_charges\)children[ + health_insurance_charges\(sex == "male" & + health_insurance_charges\)smoker == “no” & + health_insurance_charges\(region == "southeast" + ] > > # Compute the mode manually > mode_children <- as.numeric(names(sort(table(children_vals), decreasing = TRUE)[1])) > > mode_children [1] 0 > What is the maximum number of children that participants below 20 have? > max_children_under20 <- max( + health_insurance_charges\)children[ + health_insurance_charges\(age < 20 + ], + na.rm = TRUE + ) > > max_children_under20 [1] 5 > What is the minimum age for people from southwest with 5 children? > min_age_southwest_5children <- min( + health_insurance_charges\)age[ + health_insurance_charges\(region == "southwest" & + health_insurance_charges\)children == 5 + ], + na.rm = TRUE + ) > > min_age_southwest_5children [1] 19 > What is the standard deviation of age variable? > sd_age <- sd(health_insurance_charges$age, na.rm = TRUE) > sd_age [1] 14.04996