Short summary

This express analysis investigates smoking prevalence across different age groups for both men and women using the MHE23 dataset. The findings show that there are no noticeable differences in smoking status between men and women, particularly in the youngest age groups. Both genders exhibit similar smoking behaviors, with a high percentage of never-smokers and a relatively low prevalence of current smokers.

Data Preparation

The data was categorized into different age groups to facilitate meaningful analysis. The age variable (alder) was grouped into categories: <25, 25-30, 30-39, 40-49, 50-59, 60-69, 70-79, and 80+. Additionally, the smoking status variable (f58a) was grouped into three categories:

  • Current Smoker: Includes participants who currently smoke (categories 1 and 2).

  • Ex-Smoker: Includes participants who used to smoke but have quit (category 3).

  • Never-Smoker: Includes participants who have never smoked (category 4).

The gender variable (kon) was used to split the data into male and female participants, enabling a comparative analysis by gender.

Summary Analysis

Separate summaries were generated for men and women to assess the smoking status within each gender group. The summaries were then merged to facilitate side-by-side comparison, providing a clear overview of the smoking behaviors across age groups for both men and women.

Code
library(haven)

# Function to convert Stata category labels to R factor labels
stata_to_r_labels <- function(data, variable_name) {
  # data=tmp
  # variable_name="utbgrp3"
  # Extract the variable from the dataset
  variable <- data[[variable_name]]
  
  # Check if the variable is a labelled type (from Haven package for Stata)
  if (!is.null(attr(variable, "labels"))) {
    labels <- attr(variable, "labels")
    
    # Convert Stata labels to an R factor with levels and labels
    data[[variable_name]] <- factor(variable, levels = labels, labels = names(labels))
  } else {
    warning("The variable does not have Stata labels.")
  }
  
  return(data)
}


# Function to extract variable labels and category labels and write to a text file
library(haven)

extract_labels_to_file <- function(data, output_file = "variable_labels.txt") {
  # Open connection to the output file
  file_conn <- file(output_file, open = "w")
  
  # Iterate over all variables in the dataset
  for (variable_name in names(data)) {
    variable <- data[[variable_name]]
    
    # Check if the variable has labels
    if (!is.null(attr(variable, "labels"))) {
      labels <- attr(variable, "labels")
      
      # Write variable name and labels to the file
      var_label <- if (!is.null(attr(variable, "label"))) attr(variable, "label") else ""
      category_labels <- paste(paste0(names(labels), ": ", labels), collapse = " | ")
      line <- paste(variable_name, ";", var_label, ";", category_labels, "\n")
      writeLines(line, file_conn)
    }
  }
  
  # Close connection to the output file
  close(file_conn)
}

# Example usage
# extract_labels_to_file(tmp, "output_labels.txt")

Smoking Status by Age

Code
library(gt)
library(gtsummary)
library(dplyr)

# extract_labels_to_file(dt_all, "variable_labels.txt")

# Categorize age and create summary by age group and smoking status
dt_all <- dt_all %>% 
  mutate(`Age group` = cut(alder, breaks = c(0, 25, 30, 40, 50, 60, 70, 80, 100), labels = c("<25", "25-30", "30-39", "40-49", "50-59", "60-69", "70-79", "80+")))

# Assuming smoking status variable is named 'f58a' based on the provided variable list
summary_table <- dt_all %>%
  stata_to_r_labels(., "f58a") %>%
  select(`Age group`, f58a) %>%
  tbl_summary(by = f58a, 
              missing = "no",
              percent="row") %>%  
  add_overall()
  
  

# Print the summary table
summary_table
Table 1: Summary of Smoking Status by Age Group
Characteristic Overall, N = 88,4721 Ja, dagligen, N = 4,4241 Ja, men inte dagligen, N = 2,7661 Nej, jag har slutat röka, N = 30,9841 Nej, jag har aldrig rökt, N = 50,2981
Age group




    <25 4,808 (100%) 125 (2.6%) 444 (9.2%) 539 (11%) 3,700 (77%)
    25-30 3,151 (100%) 108 (3.4%) 195 (6.2%) 671 (21%) 2,177 (69%)
    30-39 9,943 (100%) 352 (3.5%) 462 (4.6%) 2,371 (24%) 6,758 (68%)
    40-49 12,266 (100%) 479 (3.9%) 446 (3.6%) 3,257 (27%) 8,084 (66%)
    50-59 16,519 (100%) 840 (5.1%) 493 (3.0%) 4,775 (29%) 10,411 (63%)
    60-69 17,944 (100%) 1,320 (7.4%) 411 (2.3%) 7,469 (42%) 8,744 (49%)
    70-79 19,142 (100%) 1,029 (5.4%) 278 (1.5%) 9,694 (51%) 8,141 (43%)
    80+ 4,699 (100%) 171 (3.6%) 37 (0.8%) 2,208 (47%) 2,283 (49%)
1 n (%)

The general pattern in Table 1 shows that the prevalence of smoking increases from the younger age groups into middle age, then decreases in older age groups. Notably, the percentage of never-smokers is highest in the youngest age group (<25), with 77% reporting that they have never smoked. The percentage of current daily smokers is relatively low in this group, at only 2.6%. Among the 25-30 age group, the proportion of never-smokers is also high at 69%, but there is a slight increase in current smokers compared to the under 25 group.

Smoking Status by sex (and age)

Code
# Assuming smoking status variable is named 'f58a' based on the provided variable list
# Group f58a in 1,2 as current smokers and 3,4 as non-smokers
dt_all <- dt_all %>%
  mutate(smoking = ifelse(as.integer(f58a) %in% c(1, 2), "Current smoker", "Non-smoker"))

# Group f58a in 1,2 as current smokers 3 as ex-smokers, and 4 as non-smokers
dt_all <- dt_all %>%
  mutate(smoking = ifelse(as.integer(f58a) %in% c(1, 2), "Current smoker", 
                          ifelse(as.integer(f58a) %in% c(3), "Ex-smoker", "Never-smoker" ))
  )



# Create summary for men (kon == 1)
men_summary <- dt_all %>%
  filter(kon == 1) %>%
  select(`Age group`, smoking) %>%
  tbl_summary(by = smoking, 
              missing = "no",
              percent = "row") %>%  
  # add_overall() %>%
  modify_header(label ~ "**Men**")

# Create summary for women (kon == 0)
women_summary <- dt_all %>%
  filter(kon == 0) %>%
  select(`Age group`, smoking) %>%
  tbl_summary(by = smoking, 
              missing = "no",
              percent = "row") %>%  
  # add_overall() %>%
  modify_header(label ~ "**Women**")

# Merge men and women summaries side by side into one gt table
combined_summary <- tbl_merge(
  tbls = list(men_summary, women_summary),
  tab_spanner = c("**Men**", "**Women**")
)  

combined_summary
Table 2: Smoking Status (Current, ex-, never smokers) by Age Groups
Men Men Women
Current smoker, N = 3,3191 Ex-smoker, N = 14,2561 Never-smoker, N = 22,7211 Current smoker, N = 3,8711 Ex-smoker, N = 16,7281 Never-smoker, N = 27,8301
Age group





    <25 243 (12%) 201 (9.8%) 1,609 (78%) 326 (12%) 338 (12%) 2,102 (76%)
    25-30 135 (10%) 278 (21%) 902 (69%) 168 (9.1%) 393 (21%) 1,280 (70%)
    30-39 424 (9.8%) 997 (23%) 2,896 (67%) 390 (6.9%) 1,374 (24%) 3,879 (69%)
    40-49 449 (8.2%) 1,420 (26%) 3,631 (66%) 476 (7.0%) 1,837 (27%) 4,483 (66%)
    50-59 600 (7.9%) 2,063 (27%) 4,914 (65%) 733 (8.2%) 2,712 (30%) 5,527 (62%)
    60-69 781 (9.4%) 3,292 (39%) 4,270 (51%) 950 (9.9%) 4,177 (43%) 4,510 (47%)
    70-79 588 (6.5%) 4,805 (53%) 3,589 (40%) 719 (7.0%) 4,889 (48%) 4,645 (45%)
    80+ 99 (4.5%) 1,200 (54%) 910 (41%) 109 (4.3%) 1,008 (40%) 1,404 (56%)
1 n (%)

Description: Table 2 presents the distribution of smoking status across different age groups for both men and women. The general pattern indicates that smoking prevalence tends to increase from the younger age groups to middle age and then decreases in older age groups. Among men, the proportion of ex-smokers tends to be higher in older age groups, while the percentage of current smokers peaks in the 60-69 age group. For women, the trend is similar, with the percentage of current smokers peaking in the 60-69 age group and a higher prevalence of ex-smokers among older women.

Focus on Age Groups Below 30: In the youngest age group (<25), the percentage of never-smokers is notably high for both men (78%) and women (76%). The percentage of current smokers is relatively low in this group, with 12% of men and women reporting smoking currently. In the 25-30 age group, there is a slight increase in the proportion of ex-smokers, especially among women (21%). The percentage of never-smokers remains high, indicating a general trend of declining smoking initiation among younger individuals, with both men and women showing similar patterns of smoking behavior in these age groups.

Conclusion

The analysis indicates that smoking patterns are similar between men and women, with no significant sex differences observed, especially among the younger age groups. The youngest participants (<25) show a high proportion of never-smokers, and the prevalence of smoking is relatively low, suggesting positive trends in smoking prevention efforts among young adults. These findings highlight the importance of continuing smoking prevention programs to maintain and enhance these positive trends.