Session Prep

Packages
dplyr
ggplot2
tidyr
readr

Introduction and Literature Review

Sexual and gender minority (SGM) research has substantially increased in recent decades. Despite this progress, there remains a significant knowledge gap in the realm of transgender (trans) research. Unless otherwise specified, in this paper the terms “transgender” and “trans” refer to anyone whose gender does not match the gender they were assigned at birth. The terms “cisgender” and “cis” refer to those whose gender matches the gender they were assigned at birth. Transgender research tends to focus on risks rather than strengths, often neglecting the utilization of population-specific measures and instead relying on measured validated in the general population, which is dominated by cisgender (cis) individuals. Furthermore, there is an emphasis on trans youth populations, leaving a gap in research concerning the adult trans population. The TransPop Study (Meyer et al., 2021) represents a significant milestone in addressing these gaps in the research.

The TransPop data set includes the first national probability sample of the adult U.S. transgender population as well as a comparative cisgender sample (Meyer et al., 2021). Recruitment for transgender participants occurred in two periods, April 2016 - August 2016 and June 2017 - December 2018. Recruitment for cisgender participants occurred in two periods during February 2018 and between November - December 2018. Both samples were screened for gender status (trans or cis) using the the same process, questions from the U.S. Gallup Poll. The surveys for both samples were largely the same, with a portion of population-specific questions included in each. Combined, the data include 612 variables covering demographic, health, social characteristics, life experiences, and perceptions. The data include some previously validated scales, some of which have been modified and some of which were used only in part. These include scales such as the Adverse Childhood Experiences (ACE), Drug Use Disorders Identification Test (DUDIT), and the Kessler-6 which measures psychological distress.

Of key interest in this paper is the Satisfaction with Life Scale (SWLS; Diener et al., 1985).

This paper explores the TransPop data in two phases, first asking if there is a significant difference in life satisfaction between transgender and cisgender respondents. Then I compare the factors that contribute to life satisfaction for each of these samples to understand how the predictors of life satisfaction are similar and different between the trans and cis population in the United States.

Methods and Sample

Data Cleaning

The data include 612 variables. Before I begin data analysis, I will remove several variables I know I will not be able to use. The data set includes surveys from both the transgender and cisgender samples Since I am doing a comparative analysis of the transgender and cisgender responses, I will remove items that are not included in both surveys. I will also remove listed items for which the data is not displayed, typically because it is narrative data or was redacted from the public data set to protect confidentiality. I will use the subset() function to remove these variables.

# Load TransPop data set into R environment.
transpop <- read.csv("/Users/nicoleborunda/Downloads/ICPSR_37938 3/DS0005/TransPopDS0005.csv")

# Use subset() to remove variables that are only for cisgender survey.
transpop_clean1 <- subset(transpop, select = -c(CIS_Q52, CIS_Q142, CIS_Q143_1, CIS_Q143_2, CIS_Q143_3, CIS_Q143_4, CIS_Q143_5, CIS_Q143_6, CIS_Q143_7, CIS_Q144, CIS_Q145)) 

# Use subset() to remove variables that are only for transgender survey.
transpop_clean2 <- subset(transpop_clean1, select = -c(Q33, Q44:Q68, Q71, Q72, Q74:Q80, Q87, Q89:Q91, Q151:Q161, Q191A:Q191I, Q193_1:Q194, Q216, Q217, T1Q27:T1Q200)) 

# Use subset() to remove variables that have redacted data or if the answers are in narrative form (that have not already been removed).
transpop_clean3 <- subset(transpop_clean2, select = -c(GMSANAME, Q21, Q31, Q34_VERB, Q81_T_VERB, Q83_T_VERB, Q86_VERB, Q202_VERB)) 

I also want to ensure that R recognizes the STUDYID as respondent identification numbers so they are not used in regression analysis.

# Change STUDYID to a character data type
transpop_clean3$STUDYID <- as.character(transpop_clean3$STUDYID) 

Descriptive Statistics

I began my analysis with an overview of the average life satisfaction of levels of the the trans and cis samples.

library(knitr)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(kableExtra)
## 
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
## 
##     group_rows
# Subset the data for TRANS_CIS = 1
group_1 <- transpop_clean3 %>%
  filter(TRANS_CIS == 1)

# Subset the data for TRANS_CIS = 2
group_2 <- transpop_clean3 %>%
  filter(TRANS_CIS == 2)

# Calculate summary statistics for LIFESAT_I in each group
summary_stats <- data.frame(Group = c("TRANS_CIS = 1", "TRANS_CIS = 2"),
                            Mean = c(mean(group_1$LIFESAT_I), mean(group_2$LIFESAT_I)),
                            SD = c(sd(group_1$LIFESAT_I), sd(group_2$LIFESAT_I)),
                            Min = c(min(group_1$LIFESAT_I), min(group_2$LIFESAT_I)),
                            Max = c(max(group_1$LIFESAT_I), max(group_2$LIFESAT_I)))

# Print the table
kable(summary_stats, caption = "Summary Statistics for LIFESAT_I by TRANS_CIS",
      align = "c", format = "html", digits = 2) %>%
  kable_styling(full_width = F)
Summary Statistics for LIFESAT_I by TRANS_CIS
Group Mean SD Min Max
TRANS_CIS = 1 3.93 1.74 1 7
TRANS_CIS = 2 5.03 1.56 1 7