Final Paper_Borunda

Packages
Introduction and Literature Review
Methods and Sample
- Data Cleaning
Findings
Discussion
References
R Markdown
Including Plots

Packages

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.2     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.2     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
## 
## Attaching package: 'mice'
## 
## 
## The following object is masked from 'package:stats':
## 
##     filter
## 
## 
## The following objects are masked from 'package:base':
## 
##     cbind, rbind
## 
## 
## corrplot 0.92 loaded
## 
## 
## Attaching package: 'cowplot'
## 
## 
## The following object is masked from 'package:lubridate':
## 
##     stamp
## 
## 
## 
## Attaching package: 'gridExtra'
## 
## 
## The following object is masked from 'package:dplyr':
## 
##     combine
## 
## 
## 
## Attaching package: 'kableExtra'
## 
## 
## The following object is masked from 'package:dplyr':
## 
##     group_rows

Introduction and Literature Review

Sexual and gender minority (SGM) research has substantially increased in recent decades. Despite this progress, there remains a significant knowledge gap in the realm of transgender (trans) research. Unless otherwise specified, in this paper the terms “transgender” and “trans” refer to anyone whose gender does not match the gender they were assigned at birth. The terms “cisgender” and “cis” refer to those whose gender matches the gender they were assigned at birth. Transgender research tends to focus on risks rather than strengths, often neglecting the utilization of population-specific measures and instead relying on measured validated in the general population, which is dominated by cisgender (cis) individuals. Furthermore, there is an emphasis on trans youth populations, leaving a gap in research concerning the adult trans population. The TransPop Study (Meyer et al., 2021) represents a significant milestone in addressing these gaps in the research.

The TransPop data set includes the first national probability sample of the adult U.S. transgender population as well as a comparative cisgender sample (Meyer et al., 2021).

This paper explores the TransPop data in two phases, first asking if there is a significant difference in life satisfaction between transgender and cisgender respondents. Then I compare the factors that contribute to life satisfaction for each of these samples to understand how the predictors of life satisfaction are similar and different between the trans and cis population in the United States.

Methods and Sample

Data Cleaning

The data include 612 variables. Before I begin data analysis, I will remove several variables I know I will not be able to use. The data set includes surveys from both the transgender and cisgender samples Since I am doing a comparative analysis of the transgender and cisgender responses, I will remove items that are not included in both surveys. I will also remove listed items for which the data is not displayed, typically because it is narrative data or was redacted from the public data set to protect confidentiality. I will use the subset() function to remove these variables.

# Load TransPop data set into R environment.
transpop <- read.csv("/Users/nicoleborunda/Downloads/ICPSR_37938 3/DS0005/TransPopDS0005.csv")

# Use subset() to remove variables that are only for cisgender survey.
transpop_clean1 <- subset(transpop, select = -c(CIS_Q52, CIS_Q142, CIS_Q143_1, CIS_Q143_2, CIS_Q143_3, CIS_Q143_4, CIS_Q143_5, CIS_Q143_6, CIS_Q143_7, CIS_Q144, CIS_Q145)) 

# Use subset() to remove variables that are only for transgender survey.
transpop_clean2 <- subset(transpop_clean1, select = -c(Q33, Q44:Q68, Q71, Q72, Q74:Q80, Q87, Q89:Q91, Q151:Q161, Q191A:Q191I, Q193_1:Q194, Q216, Q217, T1Q27:T1Q200)) 

# Use subset() to remove variables that have redacted data or if the answers are in narrative form (that have not already been removed).
transpop_clean3 <- subset(transpop_clean2, select = -c(GMSANAME, Q21, Q31, Q34_VERB, Q81_T_VERB, Q83_T_VERB, Q86_VERB, Q202_VERB))

I also want to ensure that R recognizes the STUDYID as respondent identification numbers so they are not used in regression analysis.

# Change STUDYID to a character data type
transpop_clean3$STUDYID <- as.character(transpop_clean3$STUDYID)

transpop_clean3 <- as_tibble(transpop_clean3)
print(transpop_clean3)

## # A tibble: 1,436 × 502
##    STUDYID  WEIGHT_CISGENDER_TRA…¹ WEIGHT_CISGENDER WEIGHT_TRANSPOP GMETHOD_TYPE
##    <chr>                     <dbl>            <dbl>           <dbl> <chr>       
##  1 1517689…                0.0220                NA           0.986 " "         
##  2 1523572…                0.00849               NA           0.380 " "         
##  3 1524440…                0.0158                NA           0.705 " "         
##  4 1525252…                0.0357                NA           1.60  " "         
##  5 1528944…                0.0418                NA           1.87  " "         
##  6 1529256…                0.0213                NA           0.955 " "         
##  7 1530032…                0.00528               NA           0.236 " "         
##  8 1530368…                0.00993               NA           0.444 " "         
##  9 1531623…                0.0244                NA           1.09  " "         
## 10 1533182…                0.0210                NA           0.939 " "         
## # ℹ 1,426 more rows
## # ℹ abbreviated name: ¹WEIGHT_CISGENDER_TRANSPOP
## # ℹ 497 more variables: SURVEYCOMPLETED <int>, GRESPONDENT_DATE <chr>,
## #   GCENREG <int>, RACE <int>, RACE_RECODE <int>, RACE_RECODE_CAT5 <int>,
## #   SEXUALID <int>, SEXMINID <int>, HINC <int>, HINC_I <int>, PINC <int>,
## #   PINC_I <int>, GEDUC1 <int>, GEDUC2 <int>, GANN_INC <int>, GANN_INC2 <int>,
## #   GD74 <int>, GD75 <int>, GD76 <int>, GEDUCATION <int>, …

Next, I need to split the data set so I can compare trans and cis data. I will use subset() to create a trans data frame and a cis data frame.

# Subset data for trans sample
trans_data <- subset(transpop_clean3, TRANS_CIS == 1)

# Subset data for cis sample
cis_data <- subset(transpop_clean3, TRANS_CIS == 2)

Finally, there are

rmarkdown::find_pandoc(version = "3.1.4")

## $version
## [1] '3.1.4'
## 
## $dir
## [1] "/usr/local/bin"

install.packages("installr")

## Installing package into '/Users/nicoleborunda/Library/R/arm64/4.3/library'
## (as 'lib' is unspecified)

## 
## The downloaded binary packages are in
##  /var/folders/p5/01x6s31x4pncvzwg_3c6_mf80000gn/T//Rtmp8ysTNa/downloaded_packages

library(installr)

## 
## Welcome to installr version 0.23.4
## 
## More information is available on the installr project website:
## https://github.com/talgalili/installr/
## 
## Contact: <tal.galili@gmail.com>
## Suggestions and bug-reports can be submitted at: https://github.com/talgalili/installr/issues
## 
##          To suppress this message use:
##          suppressPackageStartupMessages(library(installr))

install.pandoc()

## 
## The file was downloaded successfully into:
##  /var/folders/p5/01x6s31x4pncvzwg_3c6_mf80000gn/T//Rtmp8ysTNa/github.com 
## 
## Running the installer now...
## 
## Installation status:  FALSE . Removing the file:
##  /var/folders/p5/01x6s31x4pncvzwg_3c6_mf80000gn/T//Rtmp8ysTNa/github.com 
##  (In the future, you may keep the file by setting keep_install_file=TRUE)

# Identify and convert categorical variables to factors
#transpop$categorical_variable1 <- factor(transpop$categorical_variable1)
#transpop$categorical_variable2 <- factor(transpop$categorical_variable2)
# Repeat the above line for other categorical variables if needed

# Fit a stepwise regression model
#model <- lm(LIFESAT_I ~ ., data = transpop)  # Initial model with all predictors

# Perform stepwise regression using the "step" function
#step_model <- step(model, direction = "both")  # You can specify "both", "forward", or "backward" as the direction

# View the final selected model
#summary(step_model)

# Identify categorical variable names
#categorical_vars <- c("cat_var1", "cat_var2", "cat_var3")  # Add the names of all your categorical variables here

# Convert categorical variables to factors using a loop
#for (var in categorical_vars) {
 # transpop[[var]] <- factor(transpop[[var]])
#}

# create a test data set for stepwise regression
transpoptest <- subset(transpop_clean3, select = c(LIFESAT_I, GMILESAWAY, Q03, Q04, Q05, Q06, Q07, Q08))

# Remove rows with missing values
transpoptest <- na.omit(transpoptest)

# Fit a stepwise regression model
model <- lm(LIFESAT_I ~ ., data = transpoptest)  # Initial model with all predictors

# Perform stepwise regression using the "step" function
step_model <- step(model, direction = "both")  # You can specify "both", "forward", or "backward" as the direction

## Start:  AIC=695.5
## LIFESAT_I ~ GMILESAWAY + Q03 + Q04 + Q05 + Q06 + Q07 + Q08
## 
##              Df Sum of Sq    RSS    AIC
## - GMILESAWAY  1      2.22 2205.2 694.82
## <none>                    2203.0 695.50
## - Q06         1      8.37 2211.4 698.48
## - Q08         1      9.40 2212.4 699.09
## - Q04         1     12.94 2216.0 701.19
## - Q05         1     13.41 2216.4 701.47
## - Q07         1     29.43 2232.4 710.92
## - Q03         1    544.48 2747.5 983.49
## 
## Step:  AIC=694.82
## LIFESAT_I ~ Q03 + Q04 + Q05 + Q06 + Q07 + Q08
## 
##              Df Sum of Sq    RSS    AIC
## <none>                    2205.2 694.82
## + GMILESAWAY  1      2.22 2203.0 695.50
## - Q06         1      8.14 2213.4 697.66
## - Q08         1      9.30 2214.5 698.35
## - Q04         1     12.84 2218.1 700.44
## - Q05         1     13.60 2218.8 700.89
## - Q07         1     30.09 2235.3 710.61
## - Q03         1    545.91 2751.2 983.23

# View the final selected model
summary(step_model)

## 
## Call:
## lm(formula = LIFESAT_I ~ Q03 + Q04 + Q05 + Q06 + Q07 + Q08, data = transpoptest)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.4366 -0.7821  0.1732  0.9318  4.7368 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  6.26555    0.24964  25.099  < 2e-16 ***
## Q03         -1.18918    0.06614 -17.981  < 2e-16 ***
## Q04         -0.06660    0.02415  -2.757  0.00591 ** 
## Q05          0.08620    0.03038   2.838  0.00462 ** 
## Q06          0.06824    0.03107   2.196  0.02825 *  
## Q07          0.09667    0.02290   4.222 2.59e-05 ***
## Q08         -0.05495    0.02341  -2.347  0.01907 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.299 on 1306 degrees of freedom
## Multiple R-squared:  0.3859, Adjusted R-squared:  0.383 
## F-statistic: 136.8 on 6 and 1306 DF,  p-value: < 2.2e-16

Findings

Discussion

References

[1] Reference 1 llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll

[2] Reference 2 llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll

Barr, S. M., Budge, S. L., & Adelson, J. L. (2016). Transgender community belongingness as a mediator between strength of transgender identity and well-being. Journal of Counseling Psychology, 63(1), 87–97. https://doi.org/10.1037/cou0000127

Bränström, R., & Pachankis, J. E. (2021). Country-level structural stigma, identity concealment, and day-to-day discrimination as determinants of transgender people’s life satisfaction. Social Psychiatry and Psychiatric Epidemiology. https://doi.org/10.1007/s00127-021-02036-6

Carone, N., Rothblum, E. D., Bos, H. M. W., Gartrell, N. K., & Herman, J. L. (2021). Demographics and health outcomes in a U.S. probability sample of transgender parents. Journal of Family Psychology, 35(1), 57–68. https://doi.org/10.1037/fam0000776

Diener, E., Emmons, R. A., Larsen, R. J., & Griffin, S. (1985). The Satisfaction With Life Scale. Journal of Personality Assessment, 49(1), 71–75. https://doi.org/10.1207/s15327752jpa4901_13

Feldman, J. L., Luhur, W. E., Herman, J. L., Poteat, T., & Meyer, I. H. (2021). Health and health care access in the US transgender population health (TransPop) survey. Andrology, 9(6). https://doi.org/10.1111/andr.13052

Hardy, T. L. D., Rieger, J. M., Wells, K., & Boliek, C. A. (2021). Associations Between Voice and Gestural Characteristics of Transgender Women and Self-Rated Femininity, Satisfaction, and Quality of Life. American Journal of Speech-Language Pathology, 30(2), 663–672. https://doi.org/10.1044/2020_ajslp-20-00118

Krueger, E.A., Divsalar, S., Luhur, W., Choi, W.K., Meyer, I.H. (2020) TransPop U.S. Transgender Population Health Survey Methodology and Technical Notes. https://static1.squarespace.com/static/55958472e4b0af241ecac34f/t/5ef1331ecc152d3b60f068d6/1592865567882/TransPop+Survey+Methods+v18+FINAL+copy.pdf

Lane, M., Waljee, J. F., & Stroumsa, D. (2022). Treatment Preferences and Gender Affirmation of Nonbinary and Transgender People in a National Probability Sample. Obstetrics & Gynecology, Publish Ahead of Print. https://doi.org/10.1097/aog.0000000000004802

Meyer, I. H., Wilson, D.M.W., O’Neill. (2021). LGBTQ People in the US: Select findings from the Generations and TransPop Studies. https://williamsinstitute.law.ucla.edu/publications/generations-transpop-toplines/

Poteat, T. C., Divsalar, S., Streed, C. G., Feldman, J. L., Bockting, W. O., & Meyer, I. H. (2021). Cardiovascular Disease in a Population-Based Sample of Transgender and Cisgender Adults. American Journal of Preventive Medicine, 61(6), 804–811. https://doi.org/10.1016/j.amepre.2021.05.019

Sevelius, J. M., Poteat, T., Luhur, W. E., Reisner, S. L., & Meyer, I. H. (2020). HIV Testing and PrEP Use in a National Probability Sample of Sexually Active Transgender People in the United States. Journal of Acquired Immune Deficiency Syndromes (1999), 84(5), 437–442. https://doi.org/10.1097/QAI.0000000000002403

Sherman, A. D. F., Clark, K. D., Robinson, K., Noorani, T., & Poteat, T. (2019). Trans* Community Connection, Health, and Wellbeing: A Systematic Review. LGBT Health, 7(1). https://doi.org/10.1089/lgbt.2019.0014

Stanton, M. C., Ali, S., & Chaudhuri, S. (2016). Individual, social and community-level predictors of wellbeing in a US sample of transgender and gender non-conforming individuals. Culture, Health & Sexuality, 19(1), 32–49. https://doi.org/10.1080/13691058.2016.1189596

…

R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

summary(cars)

##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.