Introduction

I am interested in examining the school factors that encourage African American/Black high school boys to choose to major in Science, Technology, Engineering, and Mathematics (STEM) major in post secondary education in other to enter into the STEM profession (Professions like Engineer, Scientist, Computer Scientist, Information Technology, Medicine, etc). It is known that one’s professional career defines how successful they are and one needs to have knowledge and interest in that field to decide to pursue it. In a study by Kurban and Cabrera, they investigated “the role of cognitive and contextual factors affecting the process of high school students’ readiness for and consideration of an academic major in a STEM field” (Kurban & Cabrera, 2020). In their future research section of the article, they suggested to investigate the “extent to which school climate, school culture, and teacher expectations impact the STEM readiness and intention developmental process” (Kurban & Cabrera, 2020). The authors conducted Exploratory Factor Analysis (EFA) and Confirmatory Factor Analysis (CFA) to discover five distinct latent factors (SES, Parental Involvment, Math Self-Efficacy, Science Self-Efficacy, and STEM Readiness) and four single indicators (Math Ability, Math Interest, Science Interest, and Intention to Major in STEM)

Problem Statement

I propose to investigate the extent to which school opportunities for students impact the STEM readiness and intention development process for Black boys using the High School Longitudinal Study of 2009 (HSLS:09).

Data and Variables

The High School Longitudinal Study of 2009 (HSLS:09) was a study conducted by the National Center for Education Statistics (NCES) as part of a series of secondary longitudinal studies. “The core research question for HSLS:09 explore secondary to postsecondary transition plans and the evolution of those plans; the paths into and out of science, technology, engineering, and mathematics; and the educational and social experiences that affect these shifts” (Ingels et al., 2011, p. iii). The longitudinal study takes the student as the fundamental unit of analysis. The base year occured in the fall term of 2009-10 School year, with a randomly selected sample of 9th graders in more than 900 public and private high schools (Ingels et al., 2011). There was a follow up study in spring of 2012 when the participants will be in the spring of the 11th grade.

Some questions on the student questionnaire allows to investigate the school environment through student, teacher, and administrator reports. This includes the extent to which schools are expected to provide special services to selected groups of students to compensate for limitations and poor performance. For example, including special services to assist those lagging in their understanding of mathematics and science is of interest in this study. (Ingels et al., 2011)

A subset of the publicly available data set is used, specifically using variables related to questions answered by school counselors and administrators.

knitr::opts_chunk$set(warning = FALSE)
setwd("~/UB/CEP525/FinalExam")
df <- read.csv("CEP525FinalDataSet.csv")
library(DT)

## Warning: package 'DT' was built under R version 4.3.3

library(vtable)

## Warning: package 'vtable' was built under R version 4.3.3

## Loading required package: kableExtra

## Warning: package 'kableExtra' was built under R version 4.3.3

df <- df[, !(names(df) %in% c("X.1", "X"))]
st(df)

Summary Statistics
Variable	N	Mean	Std. Dev.	Min	Pctl. 25	Pctl. 75	Max
x1sex	23497
… Female	11524	49%
… Male	11973	51%
x1black	22938	0.16	0.37	0	0	0	1
x1white	22938	0.74	0.44	0	0	1	1
x1hispanic	22497	0.17	0.37	0	0	0	1
x1locale	23503
… City	6689	28%
… Rural	5559	24%
… Suburb	8467	36%
… Town	2788	12%
x1region	23503
… Midwest	6224	26%
… Northeast	3662	16%
… South	9587	41%
… West	4030	17%
c1cluster	23503	1.5	0.5	1	1	2	2
c1indvcrs	23503	2	0.22	1	2	2	2
c1intern	23503	1.4	0.49	1	1	2	2
c1jobfair	23503	1.3	0.45	1	1	2	2
c1jobguide	23503	1.3	0.46	1	1	2	2
c1employer	23503	1.6	0.49	1	1	2	2
c1awareness	23503	1.7	0.47	1	1	2	2
c1careerunit	23503	1.4	0.49	1	1	2	2
c1workstudy	23503	1.6	0.5	1	1	2	2
c1careerday	23503	1.4	0.49	1	1	2	2
c1assemblies	23503	1.4	0.49	1	1	2	2
c1jobinfocmp	23503	1.8	0.4	1	2	2	2
c1jobinfonon	23503	1.4	0.48	1	1	2	2
c1hstowrkoth	23503	1.2	0.43	1	1	1	2
c1hstoworkno	23503	1	0.15	1	1	1	2
a1mspdlearn	23503	1.6	0.48	1	1	2	2
a1mspdintrst	23503	1.4	0.48	1	1	2	2
a1msother	23503	1.2	0.41	1	1	1	2
a1msnone	23503	1	0.14	1	1	1	2
a1g9summer	23503	1.4	0.48	1	1	2	2
a1g9overage	23503	1.2	0.36	1	1	1	2
a1g9communty	23503	1.2	0.43	1	1	1	2
a1g9blocksch	23503	1.9	0.33	1	2	2	2
a1g9double	23503	1.6	0.49	1	1	2	2
a1g9study	23503	1.4	0.48	1	1	2	2
a1g9teacher	23503	1.4	0.48	1	1	2	2

One section of the counselor questionnaire focused on programs and services offered to students. Some questions pertained to enrichment courses, assistance for struggling students, dropout prevention programs, encouragement of the pursuit of mathematics and science education and employment, and assistance with the transition from high school to college or the workforce. “Other topics included the use of mathematics competency tests and options for failing students”(Ingels et al., 2011, p. 20)

My definition for “Opportunites for students” would be investigated with the following variables:

Counselor variables

Variable Name	Varialbe Lable
c1cluster	C1 B30 Career Clusters/Pathways/Programs of Study (POS) offered
c1indvcrs	C1 B31 Student not enrolled in career Clusters, etc. may take course in program
c1intern	C1 B32A School offers internships with local employers
c1jobfair	C1 B32B School offers job fairs
c1jobguide	C1 B32C School offers Career guides or skills assessments
c1employer	C1 B32D School offers school/classroom presentations by local employers
c1awareness	C1 B32E School offers career awareness activities
c1careerunit	C1 B32G School offers career information units in subject-matter courses
c1workstudy	C1 B32H School offers exploratory work experience programs/co-op/workstudy/EBCE
c1careerday	C1 B32I School offers career days or nights
c1assemblies	C1 B32J School offers vocational oriented assemblies
c1jobinfocmp	C1 B32Q School offers computerized career information resources
c1jobinfonon	C1 B32R School offers non-computerized career information resources
c1hstowrkoth	C1 B32S School assists students with transition from HS to work in other ways
c1hstoworkno	C1 B32T School doesn’t assist students with transition from HS to work

Administrator variables

Variable Name	Variable Label
a1mspdlearn	A1 A25I Requires teacher prof development in how students learn math/science
a1mspdintrst	A1 A25J Requires teacher prof development in increasing interest in math/science
a1msother	A1 A25K Raises students math/science interest/achievement in another way
a1msnone	A1 A25L Doesn’t do any of these to raise math/science interest/achievement
a1g9summer	A1 A26A Offers pre-HS summer reading/math instructions for struggling 9th graders
a1g9overage	A1 A26B Offers learning communities for over-age student lacking HS prerequisite
a1g9communty	A1 A26C Offers 9th grade learning communities separate from rest of school
a1g9blocksch	A1 A26D Offers block scheduling to assist struggling 9th graders
a1g9double	A1 A26E Offers catch-up courses / double-dosing to assist struggling 9th graders
a1g9study	A1 A26F Offers study skill seminar/class for struggling 9th graders
a1g9teacher	A1 A26G Offers assistance for teachers working with struggling 9th graders

Covariates

Variable Name	Variable Label
x1sex	Sex of student (Male, Female)
x1black	Race of student is Black
x1white	Race of student is White
x1hispanic	Race of student is Hispanic
x1locale	Location where student resides (city, Rural, Suburb, Town)
x1region	Region where student resides (Midwest, Northeast, South, West)

Sample size, description, excluded participants

In preparing the sample data, I downloaded the public available data set from the NCES website and then prep the data set as follows.

First, I choose the following variables from the HSLS:09 Student data set - sex, race (black, white, Hispanic), locale, region plus counselor variable of interest plus administrator variable of interest.

Next, I re coded the following variables - sex, locale and region. There were missing data within the data set and replaced these missed data with “NA”. The counselor and administrator variables were coded “O” for No and “1” for Yes. I had to recode the “0” to 1 (for No) and “1” to 2 (for Yes). The original variable type was character so I re coded them all to integer.

For missing values, I imputed with the mean of the column and rounded to the nearest whole integer. Then saved the final data set that needed to be used for latent class analysis (LCA)

The total sample size is 23503 with 32 variables

summary(as.factor(df$x1sex))

## Female   Male   NA's 
##  11524  11973      6

summary (as.factor(df$x1black))

##     0     1  NA's 
## 19175  3763   565

summary(as.factor(df$x1locale))

##   City  Rural Suburb   Town 
##   6689   5559   8467   2788

summary(as.factor(df$x1region))

##   Midwest Northeast     South      West 
##      6224      3662      9587      4030

Results

Multivariate Statistical Method

I will perform a Latent Class Analysis (LCA) on the variables responses by the school administrator and counselors to see the probability administrator and counselors answer yes to the “Opportunities for students”, and correlate it with gender, racial, place of residence (local and region) of the students.

knitr::opts_chunk$set(warning = FALSE)
library(poLCA)

## Loading required package: scatterplot3d

## Loading required package: MASS

library(vtable)
library(psych)
library(dplyr)

## 
## Attaching package: 'dplyr'

## The following object is masked from 'package:MASS':
## 
##     select

## The following object is masked from 'package:kableExtra':
## 
##     group_rows

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

df %>%
  slice(1:10) %>%
  kbl(caption = "Opportunities for students in HSLS:09 Dataset", digits = 1) %>%
  kable_classic("hover", full_width = F, html_font = "Cambria")

Opportunities for students in HSLS:09 Dataset
x1sex	x1black	x1white	x1locale	x1region	c1cluster	c1indvcrs	c1intern	c1jobfair	c1jobguide	c1employer	c1awareness	c1careerunit	c1workstudy	c1careerday	c1assemblies	c1jobinfocmp	c1jobinfonon	c1hstowrkoth	c1hstoworkno	a1mspdlearn	a1mspdintrst	a1msother	a1msnone	a1g9summer	a1g9overage	a1g9communty	a1g9blocksch	a1g9double	a1g9study	a1g9teacher
Male	0	1	Rural	Midwest	1	2	1	2	1	2	2	2	1	2	2	2	1	2	1	1	2	1	1	2	1	1	2	1	1	1
Female	0	1	Rural	Northeast	1	2	1	1	1	2	2	2	1	1	2	2	2	2	1	1	1	2	1	1	1	1	1	1	1	1
Female	1	0	Suburb	West	1	2	1	2	1	2	1	1	1	1	1	1	1	1	1	1	1	1	1	2	1	1	2	1	1	2
Female	0	1	Suburb	South	1	2	1	1	1	2	2	1	2	1	1	2	1	1	1	2	1	1	1	2	1	2	1	2	1	1
Male	0	1	City	South	2	2	1	1	2	1	2	1	2	1	1	2	1	1	1	2	2	1	1	1	1	1	2	2	1	2
Female	0	1	Rural	South	2	2	1	1	2	2	2	1	2	2	2	2	2	1	1	1	1	1	1	1	1	1	2	1	1	1
Female	0	1	Rural	Northeast	2	2	2	1	1	1	2	2	2	2	1	2	1	1	1	2	1	1	1	1	1	1	2	2	2	2
Male	0	1	Town	Northeast	1	2	1	1	1	2	2	1	2	1	1	2	1	1	1	1	1	1	1	1	1	1	2	1	1	1
Male	0	1	Town	South	2	2	2	2	1	2	2	2	2	2	2	2	1	1	1	2	2	1	1	1	1	1	2	1	1	1
Female	0	1	City	Northeast	2	2	2	1	2	1	2	1	2	1	1	2	2	1	1	1	1	1	1	1	1	1	1	2	1	2

Preparation for LCA (Select Variables)

As for the measure of opportunities for students (for struggling students), fifteen items related to opportunities for students were used to represent responses by the school counselor, and eleven items were used to represent responses by the school administrator.

Rationale for using latent class analysis (LCA) is a person-oriented approach where a number of distinctly separable classes or types of people or groups are identified based on the probabilities of answering to various responses to a certain construct.

f1 <- as.formula(cbind(c1cluster, c1indvcrs, c1intern, c1jobfair, c1jobguide, c1employer, c1awareness, c1careerunit, c1workstudy, c1careerday, c1assemblies,
                           c1jobinfocmp, c1jobinfonon, c1hstowrkoth, c1hstoworkno)~1)

f2 <- as.formula(cbind(a1mspdlearn, a1mspdintrst, a1msother, a1msnone, a1g9summer, a1g9overage, a1g9communty, a1g9blocksch, a1g9double, a1g9study, a1g9teacher)~1)

max_II <- -100000
min_bic <- 100000
for (i in 1:4) {
  lc <- poLCA(f1, df, nclass = i, maxiter = 3000,
              tol = 1e-5, na.rm = FALSE, nrep = 10,
              verbose = TRUE, calc.se = TRUE)
  if(lc$bic < min_bic) {
    min_bic <- lc$bic
    LCA_best_model <- lc
  }
}

## Conditional item response (column) probabilities,
##  by outcome variable, for each class (row) 
##  
## $c1cluster
##            Pr(1)  Pr(2)
## class 1:  0.4932 0.5068
## 
## $c1indvcrs
##            Pr(1)  Pr(2)
## class 1:  0.0489 0.9511
## 
## $c1intern
##            Pr(1)  Pr(2)
## class 1:  0.5709 0.4291
## 
## $c1jobfair
##            Pr(1)  Pr(2)
## class 1:  0.7213 0.2787
## 
## $c1jobguide
##            Pr(1)  Pr(2)
## class 1:  0.6897 0.3103
## 
## $c1employer
##            Pr(1)  Pr(2)
## class 1:  0.3934 0.6066
## 
## $c1awareness
##            Pr(1)  Pr(2)
## class 1:  0.3341 0.6659
## 
## $c1careerunit
##            Pr(1)  Pr(2)
## class 1:  0.6101 0.3899
## 
## $c1workstudy
##            Pr(1)  Pr(2)
## class 1:  0.4348 0.5652
## 
## $c1careerday
##            Pr(1)  Pr(2)
## class 1:  0.6049 0.3951
## 
## $c1assemblies
##            Pr(1)  Pr(2)
## class 1:  0.5795 0.4205
## 
## $c1jobinfocmp
##            Pr(1)  Pr(2)
## class 1:  0.2024 0.7976
## 
## $c1jobinfonon
##            Pr(1)  Pr(2)
## class 1:  0.6445 0.3555
## 
## $c1hstowrkoth
##           Pr(1) Pr(2)
## class 1:  0.755 0.245
## 
## $c1hstoworkno
##            Pr(1)  Pr(2)
## class 1:  0.9783 0.0217
## 
## Estimated class population shares 
##  1 
##  
## Predicted class memberships (by modal posterior prob.) 
##  1 
##  
## ========================================================= 
## Fit for 1 latent classes: 
## ========================================================= 
## number of observations: 23503 
## number of estimated parameters: 15 
## residual degrees of freedom: 23488 
## maximum log-likelihood: -202373.5 
##  
## AIC(1): 404777
## BIC(1): 404898
## G^2(1): 124537.1 (Likelihood ratio/deviance statistic) 
## X^2(1): 3920120 (Chi-square goodness of fit) 
##  
## Model 1: llik = -187489.6 ... best llik = -187489.6
## Model 2: llik = -187489.6 ... best llik = -187489.6
## Model 3: llik = -187489.6 ... best llik = -187489.6
## Model 4: llik = -187489.6 ... best llik = -187489.6
## Model 5: llik = -187489.6 ... best llik = -187489.6
## Model 6: llik = -187489.6 ... best llik = -187489.6
## Model 7: llik = -187489.6 ... best llik = -187489.6
## Model 8: llik = -187489.6 ... best llik = -187489.6
## Model 9: llik = -187489.6 ... best llik = -187489.6
## Model 10: llik = -187489.6 ... best llik = -187489.6
## Conditional item response (column) probabilities,
##  by outcome variable, for each class (row) 
##  
## $c1cluster
##            Pr(1)  Pr(2)
## class 1:  0.6810 0.3190
## class 2:  0.2455 0.7545
## 
## $c1indvcrs
##            Pr(1)  Pr(2)
## class 1:  0.0375 0.9625
## class 2:  0.0640 0.9360
## 
## $c1intern
##            Pr(1)  Pr(2)
## class 1:  0.7624 0.2376
## class 2:  0.3186 0.6814
## 
## $c1jobfair
##            Pr(1)  Pr(2)
## class 1:  0.8888 0.1112
## class 2:  0.5003 0.4997
## 
## $c1jobguide
##            Pr(1)  Pr(2)
## class 1:  0.8201 0.1799
## class 2:  0.5177 0.4823
## 
## $c1employer
##            Pr(1)  Pr(2)
## class 1:  0.5443 0.4557
## class 2:  0.1943 0.8057
## 
## $c1awareness
##            Pr(1)  Pr(2)
## class 1:  0.4679 0.5321
## class 2:  0.1578 0.8422
## 
## $c1careerunit
##            Pr(1)  Pr(2)
## class 1:  0.8323 0.1677
## class 2:  0.3171 0.6829
## 
## $c1workstudy
##            Pr(1)  Pr(2)
## class 1:  0.5173 0.4827
## class 2:  0.3260 0.6740
## 
## $c1careerday
##            Pr(1)  Pr(2)
## class 1:  0.7731 0.2269
## class 2:  0.3832 0.6168
## 
## $c1assemblies
##            Pr(1)  Pr(2)
## class 1:  0.8236 0.1764
## class 2:  0.2577 0.7423
## 
## $c1jobinfocmp
##            Pr(1)  Pr(2)
## class 1:  0.3217 0.6783
## class 2:  0.0450 0.9550
## 
## $c1jobinfonon
##            Pr(1)  Pr(2)
## class 1:  0.8692 0.1308
## class 2:  0.3482 0.6518
## 
## $c1hstowrkoth
##            Pr(1)  Pr(2)
## class 1:  0.8681 0.1319
## class 2:  0.6058 0.3942
## 
## $c1hstoworkno
##            Pr(1)  Pr(2)
## class 1:  0.9618 0.0382
## class 2:  1.0000 0.0000
## 
## Estimated class population shares 
##  0.5687 0.4313 
##  
## Predicted class memberships (by modal posterior prob.) 
##  0.5717 0.4283 
##  
## ========================================================= 
## Fit for 2 latent classes: 
## ========================================================= 
## number of observations: 23503 
## number of estimated parameters: 31 
## residual degrees of freedom: 23472 
## maximum log-likelihood: -187489.6 
##  
## AIC(2): 375041.2
## BIC(2): 375291.2
## G^2(2): 94769.25 (Likelihood ratio/deviance statistic) 
## X^2(2): 773166.1 (Chi-square goodness of fit) 
##  
## Model 1: llik = -181024 ... best llik = -181024
## Model 2: llik = -181024 ... best llik = -181024
## Model 3: llik = -181024 ... best llik = -181024
## Model 4: llik = -181024 ... best llik = -181024
## Model 5: llik = -186446.1 ... best llik = -181024
## Model 6: llik = -181024 ... best llik = -181024
## Model 7: llik = -181024 ... best llik = -181024
## Model 8: llik = -184451.8 ... best llik = -181024
## Model 9: llik = -181024 ... best llik = -181024
## Model 10: llik = -184451.8 ... best llik = -181024
## Conditional item response (column) probabilities,
##  by outcome variable, for each class (row) 
##  
## $c1cluster
##            Pr(1)  Pr(2)
## class 1:  0.6189 0.3811
## class 2:  0.2659 0.7341
## class 3:  0.9787 0.0213
## 
## $c1indvcrs
##            Pr(1)  Pr(2)
## class 1:  0.0499 0.9501
## class 2:  0.0593 0.9407
## class 3:  0.0000 1.0000
## 
## $c1intern
##            Pr(1)  Pr(2)
## class 1:  0.7113 0.2887
## class 2:  0.3432 0.6568
## class 3:  1.0000 0.0000
## 
## $c1jobfair
##            Pr(1)  Pr(2)
## class 1:  0.8681 0.1319
## class 2:  0.5223 0.4777
## class 3:  1.0000 0.0000
## 
## $c1jobguide
##            Pr(1)  Pr(2)
## class 1:  0.7871 0.2129
## class 2:  0.5295 0.4705
## class 3:  0.9970 0.0030
## 
## $c1employer
##            Pr(1)  Pr(2)
## class 1:  0.7127 0.2873
## class 2:  0.1902 0.8098
## class 3:  0.0039 0.9961
## 
## $c1awareness
##            Pr(1)  Pr(2)
## class 1:  0.6165 0.3835
## class 2:  0.1521 0.8479
## class 3:  0.0000 1.0000
## 
## $c1careerunit
##            Pr(1)  Pr(2)
## class 1:  0.8074 0.1926
## class 2:  0.3392 0.6608
## class 3:  1.0000 0.0000
## 
## $c1workstudy
##            Pr(1)  Pr(2)
## class 1:  0.6627 0.3373
## class 2:  0.3251 0.6749
## class 3:  0.0032 0.9968
## 
## $c1careerday
##            Pr(1)  Pr(2)
## class 1:  0.7315 0.2685
## class 2:  0.4009 0.5991
## class 3:  0.9865 0.0135
## 
## $c1assemblies
##            Pr(1)  Pr(2)
## class 1:  0.7980 0.2020
## class 2:  0.2821 0.7179
## class 3:  1.0000 0.0000
## 
## $c1jobinfocmp
##            Pr(1)  Pr(2)
## class 1:  0.4209 0.5791
## class 2:  0.0485 0.9515
## class 3:  0.0000 1.0000
## 
## $c1jobinfonon
##            Pr(1)  Pr(2)
## class 1:  0.8410 0.1590
## class 2:  0.3823 0.6177
## class 3:  1.0000 0.0000
## 
## $c1hstowrkoth
##            Pr(1)  Pr(2)
## class 1:  0.8340 0.1660
## class 2:  0.6259 0.3741
## class 3:  1.0000 0.0000
## 
## $c1hstoworkno
##            Pr(1)  Pr(2)
## class 1:  0.9491 0.0509
## class 2:  1.0000 0.0000
## class 3:  1.0000 0.0000
## 
## Estimated class population shares 
##  0.4271 0.4655 0.1073 
##  
## Predicted class memberships (by modal posterior prob.) 
##  0.4201 0.4712 0.1088 
##  
## ========================================================= 
## Fit for 3 latent classes: 
## ========================================================= 
## number of observations: 23503 
## number of estimated parameters: 47 
## residual degrees of freedom: 23456 
## maximum log-likelihood: -181024 
##  
## AIC(3): 362142.1
## BIC(3): 362521.1
## G^2(3): 81838.14 (Likelihood ratio/deviance statistic) 
## X^2(3): 494691.4 (Chi-square goodness of fit) 
##  
## Model 1: llik = -178320.5 ... best llik = -178320.5
## Model 2: llik = -178316.4 ... best llik = -178316.4
## Model 3: llik = -178333.6 ... best llik = -178316.4
## Model 4: llik = -178316.4 ... best llik = -178316.4
## Model 5: llik = -178316.4 ... best llik = -178316.4
## Model 6: llik = -178320.5 ... best llik = -178316.4
## Model 7: llik = -178316.4 ... best llik = -178316.4
## Model 8: llik = -178333.6 ... best llik = -178316.4
## Model 9: llik = -178333.6 ... best llik = -178316.4
## Model 10: llik = -178316.4 ... best llik = -178316.4
## Conditional item response (column) probabilities,
##  by outcome variable, for each class (row) 
##  
## $c1cluster
##            Pr(1)  Pr(2)
## class 1:  0.9886 0.0114
## class 2:  0.2249 0.7751
## class 3:  0.4859 0.5141
## class 4:  0.9845 0.0155
## 
## $c1indvcrs
##            Pr(1)  Pr(2)
## class 1:  0.0053 0.9947
## class 2:  0.0609 0.9391
## class 3:  0.0581 0.9419
## class 4:  0.0000 1.0000
## 
## $c1intern
##            Pr(1)  Pr(2)
## class 1:  0.9219 0.0781
## class 2:  0.2890 0.7110
## class 3:  0.6067 0.3933
## class 4:  1.0000 0.0000
## 
## $c1jobfair
##            Pr(1)  Pr(2)
## class 1:  0.9560 0.0440
## class 2:  0.4356 0.5644
## class 3:  0.8083 0.1917
## class 4:  1.0000 0.0000
## 
## $c1jobguide
##            Pr(1)  Pr(2)
## class 1:  0.9476 0.0524
## class 2:  0.5019 0.4981
## class 3:  0.7049 0.2951
## class 4:  0.9992 0.0008
## 
## $c1employer
##            Pr(1)  Pr(2)
## class 1:  0.9505 0.0495
## class 2:  0.1115 0.8885
## class 3:  0.5716 0.4284
## class 4:  0.0089 0.9911
## 
## $c1awareness
##            Pr(1)  Pr(2)
## class 1:  0.9326 0.0674
## class 2:  0.1150 0.8850
## class 3:  0.4560 0.5440
## class 4:  0.0000 1.0000
## 
## $c1careerunit
##            Pr(1)  Pr(2)
## class 1:  0.9215 0.0785
## class 2:  0.2440 0.7560
## class 3:  0.7131 0.2869
## class 4:  1.0000 0.0000
## 
## $c1workstudy
##            Pr(1)  Pr(2)
## class 1:  0.9873 0.0127
## class 2:  0.2814 0.7186
## class 3:  0.5419 0.4581
## class 4:  0.0057 0.9943
## 
## $c1careerday
##            Pr(1)  Pr(2)
## class 1:  0.9356 0.0644
## class 2:  0.3410 0.6590
## class 3:  0.6418 0.3582
## class 4:  0.9892 0.0108
## 
## $c1assemblies
##            Pr(1)  Pr(2)
## class 1:  0.9218 0.0782
## class 2:  0.1837 0.8163
## class 3:  0.6903 0.3097
## class 4:  1.0000 0.0000
## 
## $c1jobinfocmp
##            Pr(1)  Pr(2)
## class 1:  0.8252 0.1748
## class 2:  0.0103 0.9897
## class 3:  0.2756 0.7244
## class 4:  0.0000 1.0000
## 
## $c1jobinfonon
##            Pr(1)  Pr(2)
## class 1:  0.9377 0.0623
## class 2:  0.2713 0.7287
## class 3:  0.7620 0.2380
## class 4:  1.0000 0.0000
## 
## $c1hstowrkoth
##            Pr(1)  Pr(2)
## class 1:  0.9411 0.0589
## class 2:  0.5794 0.4206
## class 3:  0.7867 0.2133
## class 4:  1.0000 0.0000
## 
## $c1hstoworkno
##            Pr(1)  Pr(2)
## class 1:  0.7042 0.2958
## class 2:  1.0000 0.0000
## class 3:  1.0000 0.0000
## class 4:  1.0000 0.0000
## 
## Estimated class population shares 
##  0.0735 0.3175 0.5024 0.1067 
##  
## Predicted class memberships (by modal posterior prob.) 
##  0.0701 0.3129 0.5089 0.1082 
##  
## ========================================================= 
## Fit for 4 latent classes: 
## ========================================================= 
## number of observations: 23503 
## number of estimated parameters: 63 
## residual degrees of freedom: 23440 
## maximum log-likelihood: -178316.4 
##  
## AIC(4): 356758.8
## BIC(4): 357266.8
## G^2(4): 76422.84 (Likelihood ratio/deviance statistic) 
## X^2(4): 468266.4 (Chi-square goodness of fit) 
##

LCA_best_model <- lc
LCA_best_model

## Conditional item response (column) probabilities,
##  by outcome variable, for each class (row) 
##  
## $c1cluster
##            Pr(1)  Pr(2)
## class 1:  0.9886 0.0114
## class 2:  0.2249 0.7751
## class 3:  0.4859 0.5141
## class 4:  0.9845 0.0155
## 
## $c1indvcrs
##            Pr(1)  Pr(2)
## class 1:  0.0053 0.9947
## class 2:  0.0609 0.9391
## class 3:  0.0581 0.9419
## class 4:  0.0000 1.0000
## 
## $c1intern
##            Pr(1)  Pr(2)
## class 1:  0.9219 0.0781
## class 2:  0.2890 0.7110
## class 3:  0.6067 0.3933
## class 4:  1.0000 0.0000
## 
## $c1jobfair
##            Pr(1)  Pr(2)
## class 1:  0.9560 0.0440
## class 2:  0.4356 0.5644
## class 3:  0.8083 0.1917
## class 4:  1.0000 0.0000
## 
## $c1jobguide
##            Pr(1)  Pr(2)
## class 1:  0.9476 0.0524
## class 2:  0.5019 0.4981
## class 3:  0.7049 0.2951
## class 4:  0.9992 0.0008
## 
## $c1employer
##            Pr(1)  Pr(2)
## class 1:  0.9505 0.0495
## class 2:  0.1115 0.8885
## class 3:  0.5716 0.4284
## class 4:  0.0089 0.9911
## 
## $c1awareness
##            Pr(1)  Pr(2)
## class 1:  0.9326 0.0674
## class 2:  0.1150 0.8850
## class 3:  0.4560 0.5440
## class 4:  0.0000 1.0000
## 
## $c1careerunit
##            Pr(1)  Pr(2)
## class 1:  0.9215 0.0785
## class 2:  0.2440 0.7560
## class 3:  0.7131 0.2869
## class 4:  1.0000 0.0000
## 
## $c1workstudy
##            Pr(1)  Pr(2)
## class 1:  0.9873 0.0127
## class 2:  0.2814 0.7186
## class 3:  0.5419 0.4581
## class 4:  0.0057 0.9943
## 
## $c1careerday
##            Pr(1)  Pr(2)
## class 1:  0.9356 0.0644
## class 2:  0.3410 0.6590
## class 3:  0.6418 0.3582
## class 4:  0.9892 0.0108
## 
## $c1assemblies
##            Pr(1)  Pr(2)
## class 1:  0.9218 0.0782
## class 2:  0.1837 0.8163
## class 3:  0.6903 0.3097
## class 4:  1.0000 0.0000
## 
## $c1jobinfocmp
##            Pr(1)  Pr(2)
## class 1:  0.8252 0.1748
## class 2:  0.0103 0.9897
## class 3:  0.2756 0.7244
## class 4:  0.0000 1.0000
## 
## $c1jobinfonon
##            Pr(1)  Pr(2)
## class 1:  0.9377 0.0623
## class 2:  0.2713 0.7287
## class 3:  0.7620 0.2380
## class 4:  1.0000 0.0000
## 
## $c1hstowrkoth
##            Pr(1)  Pr(2)
## class 1:  0.9411 0.0589
## class 2:  0.5794 0.4206
## class 3:  0.7867 0.2133
## class 4:  1.0000 0.0000
## 
## $c1hstoworkno
##            Pr(1)  Pr(2)
## class 1:  0.7042 0.2958
## class 2:  1.0000 0.0000
## class 3:  1.0000 0.0000
## class 4:  1.0000 0.0000
## 
## Estimated class population shares 
##  0.0735 0.3175 0.5024 0.1067 
##  
## Predicted class memberships (by modal posterior prob.) 
##  0.0701 0.3129 0.5089 0.1082 
##  
## ========================================================= 
## Fit for 4 latent classes: 
## ========================================================= 
## number of observations: 23503 
## number of estimated parameters: 63 
## residual degrees of freedom: 23440 
## maximum log-likelihood: -178316.4 
##  
## AIC(4): 356758.8
## BIC(4): 357266.8
## G^2(4): 76422.84 (Likelihood ratio/deviance statistic) 
## X^2(4): 468266.4 (Chi-square goodness of fit) 
##

Brief Interpretation

For the best model, the probability for counselors to NOT provide opportunities for students should be: Class 3 > Class 1 > Class 2 > Class 4

fit.indices <- data.frame(
  Model = c("Model 1", "Model 2", "Model 3", "Model 4"),
  log_liklihood = c(-202373.5, -187489.6, -181024, -178316.4),
  resid.df = c(23488, 23472, 23456, 23440),
  BIC = c(404898, 375291.2, 362521.1, 357266.8),
  AIC = c(404777, 375041.2, 362142.1, 356758.8),
  likelihood_ratio = c(124537.1, 94769.25, 81838.14, 76422.84)
)

fit.indices %>%
  kbl(caption = "Fit indices for Opportunites for students by School Counselor") %>%
  kable_styling(full_width = F)

Fit indices for Opportunites for students by School Counselor
Model	log_liklihood	resid.df	BIC	AIC	likelihood_ratio
Model 1	-202373.5	23488	404898.0	404777.0	124537.10
Model 2	-187489.6	23472	375291.2	375041.2	94769.25
Model 3	-181024.0	23456	362521.1	362142.1	81838.14
Model 4	-178316.4	23440	357266.8	356758.8	76422.84

max_II <- -100000
min_bic <- 100000
for (i in 1:4) {
  lc2 <- poLCA(f2, df, nclass = i, maxiter = 3000,
              tol = 1e-5, na.rm = FALSE, nrep = 10,
              verbose = TRUE, calc.se = TRUE)
  if(lc2$bic < min_bic) {
    min_bic <- lc2$bic
    LCA_best_model2 <- lc2
  }
}

## Conditional item response (column) probabilities,
##  by outcome variable, for each class (row) 
##  
## $a1mspdlearn
##           Pr(1) Pr(2)
## class 1:  0.355 0.645
## 
## $a1mspdintrst
##            Pr(1)  Pr(2)
## class 1:  0.6436 0.3564
## 
## $a1msother
##            Pr(1)  Pr(2)
## class 1:  0.7827 0.2173
## 
## $a1msnone
##            Pr(1)  Pr(2)
## class 1:  0.9811 0.0189
## 
## $a1g9summer
##            Pr(1)  Pr(2)
## class 1:  0.6469 0.3531
## 
## $a1g9overage
##            Pr(1)  Pr(2)
## class 1:  0.8496 0.1504
## 
## $a1g9communty
##            Pr(1)  Pr(2)
## class 1:  0.7526 0.2474
## 
## $a1g9blocksch
##            Pr(1)  Pr(2)
## class 1:  0.1222 0.8778
## 
## $a1g9double
##            Pr(1)  Pr(2)
## class 1:  0.4038 0.5962
## 
## $a1g9study
##            Pr(1)  Pr(2)
## class 1:  0.6392 0.3608
## 
## $a1g9teacher
##            Pr(1)  Pr(2)
## class 1:  0.6227 0.3773
## 
## Estimated class population shares 
##  1 
##  
## Predicted class memberships (by modal posterior prob.) 
##  1 
##  
## ========================================================= 
## Fit for 1 latent classes: 
## ========================================================= 
## number of observations: 23503 
## number of estimated parameters: 11 
## residual degrees of freedom: 2036 
## maximum log-likelihood: -138994.1 
##  
## AIC(1): 278010.2
## BIC(1): 278098.9
## G^2(1): 40874.66 (Likelihood ratio/deviance statistic) 
## X^2(1): 151829.9 (Chi-square goodness of fit) 
##  
## Model 1: llik = -132319.4 ... best llik = -132319.4
## Model 2: llik = -132319.4 ... best llik = -132319.4
## Model 3: llik = -132319.4 ... best llik = -132319.4
## Model 4: llik = -132319.4 ... best llik = -132319.4
## Model 5: llik = -132319.4 ... best llik = -132319.4
## Model 6: llik = -132319.4 ... best llik = -132319.4
## Model 7: llik = -132319.4 ... best llik = -132319.4
## Model 8: llik = -132319.4 ... best llik = -132319.4
## Model 9: llik = -132319.4 ... best llik = -132319.4
## Model 10: llik = -132319.4 ... best llik = -132319.4
## Conditional item response (column) probabilities,
##  by outcome variable, for each class (row) 
##  
## $a1mspdlearn
##            Pr(1)  Pr(2)
## class 1:  0.1604 0.8396
## class 2:  0.4520 0.5480
## 
## $a1mspdintrst
##            Pr(1)  Pr(2)
## class 1:  0.2981 0.7019
## class 2:  0.8157 0.1843
## 
## $a1msother
##            Pr(1)  Pr(2)
## class 1:  0.6893 0.3107
## class 2:  0.8292 0.1708
## 
## $a1msnone
##            Pr(1)  Pr(2)
## class 1:  0.9970 0.0030
## class 2:  0.9732 0.0268
## 
## $a1g9summer
##            Pr(1)  Pr(2)
## class 1:  0.4582 0.5418
## class 2:  0.7409 0.2591
## 
## $a1g9overage
##            Pr(1)  Pr(2)
## class 1:  0.6233 0.3767
## class 2:  0.9623 0.0377
## 
## $a1g9communty
##            Pr(1)  Pr(2)
## class 1:  0.4739 0.5261
## class 2:  0.8915 0.1085
## 
## $a1g9blocksch
##            Pr(1)  Pr(2)
## class 1:  0.0816 0.9184
## class 2:  0.1425 0.8575
## 
## $a1g9double
##            Pr(1)  Pr(2)
## class 1:  0.3390 0.6610
## class 2:  0.4361 0.5639
## 
## $a1g9study
##            Pr(1)  Pr(2)
## class 1:  0.4140 0.5860
## class 2:  0.7514 0.2486
## 
## $a1g9teacher
##            Pr(1)  Pr(2)
## class 1:  0.1777 0.8223
## class 2:  0.8445 0.1555
## 
## Estimated class population shares 
##  0.3326 0.6674 
##  
## Predicted class memberships (by modal posterior prob.) 
##  0.3301 0.6699 
##  
## ========================================================= 
## Fit for 2 latent classes: 
## ========================================================= 
## number of observations: 23503 
## number of estimated parameters: 23 
## residual degrees of freedom: 2024 
## maximum log-likelihood: -132319.4 
##  
## AIC(2): 264684.9
## BIC(2): 264870.4
## G^2(2): 27525.34 (Likelihood ratio/deviance statistic) 
## X^2(2): 111654 (Chi-square goodness of fit) 
##  
## Model 1: llik = -131218.7 ... best llik = -131218.7
## Model 2: llik = -129611.7 ... best llik = -129611.7
## Model 3: llik = -129611.7 ... best llik = -129611.7
## Model 4: llik = -129611.7 ... best llik = -129611.7
## Model 5: llik = -131218.7 ... best llik = -129611.7
## Model 6: llik = -129611.7 ... best llik = -129611.7
## Model 7: llik = -129611.7 ... best llik = -129611.7
## Model 8: llik = -129611.7 ... best llik = -129611.7
## Model 9: llik = -129611.7 ... best llik = -129611.7
## Model 10: llik = -129611.7 ... best llik = -129611.7
## Conditional item response (column) probabilities,
##  by outcome variable, for each class (row) 
##  
## $a1mspdlearn
##            Pr(1)  Pr(2)
## class 1:  0.0000 1.0000
## class 2:  0.1039 0.8961
## class 3:  0.6636 0.3364
## 
## $a1mspdintrst
##            Pr(1)  Pr(2)
## class 1:  0.9024 0.0976
## class 2:  0.2812 0.7188
## class 3:  0.8249 0.1751
## 
## $a1msother
##            Pr(1)  Pr(2)
## class 1:  0.9501 0.0499
## class 2:  0.7064 0.2936
## class 3:  0.7821 0.2179
## 
## $a1msnone
##            Pr(1)  Pr(2)
## class 1:  1.0000 0.0000
## class 2:  1.0000 0.0000
## class 3:  0.9606 0.0394
## 
## $a1g9summer
##            Pr(1)  Pr(2)
## class 1:  0.9983 0.0017
## class 2:  0.4828 0.5172
## class 3:  0.6488 0.3512
## 
## $a1g9overage
##            Pr(1)  Pr(2)
## class 1:  1.0000 0.0000
## class 2:  0.6526 0.3474
## class 3:  0.9448 0.0552
## 
## $a1g9communty
##            Pr(1)  Pr(2)
## class 1:  0.9983 0.0017
## class 2:  0.5149 0.4851
## class 3:  0.8456 0.1544
## 
## $a1g9blocksch
##            Pr(1)  Pr(2)
## class 1:  0.0000 1.0000
## class 2:  0.0752 0.9248
## class 3:  0.1991 0.8009
## 
## $a1g9double
##            Pr(1)  Pr(2)
## class 1:  0.0603 0.9397
## class 2:  0.3305 0.6695
## class 3:  0.5759 0.4241
## 
## $a1g9study
##            Pr(1)  Pr(2)
## class 1:  0.9610 0.0390
## class 2:  0.4448 0.5552
## class 3:  0.6738 0.3262
## 
## $a1g9teacher
##            Pr(1)  Pr(2)
## class 1:  0.9790 0.0210
## class 2:  0.2347 0.7653
## class 3:  0.7898 0.2102
## 
## Estimated class population shares 
##  0.164 0.3569 0.4791 
##  
## Predicted class memberships (by modal posterior prob.) 
##  0.1756 0.3547 0.4696 
##  
## ========================================================= 
## Fit for 3 latent classes: 
## ========================================================= 
## number of observations: 23503 
## number of estimated parameters: 35 
## residual degrees of freedom: 2012 
## maximum log-likelihood: -129611.7 
##  
## AIC(3): 259293.3
## BIC(3): 259575.6
## G^2(3): 22109.79 (Likelihood ratio/deviance statistic) 
## X^2(3): 56826.11 (Chi-square goodness of fit) 
##  
## Model 1: llik = -128102.5 ... best llik = -128102.5
## Model 2: llik = -128102.5 ... best llik = -128102.5
## Model 3: llik = -128102.5 ... best llik = -128102.5
## Model 4: llik = -128102.5 ... best llik = -128102.5
## Model 5: llik = -128102.5 ... best llik = -128102.5
## Model 6: llik = -128102.5 ... best llik = -128102.5
## Model 7: llik = -128102.5 ... best llik = -128102.5
## Model 8: llik = -128102.5 ... best llik = -128102.5
## Model 9: llik = -128102.5 ... best llik = -128102.5
## Model 10: llik = -128102.5 ... best llik = -128102.5
## Conditional item response (column) probabilities,
##  by outcome variable, for each class (row) 
##  
## $a1mspdlearn
##            Pr(1)  Pr(2)
## class 1:  0.0000 1.0000
## class 2:  0.1298 0.8702
## class 3:  1.0000 0.0000
## class 4:  0.1738 0.8262
## 
## $a1mspdintrst
##            Pr(1)  Pr(2)
## class 1:  0.9735 0.0265
## class 2:  0.3053 0.6947
## class 3:  1.0000 0.0000
## class 4:  0.4330 0.5670
## 
## $a1msother
##            Pr(1)  Pr(2)
## class 1:  0.9468 0.0532
## class 2:  0.6101 0.3899
## class 3:  0.7430 0.2570
## class 4:  0.8248 0.1752
## 
## $a1msnone
##           Pr(1) Pr(2)
## class 1:  1.000 0.000
## class 2:  1.000 0.000
## class 3:  0.928 0.072
## class 4:  1.000 0.000
## 
## $a1g9summer
##            Pr(1)  Pr(2)
## class 1:  0.9937 0.0063
## class 2:  0.4192 0.5808
## class 3:  0.6726 0.3274
## class 4:  0.5979 0.4021
## 
## $a1g9overage
##            Pr(1)  Pr(2)
## class 1:  1.0000 0.0000
## class 2:  0.4640 0.5360
## class 3:  0.9350 0.0650
## class 4:  0.9145 0.0855
## 
## $a1g9communty
##            Pr(1)  Pr(2)
## class 1:  0.9923 0.0077
## class 2:  0.3125 0.6875
## class 3:  0.8217 0.1783
## class 4:  0.8183 0.1817
## 
## $a1g9blocksch
##            Pr(1)  Pr(2)
## class 1:  0.0000 1.0000
## class 2:  0.0487 0.9513
## class 3:  0.1886 0.8114
## class 4:  0.1620 0.8380
## 
## $a1g9double
##            Pr(1)  Pr(2)
## class 1:  0.0505 0.9495
## class 2:  0.1917 0.8083
## class 3:  0.5639 0.4361
## class 4:  0.5393 0.4607
## 
## $a1g9study
##            Pr(1)  Pr(2)
## class 1:  0.9610 0.0390
## class 2:  0.3716 0.6284
## class 3:  0.6835 0.3165
## class 4:  0.6067 0.3933
## 
## $a1g9teacher
##            Pr(1)  Pr(2)
## class 1:  0.9872 0.0128
## class 2:  0.0846 0.9154
## class 3:  0.7891 0.2109
## class 4:  0.6196 0.3804
## 
## Estimated class population shares 
##  0.1582 0.1861 0.2625 0.3932 
##  
## Predicted class memberships (by modal posterior prob.) 
##  0.1562 0.1659 0.2941 0.3839 
##  
## ========================================================= 
## Fit for 4 latent classes: 
## ========================================================= 
## number of observations: 23503 
## number of estimated parameters: 47 
## residual degrees of freedom: 2000 
## maximum log-likelihood: -128102.5 
##  
## AIC(4): 256299
## BIC(4): 256678
## G^2(4): 19091.43 (Likelihood ratio/deviance statistic) 
## X^2(4): 37904.92 (Chi-square goodness of fit) 
##

LCA_best_model2 <- lc2
LCA_best_model2

## Conditional item response (column) probabilities,
##  by outcome variable, for each class (row) 
##  
## $a1mspdlearn
##            Pr(1)  Pr(2)
## class 1:  0.0000 1.0000
## class 2:  0.1298 0.8702
## class 3:  1.0000 0.0000
## class 4:  0.1738 0.8262
## 
## $a1mspdintrst
##            Pr(1)  Pr(2)
## class 1:  0.9735 0.0265
## class 2:  0.3053 0.6947
## class 3:  1.0000 0.0000
## class 4:  0.4330 0.5670
## 
## $a1msother
##            Pr(1)  Pr(2)
## class 1:  0.9468 0.0532
## class 2:  0.6101 0.3899
## class 3:  0.7430 0.2570
## class 4:  0.8248 0.1752
## 
## $a1msnone
##           Pr(1) Pr(2)
## class 1:  1.000 0.000
## class 2:  1.000 0.000
## class 3:  0.928 0.072
## class 4:  1.000 0.000
## 
## $a1g9summer
##            Pr(1)  Pr(2)
## class 1:  0.9937 0.0063
## class 2:  0.4192 0.5808
## class 3:  0.6726 0.3274
## class 4:  0.5979 0.4021
## 
## $a1g9overage
##            Pr(1)  Pr(2)
## class 1:  1.0000 0.0000
## class 2:  0.4640 0.5360
## class 3:  0.9350 0.0650
## class 4:  0.9145 0.0855
## 
## $a1g9communty
##            Pr(1)  Pr(2)
## class 1:  0.9923 0.0077
## class 2:  0.3125 0.6875
## class 3:  0.8217 0.1783
## class 4:  0.8183 0.1817
## 
## $a1g9blocksch
##            Pr(1)  Pr(2)
## class 1:  0.0000 1.0000
## class 2:  0.0487 0.9513
## class 3:  0.1886 0.8114
## class 4:  0.1620 0.8380
## 
## $a1g9double
##            Pr(1)  Pr(2)
## class 1:  0.0505 0.9495
## class 2:  0.1917 0.8083
## class 3:  0.5639 0.4361
## class 4:  0.5393 0.4607
## 
## $a1g9study
##            Pr(1)  Pr(2)
## class 1:  0.9610 0.0390
## class 2:  0.3716 0.6284
## class 3:  0.6835 0.3165
## class 4:  0.6067 0.3933
## 
## $a1g9teacher
##            Pr(1)  Pr(2)
## class 1:  0.9872 0.0128
## class 2:  0.0846 0.9154
## class 3:  0.7891 0.2109
## class 4:  0.6196 0.3804
## 
## Estimated class population shares 
##  0.1582 0.1861 0.2625 0.3932 
##  
## Predicted class memberships (by modal posterior prob.) 
##  0.1562 0.1659 0.2941 0.3839 
##  
## ========================================================= 
## Fit for 4 latent classes: 
## ========================================================= 
## number of observations: 23503 
## number of estimated parameters: 47 
## residual degrees of freedom: 2000 
## maximum log-likelihood: -128102.5 
##  
## AIC(4): 256299
## BIC(4): 256678
## G^2(4): 19091.43 (Likelihood ratio/deviance statistic) 
## X^2(4): 37904.92 (Chi-square goodness of fit) 
##

Brief Interpretation

For the best model,

fit.indices2 <- data.frame(
  Model = c("Model 1", "Model 2", "Model 3", "Model 4"),
  log_liklihood = c(-138994.1, -132319.4, -129611.7, -128102.5),
  resid.df = c(2036, 2024, 2012, 2000),
  BIC = c(278098.9, 264870.4, 259575.6, 256678),
  AIC = c(278010.2, 264684.9, 259293.3, 256299),
  likelihood_ratio = c(40874.66, 27525.34, 22109.79, 19091.43)
)

fit.indices2 %>%
  kbl(caption = "Fit indices for Opportunites for students by School Administrator") %>%
  kable_styling(full_width = F)

Fit indices for Opportunites for students by School Administrator
Model	log_liklihood	resid.df	BIC	AIC	likelihood_ratio
Model 1	-138994.1	2036	278098.9	278010.2	40874.66
Model 2	-132319.4	2024	264870.4	264684.9	27525.34
Model 3	-129611.7	2012	259575.6	259293.3	22109.79
Model 4	-128102.5	2000	256678.0	256299.0	19091.43

set.seed(112)
plot(LCA_best_model)

plot(LCA_best_model2)

Adding student gender as a covariates

f3 <- as.formula(cbind(c1cluster, c1indvcrs, c1intern, c1jobfair, c1jobguide, c1employer, c1awareness, c1careerunit, c1workstudy, c1careerday, c1assemblies,c1jobinfocmp, c1jobinfonon, c1hstowrkoth, c1hstoworkno)~x1sex)

f4 <- as.formula(cbind(a1mspdlearn, a1mspdintrst, a1msother, a1msnone, a1g9summer, a1g9overage, a1g9communty, a1g9blocksch, a1g9double, a1g9study, a1g9teacher)~x1sex)

LCA <- poLCA(f3, data = df, nclass = 4, maxiter = 30000, nrep = 5)

## Model 1: llik = -180983 ... best llik = -180983
## Model 2: llik = -178272.7 ... best llik = -178272.7
## Model 3: llik = -178272.7 ... best llik = -178272.7
## Model 4: llik = -178272.7 ... best llik = -178272.7
## Model 5: llik = -178272.7 ... best llik = -178272.7
## Conditional item response (column) probabilities,
##  by outcome variable, for each class (row) 
##  
## $c1cluster
##            Pr(1)  Pr(2)
## class 1:  0.9845 0.0155
## class 2:  0.9887 0.0113
## class 3:  0.2252 0.7748
## class 4:  0.4869 0.5131
## 
## $c1indvcrs
##            Pr(1)  Pr(2)
## class 1:  0.0000 1.0000
## class 2:  0.0054 0.9946
## class 3:  0.0610 0.9390
## class 4:  0.0580 0.9420
## 
## $c1intern
##            Pr(1)  Pr(2)
## class 1:  1.0000 0.0000
## class 2:  0.9220 0.0780
## class 3:  0.2895 0.7105
## class 4:  0.6075 0.3925
## 
## $c1jobfair
##            Pr(1)  Pr(2)
## class 1:  1.0000 0.0000
## class 2:  0.9567 0.0433
## class 3:  0.4362 0.5638
## class 4:  0.8088 0.1912
## 
## $c1jobguide
##            Pr(1)  Pr(2)
## class 1:  0.9992 0.0008
## class 2:  0.9476 0.0524
## class 3:  0.5020 0.4980
## class 4:  0.7057 0.2943
## 
## $c1employer
##            Pr(1)  Pr(2)
## class 1:  0.0088 0.9912
## class 2:  0.9511 0.0489
## class 3:  0.1121 0.8879
## class 4:  0.5727 0.4273
## 
## $c1awareness
##            Pr(1)  Pr(2)
## class 1:  0.0000 1.0000
## class 2:  0.9353 0.0647
## class 3:  0.1155 0.8845
## class 4:  0.4567 0.5433
## 
## $c1careerunit
##            Pr(1)  Pr(2)
## class 1:  1.0000 0.0000
## class 2:  0.9219 0.0781
## class 3:  0.2447 0.7553
## class 4:  0.7140 0.2860
## 
## $c1workstudy
##            Pr(1)  Pr(2)
## class 1:  0.0057 0.9943
## class 2:  0.9886 0.0114
## class 3:  0.2819 0.7181
## class 4:  0.5425 0.4575
## 
## $c1careerday
##            Pr(1)  Pr(2)
## class 1:  0.9892 0.0108
## class 2:  0.9374 0.0626
## class 3:  0.3417 0.6583
## class 4:  0.6422 0.3578
## 
## $c1assemblies
##            Pr(1)  Pr(2)
## class 1:  1.0000 0.0000
## class 2:  0.9222 0.0778
## class 3:  0.1843 0.8157
## class 4:  0.6912 0.3088
## 
## $c1jobinfocmp
##            Pr(1)  Pr(2)
## class 1:  0.0000 1.0000
## class 2:  0.8266 0.1734
## class 3:  0.0107 0.9893
## class 4:  0.2764 0.7236
## 
## $c1jobinfonon
##            Pr(1)  Pr(2)
## class 1:  1.0000 0.0000
## class 2:  0.9381 0.0619
## class 3:  0.2721 0.7279
## class 4:  0.7628 0.2372
## 
## $c1hstowrkoth
##            Pr(1)  Pr(2)
## class 1:  1.0000 0.0000
## class 2:  0.9421 0.0579
## class 3:  0.5797 0.4203
## class 4:  0.7871 0.2129
## 
## $c1hstoworkno
##            Pr(1)  Pr(2)
## class 1:  1.0000 0.0000
## class 2:  0.7013 0.2987
## class 3:  1.0000 0.0000
## class 4:  1.0000 0.0000
## 
## Estimated class population shares 
##  0.1066 0.0728 0.3186 0.502 
##  
## Predicted class memberships (by modal posterior prob.) 
##  0.1081 0.0699 0.3151 0.5069 
##  
## ========================================================= 
## Fit for 4 latent classes: 
## ========================================================= 
## 2 / 1 
##             Coefficient  Std. error  t value  Pr(>|t|)
## (Intercept)    -0.42871     0.05578   -7.686     0.000
## x1sexMale       0.09013     0.06868    1.312     0.189
## ========================================================= 
## 3 / 1 
##             Coefficient  Std. error  t value  Pr(>|t|)
## (Intercept)     1.12855     0.03572   31.590     0.000
## x1sexMale      -0.06607     0.04797   -1.377     0.168
## ========================================================= 
## 4 / 1 
##             Coefficient  Std. error  t value  Pr(>|t|)
## (Intercept)     1.56849     0.03357   46.729     0.000
## x1sexMale      -0.03650     0.04569   -0.799     0.424
## ========================================================= 
## number of observations: 23497 
## number of estimated parameters: 66 
## residual degrees of freedom: 23431 
## maximum log-likelihood: -178272.7 
##  
## AIC(4): 356677.4
## BIC(4): 357209.7
## X^2(4): 468140.2 (Chi-square goodness of fit) 
##  
## ALERT: estimation algorithm automatically restarted with new initial values 
##

probs.start <- LCA$probs.start
plot(LCA)

LCA2 <- poLCA(f4, data = df, nclass = 4, maxiter = 30000, nrep = 5)

## Model 1: llik = -132296.9 ... best llik = -132296.9
## Model 2: llik = -128084.6 ... best llik = -128084.6
## Model 3: llik = -128084.6 ... best llik = -128084.6
## Model 4: llik = -138969 ... best llik = -128084.6
## Model 5: llik = -138969 ... best llik = -128084.6
## Conditional item response (column) probabilities,
##  by outcome variable, for each class (row) 
##  
## $a1mspdlearn
##            Pr(1)  Pr(2)
## class 1:  0.0000 1.0000
## class 2:  0.1298 0.8702
## class 3:  1.0000 0.0000
## class 4:  0.1737 0.8263
## 
## $a1mspdintrst
##            Pr(1)  Pr(2)
## class 1:  0.9738 0.0262
## class 2:  0.3053 0.6947
## class 3:  1.0000 0.0000
## class 4:  0.4328 0.5672
## 
## $a1msother
##            Pr(1)  Pr(2)
## class 1:  0.9467 0.0533
## class 2:  0.6100 0.3900
## class 3:  0.7430 0.2570
## class 4:  0.8247 0.1753
## 
## $a1msnone
##            Pr(1)  Pr(2)
## class 1:  1.0000 0.0000
## class 2:  1.0000 0.0000
## class 3:  0.9281 0.0719
## class 4:  1.0000 0.0000
## 
## $a1g9summer
##            Pr(1)  Pr(2)
## class 1:  0.9938 0.0062
## class 2:  0.4191 0.5809
## class 3:  0.6726 0.3274
## class 4:  0.5980 0.4020
## 
## $a1g9overage
##            Pr(1)  Pr(2)
## class 1:  1.0000 0.0000
## class 2:  0.4638 0.5362
## class 3:  0.9350 0.0650
## class 4:  0.9144 0.0856
## 
## $a1g9communty
##            Pr(1)  Pr(2)
## class 1:  0.9923 0.0077
## class 2:  0.3122 0.6878
## class 3:  0.8217 0.1783
## class 4:  0.8183 0.1817
## 
## $a1g9blocksch
##            Pr(1)  Pr(2)
## class 1:  0.0000 1.0000
## class 2:  0.0487 0.9513
## class 3:  0.1886 0.8114
## class 4:  0.1618 0.8382
## 
## $a1g9double
##            Pr(1)  Pr(2)
## class 1:  0.0508 0.9492
## class 2:  0.1916 0.8084
## class 3:  0.5639 0.4361
## class 4:  0.5390 0.4610
## 
## $a1g9study
##            Pr(1)  Pr(2)
## class 1:  0.9609 0.0391
## class 2:  0.3714 0.6286
## class 3:  0.6835 0.3165
## class 4:  0.6068 0.3932
## 
## $a1g9teacher
##            Pr(1)  Pr(2)
## class 1:  0.9873 0.0127
## class 2:  0.0845 0.9155
## class 3:  0.7891 0.2109
## class 4:  0.6194 0.3806
## 
## Estimated class population shares 
##  0.1579 0.186 0.2626 0.3935 
##  
## Predicted class memberships (by modal posterior prob.) 
##  0.156 0.1659 0.2942 0.3839 
##  
## ========================================================= 
## Fit for 4 latent classes: 
## ========================================================= 
## 2 / 1 
##             Coefficient  Std. error  t value  Pr(>|t|)
## (Intercept)     0.18072     0.04557    3.966     0.000
## x1sexMale      -0.03346     0.05061   -0.661     0.509
## ========================================================= 
## 3 / 1 
##             Coefficient  Std. error  t value  Pr(>|t|)
## (Intercept)     0.52177     0.03494   14.935     0.000
## x1sexMale      -0.02561     0.04463   -0.574     0.566
## ========================================================= 
## 4 / 1 
##             Coefficient  Std. error  t value  Pr(>|t|)
## (Intercept)     0.94016     0.03986   23.584     0.000
## x1sexMale      -0.05350     0.04562   -1.173     0.241
## ========================================================= 
## number of observations: 23497 
## number of estimated parameters: 50 
## residual degrees of freedom: 1997 
## maximum log-likelihood: -128084.6 
##  
## AIC(4): 256269.2
## BIC(4): 256672.5
## X^2(4): 37908.8 (Chi-square goodness of fit) 
##  
## ALERT: estimation algorithm automatically restarted with new initial values 
##

probs.start <- LCA2$probs.start
plot(LCA2)

LCA with covariates - student gender, race, locale and region

f5 <- as.formula(cbind(c1cluster, c1indvcrs, c1intern, c1jobfair, c1jobguide, c1employer, c1awareness, c1careerunit, c1workstudy, c1careerday, c1assemblies,c1jobinfocmp, c1jobinfonon, c1hstowrkoth, c1hstoworkno)~x1sex+x1black+x1white+x1hispanic+x1locale+x1region)

f6 <- as.formula(cbind(a1mspdlearn, a1mspdintrst, a1msother, a1msnone, a1g9summer, a1g9overage, a1g9communty, a1g9blocksch, a1g9double, a1g9study, a1g9teacher)~x1sex+x1black+x1white+x1hispanic+x1locale+x1region)

LCA3 <- poLCA(f5, data = df, nclass = 4, maxiter = 30000, nrep = 5)

## Model 1: llik = -172563.6 ... best llik = -172563.6
## Model 2: llik = -167186.3 ... best llik = -167186.3
## Model 3: llik = -167186.3 ... best llik = -167186.3
## Model 4: llik = -167186.3 ... best llik = -167186.3
## Model 5: llik = -167304 ... best llik = -167186.3
## Conditional item response (column) probabilities,
##  by outcome variable, for each class (row) 
##  
## $c1cluster
##            Pr(1)  Pr(2)
## class 1:  0.9897 0.0103
## class 2:  0.9845 0.0155
## class 3:  0.2283 0.7717
## class 4:  0.4980 0.5020
## 
## $c1indvcrs
##            Pr(1)  Pr(2)
## class 1:  0.0054 0.9946
## class 2:  0.0000 1.0000
## class 3:  0.0605 0.9395
## class 4:  0.0546 0.9454
## 
## $c1intern
##            Pr(1)  Pr(2)
## class 1:  0.9234 0.0766
## class 2:  1.0000 0.0000
## class 3:  0.2953 0.7047
## class 4:  0.6160 0.3840
## 
## $c1jobfair
##            Pr(1)  Pr(2)
## class 1:  0.9666 0.0334
## class 2:  1.0000 0.0000
## class 3:  0.4411 0.5589
## class 4:  0.8160 0.1840
## 
## $c1jobguide
##            Pr(1)  Pr(2)
## class 1:  0.9543 0.0457
## class 2:  0.9991 0.0009
## class 3:  0.5059 0.4941
## class 4:  0.7099 0.2901
## 
## $c1employer
##            Pr(1)  Pr(2)
## class 1:  0.9537 0.0463
## class 2:  0.0022 0.9978
## class 3:  0.1175 0.8825
## class 4:  0.5816 0.4184
## 
## $c1awareness
##            Pr(1)  Pr(2)
## class 1:  0.9399 0.0601
## class 2:  0.0000 1.0000
## class 3:  0.1188 0.8812
## class 4:  0.4623 0.5377
## 
## $c1careerunit
##            Pr(1)  Pr(2)
## class 1:  0.9261 0.0739
## class 2:  1.0000 0.0000
## class 3:  0.2581 0.7419
## class 4:  0.7187 0.2813
## 
## $c1workstudy
##            Pr(1)  Pr(2)
## class 1:  1.0000 0.0000
## class 2:  0.0002 0.9998
## class 3:  0.2866 0.7134
## class 4:  0.5464 0.4536
## 
## $c1careerday
##            Pr(1)  Pr(2)
## class 1:  0.9477 0.0523
## class 2:  0.9891 0.0109
## class 3:  0.3508 0.6492
## class 4:  0.6447 0.3553
## 
## $c1assemblies
##            Pr(1)  Pr(2)
## class 1:  0.9307 0.0693
## class 2:  1.0000 0.0000
## class 3:  0.1886 0.8114
## class 4:  0.7010 0.2990
## 
## $c1jobinfocmp
##            Pr(1)  Pr(2)
## class 1:  0.8393 0.1607
## class 2:  0.0000 1.0000
## class 3:  0.0157 0.9843
## class 4:  0.2808 0.7192
## 
## $c1jobinfonon
##            Pr(1)  Pr(2)
## class 1:  0.9387 0.0613
## class 2:  1.0000 0.0000
## class 3:  0.2794 0.7206
## class 4:  0.7726 0.2274
## 
## $c1hstowrkoth
##            Pr(1)  Pr(2)
## class 1:  0.9495 0.0505
## class 2:  1.0000 0.0000
## class 3:  0.5807 0.4193
## class 4:  0.7908 0.2092
## 
## $c1hstoworkno
##            Pr(1)  Pr(2)
## class 1:  0.6731 0.3269
## class 2:  1.0000 0.0000
## class 3:  1.0000 0.0000
## class 4:  1.0000 0.0000
## 
## Estimated class population shares 
##  0.0692 0.1016 0.3293 0.4999 
##  
## Predicted class memberships (by modal posterior prob.) 
##  0.0708 0.104 0.3284 0.4969 
##  
## ========================================================= 
## Fit for 4 latent classes: 
## ========================================================= 
## 2 / 1 
##                   Coefficient  Std. error  t value  Pr(>|t|)
## (Intercept)           0.16919     0.13567    1.247     0.212
## x1sexMale            -0.10118     0.07361   -1.375     0.169
## x1black              -0.09856     0.11575   -0.852     0.394
## x1white              -0.22175     0.09643   -2.300     0.021
## x1hispanic            0.46481     0.10536    4.411     0.000
## x1localeRural         0.77854     0.12282    6.339     0.000
## x1localeSuburb        1.00359     0.09471   10.596     0.000
## x1localeTown          0.37831     0.14497    2.610     0.009
## x1regionNortheast     0.81378     0.14419    5.644     0.000
## x1regionSouth        -0.40145     0.10213   -3.931     0.000
## x1regionWest         -0.40066     0.12935   -3.097     0.002
## ========================================================= 
## 3 / 1 
##                   Coefficient  Std. error  t value  Pr(>|t|)
## (Intercept)           0.93077     0.12037    7.733     0.000
## x1sexMale            -0.16552     0.06476   -2.556     0.011
## x1black               0.12657     0.10392    1.218     0.223
## x1white               0.07192     0.08745    0.822     0.411
## x1hispanic            0.18632     0.09632    1.934     0.053
## x1localeRural         1.32780     0.10881   12.203     0.000
## x1localeSuburb        0.94142     0.08531   11.035     0.000
## x1localeTown          1.02538     0.12559    8.165     0.000
## x1regionNortheast     0.53553     0.13639    3.926     0.000
## x1regionSouth        -0.14909     0.09025   -1.652     0.099
## x1regionWest         -0.21285     0.11640   -1.829     0.067
## ========================================================= 
## 4 / 1 
##                   Coefficient  Std. error  t value  Pr(>|t|)
## (Intercept)           1.58079     0.12069   13.098     0.000
## x1sexMale            -0.11699     0.06511   -1.797     0.072
## x1black               0.08705     0.10446    0.833     0.405
## x1white               0.05584     0.08766    0.637     0.524
## x1hispanic            0.28351     0.09669    2.932     0.003
## x1localeRural         1.29271     0.11082   11.665     0.000
## x1localeSuburb        1.04618     0.08605   12.159     0.000
## x1localeTown          0.94291     0.12661    7.447     0.000
## x1regionNortheast     0.21312     0.13709    1.555     0.120
## x1regionSouth        -0.52924     0.09042   -5.853     0.000
## x1regionWest         -0.58730     0.11671   -5.032     0.000
## ========================================================= 
## number of observations: 22046 
## number of estimated parameters: 93 
## residual degrees of freedom: 21953 
## maximum log-likelihood: -167186.3 
##  
## AIC(4): 334558.6
## BIC(4): 335302.6
## X^2(4): 445957.8 (Chi-square goodness of fit) 
##  
## ALERT: estimation algorithm automatically restarted with new initial values 
##

probs.start <- LCA3$probs.start
plot(LCA3)

LCA4 <- poLCA(f6, data = df, nclass = 4, maxiter = 30000, nrep = 5)

## Model 1: llik = -119605.3 ... best llik = -119605.3
## Model 2: llik = -120221.7 ... best llik = -119605.3
## Model 3: llik = -129656.2 ... best llik = -119605.3
## Model 4: llik = -120473.9 ... best llik = -119605.3
## Model 5: llik = -121543 ... best llik = -119605.3
## Conditional item response (column) probabilities,
##  by outcome variable, for each class (row) 
##  
## $a1mspdlearn
##           Pr(1) Pr(2)
## class 1:  0.120 0.880
## class 2:  0.000 1.000
## class 3:  0.184 0.816
## class 4:  1.000 0.000
## 
## $a1mspdintrst
##            Pr(1)  Pr(2)
## class 1:  0.3231 0.6769
## class 2:  0.9728 0.0272
## class 3:  0.4390 0.5610
## class 4:  1.0000 0.0000
## 
## $a1msother
##            Pr(1)  Pr(2)
## class 1:  0.6432 0.3568
## class 2:  0.9505 0.0495
## class 3:  0.8171 0.1829
## class 4:  0.7461 0.2539
## 
## $a1msnone
##            Pr(1)  Pr(2)
## class 1:  1.0000 0.0000
## class 2:  1.0000 0.0000
## class 3:  1.0000 0.0000
## class 4:  0.9282 0.0718
## 
## $a1g9summer
##            Pr(1)  Pr(2)
## class 1:  0.4408 0.5592
## class 2:  1.0000 0.0000
## class 3:  0.6033 0.3967
## class 4:  0.6690 0.3310
## 
## $a1g9overage
##            Pr(1)  Pr(2)
## class 1:  0.4934 0.5066
## class 2:  1.0000 0.0000
## class 3:  0.9293 0.0707
## class 4:  0.9321 0.0679
## 
## $a1g9communty
##            Pr(1)  Pr(2)
## class 1:  0.3012 0.6988
## class 2:  0.9916 0.0084
## class 3:  0.8553 0.1447
## class 4:  0.8209 0.1791
## 
## $a1g9blocksch
##            Pr(1)  Pr(2)
## class 1:  0.0587 0.9413
## class 2:  0.0000 1.0000
## class 3:  0.1596 0.8404
## class 4:  0.1883 0.8117
## 
## $a1g9double
##            Pr(1)  Pr(2)
## class 1:  0.2178 0.7822
## class 2:  0.0485 0.9515
## class 3:  0.5421 0.4579
## class 4:  0.5682 0.4318
## 
## $a1g9study
##            Pr(1)  Pr(2)
## class 1:  0.3877 0.6123
## class 2:  0.9585 0.0415
## class 3:  0.6175 0.3825
## class 4:  0.6803 0.3197
## 
## $a1g9teacher
##            Pr(1)  Pr(2)
## class 1:  0.1286 0.8714
## class 2:  0.9995 0.0005
## class 3:  0.6235 0.3765
## class 4:  0.7897 0.2103
## 
## Estimated class population shares 
##  0.206 0.1507 0.3812 0.2621 
##  
## Predicted class memberships (by modal posterior prob.) 
##  0.1901 0.1544 0.3603 0.2952 
##  
## ========================================================= 
## Fit for 4 latent classes: 
## ========================================================= 
## 2 / 1 
##                   Coefficient  Std. error  t value  Pr(>|t|)
## (Intercept)           0.11687     0.09417    1.241     0.215
## x1sexMale            -0.01375     0.05199   -0.264     0.792
## x1black               0.18737     0.07543    2.484     0.013
## x1white              -0.07310     0.06691   -1.093     0.275
## x1hispanic           -0.16485     0.07027   -2.346     0.019
## x1localeRural         0.53988     0.07698    7.013     0.000
## x1localeSuburb       -0.30381     0.06481   -4.688     0.000
## x1localeTown          0.09506     0.11126    0.854     0.393
## x1regionNortheast     0.47197     0.10192    4.631     0.000
## x1regionSouth        -0.67357     0.06849   -9.835     0.000
## x1regionWest         -0.76684     0.08770   -8.744     0.000
## ========================================================= 
## 3 / 1 
##                   Coefficient  Std. error  t value  Pr(>|t|)
## (Intercept)           0.62922     0.09620    6.540     0.000
## x1sexMale            -0.04634     0.05020   -0.923     0.356
## x1black              -0.24638     0.07777   -3.168     0.002
## x1white               0.26694     0.06655    4.011     0.000
## x1hispanic           -0.16612     0.06670   -2.491     0.013
## x1localeRural         0.50370     0.07715    6.529     0.000
## x1localeSuburb       -0.20002     0.06291   -3.179     0.001
## x1localeTown          0.83954     0.10009    8.388     0.000
## x1regionNortheast     0.78864     0.10201    7.731     0.000
## x1regionSouth        -0.52601     0.06798   -7.737     0.000
## x1regionWest         -0.40606     0.08336   -4.871     0.000
## ========================================================= 
## 4 / 1 
##                   Coefficient  Std. error  t value  Pr(>|t|)
## (Intercept)           0.51934     0.08760    5.929     0.000
## x1sexMale            -0.00341     0.04764   -0.072     0.943
## x1black              -0.30175     0.07491   -4.028     0.000
## x1white               0.39111     0.06366    6.144     0.000
## x1hispanic           -0.36457     0.06541   -5.574     0.000
## x1localeRural         0.83217     0.07159   11.624     0.000
## x1localeSuburb       -0.43264     0.05991   -7.222     0.000
## x1localeTown          0.72834     0.09614    7.576     0.000
## x1regionNortheast     0.41230     0.09639    4.277     0.000
## x1regionSouth        -1.31308     0.06525  -20.123     0.000
## x1regionWest         -0.38213     0.07310   -5.228     0.000
## ========================================================= 
## number of observations: 22046 
## number of estimated parameters: 77 
## residual degrees of freedom: 1970 
## maximum log-likelihood: -119605.3 
##  
## AIC(4): 239364.5
## BIC(4): 239980.6
## X^2(4): 34789.35 (Chi-square goodness of fit) 
##  
## ALERT: estimation algorithm automatically restarted with new initial values 
##

probs.start <- LCA4$probs.start
plot(LCA4)

Brief Interpretation

Limitations and Next Step Analysis

Concluding Remarks

Reference

Ingels, S. J., Pratt, D. J., Herget, D. R., Burns, L. J., Dever, J. A., Ottem, R., Rogers, J. E., Jin, Y., & Leinwand, S. (2011). High school longitudinal study of 2009 (HSLS: 09): Base-year data file documentation. NCES 2011-328. National Center for Education Statistics.

Kurban, E. R., & Cabrera, A. F. (2020). Building readiness and intention towards STEM fields of study: Using HSLS: 09 and SEM to examine this complex process among high school students. The Journal of Higher Education, 91(4), 620–650.

Final Project

Emmanuel Akinyele

2024-05-08

Introduction

Problem Statement

Data and Variables

Counselor variables

Administrator variables

Covariates

Sample size, description, excluded participants

Results

Multivariate Statistical Method

Preparation for LCA (Select Variables)

Brief Interpretation

Brief Interpretation

Adding student gender as a covariates

LCA with covariates - student gender, race, locale and region

Brief Interpretation

Limitations and Next Step Analysis

Concluding Remarks

Reference