class: center, middle, inverse, title-slide .title[ # Advanced quantitative data analysis ] .subtitle[ ## Fixed effect II ] .author[ ### Mengni Chen ] .institute[ ### Department of Sociology, University of Copenhagen ] --- <style type="text/css"> .remark-slide-content { font-size: 20px; padding: 20px 80px 20px 80px; } .remark-code, .remark-inline-code { background: #f0f0f0; } .remark-code { font-size: 14px; } </style> #Let's get ready ```r #install.packages("lmtest") library(tidyverse) # Add the tidyverse package to my current library. library(haven) # Handle labelled data. library(janitor) # Tabulations library(texreg) #output results library(splitstackshape) #transform wide data (with stacked variables) to long data library(plm) #linear models for panel data ``` --- #Does partnership make you happier? - [Prepare the data](https://rpubs.com/fancycmn/1245850) --- #Does partnership make you happier? - Fixed effect ```r sixwaves_long1 <- pdata.frame(sixwaves_long1, index=c("id", "wave")) fixed1 <- plm(sat ~ partner, data=sixwaves_long1, model="within") summary(fixed1) ``` ``` ## Oneway (individual) effect Within Model ## ## Call: ## plm(formula = sat ~ partner, data = sixwaves_long1, model = "within") ## ## Unbalanced Panel: n = 2591, T = 1-6, N = 10370 ## ## Residuals: ## Min. 1st Qu. Median 3rd Qu. Max. ## -7.83333 -0.50000 0.00000 0.56371 5.66667 ## ## Coefficients: ## Estimate Std. Error t-value Pr(>|t|) ## partnerYes 0.308867 0.037296 8.2815 < 2.2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Total Sum of Squares: 12535 ## Residual Sum of Squares: 12426 ## R-Squared: 0.0087404 ## Adj. R-Squared: -0.32147 ## F-statistic: 68.5825 on 1 and 7778 DF, p-value: < 2.22e-16 ``` --- #But ignores sequence and repetition of event A within-person change from single to partnered is associated with 0.30 scale points increase in life satisfaction. **Such estimation ignores sequence and repetition of event!** <img src="https://github.com/fancycmn/slide10/blob/main/S10_Pic2.JPG?raw=true" width="50%" style="display: block; margin: 5px ;"> --- #Look at the set up of the fixed effect `$$Sat_{i,t}-\bar{Sat_{i,t}}= \beta_{1}*(partner_{i,t}-\bar{partner_{i,t}}) + (\epsilon_{i,t} -\bar{\epsilon_{i,t}})$$` ```r sixwaves_long1 <- sixwaves_long1 %>% group_by(id) %>% mutate( partner1=as.numeric(partner), pt_mean=mean(partner1), #generate the mean for partner1 by id pt_demean=partner1-pt_mean, #the process of de-meaning for partner sat_mean=mean(sat), #generate the mean for sat by id sat_demean=sat-sat_mean #the process of de-meaning for sat ) # within-person de-mean ``` <img src="https://github.com/fancycmn/2024-Session11/blob/main/Figure1.JPG?raw=true" width="85%" style="display: block; margin: 5px ;"> --- #Look at the set up of the fixed effect <img src="https://github.com/fancycmn/2024-Session11/blob/main/Figure2.JPG?raw=true" width="85%" style="display: block; margin: 5px ;"> --- #What is the problem with the estimation <img src="https://github.com/fancycmn/2024-Session11/blob/main/Figure4.JPG?raw=true" width="70%" style="display: block; margin: 5px ;"> - Current setup: - Initial status is defined ("no partner") - Partner effect modelled by a simple dummy (yes vs no) - Problem: - Subsequent order is not defined: e.g. break up, finding another partner - Consequences: - Sequence of cause and effect unclear - The estimation mixes the effects of union formation and union dissolution - The estimation comprises repeated events (multiple formations and dissolution) --- #Modelling time and temporal order - How to define the sample? - Included only those who are risk to experience the event. In this case, we only include who are initially single. - How to define the event? - Reverse an event - Repeat an event - Decide whether to remove or keep reverse and repeated transition - shall we consider the breakup, a reverse of an event (back to single)? - shall we consider the repartnering, a repeated transition? - Anchor the event in time --- #How to define the event? Whether to remove or keep reverse and repeated transition - Option 1 - Keep reverse/repeated transitions, if the interest is in finding a partner - Option 2 - Remove reverse/repeated transitions, if the interest is in finding and keeping a partner - Examples - In the partnership case, finding and keeping a parnter is of my interest. - Drop observations when the person breakup and repartnered - In the first childbearing case, having and raising a first child is of my interest. - Drop observations when the person starts to have a second child - In the breakup case, breaking up and remained in a separated status is of my interest. - Drop observations when the person starts to enter a new relationship - In a new case of experience unemployment, how do you handle? --- #How to define the event? remove reverse and repeated transition ```r sixwaves_long1 <- sixwaves_long1 %>% group_by(id) %>% mutate( wave=as.numeric(wave), #once we define sixwaves_long1 as a panel structure, wave becomes a factor; so transfer back to numeric getpartner=case_when( partner!=dplyr::lag(partner, 1) & partner=="Yes" & dplyr::lag(partner, 1)=="No" ~ 1, TRUE ~ 0), #identify the event of getting a partner breakpartner=case_when( partner!=dplyr::lag(partner, 1) & partner=="No" & dplyr::lag(partner, 1)=="Yes" ~ 1, TRUE ~ 0), #identify the event of breaking up times_partner=cumsum(getpartner), # identify how many times a person gets a partner in a cumulative way times_departer=cumsum(breakpartner), # identify how many times a person breaks up in a cumulative way ) ``` --- <img src="https://github.com/fancycmn/2024-Session11/blob/main/Figure5.JPG?raw=true" width="120%" style="display: block; margin: 5px ;"> ```r sixwaves_long2 <- sixwaves_long1 %>% filter(times_departer==0) #drop observations once an individuals experience at least 1 time of break up ``` --- #How to define the event? remove reverse and repeated transition <img src="https://github.com/fancycmn/2024-Session11/blob/main/Figure6.JPG?raw=true" width="120%" style="display: block; margin: 5px ;"> --- #How to define the event? remove reverse and repeated transition <img src="https://github.com/fancycmn/2024-Session11/blob/main/Figure7.JPG?raw=true" width="120%" style="display: block; margin: 5px ;"> --- #How to define the event? remove reverse and repeated transition <img src="https://github.com/fancycmn/2024-Session11/blob/main/Figure12.JPG?raw=true" width="70%" style="display: block; margin: 5px ;"> --- #Now run the fixed effect .pull-left[ ```r sixwaves_long2 <- pdata.frame(sixwaves_long2, index=c("id", "wave")) fixed2 <- plm(sat ~ partner, data=sixwaves_long2, model="within") summary(fixed2) ``` ``` ## Oneway (individual) effect Within Model ## ## Call: ## plm(formula = sat ~ partner, data = sixwaves_long2, model = "within") ## ## Unbalanced Panel: n = 2591, T = 1-6, N = 9409 ## ## Residuals: ## Min. 1st Qu. Median 3rd Qu. Max. ## -7.8333 -0.5000 0.0000 0.5172 5.6667 ## ## Coefficients: ## Estimate Std. Error t-value Pr(>|t|) ## partnerYes 0.292990 0.040214 7.2858 3.557e-13 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Total Sum of Squares: 10824 ## Residual Sum of Squares: 10740 ## R-Squared: 0.0077267 ## Adj. R-Squared: -0.36942 ## F-statistic: 53.0832 on 1 and 6817 DF, p-value: 3.557e-13 ``` ] .pull-right[ Interpretation: - now we only look at one transition, that is from single to partnered; - A person's partnership status changes from single to partnered, the life satisfaction increases by 0.29 on average. ] --- #Does the effect of the event vary across time? Anchoring the event in time - For examples, event of childbirth, divorce, marriage - Anticipation effect, such as pregnancy, the struggle period before divorce, the marriage proposal - After the event happens, the effect will change over time - Experience the family-work conflict after childbirth - Experience relief from the divorce, or loneliness, or economic struggle - Experience a honey moon period and then get used to a normal life --- #Does the effect of the event vary across time? Anchoring the event in time With panel data we can investigate the time path of a causal effect - Termed "impact function" (IF) - Different impact functions, see the major types of impact function in the following <img src="https://github.com/fancycmn/slide10/blob/main/S10_Pic10.JPG?raw=true" width="50%" style="display: block; margin: 5px ;"> --- #Test step impact - Step impact <img src="https://github.com/fancycmn/2024-Session11/blob/main/Figure9.JPG?raw=true" width="70%" style="display: block; margin: 5px ;"> --- #Test step impact - Step impact ```r fixed2 <- plm(sat ~ partner, data=sixwaves_long2, model="within") summary(fixed2) ``` ``` ## Oneway (individual) effect Within Model ## ## Call: ## plm(formula = sat ~ partner, data = sixwaves_long2, model = "within") ## ## Unbalanced Panel: n = 2591, T = 1-6, N = 9409 ## ## Residuals: ## Min. 1st Qu. Median 3rd Qu. Max. ## -7.8333 -0.5000 0.0000 0.5172 5.6667 ## ## Coefficients: ## Estimate Std. Error t-value Pr(>|t|) ## partnerYes 0.292990 0.040214 7.2858 3.557e-13 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Total Sum of Squares: 10824 ## Residual Sum of Squares: 10740 ## R-Squared: 0.0077267 ## Adj. R-Squared: -0.36942 ## F-statistic: 53.0832 on 1 and 6817 DF, p-value: 3.557e-13 ``` --- #Test linear and quadratic impact - Linear or quadratic impact ```r #sixwaves_long2 is a dataset that removes observations when indivdiuals experienced breakup or repartnering. sixwaves_long3 <- sixwaves_long2 %>% group_by(id) %>% mutate( wave=as.numeric(wave),#once we define sixwaves_long2a as a panel structure, wave becomes a factor; so transfer back to numeric partnerwave=case_when(getpartner==1 ~ wave, TRUE ~ 99 ), #identify at which wave the person get a partner; for the rest, make it 99. anchorwave=min(partnerwave), #anchor the time of the event timeindex=wave - anchorwave, #index reflect the time since the event happens duration=case_when(timeindex<0 ~ 0, #any wave before the time of getting a partner, make it zero timeindex>=0 ~ timeindex), #linear setup duration2=duration^2 #quadratic setup, the square of duration ) ``` --- #Test linear and quadratic impact - Linear or quadratic impact <img src="https://github.com/fancycmn/2024-Session11/blob/main/Figure10.JPG?raw=true" width="120%" style="display: block; margin: 5px ;"> --- #Test linear and quadratic impact - Linear impact ```r sixwaves_long3 <- pdata.frame(sixwaves_long3, index=c("id", "wave")) fixed_linear <- plm(sat ~ partner + duration, data=sixwaves_long3, model="within") #linear impact summary(fixed_linear) ``` ``` ## Oneway (individual) effect Within Model ## ## Call: ## plm(formula = sat ~ partner + duration, data = sixwaves_long3, ## model = "within") ## ## Unbalanced Panel: n = 2591, T = 1-6, N = 9409 ## ## Residuals: ## Min. 1st Qu. Median 3rd Qu. Max. ## -7.8333 -0.5000 0.0000 0.5000 5.6667 ## ## Coefficients: ## Estimate Std. Error t-value Pr(>|t|) ## partnerYes 0.337858 0.043950 7.6873 1.714e-14 *** ## duration -0.063071 0.024978 -2.5251 0.01159 * ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Total Sum of Squares: 10824 ## Residual Sum of Squares: 10730 ## R-Squared: 0.0086541 ## Adj. R-Squared: -0.36834 ## F-statistic: 29.7506 on 2 and 6816 DF, p-value: 1.3663e-13 ``` --- #Test linear and quadratic impact - Quadratic impact ```r fixed_quadratic <- plm(sat ~ partner + duration + duration2, data=sixwaves_long3, model="within") #Quadratic impact summary(fixed_quadratic) ``` ``` ## Oneway (individual) effect Within Model ## ## Call: ## plm(formula = sat ~ partner + duration + duration2, data = sixwaves_long3, ## model = "within") ## ## Unbalanced Panel: n = 2591, T = 1-6, N = 9409 ## ## Residuals: ## Min. 1st Qu. Median 3rd Qu. Max. ## -7.8333 -0.5000 0.0000 0.5000 5.6667 ## ## Coefficients: ## Estimate Std. Error t-value Pr(>|t|) ## partnerYes 0.3434487 0.0463117 7.4160 1.353e-13 *** ## duration -0.0868023 0.0667851 -1.2997 0.1937 ## duration2 0.0075053 0.0195886 0.3831 0.7016 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Total Sum of Squares: 10824 ## Residual Sum of Squares: 10730 ## R-Squared: 0.0086754 ## Adj. R-Squared: -0.36851 ## F-statistic: 19.8802 on 3 and 6815 DF, p-value: 7.9532e-13 ``` --- #Test dummy impact .pull-left[ ```r tabyl(sixwaves_long3, timeindex) ``` ``` ## timeindex n percent ## -98 1470 0.156233394 ## -97 907 0.096397067 ## -96 725 0.077053885 ## -95 601 0.063875013 ## -94 503 0.053459454 ## -93 447 0.047507705 ## -5 64 0.006801998 ## -4 203 0.021575088 ## -3 378 0.040174301 ## -2 646 0.068657668 ## -1 1048 0.111382719 ## 0 1121 0.119141248 ## 1 597 0.063449888 ## 2 367 0.039005208 ## 3 225 0.023913275 ## 4 107 0.011372091 ``` ] .pull-right[ - Dummy impact ```r sixwaves_long3 <- sixwaves_long3 %>% group_by(id) %>% mutate( dummy=case_when( timeindex %in% c(-98:-93) ~ "2+ year before", timeindex %in% c(-2:-5) ~ "2+ year before", timeindex==-1 ~ "1 year before", timeindex==0 ~ "year of formation", timeindex==1 ~ "1 year after", timeindex>1 ~ "2+ year after", ) %>% as_factor() ) #setup for dummy impact ``` ] --- #Test dummy impact - Dummy impact <img src="https://github.com/fancycmn/2024-Session11/blob/main/Figure11.JPG?raw=true" width="120%" style="display: block; margin: 5px ;"> --- #Test dummy impact - Dummy impact ```r sixwaves_long3 <- pdata.frame(sixwaves_long3, index=c("id", "wave")) fixed_dummy <- plm(sat ~ dummy, data=sixwaves_long3, model="within") summary(fixed_dummy ) ``` ``` ## Oneway (individual) effect Within Model ## ## Call: ## plm(formula = sat ~ dummy, data = sixwaves_long3, model = "within") ## ## Unbalanced Panel: n = 2591, T = 1-6, N = 9409 ## ## Residuals: ## Min. 1st Qu. Median 3rd Qu. Max. ## -7.8333 -0.5000 0.0000 0.5000 5.6667 ## ## Coefficients: ## Estimate Std. Error t-value Pr(>|t|) ## dummy1 year before -0.004647 0.057680 -0.0806 0.935790 ## dummyyear of formation 0.352088 0.056013 6.2859 3.462e-10 *** ## dummy1 year after 0.217217 0.069360 3.1317 0.001745 ** ## dummy2+ year after 0.196011 0.073328 2.6731 0.007534 ** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Total Sum of Squares: 10824 ## Residual Sum of Squares: 10730 ## R-Squared: 0.0086872 ## Adj. R-Squared: -0.36869 ## F-statistic: 14.9284 on 4 and 6814 DF, p-value: 3.7633e-12 ``` --- #Compare all kinds of impacts Compare the step, linear, quadratic, and dummy impacts ```r texreg::screenreg(list(fixed2, fixed_linear, fixed_quadratic, fixed_dummy), custom.model.names=c("step impact", "linear impact", "quadratic impact", "dummyimpact"), include.ci = FALSE, single.row = TRUE) ``` ``` ## ## ====================================================================================================== ## step impact linear impact quadratic impact dummyimpact ## ------------------------------------------------------------------------------------------------------ ## partnerYes 0.29 (0.04) *** 0.34 (0.04) *** 0.34 (0.05) *** ## duration -0.06 (0.02) * -0.09 (0.07) ## duration2 0.01 (0.02) ## dummy1 year before -0.00 (0.06) ## dummyyear of formation 0.35 (0.06) *** ## dummy1 year after 0.22 (0.07) ** ## dummy2+ year after 0.20 (0.07) ** ## ------------------------------------------------------------------------------------------------------ ## R^2 0.01 0.01 0.01 0.01 ## Adj. R^2 -0.37 -0.37 -0.37 -0.37 ## Num. obs. 9409 9409 9409 9409 ## ====================================================================================================== ## *** p < 0.001; ** p < 0.01; * p < 0.05 ``` --- #Compare all kinds of impacts: interpretation - Step impact: When a person's partnership status changes from single to partnered, the life satisfaction increases by 0.29 on average. This effect remains constant over time. - Linear impact: in the year a person forms a partnership, the LS increases significantly by the amount of 0.34, and then it declines at a yearly rate of 0.06. - Quadratic impact: the result does not show a significant non-linear quadratic effect of forming a partnership. - Dummy impact: compared to 2 years before the partnership forms, LS does not change significantly one year before the partnership, indicating no anticipation effect; it increases by 0.35 in the year of partnership formation; LS declines since then but still remains significantly higher than the level observed 2 years before the partnership. --- #Take home - Generate standard error robust results, using `coeftest()` - Define your sample: keep samples who are at risk of experiencing the event - Define the event: keep or remove the reverse and repeated transition - Estimate the temporal impact of the event - Step impact - Linear or quadratic impact - Dummy impact --- class: center, middle #[Exercise](https://rpubs.com/fancycmn/1245902)