library(psych)
library(MPsychoR)
library(psychotools)
library(readr)
library(dplyr)
library(tidyverse)
library(corrplot)
library(GPArotation)
library(ggplot2)
library(corrplot)
library(kableExtra)

Introduction

Throughout the 20th century, psychologists showed an ongoing interest in the study of individual differences in humor (Martin, 1998). Since the early 1980s, much of this research has focused on potential beneficial effects of humor on physical and psychosocial health and well-being.

“Sense of humor” refers to humor as a stable personality trait or individual difference variable (see Ruch, 1998, for reviews of personality approaches to humor). Rather than being a single dimension, however, sense of humor is a multi-faceted construct. According to Martin (2003), sense of humor can be conceptualized as:

  1. a cognitive ability (you need to understand jokes, etc)

  2. an aesthetic response (you need to like certain types of jokes)

  3. an habitual behavior pattern (some people have the habit or laughing often, or of telling many jokes)

  4. an emotion-related temperament trait

  5. an attitude

  6. a coping strategy or defense mechanism The use of humor is not always related to healthy psychological behavior. Some forms of humor are, while others are not (e.g., sarcasm).

In the past two decades, researchers interested in relations between humor and various aspects of psychosocial and physical health and well-being have made use of a number of self-report measures that focus on some aspects of sense of humor considered to be germane to well-being. Recently, however, some researchers have begun to question the degree to which these measures adequately assess health-relevant dimensions of sense of humor.

In this paper, we present the development and initial validation of a new multidimensional measure, the Humor Styles Questionnaire (HSQ), which assesses four dimensions relating to different uses or functions of humor in everyday life. Two of these dimensions are considered to be conducive to psychosocial well-being, while two are hypothesized to be less benign and potentially even deleterious to well-being.

Presentation of the dataset

The dataset used in this project is composed of 1071 observations and 39 variables. The data are from a study of Martin, R. A., Puhlik-Doris, P., Larsen, G., Gray, J., & Weir, K. called Individual differences in uses of humor and their relation to psychological well-being: Development of the Humor Styles Questionnaire. Journal of Research in Personality, 37, 48-75.. It is expected that the questionnaire could be useful for assessing forms of humor that may be deleterious to health, since humor can be used in different ways, for example as coping mechanism or defense or it could be also an attitude of a person (tendency to be always cheerful or to laugh frequently). It’s also related to the Five Factor Model and may predict the mood of people and their self-esteem.

The study focus on the more or less healthy uses of humor, indeed the authors suggests that is possible to find mainly four type of humor, two of these are considered to be good for mental health, while the other two are hypothesized to be deleterious to well-being. These dimensions are:

In theory the four dimension should be well separeted with each other even though it’s possible an overlapping and so possible correlations.

humor <- read.csv("data.csv")
#View(humor)

The variables are statements rated on a five point scale where:

The exact question used are:

According to the autors, each question is related to the four dimension as follows:

affiliative: \[Q_1+Q_5+Q_9+Q_{13}+Q_{17}+Q_{21}+Q_{25}+Q_{29}\]

self-enhancing: \[Q_2+Q_6+Q_9+Q_{10}+Q_{14}+Q_{18}+Q_{22}+Q_{26}+Q_{30}\]

aggressive: \[Q_3+Q_7+Q_{11}+Q_{15}+Q_{19}+Q_{23}+Q_{27}+Q_{31}\]

selfdefeating: \[Q_4+Q_8+Q_{12}+Q_{16}+Q_{20}+Q_{24}+Q_{28}+Q_{32}\]

Description of the population

Also we have some information about the population, for example:

chr <- humor %>%
  select(age, gender, accuracy)

psych::describe(chr)
##          vars    n  mean      sd median trimmed   mad min   max range  skew
## age         1 1071 70.97 1371.99     23   24.56  7.41  14 44849 44835 32.47
## gender      2 1071  1.46    0.52      1    1.44  0.00   0     3     3  0.24
## accuracy    3 1071 87.54   12.04     90   89.16 11.86   2   100    98 -2.24
##          kurtosis    se
## age       1056.43 41.92
## gender      -1.30  0.02
## accuracy    10.01  0.37

I decided to filter:

chr2<- humor %>% 
  filter(gender=="1" | gender=="2") %>%
  filter(age>14 & age< 71) %>% 
  filter(accuracy>= 90 )

chr3 <- chr2 %>%
  select(age, gender, accuracy)
## rlang      (1.0.2   -> 1.0.6 ) [CRAN]
## cli        (3.3.0   -> 3.4.1 ) [CRAN]
## pillar     (1.8.0   -> 1.8.1 ) [CRAN]
## vctrs      (0.4.1   -> 0.5.0 ) [CRAN]
## tidyselect (1.1.2   -> 1.2.0 ) [CRAN]
## tibble     (3.1.7   -> 3.1.8 ) [CRAN]
## lifecycle  (1.0.1   -> 1.0.3 ) [CRAN]
## stringi    (1.7.6   -> 1.7.8 ) [CRAN]
## stringr    (1.4.0   -> 1.4.1 ) [CRAN]
## Rcpp       (1.0.8.3 -> 1.0.9 ) [CRAN]
## digest     (0.6.29  -> 0.6.30) [CRAN]
## cpp11      (0.4.2   -> 0.4.3 ) [CRAN]
## crayon     (1.5.1   -> 1.5.2 ) [CRAN]
## purrr      (0.3.4   -> 0.3.5 ) [CRAN]
## dplyr      (1.0.9   -> 1.0.10) [CRAN]
## curl       (4.3.2   -> 4.3.3 ) [CRAN]
## tidyr      (1.2.0   -> 1.2.1 ) [CRAN]
## htmltools  (0.5.2   -> 0.5.3 ) [CRAN]
## 
##   Ci sono versioni binarie disponibile, ma le versioni con le sorgenti
##   sono successive:
##        binary source needs_compilation
## vctrs   0.4.2  0.5.0              TRUE
## digest 0.6.29 0.6.30              TRUE
## cpp11   0.4.2  0.4.3             FALSE
## 
## pacchetto 'rlang' aperto con successo con controllo somme MD5
## pacchetto 'cli' aperto con successo con controllo somme MD5
## pacchetto 'pillar' aperto con successo con controllo somme MD5
## pacchetto 'tidyselect' aperto con successo con controllo somme MD5
## pacchetto 'lifecycle' aperto con successo con controllo somme MD5
## pacchetto 'stringi' aperto con successo con controllo somme MD5
## pacchetto 'Rcpp' aperto con successo con controllo somme MD5
## pacchetto 'crayon' aperto con successo con controllo somme MD5
## pacchetto 'curl' aperto con successo con controllo somme MD5
## pacchetto 'htmltools' aperto con successo con controllo somme MD5
## 
## I pacchetti binari scaricati sono in
##  C:\Users\Antonio\AppData\Local\Temp\RtmpmyTcZQ\downloaded_packages
## * checking for file 'C:\Users\Antonio\AppData\Local\Temp\RtmpmyTcZQ\remotes45304df24130\dcomtois-summarytools-9023f34/DESCRIPTION' ... OK
## * preparing 'summarytools':
## * checking DESCRIPTION meta-information ... OK
## * checking for LF line-endings in source and make files and shell scripts
## * checking for empty or unneeded directories
## * building 'summarytools_1.0.1.tar.gz'
## 

Data Frame Summary

chr3

Dimensions: 608 x 3
Duplicates: 354
Variable Stats / Values Freqs (% of Valid) Graph Missing
age [integer]
Mean (sd) : 27.2 (11.7)
min ≤ med ≤ max:
15 ≤ 23 ≤ 70
IQR (CV) : 13 (0.4)
51 distinct values 0 (0.0%)
gender [integer]
Min : 1
Mean : 1.5
Max : 2
1:320(52.6%)
2:288(47.4%)
0 (0.0%)
accuracy [integer]
Mean (sd) : 94.8 (4.3)
min ≤ med ≤ max:
90 ≤ 95 ≤ 100
IQR (CV) : 10 (0)
11 distinct values 0 (0.0%)

Generated by summarytools 1.0.1 (R version 4.2.0)
2022-10-28

we can notice:

Assumptions

Since we have already filtered all the characteristics of the population, we can pass to check the assumptions before doing the EFA, which are:

  1. Check for normality of the items

  2. Check if the correlation matrix is invertible and produce a correlograms

  3. Check for sphericity, to check if the correlation matrix is different from an identity matrix

Normality of data

We perform a shapiro test on each item in order to see if it follows a Gaussian distribution.

library(nortest)

#?ad.test
shap = lapply(chr2[, 1:32], ad.test)
res = sapply(shap, `[`, c("statistic", "p.value"))
t(res)
##     statistic p.value
## Q1  46.22377  3.7e-24
## Q2  22.82413  3.7e-24
## Q3  18.27435  3.7e-24
## Q4  19.39204  3.7e-24
## Q5  28.21008  3.7e-24
## Q6  52.37541  3.7e-24
## Q7  21.84468  3.7e-24
## Q8  21.756    3.7e-24
## Q9  26.65949  3.7e-24
## Q10 18.66616  3.7e-24
## Q11 23.3029   3.7e-24
## Q12 19.26865  3.7e-24
## Q13 90.21046  3.7e-24
## Q14 19.94465  3.7e-24
## Q15 26.1858   3.7e-24
## Q16 18.65558  3.7e-24
## Q17 50.80109  3.7e-24
## Q18 18.78425  3.7e-24
## Q19 21.473    3.7e-24
## Q20 37.89628  3.7e-24
## Q21 79.23421  3.7e-24
## Q22 17.74357  3.7e-24
## Q23 18.29878  3.7e-24
## Q24 23.23633  3.7e-24
## Q25 82.06884  3.7e-24
## Q26 25.10195  3.7e-24
## Q27 36.45982  3.7e-24
## Q28 20.33028  3.7e-24
## Q29 29.88143  3.7e-24
## Q30 40.81459  3.7e-24
## Q31 20.10447  3.7e-24
## Q32 19.29943  3.7e-24

As we can notice normality is not present.

Correlation

since that the determinant is not equal to zero we can consider our matrix invertible

r = cor(chr2[, 1:32])
det(r)
## [1] 2.283079e-05

let’s plot the correlation matrix

corrplot(r, method="color",
         type="upper", order="hclust", 
         addCoef.col = "black",
         tl.col="black", tl.srt=45, sig.level = 0.01, 
         insig = "blank",
         tl.cex = 0.5,
         diag=FALSE,
         number.cex = 0.5
         )

We can see mainly three groups, two of positively correlated and one of negatively correlated on the upper part. Mainly we would expect four groups due to the four factors mentioned in the theory, probably the fourth one is connected to another group making it difficult to identify.

Check for sphericity

Sphericity test through KMO since items are not normally distributed:

KMO(r)
## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = r)
## Overall MSA =  0.86
## MSA for each item = 
##   Q1   Q2   Q3   Q4   Q5   Q6   Q7   Q8   Q9  Q10  Q11  Q12  Q13  Q14  Q15  Q16 
## 0.93 0.91 0.91 0.88 0.88 0.82 0.82 0.84 0.93 0.86 0.83 0.88 0.87 0.91 0.83 0.84 
##  Q17  Q18  Q19  Q20  Q21  Q22  Q23  Q24  Q25  Q26  Q27  Q28  Q29  Q30  Q31  Q32 
## 0.88 0.84 0.90 0.82 0.84 0.85 0.81 0.83 0.81 0.88 0.84 0.90 0.86 0.78 0.82 0.89

The KMO measure the sampling adequacy for each image of our data matrix, the closer the value is to 1, the better is our data for doing a factor analysis. Since the overall MSA is 0.81 it’s possible to make a good factor analysis.

EXPLORATORY FACTORIAL ANALYSIS

How many factors?

fa.parallel(chr2[, 1:32], fm = "pa")

## Parallel analysis suggests that the number of factors =  5  and the number of components =  5

By looking at the eigenvalues greater than 1 we can notice 4 factor and, since that also according to the theory this is the optimal number, thst’s the one that I’m going to choose.

Which method?

In this case without rotation.

res<-fa(chr2[, 1:32],fm='pa',nfactors = 4,rotate='none')
## see the loadings
#res2$loadings
print.psych(res,sort=T)
## Factor Analysis using method =  pa
## Call: fa(r = chr2[, 1:32], nfactors = 4, rotate = "none", fm = "pa")
## Standardized loadings (pattern matrix) based upon correlation matrix
##     item   PA1   PA2   PA3   PA4   h2   u2 com
## Q10   10  0.55 -0.23  0.22  0.37 0.54 0.46 2.5
## Q17   17 -0.53  0.18  0.21  0.37 0.50 0.50 2.4
## Q14   14  0.52 -0.35  0.15  0.23 0.47 0.53 2.4
## Q26   26  0.52 -0.29  0.20  0.27 0.47 0.53 2.5
## Q1     1 -0.51  0.20  0.16  0.30 0.42 0.58 2.2
## Q5     5  0.51 -0.24 -0.14 -0.27 0.41 0.59 2.2
## Q19   19  0.51  0.17 -0.21  0.15 0.35 0.65 1.8
## Q18   18  0.50 -0.25  0.26  0.44 0.57 0.43 3.0
## Q2     2  0.49 -0.29  0.13  0.25 0.40 0.60 2.4
## Q13   13  0.48 -0.26 -0.10 -0.20 0.35 0.65 2.0
## Q21   21  0.46 -0.24  0.06 -0.26 0.34 0.66 2.2
## Q25   25 -0.46  0.29  0.16  0.31 0.42 0.58 2.8
## Q32   32  0.46  0.32  0.40 -0.14 0.50 0.50 3.0
## Q3     3  0.46  0.29 -0.30  0.13 0.40 0.60 2.7
## Q8     8  0.45  0.39  0.39 -0.16 0.53 0.47 3.2
## Q28   28  0.44  0.14  0.09  0.09 0.23 0.77 1.4
## Q6     6  0.44 -0.26  0.01  0.13 0.28 0.72 1.9
## Q9     9 -0.42  0.14  0.04  0.25 0.26 0.74 1.9
## Q12   12  0.39  0.35  0.33 -0.22 0.43 0.57 3.6
## Q22   22 -0.38  0.15  0.04 -0.05 0.17 0.83 1.4
## Q4     4  0.36  0.33  0.32 -0.09 0.34 0.66 3.1
## Q20   20  0.39  0.49  0.39 -0.10 0.56 0.44 2.9
## Q16   16 -0.33 -0.39 -0.16  0.17 0.31 0.69 2.7
## Q7     7 -0.19 -0.36  0.35 -0.21 0.32 0.68 3.1
## Q27   27  0.31  0.35 -0.28  0.20 0.33 0.67 3.6
## Q31   31 -0.31 -0.30  0.54 -0.09 0.48 0.52 2.3
## Q15   15 -0.36 -0.39  0.42 -0.11 0.46 0.54 3.1
## Q24   24  0.16  0.31  0.37  0.04 0.27 0.73 2.4
## Q29   29 -0.35  0.19  0.37  0.24 0.35 0.65 3.3
## Q23   23 -0.34 -0.21  0.37 -0.07 0.30 0.70 2.7
## Q11   11  0.24  0.29 -0.31  0.28 0.32 0.68 3.9
## Q30   30  0.24 -0.23  0.12  0.30 0.22 0.78 3.2
## 
##                        PA1  PA2  PA3  PA4
## SS loadings           5.67 2.67 2.36 1.63
## Proportion Var        0.18 0.08 0.07 0.05
## Cumulative Var        0.18 0.26 0.33 0.39
## Proportion Explained  0.46 0.22 0.19 0.13
## Cumulative Proportion 0.46 0.68 0.87 1.00
## 
## Mean item complexity =  2.6
## Test of the hypothesis that 4 factors are sufficient.
## 
## The degrees of freedom for the null model are  496  and the objective function was  10.69 with Chi Square of  6364.35
## The degrees of freedom for the model are 374  and the objective function was  1.77 
## 
## The root mean square of the residuals (RMSR) is  0.04 
## The df corrected root mean square of the residuals is  0.04 
## 
## The harmonic number of observations is  608 with the empirical chi square  913.95  with prob <  4.2e-47 
## The total number of observations was  608  with Likelihood Chi Square =  1049.32  with prob <  2.2e-65 
## 
## Tucker Lewis Index of factoring reliability =  0.847
## RMSEA index =  0.054  and the 90 % confidence intervals are  0.051 0.058
## BIC =  -1348.09
## Fit based upon off diagonal values = 0.97
## Measures of factor score adequacy             
##                                                    PA1  PA2  PA3  PA4
## Correlation of (regression) scores with factors   0.95 0.91 0.90 0.86
## Multiple R square of scores with factors          0.91 0.82 0.81 0.74
## Minimum correlation of possible factor scores     0.82 0.65 0.61 0.49
fa.diagram(res)

With oblique rotation

Since that in the original paper the autors said that possible correlation between factors are allowed, now we use promax which is a rotation that doesn’t force our factors to be orthogonal.

res2<-fa(chr2[, 1:32],fm='pa',nfactors = 4, rotate = 'promax')
print.psych(res2,sort = T)
## Factor Analysis using method =  pa
## Call: fa(r = chr2[, 1:32], nfactors = 4, rotate = "promax", fm = "pa")
## Standardized loadings (pattern matrix) based upon correlation matrix
##     item   PA1   PA4   PA2   PA3   h2   u2 com
## Q17   17  0.73  0.11 -0.04  0.03 0.50 0.50 1.1
## Q25   25  0.67  0.01  0.05 -0.05 0.42 0.58 1.0
## Q1     1  0.65  0.03 -0.02  0.01 0.42 0.58 1.0
## Q5     5 -0.63  0.03  0.01  0.01 0.41 0.59 1.0
## Q29   29  0.59  0.11  0.18  0.15 0.35 0.65 1.4
## Q13   13 -0.55  0.11 -0.02  0.03 0.35 0.65 1.1
## Q21   21 -0.53  0.10  0.12  0.18 0.34 0.66 1.4
## Q9     9  0.50  0.00 -0.10 -0.04 0.26 0.74 1.1
## Q18   18  0.16  0.80  0.03  0.00 0.57 0.43 1.1
## Q10   10  0.06  0.74  0.06 -0.02 0.54 0.46 1.0
## Q26   26 -0.05  0.66  0.02  0.05 0.47 0.53 1.0
## Q14   14 -0.13  0.64 -0.04  0.07 0.47 0.53 1.1
## Q2     2 -0.08  0.61 -0.03  0.01 0.40 0.60 1.0
## Q30   30  0.11  0.52 -0.09  0.01 0.22 0.78 1.2
## Q6     6 -0.19  0.43 -0.07 -0.02 0.28 0.72 1.5
## Q22   22  0.21 -0.26  0.02  0.07 0.17 0.83 2.1
## Q20   20  0.07 -0.05  0.76 -0.04 0.56 0.44 1.0
## Q8     8 -0.06 -0.02  0.73  0.02 0.53 0.47 1.0
## Q32   32 -0.07  0.04  0.69  0.06 0.50 0.50 1.0
## Q12   12 -0.12 -0.10  0.66  0.05 0.43 0.57 1.1
## Q4     4  0.00  0.01  0.58  0.00 0.34 0.66 1.0
## Q16   16  0.09  0.16 -0.53  0.11 0.31 0.69 1.3
## Q24   24  0.22  0.08  0.51  0.04 0.27 0.73 1.4
## Q28   28 -0.03  0.22  0.27 -0.17 0.23 0.77 2.7
## Q31   31  0.11  0.14  0.12  0.69 0.48 0.52 1.2
## Q15   15  0.04  0.10 -0.04  0.67 0.46 0.54 1.1
## Q7     7 -0.14  0.04  0.02  0.60 0.32 0.68 1.1
## Q11   11  0.17  0.09 -0.05 -0.59 0.32 0.68 1.2
## Q27   27  0.09  0.03  0.06 -0.57 0.33 0.67 1.1
## Q3     3 -0.07  0.04  0.08 -0.57 0.40 0.60 1.1
## Q23   23  0.13  0.04  0.03  0.51 0.30 0.70 1.2
## Q19   19 -0.09  0.18  0.08 -0.46 0.35 0.65 1.5
## 
##                        PA1  PA4  PA2  PA3
## SS loadings           3.31 3.09 3.07 2.85
## Proportion Var        0.10 0.10 0.10 0.09
## Cumulative Var        0.10 0.20 0.30 0.39
## Proportion Explained  0.27 0.25 0.25 0.23
## Cumulative Proportion 0.27 0.52 0.77 1.00
## 
##  With factor correlations of 
##       PA1   PA4   PA2   PA3
## PA1  1.00 -0.46 -0.21  0.30
## PA4 -0.46  1.00  0.29 -0.20
## PA2 -0.21  0.29  1.00 -0.26
## PA3  0.30 -0.20 -0.26  1.00
## 
## Mean item complexity =  1.2
## Test of the hypothesis that 4 factors are sufficient.
## 
## The degrees of freedom for the null model are  496  and the objective function was  10.69 with Chi Square of  6364.35
## The degrees of freedom for the model are 374  and the objective function was  1.77 
## 
## The root mean square of the residuals (RMSR) is  0.04 
## The df corrected root mean square of the residuals is  0.04 
## 
## The harmonic number of observations is  608 with the empirical chi square  913.95  with prob <  4.2e-47 
## The total number of observations was  608  with Likelihood Chi Square =  1049.32  with prob <  2.2e-65 
## 
## Tucker Lewis Index of factoring reliability =  0.847
## RMSEA index =  0.054  and the 90 % confidence intervals are  0.051 0.058
## BIC =  -1348.09
## Fit based upon off diagonal values = 0.97
## Measures of factor score adequacy             
##                                                    PA1  PA4  PA2  PA3
## Correlation of (regression) scores with factors   0.92 0.93 0.92 0.91
## Multiple R square of scores with factors          0.85 0.86 0.85 0.83
## Minimum correlation of possible factor scores     0.71 0.72 0.71 0.66

Fortunately the overall complexity is 1.2, which is good since it means that almost every items need just one factor to be explained. Also with Cumulative Var, we can see the percentage of variability explained by the four factor, like the R^2 in the regression line. We have that it explaines a good percentage of the variablity of our data.

In Addition we can see better how the different items distributes between the different factors by doing a diagram

fa.diagram(res2)

The negative correlation between affiliative and self-enhancing is justified by theory, the first one represent the usage of the sense of humor to enhance ourself while, the other one, use the humor to enhance other people.

The items related to the factors are the same stated in the theory, however we can see the line in red are in reality reversed items. We can identify each factor as the following one:

  1. PA1: Affiliative The items related are the following one:
  1. PA2: Self-defeating The items related are the following one:
  1. PA3: Aggressive

The items related are the following one:

  1. PA4: Self-enhancing

The items related are the following one:

The following items are not correlated with any factor and that’s beacause they’re slightly correlated with every other items and it’s also possible to notice that in the correlogram.

Reliability

At this point our main concerne is the reliability of the factors we have just found, therefore let’s compute the Cronbach’s alpha for each of them to check.

AFFILIATIVE

sel<- c("Q1","Q5","Q9","Q13","Q17","Q21","Q25","Q29")

alpha1 <- psych::alpha(chr2 %>% select(sel), check.keys = TRUE)# this option reverse automatically items
raw_alpha std.alpha
0.812 0.819
raw_alpha std.alpha
Q1 0.783 0.792
Q5- 0.782 0.793
Q9 0.805 0.810
Q13- 0.793 0.799
Q17 0.774 0.785
Q21- 0.795 0.803
Q25 0.790 0.795
Q29 0.801 0.808

SELF-DEFEATING

sel2<- c("Q4","Q8","Q12","Q16","Q20","Q24","Q32")

alpha2 <-psych::alpha(chr2 %>% select(sel2), check.keys = TRUE)# this option reverse automatically items
raw_alpha std.alpha
0.819 0.819
raw_alpha std.alpha
Q4 0.799 0.799
Q8 0.779 0.780
Q12 0.794 0.794
Q16- 0.808 0.807
Q20 0.780 0.779
Q24 0.820 0.821
Q32 0.783 0.784

AGGRESSIVE

sel3<- c("Q3","Q7","Q11","Q15","Q19","Q23","Q27","Q31")

alpha3 <- psych::alpha(chr2 %>% select(sel3), check.keys = TRUE)# this option reverse automatically items
raw_alpha std.alpha
0.806 0.805
raw_alpha std.alpha
Q3- 0.779 0.778
Q7 0.791 0.791
Q11- 0.790 0.789
Q15 0.772 0.772
Q19- 0.790 0.790
Q23 0.790 0.790
Q27- 0.786 0.785
Q31 0.774 0.773

SELF-ENHANCING

sel4<- c("Q2","Q6","Q10","Q14","Q18","Q26","Q30")

alpha4 <- psych::alpha(chr2 %>% select(sel4), check.keys = TRUE)# this option reverse automatically items
raw_alpha std.alpha
0.821 0.819
raw_alpha std.alpha
Q2 0.797 0.794
Q6 0.813 0.811
Q10 0.783 0.782
Q14 0.790 0.787
Q18 0.782 0.781
Q26 0.790 0.788
Q30 0.823 0.820

POST ANALYSIS

Differences in male and female

scores<-as.data.frame(res2$scores)

scores <- cbind(scores, sex = chr2$gender, age=chr2$age) %>%
  filter(sex=="1" | sex=="2")

male<-scores %>% 
  filter(sex=="1") 

female<-scores %>% 
  filter(sex=="2") 

Assumptions

In orde to understand if we could or not use a parametric test we should check some assumptions.

1)Normality of data

library(kableExtra)

pvalues_male<-male %>% 
  select(PA1,PA2,PA3,PA4) %>% 
  summarise_all(.funs = funs(p.value = shapiro.test(.)$p.value))

pvalues_female<-female %>% 
  select(PA1,PA2,PA3,PA4) %>% 
  summarise_all(.funs = funs(p.value = shapiro.test(.)$p.value))


shap_test<-rbind(pvalues_male,pvalues_female)

rownames(shap_test)<-c("MALE","FEMALE")

kable(shap_test, row.names = T, col.names = c("PA1","PA2","PA3","PA4"), digits = 100 ) %>% 
  kable_styling(bootstrap_options = c("bordered","condensed","hover","striped"),font_size = 14)
PA1 PA2 PA3 PA4
MALE 1.498442e-12 0.02165507 0.030174914 0.004919009
FEMALE 4.816281e-09 0.01591319 0.005216865 0.040084496

As we can see some of our factor don’t follows a Normal distribution, meaning that we can pass directly to a Non-parametric test without further exploration of the assumptions.

WILCOX RANK SUM TEST

library(rstatix)

PA1<- scores %>% 
  wilcox_test(PA1~sex) %>% 
  add_significance() 

PA2<-scores %>% 
  wilcox_test(PA2~sex) %>% 
  add_significance() 

PA3<-scores %>% 
  wilcox_test(PA3~sex) %>% 
  add_significance() 

PA4<-scores %>% 
  wilcox_test(PA4~sex) %>% 
  add_significance() 

test=rbind(PA1,PA2,PA3,PA4) %>% 
  select(.y.,statistic,p,p.signif)

kable(test,digits = 8) %>% 
   kable_styling(bootstrap_options = c("bordered","condensed","hover","striped"),font_size = 14)
.y. statistic p p.signif
PA1 40849 1.56e-02
PA2 52348 3.75e-03 **
PA3 33817 1.00e-08 ****
PA4 46207 9.53e-01 ns

As we can notice from the table there are important differences between male and female in all the factors except for the last one. The only factor that doesn’t seems to have a difference is the one about self-enhancing. So, basically there’s no difference in the use of humor to enhance the self.

Instead, in the other aspects of the usage of humor there’s always a substantial difference,.

Differences in age

age<-scores[,-c(5)]
cor_matrix<-as.matrix(round(cor(age),2))

let’s also plot the correlation plot

col <- colorRampPalette(c("#BB4444", "#EE9988", "#FFFFFF", "#77AADD", "#4477AA"))
corrplot(cor_matrix, method="color", col=col(200),  
         type="upper", 
         addCoef.col = "black", # Add coefficient of correlation
         tl.col="black", tl.srt=45, 
         # hide correlation coefficient on the principal diagonal
         diag=FALSE 
         )

As we can see from the plot age has not significant effect on the factors. However we should also take in consideration the initial description of data, where is possible to see that the great majority of our population have an age between 15 and 30 years. This could probably mean that since we do not have enough elder people, we may have misleading results that may not apply to the reality.