On the question 1 of the exam they show you a study on Lickert scale rating . This lickert scale althought integer reflect the levels of the factor of itemx.. Then cross tabulation (table de contingence) enables you to counts the pairings of correcponding response to all possible level of the factors (oridnal data in this case: “moyen” “faible”…). Each boxe is a joint distribution of that factor that is item1 and item 2. Marginal counts is not provided but easily calculated.
The problem arise how to start Should I construct the two way table?
Well this answer depdends on the (research Question) and I acknowledged that in R is not not straightforward with functions apply to factor . Let see it in example (intentional thought):
Severals ways are possibles and are shown with the coding below: A TWO-WAY table (item1 crossed item2) with 4 levels factors:
Lets start with a dataframe:
Note: Table is a very old function working with matrix (Splus)
For data frame xtabs perform well better (see ?xtabs)
MYTAB=data.frame(expand.grid(item1=c("faible", "moy", "for", "tresfort"),
item2=c("faible", "moy", "for", "tresfort")),
count=c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16))
##count is taken by simplicity of coding
str(MYTAB) ##Count is an num!!!
## 'data.frame': 16 obs. of 3 variables:
## $ item1: Factor w/ 4 levels "faible","moy",..: 1 2 3 4 1 2 3 4 1 2 ...
## $ item2: Factor w/ 4 levels "faible","moy",..: 1 1 1 1 2 2 2 2 3 3 ...
## $ count: num 1 2 3 4 5 6 7 8 9 10 ...
MYTAB
## item1 item2 count
## 1 faible faible 1
## 2 moy faible 2
## 3 for faible 3
## 4 tresfort faible 4
## 5 faible moy 5
## 6 moy moy 6
## 7 for moy 7
## 8 tresfort moy 8
## 9 faible for 9
## 10 moy for 10
## 11 for for 11
## 12 tresfort for 12
## 13 faible tresfort 13
## 14 moy tresfort 14
## 15 for tresfort 15
## 16 tresfort tresfort 16
mytable=xtabs(MYTAB$count~MYTAB$item1+MYTAB$item2)
mytable
## MYTAB$item2
## MYTAB$item1 faible moy for tresfort
## faible 1 5 9 13
## moy 2 6 10 14
## for 3 7 11 15
## tresfort 4 8 12 16
#lets check if all factors are crossed with all possible paired levels:
table(MYTAB$item1,MYTAB$item2)##make check of all combinaisons 2 Here 12 pairs +4 own pairing make 16 boxes
##
## faible moy for tresfort
## faible 1 1 1 1
## moy 1 1 1 1
## for 1 1 1 1
## tresfort 1 1 1 1
##Add total margin
addmargins(mytable)#a 136 total of paired answers (item1 & item2)
## MYTAB$item2
## MYTAB$item1 faible moy for tresfort Sum
## faible 1 5 9 13 28
## moy 2 6 10 14 32
## for 3 7 11 15 36
## tresfort 4 8 12 16 40
## Sum 10 26 42 58 136
#that is why it s usuallly shown in maths as a double summation ∑∑ij(rowswise i then columnwise j)
addmargins(prop.table(mytable))#a proportions tables makeing a total of 100%
## MYTAB$item2
## MYTAB$item1 faible moy for tresfort Sum
## faible 0.007352941 0.036764706 0.066176471 0.095588235 0.205882353
## moy 0.014705882 0.044117647 0.073529412 0.102941176 0.235294118
## for 0.022058824 0.051470588 0.080882353 0.110294118 0.264705882
## tresfort 0.029411765 0.058823529 0.088235294 0.117647059 0.294117647
## Sum 0.073529412 0.191176471 0.308823529 0.426470588 1.000000000
#Attention that there could be other type of total tha is called a conditional magins (not explained here to avoid confusion)
Avant=factor(rep(c("oui","non"),c(20,30)))
Apres=factor(rep(c("oui","non"),c(30,20)))
Ma=data.frame(Avant,Apres)
str(Ma)
## 'data.frame': 50 obs. of 2 variables:
## $ Avant: Factor w/ 2 levels "non","oui": 2 2 2 2 2 2 2 2 2 2 ...
## $ Apres: Factor w/ 2 levels "non","oui": 2 2 2 2 2 2 2 2 2 2 ...
table(Ma)
## Apres
## Avant non oui
## non 20 10
## oui 0 20
##ATTENTION WRONG CODING
prop.table(addmargins(mytable))
## MYTAB$item2
## MYTAB$item1 faible moy for tresfort Sum
## faible 0.001838235 0.009191176 0.016544118 0.023897059 0.051470588
## moy 0.003676471 0.011029412 0.018382353 0.025735294 0.058823529
## for 0.005514706 0.012867647 0.020220588 0.027573529 0.066176471
## tresfort 0.007352941 0.014705882 0.022058824 0.029411765 0.073529412
## Sum 0.018382353 0.047794118 0.077205882 0.106617647 0.250000000
#the sum is taken as o won levels of factor and calculations is WRONG!!!
library(waffle)
## Warning: le package 'waffle' a été compilé avec la version R 4.2.3
## Le chargement a nécessité le package : ggplot2
## Warning: le package 'ggplot2' a été compilé avec la version R 4.2.2
mytable
## MYTAB$item2
## MYTAB$item1 faible moy for tresfort
## faible 1 5 9 13
## moy 2 6 10 14
## for 3 7 11 15
## tresfort 4 8 12 16
mosaicplot(mytable,col=rainbow(4))#the correct way of visualizing a two way table > 2 levels factors
barplot(mytable)###WRONG COMMAND!!! It takes the count here
#we will see only possible with a vecor of item!
###vias matrix
mytable
## MYTAB$item2
## MYTAB$item1 faible moy for tresfort
## faible 1 5 9 13
## moy 2 6 10 14
## for 3 7 11 15
## tresfort 4 8 12 16
c(1,5,9,13,2,6,10,14,3,7,11,15,4,8,12,16)
## [1] 1 5 9 13 2 6 10 14 3 7 11 15 4 8 12 16
#use rbind of cbind
tab <- as.table(rbind(c(1,5,9,13),c(2,6,10,14),c(3,7,11,15),c(4,8,12,16)))
tab
## A B C D
## A 1 5 9 13
## B 2 6 10 14
## C 3 7 11 15
## D 4 8 12 16
dimnames(tab) <- list(item1 = c("faible", "moy", "for", "tresfort"),
item2 = c("faible", "moy", "for", "tresfort"))
tab#the
## item2
## item1 faible moy for tresfort
## faible 1 5 9 13
## moy 2 6 10 14
## for 3 7 11 15
## tresfort 4 8 12 16
mytable
## MYTAB$item2
## MYTAB$item1 faible moy for tresfort
## faible 1 5 9 13
## moy 2 6 10 14
## for 3 7 11 15
## tresfort 4 8 12 16
#reproduce correct table OK
t(mytable)#transpose a table:Note transpose is not the same meaning as transpose a Matrix in maths
## MYTAB$item1
## MYTAB$item2 faible moy for tresfort
## faible 1 2 3 4
## moy 5 6 7 8
## for 9 10 11 12
## tresfort 13 14 15 16
mytable
## MYTAB$item2
## MYTAB$item1 faible moy for tresfort
## faible 1 5 9 13
## moy 2 6 10 14
## for 3 7 11 15
## tresfort 4 8 12 16
Wee quantile is a position measure apply to ranked variables.
On a dataframe it works easily on factors but on what we coded is a bit trickier:
quantile(mtcars$cyl)
## 0% 25% 50% 75% 100%
## 4 4 6 8 8
quantile(tab)
## 0% 25% 50% 75% 100%
## 1.00 4.75 8.50 12.25 16.00
##wrong#
##with ordinal data it quantile is the location of all N ids paired
##from the 136 totals
quantile(mytable)
## 0% 25% 50% 75% 100%
## 1.00 4.75 8.50 12.25 16.00
#wrong again
quantile(MYTAB$count)
## 0% 25% 50% 75% 100%
## 1.00 4.75 8.50 12.25 16.00
#again wrong
##the solution: construct a full vector then ordered it and take the position you want as sucH:
#item1 i.e
myvector1=rep(c("faible","moy","for","tresfort"),c(28 , 32 , 36 , 40 ))
summary(myvector1)
## Length Class Mode
## 136 character character
myvector1=factor(myvector1)
sum(table(myvector1))
## [1] 136
q=quantile(order(myvector1))##quantile(myvector1)is wrong you need to sort or order
q#that is the position of a vector of lenght 136
## 0% 25% 50% 75% 100%
## 1.00 34.75 68.50 102.25 136.00
#record the value position of the desired quantile Here 0.25 , 0.5 , 75%
myvector1[c(35,68,102)]#position of ordinal with 25 50 and 75 quantiles
## [1] moy for tresfort
## Levels: faible for moy tresfort
#re-check if it is correct moy for position 35
myvector1[35]
## [1] moy
## Levels: faible for moy tresfort
NOTE: on your course they is a Kappas presented as a Covariance (Symetry) Kappa but they are some other procedures to calculate a Kappa (see T Ancelle,Youtube,Epidemiolgy) derived from Khi Deux
On two way table n11 & n22 are the diagonal of the matrix YEY-YES / NO-NO A perfect concordance or agreement between cross answers (VAR) doesent interest you now: Conversly for a treatment i.e (Crisis before after a drug) it is a YES/NO or NO/Yes who define the effect of a treatment (depends of course of the design of exp) they are the n12 n21 that are of interest (COV) for this asymmetry the difference between these 2 values gives who her extent.
library(vcd)
## Le chargement a nécessité le package : grid
Kappa(mytable)
## value ASE z Pr(>|z|)
## Unweighted -0.02361 0.04795 -0.4924 0.6224
## Weighted -0.04938 0.05494 -0.8988 0.3688
K=Kappa(tab)
summary(K)
## value ASE z Pr(>|z|)
## Unweighted -0.02361 0.04795 -0.4924 0.6224
## Weighted -0.04938 0.05494 -0.8988 0.3688
##
## Weights:
## [,1] [,2] [,3] [,4]
## [1,] 1.0000000 0.6666667 0.3333333 0.0000000
## [2,] 0.6666667 1.0000000 0.6666667 0.3333333
## [3,] 0.3333333 0.6666667 1.0000000 0.6666667
## [4,] 0.0000000 0.3333333 0.6666667 1.0000000