1 Load libaries

library(table1)
library(likert)
library(grid)

2 Exploring responses to a questionnaire

The questionnaire was developped by Khanh Bui My Thuy during her master thesis, and was published in: https://doi.org/10.1016/j.vprsr.2024.101024

Dog owners were interviewed regarding care practice, diet, living pattern, health awareness and relationships with their dogs.

2.1 Get data file

quest_dogs <- readxl::read_excel("data.questionnaire.dogs.khanh.xlsx")

2.2 Table of responses using library “table1”

table1( ~ sex + age_group + occupation + number_dog + dog_inside_outside_house | village,
                data = quest_dogs, render.missing = NULL)
1 Na Non
(N=14)
2 Na Sai
(N=12)
3 Poh
(N=25)
4 Huak
(N=32)
5 Nam Krai
(N=25)
6 Huay Muang
(N=14)
7 Sun Ti Suk
(N=16)
8 Hae
(N=18)
Overall
(N=156)
sex
F 6 (42.9%) 6 (50.0%) 8 (32.0%) 9 (28.1%) 3 (12.0%) 5 (35.7%) 7 (43.8%) 4 (22.2%) 48 (30.8%)
M 8 (57.1%) 6 (50.0%) 15 (60.0%) 23 (71.9%) 22 (88.0%) 9 (64.3%) 9 (56.3%) 14 (77.8%) 106 (67.9%)
NA 0 (0%) 0 (0%) 2 (8.0%) 0 (0%) 0 (0%) 0 (0%) 0 (0%) 0 (0%) 2 (1.3%)
age_group
35 to 60 10 (71.4%) 10 (83.3%) 14 (56.0%) 23 (71.9%) 8 (32.0%) 7 (50.0%) 13 (81.3%) 14 (77.8%) 99 (63.5%)
less than 35 1 (7.1%) 0 (0%) 2 (8.0%) 1 (3.1%) 0 (0%) 0 (0%) 0 (0%) 1 (5.6%) 5 (3.2%)
over 60 2 (14.3%) 2 (16.7%) 5 (20.0%) 8 (25.0%) 16 (64.0%) 7 (50.0%) 3 (18.8%) 3 (16.7%) 46 (29.5%)
occupation
farmer 13 (92.9%) 9 (75.0%) 13 (52.0%) 28 (87.5%) 22 (88.0%) 14 (100%) 15 (93.8%) 13 (72.2%) 127 (81.4%)
non farmer 1 (7.1%) 3 (25.0%) 11 (44.0%) 4 (12.5%) 3 (12.0%) 0 (0%) 1 (6.3%) 4 (22.2%) 27 (17.3%)
number_dog
Mean (SD) 1.79 (1.25) 1.50 (0.522) 1.20 (0.408) 1.94 (1.01) 2.88 (1.90) 1.64 (0.497) 2.00 (1.10) 1.22 (0.428) 1.82 (1.18)
Median [Min, Max] 1.00 [1.00, 4.00] 1.50 [1.00, 2.00] 1.00 [1.00, 2.00] 2.00 [1.00, 4.00] 2.00 [1.00, 6.00] 2.00 [1.00, 2.00] 2.00 [1.00, 4.00] 1.00 [1.00, 2.00] 2.00 [1.00, 6.00]
dog_inside_outside_house
mostly inside house 2 (14.3%) 1 (8.3%) 0 (0%) 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3 (16.7%) 6 (3.8%)
only outside house 4 (28.6%) 5 (41.7%) 22 (88.0%) 27 (84.4%) 25 (100%) 14 (100%) 16 (100%) 14 (77.8%) 127 (81.4%)
sometime inside or outside 8 (57.1%) 6 (50.0%) 3 (12.0%) 5 (15.6%) 0 (0%) 0 (0%) 0 (0%) 1 (5.6%) 23 (14.7%)

2.3 Customize table

There are various ways to customize your table, using “topclass” in “table1”, see: https://cran.r-project.org/web/packages/table1/vignettes/table1-examples.html

table1( ~ sex + age_group + occupation + number_dog + dog_inside_outside_house | village,
                data = quest_dogs, topclass="Rtable1-grid Rtable1-shade Rtable1-times")
1 Na Non
(N=14)
2 Na Sai
(N=12)
3 Poh
(N=25)
4 Huak
(N=32)
5 Nam Krai
(N=25)
6 Huay Muang
(N=14)
7 Sun Ti Suk
(N=16)
8 Hae
(N=18)
Overall
(N=156)
sex
F 6 (42.9%) 6 (50.0%) 8 (32.0%) 9 (28.1%) 3 (12.0%) 5 (35.7%) 7 (43.8%) 4 (22.2%) 48 (30.8%)
M 8 (57.1%) 6 (50.0%) 15 (60.0%) 23 (71.9%) 22 (88.0%) 9 (64.3%) 9 (56.3%) 14 (77.8%) 106 (67.9%)
NA 0 (0%) 0 (0%) 2 (8.0%) 0 (0%) 0 (0%) 0 (0%) 0 (0%) 0 (0%) 2 (1.3%)
age_group
35 to 60 10 (71.4%) 10 (83.3%) 14 (56.0%) 23 (71.9%) 8 (32.0%) 7 (50.0%) 13 (81.3%) 14 (77.8%) 99 (63.5%)
less than 35 1 (7.1%) 0 (0%) 2 (8.0%) 1 (3.1%) 0 (0%) 0 (0%) 0 (0%) 1 (5.6%) 5 (3.2%)
over 60 2 (14.3%) 2 (16.7%) 5 (20.0%) 8 (25.0%) 16 (64.0%) 7 (50.0%) 3 (18.8%) 3 (16.7%) 46 (29.5%)
Missing 1 (7.1%) 0 (0%) 4 (16.0%) 0 (0%) 1 (4.0%) 0 (0%) 0 (0%) 0 (0%) 6 (3.8%)
occupation
farmer 13 (92.9%) 9 (75.0%) 13 (52.0%) 28 (87.5%) 22 (88.0%) 14 (100%) 15 (93.8%) 13 (72.2%) 127 (81.4%)
non farmer 1 (7.1%) 3 (25.0%) 11 (44.0%) 4 (12.5%) 3 (12.0%) 0 (0%) 1 (6.3%) 4 (22.2%) 27 (17.3%)
Missing 0 (0%) 0 (0%) 1 (4.0%) 0 (0%) 0 (0%) 0 (0%) 0 (0%) 1 (5.6%) 2 (1.3%)
number_dog
Mean (SD) 1.79 (1.25) 1.50 (0.522) 1.20 (0.408) 1.94 (1.01) 2.88 (1.90) 1.64 (0.497) 2.00 (1.10) 1.22 (0.428) 1.82 (1.18)
Median [Min, Max] 1.00 [1.00, 4.00] 1.50 [1.00, 2.00] 1.00 [1.00, 2.00] 2.00 [1.00, 4.00] 2.00 [1.00, 6.00] 2.00 [1.00, 2.00] 2.00 [1.00, 4.00] 1.00 [1.00, 2.00] 2.00 [1.00, 6.00]
dog_inside_outside_house
mostly inside house 2 (14.3%) 1 (8.3%) 0 (0%) 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3 (16.7%) 6 (3.8%)
only outside house 4 (28.6%) 5 (41.7%) 22 (88.0%) 27 (84.4%) 25 (100%) 14 (100%) 16 (100%) 14 (77.8%) 127 (81.4%)
sometime inside or outside 8 (57.1%) 6 (50.0%) 3 (12.0%) 5 (15.6%) 0 (0%) 0 (0%) 0 (0%) 1 (5.6%) 23 (14.7%)

2.3.1 Chi2 test for occupation

We can test several hypotheses, such as the proportion of the different occupations of dog owners, farmers and non-farmers, between villages. For this, we can use a chi-square test. The null hypothesis (H0) is there are no differences in occupation between villages (i.e. the proportion of farmers is equal between villages).

The p value is equal to 0.004, less than 0.05, which implies to reject the null hypothesis and accepts the alternative hypothesis (H1) (i.e. the proportion of farmers differs between villages). To remember, the degree of freedom (df) for a chi-square test is (number of rows -1) X (number of columns - 1), with the number of columns is 8 (number of villages) and number of columns is 2 (number of occupation types).

chisq.test(quest_dogs$occupation, quest_dogs$village)
## 
##  Pearson's Chi-squared test
## 
## data:  quest_dogs$occupation and quest_dogs$village
## X-squared = 20.701, df = 7, p-value = 0.00424

2.4 Power analysis for a Chi-square Test

In statistical hypothesis testing, the power of a test is the probability that it correctly rejects the null hypothesis when the alternative hypothesis is true. It is defined as: Power=1−β

Power analysis helps determining the sample size needed to detect an effect of a given size with a given degree of confidence. Power analysis gives the probability that a test will correctly reject a null hypothesis when it is false. (see: https://reintech.io/blog/power-analysis-with-r-tutorial-for-developers)

2.4.1 Sample size

For a chi-square test, we use the function ‘pwr.chisq.test’ implemented in the package “pwr”. Here, the sample size for a medium effect size (0.3), 80% power, a degree of freedom of 7 (see above), and a significance level of 0.05 is N = 159. The actual sample size, N = 156, is almost reached.

pwr::pwr.chisq.test(w=0.3, power=0.8, df= 7, sig.level=0.05)
## 
##      Chi squared power calculation 
## 
##               w = 0.3
##               N = 159.4503
##              df = 7
##       sig.level = 0.05
##           power = 0.8
## 
## NOTE: N is the number of observations

2.4.2 Power

We use the library “DescTools” and the function “power.chisq.test” to compute the power of our chi-2 test Using the actual sample size, N = 156, and the degree of freedom of 7 (see above), and a the power of the Chi-square analysis is high (power = 0.789).

DescTools::power.chisq.test(w=0.30, df=((8-1) * (2-1)), n=156, sig.level=0.05)
## 
##      Chi squared power calculation 
## 
##               w = 0.3
##               n = 156
##              df = 7
##       sig.level = 0.05
##           power = 0.7895372
## 
## NOTE: n is the number of observations

The effect size is a value measuring the strength of the relationship between two variables in a population, or a sample-based estimate of that quantity, see: https://en.wikipedia.org/wiki/Effect_size.

In a power analysis the effect size is given by the “Cohen w”, which is used with a goodness-of-fit tet.

Rule of thumb for Cohen’s w

w Interpretation
0.00 < 0.10 Negligible
0.10 < 0.30 Small
0.30 < 0.50 Medium
0.50 and more Large

Below, we plot the relationship between the power and the effect size for our chi-square test, with n = 156, df = 7 and p = 0.05.

# Parameters for power curve
total_sample_size <- 156  # Total sample size
effect_sizes_curve <- seq(0.2, 0.8, by = 0.1)  # Range of effect sizes (Cohen's w)
alpha_chisq <- 0.05  # Significance level

# Calculate power values
power_values_curve <- sapply(effect_sizes_curve, function(w) pwr::pwr.chisq.test(w = w, 
                                                    N = total_sample_size, 
                                                    df = 7,  # degrees of freedom
                                                    sig.level = alpha_chisq)$power)
# Plot power curve
plot(effect_sizes_curve, power_values_curve, type = "b",
     main = "Power Curve for Chi-Square Test",
     xlab = "Effect Size (Cohen's w)",
     ylab = "Power",
     ylim = c(0, 1))

2.5 Preparing data for figure using linkert

The dataset should be a dataframe: use a.data.frame()

The variables should factor: use a.factor()

dogq <- as.data.frame(quest_dogs)
dogq$village <- as.factor(dogq$village)
dogq$bath_dog <- as.factor(dogq$bath_dog)
dogq$village_group_landscape <- as.factor(dogq$village_group_landscape)

2.6 Customize (colors)

myColor <- c("blue","pink","red")
plot(result_question,
     include.histogram = TRUE,col=myColor)


3 List of some tutorials