data("UCBAdmissions", package="datasets")
View(UCBAdmissions)

(a) Find the total number of cases contained in this table.

mydata=xtabs(formula = Freq~Admit +Gender+ Dept, data = UCBAdmissions)
summary(mydata)
## Call: xtabs(formula = Freq ~ Admit + Gender + Dept, data = UCBAdmissions)
## Number of cases in table: 4526 
## Number of factors: 3 
## Test for independence of all factors:
##  Chisq = 2000.3, df = 16, p-value = 0

The total number of cases in table are 4526.Numbers of factors are 3.

(b) For each department, find the total number of applicants.

addmargins(UCBAdmissions,  FUN = sum)
## Margins computed over dimensions
## in the following order:
## 1: Admit
## 2: Gender
## 3: Dept
## , , Dept = A
## 
##           Gender
## Admit      Male Female  sum
##   Admitted  512     89  601
##   Rejected  313     19  332
##   sum       825    108  933
## 
## , , Dept = B
## 
##           Gender
## Admit      Male Female  sum
##   Admitted  353     17  370
##   Rejected  207      8  215
##   sum       560     25  585
## 
## , , Dept = C
## 
##           Gender
## Admit      Male Female  sum
##   Admitted  120    202  322
##   Rejected  205    391  596
##   sum       325    593  918
## 
## , , Dept = D
## 
##           Gender
## Admit      Male Female  sum
##   Admitted  138    131  269
##   Rejected  279    244  523
##   sum       417    375  792
## 
## , , Dept = E
## 
##           Gender
## Admit      Male Female  sum
##   Admitted   53     94  147
##   Rejected  138    299  437
##   sum       191    393  584
## 
## , , Dept = F
## 
##           Gender
## Admit      Male Female  sum
##   Admitted   22     24   46
##   Rejected  351    317  668
##   sum       373    341  714
## 
## , , Dept = sum
## 
##           Gender
## Admit      Male Female  sum
##   Admitted 1198    557 1755
##   Rejected 1493   1278 2771
##   sum      2691   1835 4526

OR

apply(UCBAdmissions, 3, sum)
##   A   B   C   D   E   F 
## 933 585 918 792 584 714

The first command shows the total number of applicants for each department individually per males and females whom admitted or rejected. This command is more detailed. But the second command just shows the total number of all applicants for each department.

(c) For each department, find the overall proportion of applicants who were admitted.

prop.table(ftable(UCBAdmissions ,row.vars = "Admit",col.vars = "Dept"))
##          Dept          A          B          C          D          E          F
## Admit                                                                          
## Admitted      0.13278833 0.08174989 0.07114450 0.05943438 0.03247901 0.01016350
## Rejected      0.07335395 0.04750331 0.13168361 0.11555457 0.09655325 0.14759169

We can see here the proportion of people who were admitted or rejcted per department.

(d) Construct a tabular display of department (rows) and gender (columns), showing the proportion of applicants in each cell who were admitted relative to the total applicants in that cell.

table=aperm(UCBAdmissions)
table1 <- table[,,"Admitted"]
table2 <- table[,,"Rejected"]
table3 <- table1/(table1+table2)
table3
##     Gender
## Dept       Male     Female
##    A 0.62060606 0.82407407
##    B 0.63035714 0.68000000
##    C 0.36923077 0.34064081
##    D 0.33093525 0.34933333
##    E 0.27748691 0.23918575
##    F 0.05898123 0.07038123

The first command shows applicants status per each department based on gender.Table 1 and 2 are 2 individuals parts of previous command. Table 3 has shown the proportion of admitted applicants per total applicants.

Exercise 2.5 The data set UKSoccer in vcd gives the distributions of number of goals scored by the 20 teams in the 1995/96 season of the Premier League of the UK Football Association.

data("UKSoccer", package="vcd")
ftable(UKSoccer)
##      Away  0  1  2  3  4
## Home                    
## 0         27 29 10  8  2
## 1         59 53 14 12  4
## 2         28 32 14 12  4
## 3         19 14  7  4  1
## 4          7  8 10  2  0
View(UKSoccer)

(a) Verify that the total number of games represented in this table is 380.

margin.table(UKSoccer)
## [1] 380

(b) Find the marginal total of the number of goals scored by each of the home and away teams.

addmargins(UKSoccer,  FUN = sum)
## Margins computed over dimensions
## in the following order:
## 1: Home
## 2: Away
##      Away
## Home    0   1   2   3   4 sum
##   0    27  29  10   8   2  76
##   1    59  53  14  12   4 142
##   2    28  32  14  12   4  90
##   3    19  14   7   4   1  45
##   4     7   8  10   2   0  27
##   sum 140 136  55  38  11 380

(c) Express each of the marginal totals as proportions.

prop.table (addmargins(UKSoccer,  FUN = sum))
## Margins computed over dimensions
## in the following order:
## 1: Home
## 2: Away
##      Away
## Home             0            1            2            3            4
##   0   0.0177631579 0.0190789474 0.0065789474 0.0052631579 0.0013157895
##   1   0.0388157895 0.0348684211 0.0092105263 0.0078947368 0.0026315789
##   2   0.0184210526 0.0210526316 0.0092105263 0.0078947368 0.0026315789
##   3   0.0125000000 0.0092105263 0.0046052632 0.0026315789 0.0006578947
##   4   0.0046052632 0.0052631579 0.0065789474 0.0013157895 0.0000000000
##   sum 0.0921052632 0.0894736842 0.0361842105 0.0250000000 0.0072368421
##      Away
## Home           sum
##   0   0.0500000000
##   1   0.0934210526
##   2   0.0592105263
##   3   0.0296052632
##   4   0.0177631579
##   sum 0.2500000000

Optional: (d) Comment on the distribution of the numbers of home-team and away-team goals. Is there any evidence that home teams score more goals on average

library(ggplot2)
library(VGAMdata)
library(GGally)
a=as.data.frame(UKSoccer)
class(a)
## [1] "data.frame"
data(a, package = "VGAMdata")
## Warning in data(a, package = "VGAMdata"): data set 'a' not found
names(a) <- gsub("*Rate", "", names(a))
names(a)[1:2] <-  c("Away", "Home")
GGally::ggpairs(a[,c(1:3)],
                title = "correlation of home team and away team goals",
                diag = list(continuous = 'density'), axisLabels='none')
## Warning in check_and_set_ggpairs_defaults("diag", diag, continuous =
## "densityDiag", : Changing diag$continuous from 'density' to 'densityDiag'
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

There is an evidence that home teams score more goals on average based on box plots.The median of home games are more than away games.