I have chosen the csv file “suicide” to work through the problems below. This file includes data from suicide rates in West Germany, with factors such as age, age group, sex, method of suicide, a second method of suicide and frequency of suicides.

suicide <- read.csv(file="Suicide.csv",head=TRUE,sep=",",stringsAsFactors=FALSE)
suicide <- suicide[-c(1)]
head(suicide)
##   Freq  sex   method age age.group method2
## 1    4 male   poison  10    20-Oct  poison
## 2    0 male  cookgas  10    20-Oct     gas
## 3    0 male toxicgas  10    20-Oct     gas
## 4  247 male     hang  10    20-Oct    hang
## 5    1 male    drown  10    20-Oct   drown
## 6   17 male      gun  10    20-Oct     gun

Problem 1

Summary of data. Display mean and median for at least 2 factors.

summary(suicide)
##       Freq             sex               method               age    
##  Min.   :   0.00   Length:306         Length:306         Min.   :10  
##  1st Qu.:  10.25   Class :character   Class :character   1st Qu.:30  
##  Median :  59.00   Mode  :character   Mode  :character   Median :50  
##  Mean   : 173.80                                         Mean   :50  
##  3rd Qu.: 178.75                                         3rd Qu.:70  
##  Max.   :1381.00                                         Max.   :90  
##   age.group           method2         
##  Length:306         Length:306        
##  Class :character   Class :character  
##  Mode  :character   Mode  :character  
##                                       
##                                       
## 
mm = matrix(c(mean(suicide$Freq),
                mean(suicide$age),
                median(suicide$Freq),
                median(suicide$age)),ncol=2)
colnames(mm)= c("mean", "median")
rownames(mm)= c("Frequency", "Age")
mm = as.table(mm)
mm
##               mean   median
## Frequency 173.7974  59.0000
## Age        50.0000  50.0000

Problem 2

Create new data frame (subset of columns and rows). Rename it.

minisuic = subset(suicide, Freq>10 & sex=="female" & method2=="gas" & age<90, select = c(Freq, sex, method2, age))
data.frame(minisuic)
##     Freq    sex method2 age
## 165   11 female     gas  15
## 174   20 female     gas  20
## 183   27 female     gas  25
## 192   29 female     gas  30
## 201   44 female     gas  35
## 210   24 female     gas  40
## 219   24 female     gas  45
## 228   26 female     gas  50
## 237   14 female     gas  55

Problem 3

Create new column names for new data frame.

colnames(minisuic) = c("Attempts","M/F","How","Old")
head(minisuic)
##     Attempts    M/F How Old
## 165       11 female gas  15
## 174       20 female gas  20
## 183       27 female gas  25
## 192       29 female gas  30
## 201       44 female gas  35
## 210       24 female gas  40

Problem 4

Summary of new data frame. Print mean and median for same factors. Compare.

summary(minisuic)
##     Attempts         M/F                How                 Old    
##  Min.   :11.00   Length:9           Length:9           Min.   :15  
##  1st Qu.:20.00   Class :character   Class :character   1st Qu.:25  
##  Median :24.00   Mode  :character   Mode  :character   Median :35  
##  Mean   :24.33                                         Mean   :35  
##  3rd Qu.:27.00                                         3rd Qu.:45  
##  Max.   :44.00                                         Max.   :55
mm2 = matrix(c(mean(suicide$Freq),
              mean(suicide$age),
              mean(minisuic$Attempts),
              mean(minisuic$Old),
              median(suicide$Freq),
              median(suicide$age),
              median(minisuic$Attempts),
              median(minisuic$Old)),ncol=2)
colnames(mm2)= c("mean", "median")
rownames(mm2)= c("Frequency", "Age","Attempts", "Old")
mm2 = as.table(mm2)
mm2
##                mean    median
## Frequency 173.79739  59.00000
## Age        50.00000  50.00000
## Attempts   24.33333  24.00000
## Old        35.00000  35.00000

Problem 5

Rename a value in a column. Do this three times.

minisuic$`M/F`[minisuic$`M/F` == "female"] = "F"
minisuic$Old[minisuic$Old >= 35] = ">35"
minisuic$Attempts[minisuic$Attempts >= 20 & minisuic$Attempts<30] = "twenties"
minisuic
##     Attempts M/F How Old
## 165       11   F gas  15
## 174 twenties   F gas  20
## 183 twenties   F gas  25
## 192 twenties   F gas  30
## 201       44   F gas >35
## 210 twenties   F gas >35
## 219 twenties   F gas >35
## 228 twenties   F gas >35
## 237       14   F gas >35