Add libraries
require(data.table)
## Loading required package: data.table
require(ggplot2)
## Loading required package: ggplot2
Load csv file"
cats <- read.csv(url("https://raw.github.com/dburtsev/CUNYR/master/cats.csv"))
dt <- data.table(cats)
data table colums are:
colnames(dt)
## [1] "X" "Sex" "Bwt" "Hwt"
numbers of rows in table:
dt[,.N]
## [1] 144
the mean and median for Body weight in kg and Heart weight in g
print("mean for Body weight in kg:")
## [1] "mean for Body weight in kg:"
dt[,mean(Bwt),]
## [1] 2.723611
print("medianfor Body weight in kg:")
## [1] "medianfor Body weight in kg:"
dt[,median(Bwt),]
## [1] 2.7
print("mean for Heart weight in g:")
## [1] "mean for Heart weight in g:"
dt[,mean(Hwt),]
## [1] 10.63056
print("medianfor Heart weight in g:")
## [1] "medianfor Heart weight in g:"
dt[,median(Hwt),]
## [1] 10.1
Conclusion: Body weight mean is almost the same as median. The most cats weight almost the same.
Data transformations
dt3 <- dt[Bwt=="3"]
setnames(dt3,"Bwt","BodyWeight")
setnames(dt3,"Hwt","HeartWeight")
dt3[,Sex:="F"]
dt3
## X Sex BodyWeight HeartWeight
## 1: 46 F 3 10.6
## 2: 47 F 3 13.0
## 3: 100 F 3 10.0
## 4: 101 F 3 10.4
## 5: 102 F 3 10.6
## 6: 103 F 3 11.6
## 7: 104 F 3 12.2
## 8: 105 F 3 12.4
## 9: 106 F 3 12.7
## 10: 107 F 3 13.3
## 11: 108 F 3 13.8
Graphics
histogram:
hist(dt$Bwt, main = "Cats body weight in kg", xlab = "Weight")
scatter plot:
ggplot(dt, aes(x = Bwt, y = Hwt)) + geom_point()
box plot:
ggplot(dt, aes(y = Bwt, x = Hwt)) + geom_violin()
Analysis: It was expected, that tall cats should have bigger harts but I found a strange anomaly for cats with hart wait about 11 g. Those cats has very diverse body weight, from 2.2 to 3.7 kg. The minimum body weight is 2 and maximum is 3.9 kg.