Find interesting dataset and prepare short report (in R Markdown) which will consists: -* short description of the dataset, -* 3 scatterplots which will present interesting relationships between variables, -* brief comments which describes obtained results.

Then, edit theme of the graphs and all scales of the graph and prepare

publication-ready plots.

Introduction

Source: Library

Data Variables

library(ggplot2)
library(dplyr)
## 
## 載入套件:'dplyr'
## 下列物件被遮斷自 'package:stats':
## 
##     filter, lag
## 下列物件被遮斷自 'package:base':
## 
##     intersect, setdiff, setequal, union
library(tidyr)
boston <- read.csv('/Users/jeank4723/Desktop/Advance VR/1/Data/boston.csv', header = T, dec = ',', sep = ';')

head(boston)
##      CRIM ZN INDUS CHAS   NOX    RM  AGE    DIS RAD TAX PTRATIO      B LSTAT
## 1 0.00632 18  2.31    0 0.538 6.575 65.2 4.0900   1 296    15.3 396.90  4.98
## 2 0.02731  0  7.07    0 0.469 6.421 78.9 4.9671   2 242    17.8 396.90  9.14
## 3 0.02729  0  7.07    0 0.469 7.185 61.1 4.9671   2 242    17.8 392.83  4.03
## 4 0.03237  0  2.18    0 0.458 6.998 45.8 6.0622   3 222    18.7 394.63  2.94
## 5 0.06905  0  2.18    0 0.458 7.147 54.2 6.0622   3 222    18.7 396.90  5.33
## 6 0.02985  0  2.18    0 0.458 6.430 58.7 6.0622   3 222    18.7 394.12  5.21
##   MEDV
## 1 24.0
## 2 21.6
## 3 34.7
## 4 33.4
## 5 36.2
## 6 28.7
str(boston)
## 'data.frame':    506 obs. of  14 variables:
##  $ CRIM   : num  0.00632 0.02731 0.02729 0.03237 0.06905 ...
##  $ ZN     : num  18 0 0 0 0 0 12.5 12.5 12.5 12.5 ...
##  $ INDUS  : num  2.31 7.07 7.07 2.18 2.18 2.18 7.87 7.87 7.87 7.87 ...
##  $ CHAS   : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ NOX    : num  0.538 0.469 0.469 0.458 0.458 0.458 0.524 0.524 0.524 0.524 ...
##  $ RM     : num  6.58 6.42 7.18 7 7.15 ...
##  $ AGE    : num  65.2 78.9 61.1 45.8 54.2 58.7 66.6 96.1 100 85.9 ...
##  $ DIS    : num  4.09 4.97 4.97 6.06 6.06 ...
##  $ RAD    : int  1 2 2 3 3 3 5 5 5 5 ...
##  $ TAX    : int  296 242 242 222 222 222 311 311 311 311 ...
##  $ PTRATIO: num  15.3 17.8 17.8 18.7 18.7 18.7 15.2 15.2 15.2 15.2 ...
##  $ B      : num  397 397 393 395 397 ...
##  $ LSTAT  : num  4.98 9.14 4.03 2.94 5.33 ...
##  $ MEDV   : num  24 21.6 34.7 33.4 36.2 28.7 22.9 27.1 16.5 18.9 ...

3 Scatterplots

  1. TAX and CRIM We can observe that there is low percentage of crime by town Also, when the full-value property-tax rate per $10,000 more than 650 the crime rate per capita by town rapidly increasing over 75%.
p1 <- ggplot(data = boston, aes(x = TAX,
                           y = CRIM
                           ))
p1 + 
  geom_point() +
  theme_classic()

2. LSTAT and CRIM According to the plot, we obtain the result that the crime rate is slightly proportional to the lower status of the population. We can say that when lower status of the population is more than 10% the crime rate per capita by town rapidly increasing.

p2 <- ggplot(data = boston, aes(x = LSTAT,
                           y = CRIM,
                           color = TAX
                           ))
p2 + geom_point()

3. The plot shows that average number of rooms per dwelling and full-value property-tax rate per $10,000 have low relevance. It means that high number of rooms does not related to high amount of full-value property-tax rate.

library(RColorBrewer)
p3 <- ggplot(data = boston, aes(x = TAX,
                           y = RM,
                           color = ZN
                           ))
p3 + 
  geom_point() +
  labs(x = "full-value property-tax rate per $10,000",
       y = "average number of rooms per dwelling",
       title = "TAX and Average Number of rooms relation")+
  theme(axis.title = element_text(color = "black"), 
    panel.background = element_rect(fill = "gray")) +
  guides(color = guide_legend(label.theme = element_text(size = 10,
                                                         colour = "brown",
                                                         angle = 0),
                              label.position = "left"))