作者:带土;Daitu; Adam 邮箱:

该章节主要是数据可视化的作用绘制的一些辅助图像

帮助数据清洗

1、可以快速帮助研究者检查数据的缺失值,及数据的分布情况,帮助数据清洗的快速完成。

## Loading required package: colorspace
## Loading required package: grid
## Loading required package: data.table
## VIM is ready to use. 
##  Since version 4.0.0 the GUI is in its own package VIMGUI.
## 
##           Please use the package to use the new (and old) GUI.
## Suggestions and bug-reports can be submitted at: https://github.com/alexkowa/VIM/issues
## 
## Attaching package: 'VIM'
## The following object is masked from 'package:datasets':
## 
##     sleep
##    BodyWgt BrainWgt NonD Dream Sleep Span Gest Pred Exp Danger
## 1 6654.000   5712.0   NA    NA   3.3 38.6  645    3   5      3
## 2    1.000      6.6  6.3   2.0   8.3  4.5   42    3   1      3
## 3    3.385     44.5   NA    NA  12.5 14.0   60    1   1      1
## 4    0.920      5.7   NA    NA  16.5   NA   25    5   2      3
## 5 2547.000   4603.0  2.1   1.8   3.9 69.0  624    3   5      4
## 6   10.550    179.5  9.1   0.7   9.8 27.0  180    4   4      4

使用线性回归方程,说明数据可视化的作用

2、通过对数据进行可视化,可以发现数据中是否存在异常值,或者对数据选择合适的模型。

##   x1   y1 x2   y2 x3    y3 x4   y4
## 1 10 8.04 10 9.14 10  7.46  8 6.58
## 2  8 6.95  8 8.14  8  6.77  8 5.76
## 3 13 7.58 13 8.74 13 12.74  8 7.71
## 4  9 8.81  9 8.77  9  7.11  8 8.84
## 5 11 8.33 11 9.26 11  7.81  8 8.47
## 6 14 9.96 14 8.10 14  8.84  8 7.04
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'

3、通过数据可视化,研究数据特征之间的关系。如相关系数。

## Registered S3 method overwritten by 'GGally':
##   method from   
##   +.gg   ggplot2
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa

##        mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
## mpg   1.00 -0.85 -0.85 -0.78  0.68 -0.87  0.42  0.66  0.60  0.48 -0.55
## cyl  -0.85  1.00  0.90  0.83 -0.70  0.78 -0.59 -0.81 -0.52 -0.49  0.53
## disp -0.85  0.90  1.00  0.79 -0.71  0.89 -0.43 -0.71 -0.59 -0.56  0.39
## hp   -0.78  0.83  0.79  1.00 -0.45  0.66 -0.71 -0.72 -0.24 -0.13  0.75
## drat  0.68 -0.70 -0.71 -0.45  1.00 -0.71  0.09  0.44  0.71  0.70 -0.09
## wt   -0.87  0.78  0.89  0.66 -0.71  1.00 -0.17 -0.55 -0.69 -0.58  0.43
## qsec  0.42 -0.59 -0.43 -0.71  0.09 -0.17  1.00  0.74 -0.23 -0.21 -0.66
## vs    0.66 -0.81 -0.71 -0.72  0.44 -0.55  0.74  1.00  0.17  0.21 -0.57
## am    0.60 -0.52 -0.59 -0.24  0.71 -0.69 -0.23  0.17  1.00  0.79  0.06
## gear  0.48 -0.49 -0.56 -0.13  0.70 -0.58 -0.21  0.21  0.79  1.00  0.27
## carb -0.55  0.53  0.39  0.75 -0.09  0.43 -0.66 -0.57  0.06  0.27  1.00

5、通过数据可视化,判断建立的数据分析模型是否是正确的,模型是否还能更好。

## Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa

可视化时间序列模型的拟合情况,选择合适的模型

## Loading required package: Rcpp
## Loading required package: rlang
## 
## Attaching package: 'rlang'
## The following object is masked from 'package:data.table':
## 
##     :=
## Loading required package: timeDate
## 
## Attaching package: 'timeSeries'
## The following object is masked from 'package:zoo':
## 
##     time<-
##     y      ds
## 1 112  1 1949
## 2 118  2 1949
## 3 132  3 1949
## 4 129  4 1949
## 5 121  5 1949
## 6 135  6 1949
## Disabling weekly seasonality. Run prophet with weekly.seasonality=TRUE to override this.
## Disabling daily seasonality. Run prophet with daily.seasonality=TRUE to override this.

6、通过将数据建模的结果可视化,方便对模型的解读和理解。

##   pclass survived    sex     age sibsp parch
## 1    1st survived female 29.0000     0     0
## 2    1st survived   male  0.9167     1     2
## 3    1st     died female  2.0000     1     2
## 4    1st     died   male 30.0000     1     2
## 5    1st     died female 25.0000     1     2
## 6    1st survived   male 48.0000     0     0

## n= 1309 
## 
## node), split, n, loss, yval, (yprob)
##       * denotes terminal node
## 
##   1) root 1309 500 died (0.6180290 0.3819710)  
##     2) sex=male 843 161 died (0.8090154 0.1909846)  
##       4) age>=9.5 796 136 died (0.8291457 0.1708543) *
##       5) age< 9.5 47  22 survived (0.4680851 0.5319149)  
##        10) sibsp>=2.5 20   1 died (0.9500000 0.0500000) *
##        11) sibsp< 2.5 27   3 survived (0.1111111 0.8888889) *
##     3) sex=female 466 127 survived (0.2725322 0.7274678)  
##       6) pclass=3rd 216 106 died (0.5092593 0.4907407)  
##        12) sibsp>=2.5 21   3 died (0.8571429 0.1428571) *
##        13) sibsp< 2.5 195  92 survived (0.4717949 0.5282051)  
##          26) age>=16.5 162  79 died (0.5123457 0.4876543)  
##            52) parch>=3.5 9   1 died (0.8888889 0.1111111) *
##            53) parch< 3.5 153  75 survived (0.4901961 0.5098039)  
##             106) age>=27.5 44  17 died (0.6136364 0.3863636) *
##             107) age< 27.5 109  48 survived (0.4403670 0.5596330)  
##               214) age< 21.5 28  11 died (0.6071429 0.3928571) *
##               215) age>=21.5 81  31 survived (0.3827160 0.6172840) *
##          27) age< 16.5 33   9 survived (0.2727273 0.7272727) *
##       7) pclass=1st,2nd 250  17 survived (0.0680000 0.9320000) *
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2       1
## 2          4.9         3.0          1.4         0.2       1
## 3          4.7         3.2          1.3         0.2       1
## 4          4.6         3.1          1.5         0.2       1
## 5          5.0         3.6          1.4         0.2       1
## 6          5.4         3.9          1.7         0.4       1