第十三课 ggplot绘图

lidong

01/06/2021

ggplot的作者和原理

ggplot2的作者Hadley Wickham

点击此处查看这本书的精要讲解

Hadely对ggplot2的图形总结:“一张图就是从数据到几何对象(geometric object, 缩写为geom, 包括点、线、条形等)的图形属性(aesthetic attributes, 缩写为aes, 包括颜色、形状、大小等)的一个映射(mapping)。此外,图形中还可能包含数据的统计变换(statistical transformation, 缩写为stats), 最后绘制在某个特定的坐标系(coordinate system, 缩写为coord)中, 而分面(facet,指将绘图窗口划分为若干个子窗口)则可以用来生成数据中不同子集的图形。”

ggplot的特点

> library(ggplot2)
> ggplot(mtcars,aes(x=wt, y=mpg))+geom_point()

ggplot基本语法

数据(变量)映射到几何对象(geom,包括点、线、面等)的图形属性(aes,包括颜色、形状、大小等)。此外还包括数据的统计变换(stats)、绘制特定坐标系(coord)、形成分面(facet)等过程

 – geom_point()绘制散点图

 – geom_bar()绘制条形图

 – geom_line()绘制线图

 – geom_histogram()绘制直方图

 – geom_boxplot() 绘制箱式图

 – geom_density()绘制概率密度函数

 – geom函数约有40种

> ggplot(mtcars, aes(x=wt, y=mpg, color=cyl, size=cyl)) +
+       geom_point() 

> #可以把上面的图分解
> p <-ggplot(mtcars, aes(x=wt, y=mpg, color=cyl, size=cyl))
> p#只生成底图和框架

> p+geom_point()#加散点图图层

> #其实geom_point()只是图层包装函数,背后隐藏的是layer()
> p+layer(
+   mapping = NULL, 
+   data = NULL,
+   geom = "point", 
+   stat = "identity",
+   position = "identity"
+ )

layer()图层组成

ggplot 其他图形调整组成

ggplot实战

选中单击查看


```{.r .watch-out}
> ggplot(mtcars, aes(x=wt, y=mpg, color=am))+geom_point() 
```


 在上面的代码中,颜色用as.factor(am)会有何不同?

比较以下两种绘图颜色的差别及原因

> ggplot(mtcars, aes(x=wt, y=mpg, color="blue"))+geom_point()

> ggplot(mtcars, aes(x=wt, y=mpg ),color="blue")+geom_point()

> ggplot(mtcars, aes(x=wt, y=mpg ))+geom_point(color="blue")

> ggplot(mtcars, aes(x=wt, y=mpg ))+geom_point(aes(color="blue"))

> ggplot(mtcars, aes(x=wt, y=mpg ))+geom_point(aes(color="blue"))+scale_colour_identity()

> ggplot(data=iris,aes(x=Species, y=Sepal.Length)) + geom_boxplot(aes(fill=Species)) 

> ggplot(data=iris, aes(x=Sepal.Width)) + geom_histogram(binwidth=0.2, color="black", aes(fill=Species)) 

> ggplot(data=iris, aes(x=Species)) + geom_bar() 

> ggplot(data=iris, aes(x=Sepal.Length,y=Sepal.Width,group=Species,color=Species)) + geom_line(aes(linetype=Species)) 

> ggplot(data=iris, aes(x=Sepal.Length,y=Sepal.Width,group=Species,color=Species)) +  geom_line(aes(linetype=Species), size = 1.2) +
+   geom_point(aes(shape=Species), size = 3) +        
+   scale_shape_manual(values=c(6, 5, 4)) +               
+   scale_linetype_manual(values=c("dotdash", "solid", "dotted")) +
+   xlab("Sepal Length") + ylab("Sepal Width") + ggtitle("Line plot of sepal length and width")

> ggplot(data=iris, aes(x=Sepal.Length,y=Sepal.Width,color=Species)) +  
+   geom_point(aes(shape=Species), size = 3) +
+   facet_wrap(~Species)

ggplot()的两个分面函数facet_wrap() 和facet_grid()
这两个函数的区别:

ggplot统计功能与绘图

stat_summary() operates on unique x or y;stat_summary_bin() operates on binned x or y. They are more flexible versions of stat_bin(): instead of just counting, they can compute any aggregate.

> ggplot(data=iris, aes(x=Species,y=Sepal.Width,color=Species)) +  
+   geom_point(size = 3) 

> ggplot(data=iris, aes(x=Species,y=Sepal.Width,color=Species)) +  
+   geom_point(position="jitter", size = 3) 

> ggplot(data=iris, aes(x=Species,y=Sepal.Width,color=Species)) +  
+   geom_point(position="jitter", size = 3)+
+   stat_summary(
+     fun.y="mean",
+     geom='errorbar', 
+     aes(ymin=..y.., ymax=..y..), 
+     width=0.6, 
+     size=1.5,
+     colour="grey25"
+   ) 

Visualise the distribution of a single continuous variable by dividing the x axis into bins and counting the number of observations in each bin. Histograms (geom_histogram) display the count with bars

>  ggplot(data=iris, aes(x=Sepal.Width))+ geom_histogram(binwidth=0.2, color="black", fill="blue", aes(y=..density..))#两个.. 是ggplot的标识符,不是ggplot自定义的,而是需要计算的变量

Computes and draws kernel density estimate, which is a smoothed version of the histogram.

>  ggplot(data=iris, aes(x=Sepal.Width, fill=Species))+ geom_density(stat="density", alpha=I(0.2))

Aids the eye in seeing patterns in the presence of overplotting. geom_smooth() and stat_smooth() are effectively aliases: they both use the same arguments. Use stat_smooth() if you want to display the results with a non-standard geom.

>  ggplot(data=iris, aes(x=Sepal.Length, y=Sepal.Width, color=Species)) +   geom_point(aes(shape=Species), size=1.5) +geom_smooth(method="lm")
## `geom_smooth()` using formula 'y ~ x'

> ggplot(data=iris, aes(x=Sepal.Length, y=Sepal.Width, color=Species)) +   geom_point(aes(shape=Species), size=1.5) +geom_smooth(method="loess")
## `geom_smooth()` using formula 'y ~ x'

期末考核问题

\[ \begin{bmatrix} 6~~2 \\ 2~~5\\ \end{bmatrix} \]

(把代码和结果粘贴下边,不要截图!!)

\[ \begin{bmatrix} 6~~ 2 \\ 2~~ 5\\ \end{bmatrix} \]

(把代码和结果粘贴下边,不要截图!!)