ggplot2 作业1

在课程资源下载本周作业素材,这是某网站在一段时间内的统计指标,每三行为一个单位,分别是时间,ip,pv。将数据读入R,然后使用qplot函数用适当而美观的方式展现时间序列变化趋势。

解答:

引入ggplot2包:

library("ggplot2")

加载测试数据lesson8.txt:

x = scan("lesson8.txt", sep = "\n", what = list("", "", ""))
x[[1]][1:10]
##  [1] "Sun Jul  8 23:59:02 HKT 2012" "Mon Jul  9 23:59:01 HKT 2012"
##  [3] "Tue Jul 10 23:59:01 HKT 2012" "Wed Jul 11 23:59:01 HKT 2012"
##  [5] "Thu Jul 12 23:59:02 HKT 2012" "Fri Jul 13 23:59:01 HKT 2012"
##  [7] "Sat Jul 14 23:59:02 HKT 2012" "Sun Jul 15 23:59:01 HKT 2012"
##  [9] "Mon Jul 16 23:59:02 HKT 2012" "Tue Jul 17 23:59:01 HKT 2012"
x[[2]][1:10]
##  [1] "1922" "2345" "2255" "2179" "2225" "2392" "1957" "1809" "2277" "2682"
x[[3]][1:10]
##  [1] "91938"  "108521" "89036"  "84149"  "85583"  "79507"  "83188" 
##  [8] "77675"  "88402"  "100121"

区域设置(选项"C"表示关闭区域特定编码方式,见help):

lct <- Sys.getlocale("LC_TIME")
Sys.setlocale("LC_TIME", "C")
## [1] "C"

日期转换:

date <- as.Date(strptime(x[[1]], "%a %b %d %H:%M:%S HKT %Y"))
date[1:10]
##  [1] "2012-07-08" "2012-07-09" "2012-07-10" "2012-07-11" "2012-07-12"
##  [6] "2012-07-13" "2012-07-14" "2012-07-15" "2012-07-16" "2012-07-17"

构造数据集:

website = data.frame(date, ip = as.numeric(x[[2]]), pv = as.numeric(x[[3]]), 
    stringsAsFactors = F)

画出时间序列趋势图:

qplot(date, ip, data = website, geom = c("line", "point"), main = "ip ~ date")

plot of chunk unnamed-chunk-6


qplot(date, pv, data = website, geom = c("line", "point"), main = "pv ~ date")

plot of chunk unnamed-chunk-6


qplot(date, pv/ip, data = website, geom = c("line", "point"), xlab = "date", 
    ylab = "pv / ip rate", main = "pv/ip ~ date")

plot of chunk unnamed-chunk-6

还原区域设置:

Sys.setlocale("LC_TIME", lct)
## [1] "Chinese (Simplified)_People's Republic of China.936"