由于本人水平有限,错漏和不妥之处在所难免,殷望批评指正!

分析工具:R

OpenStack日志结构

读取日志文件:nova-api.log

library(readr)
novaApi<-read_lines("G:/nova-api.log")
length(novaApi)
## [1] 3234903

日志文件普遍比较大,readr::read_lines是我目前找到读取日志文件最快的函数了,如果有谁知道更快的,告诉我一下

head(novaApi)
## [1] "2017-06-22 03:07:14.154 23317 DEBUG nova.osapi_compute.wsgi.server [-] (23317) accepted ('192.168.17.2', 56980) server /usr/lib/python2.7/site-packages/eventlet/wsgi.py:826"                                                                                       
## [2] "2017-06-22 03:07:14.155 23317 DEBUG nova.api.openstack.wsgi [-] Calling method '<bound method Versions.index of <nova.api.openstack.compute.versions.Versions object at 0x8437f50>>' _process_stack /usr/lib/python2.7/site-packages/nova/api/openstack/wsgi.py:792"
## [3] "2017-06-22 03:07:14.156 23317 INFO nova.osapi_compute.wsgi.server [-] 192.168.17.2 \"OPTIONS / HTTP/1.0\" status: 200 len: 501 time: 0.0012259"                                                                                                                     
## [4] "2017-06-22 03:07:14.608 23351 DEBUG nova.metadata.wsgi.server [-] (23351) accepted ('192.168.17.2', 40228) server /usr/lib/python2.7/site-packages/eventlet/wsgi.py:826"                                                                                            
## [5] "2017-06-22 03:07:14.613 23351 INFO nova.api.ec2 [-] 0.311s 192.168.17.2 OPTIONS / None:None 200 [None] text/plain text/plain"                                                                                                                                       
## [6] "2017-06-22 03:07:14.613 23351 INFO nova.metadata.wsgi.server [-] 192.168.17.2 \"OPTIONS / HTTP/1.0\" status: 200 len: 234 time: 0.0014119"

基本结构:日期+时间+日志级别+模块+日志内容+方法+源码路径

也有些日志没有方法源码路径

## [1] "2017-06-22 03:07:14.156 23317 INFO nova.osapi_compute.wsgi.server [-] 192.168.17.2 \"OPTIONS / HTTP/1.0\" status: 200 len: 501 time: 0.0012259"

还有一些日志以多行的形式打出:

## [1] "2017-06-22 11:32:17.634 8748 DEBUG keystoneclient.session [-] RESP: [300] date: Thu, 22 Jun 2017 03:32:17 GMT vary: X-Auth-Token content-length: 599 content-type: application/json connection: close "                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
## [2] "RESP BODY: {\"versions\": {\"values\": [{\"status\": \"stable\", \"updated\": \"2015-03-30T00:00:00Z\", \"media-types\": [{\"base\": \"application/json\", \"type\": \"application/vnd.openstack.identity-v3+json\"}], \"id\": \"v3.4\", \"links\": [{\"href\": \"http://192.168.19.252:35357/v3/\", \"rel\": \"self\"}]}, {\"status\": \"stable\", \"updated\": \"2014-04-17T00:00:00Z\", \"media-types\": [{\"base\": \"application/json\", \"type\": \"application/vnd.openstack.identity-v2.0+json\"}], \"id\": \"v2.0\", \"links\": [{\"href\": \"http://192.168.19.252:35357/v2.0/\", \"rel\": \"self\"}, {\"href\": \"http://docs.openstack.org/\", \"type\": \"text/html\", \"rel\": \"describedby\"}]}]}}"
## [3] " _http_log_response /usr/lib/python2.7/site-packages/keystoneclient/session.py:215"

日志的结构并不十分统一,而且开发人员修改源码的时候也会打一些自己定义的日志,所以有些方案是从源码出发,从源码中解析出日志的模版结构,如:

Large-Scale System Problems Detection by Mining Console Log

http://nma.berkeley.edu/ark:/28722/bk0005k9b6k

目前,日志结构统一的部分是时间+日志级别+模块。为了方便处理,把以多行形式打出的日志重新整理成一行,作为一条日志对象。这样处理的假定条件是:

1、每一条日志都以时间戳开始

2、忽略日志之间的干扰,认为日志的抛出是以条为单位且有序的

实际上,有时候会因为网络、日志收集方式等原因会导致我们看到的日志是残缺且无序的,这种情况的日志相对来讲量不多,所以一般被passed掉,如果量很多,说明这个系统问题挺大了

unifyLog<-function(logSet,pattern='[0-9]+[-]+[0-9]+[0-9]+[ ]+[0-9]+[:]+[0-9]+[:]+[0-9]',outPutTime=T){
  timeofLog<-substr(logSet,1,23)
  normalLogSerial<-grep(pattern,timeofLog)
  diffSerial<-diff(normalLogSerial)
  breakPoint<-which(diffSerial>1)
  lapply(breakPoint,
         function(i){
           logSet[normalLogSerial[i]]<<-paste0(logSet[normalLogSerial[i]:(normalLogSerial[i]+diffSerial[i]-1)],collapse = '')
         }
  )
  if(outPutTime){
    return(list(unifiedLog=logSet[normalLogSerial],timeofLog=timeofLog[normalLogSerial]))
  }else{
    return(logSet[normalLogSerial])
  }
}
timeConsuming<-system.time(unifiedLog<-unifyLog(novaApi))
str(unifiedLog)
## List of 2
##  $ unifiedLog: chr [1:2775321] "2017-06-22 03:07:14.154 23317 DEBUG nova.osapi_compute.wsgi.server [-] (23317) accepted ('192.168.17.2', 56980)"| __truncated__ "2017-06-22 03:07:14.155 23317 DEBUG nova.api.openstack.wsgi [-] Calling method '<bound method Versions.index of"| __truncated__ "2017-06-22 03:07:14.156 23317 INFO nova.osapi_compute.wsgi.server [-] 192.168.17.2 \"OPTIONS / HTTP/1.0\" statu"| __truncated__ "2017-06-22 03:07:14.608 23351 DEBUG nova.metadata.wsgi.server [-] (23351) accepted ('192.168.17.2', 40228) serv"| __truncated__ ...
##  $ timeofLog : chr [1:2775321] "2017-06-22 03:07:14.154" "2017-06-22 03:07:14.155" "2017-06-22 03:07:14.156" "2017-06-22 03:07:14.608" ...

3234903行日志处理完后,实际日志条数为2775321条,提取出日志的时间,整个处理过程耗时:

timeConsuming
##    user  system elapsed 
##   11.75    0.25   12.16

有更快处理方案的,告诉我一下