GBCC

This reads in a small-ish wireshark dump in CSV format:

shark <- read.csv("https://gist.githubusercontent.com/hrbrmstr/42480de78a3ee1ed39bb/raw/ffe5c62ce91a8b40b77a7b12c28254e4b3c11325/smallshark.csv")

Here’s how to turn PCAPs into CSV: https://ask.wireshark.org/questions/2935/creating-a-csv-file-with-tshark

There are tons of fields you can add to packet dumps.

Take a look at it:

str(shark)
#> 'data.frame':    347 obs. of  8 variables:
#>  $ frame.number: int  1 2 3 4 5 6 7 8 9 10 ...
#>  $ frame.time  : Factor w/ 347 levels "Feb  9, 1970 02:04:19.354593000 EST",..: 1 2 3 4 5 6 7 8 9 10 ...
#>  $ eth.src     : Factor w/ 3 levels "52:54:00:12:34:56",..: 1 2 1 2 1 1 1 1 1 1 ...
#>  $ eth.dst     : Factor w/ 9 levels "01:00:5e:00:00:16",..: 9 6 7 9 3 1 1 3 3 1 ...
#>  $ ip.src      : Factor w/ 19 levels "","10.0.2.15",..: 1 1 2 3 1 2 2 1 1 2 ...
#>  $ frame.len   : int  42 64 342 590 90 54 54 90 90 54 ...
#>  $ ip.dst      : Factor w/ 23 levels "","10.0.2.15",..: 1 1 3 21 1 16 16 1 1 16 ...
#>  $ ip.proto    : int  NA NA 17 17 NA 2 2 NA NA 2 ...

summary(shark)
#>   frame.number                                 frame.time 
#>  Min.   :  1.0   Feb  9, 1970 02:04:19.354593000 EST:  1  
#>  1st Qu.: 87.5   Feb  9, 1970 02:04:19.354615000 EST:  1  
#>  Median :174.0   Feb  9, 1970 02:04:19.360721000 EST:  1  
#>  Mean   :174.0   Feb  9, 1970 02:04:19.360738000 EST:  1  
#>  3rd Qu.:260.5   Feb  9, 1970 02:04:19.617335000 EST:  1  
#>  Max.   :347.0   Feb  9, 1970 02:04:19.623168000 EST:  1  
#>                  (Other)                            :341  
#>               eth.src                 eth.dst                ip.src   
#>  52:54:00:12:34:56:207   52:54:00:12:34:56:139   10.0.2.15      :176  
#>  52:55:0a:00:02:02:138   52:55:0a:00:02:02:126                  : 36  
#>  52:55:0a:00:02:03:  2   ff:ff:ff:ff:ff:ff: 19   10.0.2.3       : 17  
#>                          52:55:0a:00:02:03: 18   131.253.34.240 : 14  
#>                          01:00:5e:00:00:fc: 12   134.170.184.137: 14  
#>                          33:33:00:01:00:03: 12   23.76.195.70   : 13  
#>                          (Other)          : 21   (Other)        : 77  
#>    frame.len                  ip.dst       ip.proto     
#>  Min.   :  42.0   10.0.2.15      :134   Min.   : 2.000  
#>  1st Qu.:  54.0                  : 36   1st Qu.: 6.000  
#>  Median :  66.0   10.0.2.3       : 17   Median : 6.000  
#>  Mean   : 253.5   134.170.184.137: 17   Mean   : 8.103  
#>  3rd Qu.: 257.5   131.253.34.240 : 15   3rd Qu.: 6.000  
#>  Max.   :1514.0   93.184.215.200 : 15   Max.   :17.000  
#>                   (Other)        :113   NA's   :36

It’s a data.frame (basically an Excel spreadsheet).

“Cool” plots are easier with ggplot2. You have no idea what ggplot2 is but perhaps this code can help a bit.

library(ggplot2)
library(dplyr) # you have no idea what this is but google is ur friend

dest <- count(shark, ip.dst)

str(dest)
#> Classes 'tbl_df', 'tbl' and 'data.frame':    23 obs. of  2 variables:
#>  $ ip.dst: Factor w/ 23 levels "","10.0.2.15",..: 1 2 3 4 5 6 7 8 9 10 ...
#>  $ n     : int  36 134 1 12 17 5 6 6 6 15 ...

ggplot(dest) + 
  geom_bar(aes(x=reorder(ip.dst, n), y=n), stat="identity") + 
  coord_flip() +
  labs(title="Count of packets by destination address")


src <- count(shark, ip.src)

str(src)
#> Classes 'tbl_df', 'tbl' and 'data.frame':    19 obs. of  2 variables:
#>  $ ip.src: Factor w/ 19 levels "","10.0.2.15",..: 1 2 3 4 5 6 7 8 9 10 ...
#>  $ n     : int  36 176 1 17 3 6 6 6 14 14 ...

ggplot(src) + 
  geom_bar(aes(x=reorder(ip.src, n), y=n), stat="identity") + 
  coord_flip() +
  labs(title="Count of packets by source address")

Histogram & density plots of packet lengths:

ggplot(shark, aes(x=frame.len)) +
  geom_histogram(alpha=0.5, fill="steelblue")


ggplot(shark, aes(x=frame.len)) +
  geom_density(alpha=0.5, fill="steelblue")

shark$sec <- format(as.POSIXct(as.character(shark$frame.time), 
                               format="%b  %d, %Y %H:%M:%OS EST"), "%H%M%S")

by_sec <- count(shark, sec)
by_sec$ts <- as.POSIXct(sprintf("1970-02-09 %s", by_sec$sec), format="%Y-%m-%d %H%M%S")

ggplot(by_sec, aes(x=ts, y=n)) +
  geom_line(size=0.25) +
  geom_point(size=0.75)