I found some data on migraine headaches and thought plotting them might reveal a pattern. The data include a Start and End date and a Severity of the pain on a scale of 1 to 5.

Here are the packages we will use.

library(ggplot2)
library(tidyverse)
library(lubridate)
library(Jmisc)

Read in the data

Pain<- read.csv("HeadacheData.csv")
head(Pain, 3)
##        Date      End Character Severity  X X.1 X.2
## 1 3/27/2018 4/2/2018    Actual        3 NA  NA  NA
## 2  5/4/2018 6/2/2018    Actual        5 NA  NA  NA
## 3 5/28/2018 6/2/2018    Actual        5 NA  NA  NA
Pain$Date<- as.Date(Pain$Date , "%m/%d/%y")
Pain$End<- as.Date(Pain$End, "%m/%d/%y")
str(Pain)
## 'data.frame':    26 obs. of  7 variables:
##  $ Date     : Date, format: "2020-03-27" "2020-05-04" ...
##  $ End      : Date, format: "2020-04-02" "2020-06-02" ...
##  $ Character: Factor w/ 1 level "Actual": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Severity : int  3 5 5 4 4 3 3 3 3 3 ...
##  $ X        : logi  NA NA NA NA NA NA ...
##  $ X.1      : logi  NA NA NA NA NA NA ...
##  $ X.2      : logi  NA NA NA NA NA NA ...

If only things worked like they do in the examples!

Initially after going through the Lubridate steps the results came back with a df of NA’s.

The solution was to return to the excel sheet , select the Date column, then CNTRL+I brought up a drop-down menu , I selcted ‘date’ which brought up another menu where I selected the format “m/d/Y”. (Note use lower case “y” if your data has the year in a 2 digit format or a upper case “Y”if your data has the year in a 4 digit format.)

After this the Lubridate worked as advertised

order(Pain$Date)
##  [1] 11 21 12 22 13  1 14  2 23 15  3 24 16  4 17  5  6 25 18  7 26  8 19
## [24]  9 20 10
Pain<-Pain[order(Pain$Date) , ]

Somewhere along the way we picked up extraneous columns. Lets drop them.

Pain%>%select(Date, End, Character, Severity )

Alright the data is formatted . Lets plot.

A<-ggplot(Pain, aes(x=Date, y=Severity)) +
    geom_point(shape=1) 
A

Lets make the size of the points a funtion of the Severity.

B<- ggplot(Pain, aes(x=Date, y=Severity)) +
    geom_point(aes(size=Severity)) 
B

Not enough contrast btwn the point sizes. Lets change size exponentially.

Pain<-Pain%>%mutate(Raised_Severity =Severity^4.25)
head(Pain,3)
##         Date        End Character Severity  X X.1 X.2 Raised_Severity
## 1 2020-01-20 2020-01-25    Actual        3 NA  NA  NA         106.602
## 2 2020-02-11 2020-02-15    Actual        3 NA  NA  NA         106.602
## 3 2020-02-13 2020-02-17    Actual        3 NA  NA  NA         106.602

Sure enough, there is the new column that we were after but so are some more extraneuous ones that we will drop.

Pain%>%select(Date, End, Character, Severity, Raised_Severity )
head(Pain,3)
##         Date        End Character Severity  X X.1 X.2 Raised_Severity
## 1 2020-01-20 2020-01-25    Actual        3 NA  NA  NA         106.602
## 2 2020-02-11 2020-02-15    Actual        3 NA  NA  NA         106.602
## 3 2020-02-13 2020-02-17    Actual        3 NA  NA  NA         106.602

Ok let us try replotting with our augmented Severity.

C<- ggplot(Pain, aes(x=Date, y=Raised_Severity)) +
    geom_point(aes(size=Raised_Severity)) 
C

Lets change the points to red.

D<- ggplot(Pain, aes(x=Date, y=Raised_Severity)) +
    geom_point(aes(size=Raised_Severity, color="Red")) 
D

Now lets change the background for more contrast

E<-D+
  theme(
    panel.background = element_rect(fill = "black",
                                colour = "lightblue",
                                size = 0.5, linetype = "solid"))+
  theme(panel.border = element_blank(),
          panel.grid.major = element_blank(),
          panel.grid.minor = element_blank())
E

We dont need the legends.

F<-E+
  theme(legend.position = "none")
F

The bouncing up&down of the line contributes nothing. Lets put the points on a straight line. We will add a column filled witha constant number.

Pain<- addCol(Pain,value = c(Constant=3))
Pain

Yet again, we pick up extraneous columsn. Why?

Pain%>%select(Date, End, Character, Severity, Raised_Severity, Constant )
head(Pain,3)
##         Date        End Character Severity  X X.1 X.2 Raised_Severity
## 1 2020-01-20 2020-01-25    Actual        3 NA  NA  NA         106.602
## 2 2020-02-11 2020-02-15    Actual        3 NA  NA  NA         106.602
## 3 2020-02-13 2020-02-17    Actual        3 NA  NA  NA         106.602
##   Constant
## 1        3
## 2        3
## 3        3

Huh? Nothing dropped by Select.

Lets try a more direct approach.

Pain<- Pain%>%
  select(-c(X, X.1, X.2))
G<- ggplot(Pain, aes(x=Date ,y= Constant))+
  geom_point(aes(size= Raised_Severity, color="Red"))
G

H<-G+
  theme(
    panel.background = element_rect(fill = "black",
                                
                                ))+
  theme(panel.border = element_blank(),
          panel.grid.major = element_blank(),
          panel.grid.minor = element_blank())+
  theme(legend.position = "none")
H

Let us drop the Y -axix and add a title.

I <- H+
  theme(axis.title.y=  element_blank(),
        axis.text.y =  element_blank(),
        axis.ticks.y = element_blank()
                )
I

J<-I +
  ggtitle("Headaches: Frequency and Severity")
J