Original Email:

So here’s an example. Like I said, I am trying to compare the STABILITY measure within the couples. So for instance, a plot of interest would be the STABILITY measure for all the couples, with STABILITY for one person in the relationship on one axis, and the other person’s STABILITY on another axis. Is there a quick/dirty way to do this?? Thanks for your help!

Below, I’ve outlined the steps to analyze your code, from importing your data, cleaning it up and finally making the plots you requested. You can copy paste this code into R to replicate the results and let me know if you have any question!

Josh

My code and notes

1) Import data

danData <- read.csv("NRI_toydata Josh.csv")
summary(danData)
##  STABILITY_Romantic    CoupleID    
##  Min.   :-2.000     Min.   :  4.0  
##  1st Qu.: 1.267     1st Qu.:604.0  
##  Median : 2.233     Median :643.0  
##  Mean   : 1.973     Mean   :577.3  
##  3rd Qu.: 2.767     3rd Qu.:675.0  
##  Max.   : 4.000     Max.   :900.0


2) Add variable for partner 1 or 2

In this section, I create a new variable called partnerNum. This is either set to 1 or 2, in the order in which they first appear in the dataset. I could also imagine this being a different binary characteristic such as Male / Female. In any case, you need some variable with which to choose the first and second partner (even if it’s artificially created as it is here).

# create new variable partnerNum and set all values equal to -1
danData$partnerNum <- -1 

#set partner number equal to 1 or 2, based on whether or not it's a duplicate
danData[!duplicated(danData$CoupleID), "partnerNum"] <- 1
danData[duplicated(danData$CoupleID), "partnerNum"] <- 2

head(danData)
##   STABILITY_Romantic CoupleID partnerNum
## 1          1.4000000        4          1
## 2          3.2000000        7          1
## 3          0.3333333        4          2
## 4          3.0000000        5          1
## 5          2.4333333        5          2
## 6          1.0000000        8          1


3) Melt and cast data set

Melting and casting is a great way to reshape data sets, such as you need to do here. I don’t do a full expaination here since other people have done this already. I’d read this article and then Google around for ‘reshape2 melt cast’

#load reshape library
library(reshape2)

dan.melt <- melt(danData, id.vars = c("CoupleID", "partnerNum"))
dan.cast <- dcast(dan.melt, formula = CoupleID ~ partnerNum)

#fix names of columns
names(dan.cast) <- c("CoupleID", "Partner1", "Partner2")

head(dan.cast)
##   CoupleID Partner1   Partner2
## 1        4 1.400000  0.3333333
## 2        5 3.000000  2.4333333
## 3        7 3.200000 -0.8000000
## 4        8 1.000000  2.2333333
## 5       12 2.300000  2.2333333
## 6      201 1.666667  2.8000000


4) Create scatter plot of data

plot(dan.cast[,c("Partner1","Partner2")])


5) Plot in ggplot

library(ggplot2)
ggplot(dan.cast, aes(x = Partner1, y = Partner2)) +
  geom_point(aes(size=4, alpha=0.7)) +
  theme_minimal() +
  ggtitle("Comparison of values by CoupleID") +
  scale_size(guide=FALSE) + # suppress output of size legend
  scale_alpha(guide=FALSE, range = c(0.7)) # suppress output of opacity legend
## Warning: Removed 1 rows containing missing values (geom_point).


6) Example with estimation

m <- lm(dan.cast$Partner2 ~ dan.cast$Partner1)
a <- signif(coef(m)[1], digits = 2)
b <- signif(coef(m)[2], digits = 2)
textlab <- paste("y = ",b,"x + ",a, sep="")

ggplot(dan.cast, aes(x = Partner1, y = Partner2)) +
  geom_point(aes(size=4, alpha=0.7)) +
  theme_minimal() +
  geom_smooth(method = "lm") +
  annotate("text", x = -.6, y = 3.5, label = textlab, color="black", size = 8, parse=FALSE) +
  labs(title = "Comparison of Stability by CoupleID", x = "Partner 1 Stability", y = "Partner 2 Stability") +
  scale_size(guide=FALSE) + # suppress output of size legend
  scale_alpha(guide=FALSE, range = c(0.7)) # suppress output of opacity legend
## Warning: Removed 1 rows containing missing values (stat_smooth).
## Warning: Removed 1 rows containing missing values (geom_point).