2024-07-23

Introduction

  • R Shiny app that allows you to generate, visualize, and download linear regression data

  • Could be useful for educators

  • Available at https://kristopherhuffman.shinyapps.io/Linear-Regression-Data-Generator/

  • Inputs

    • Seed: Numeric parameter governing the random generation of data
    • Number of Observations: Number of observations to be generated
    • Mean/Variance of Predictor: Population mean/variance of the normally distributed predictor
    • Slope/Intercept: Population slope/intercept
    • Error Variance: Error variance of normally distributed errors with mean zero

Outputs

  • Each set of inputs produces a plot of the generated data that includes the fitted regression line and the estimated slope, intercept, and R^2

  • Data can be downloaded as a .csv file and include the generated predictor (x), the generated error term (error), the generated outcome (y), and the population mean (beta) and population intercept (int) used to generate the data

  • Plot can be downloaded as a .png file

  • All downloaded files are indexed with the input seed and number of observations

Example Code

# inputs specified by the user
seed<-1234; set.seed(1234); n<-100 # set seed, number of obs
b<-2; int<-0; v<-1 # pop mean, pop int, error variance
xmu<-0; xvar<-1    # predictor mean and variance

# generate data
dat<-data.frame(x=rnorm(n,xmu,xvar),error=rnorm(n,0,v)) # generate x + noise 
dat<-dat %>% mutate(y = b*x + int + error)              # generate outcome y
myfit     <-lm(y ~ x,data=dat)                          # fit line
intercept <-round(myfit$coefficients[1],3)              # fitted int
slope     <-round(myfit$coefficients[2],3)              # fitted slope
r2        <-round(summary(myfit)$r.squared,3)           # r2

# label for plot
mylab <- paste0("Slope Estimate: ",slope,", ","Intercept Estimate: ",
                intercept,", ","R-Squared Estimate: ",r2)
# plot data, regression line, and label
g<-dat %>% ggplot()
p<-g+geom_point(aes(x = x, y = y)) + 
   geom_smooth(aes(x=x,y=y),method=lm,color='blue',se=FALSE) + 
   annotate("label",x=mean(dat$x),y=quantile(dat$y,.99),label=mylab,size=4)  

Example Output

## `geom_smooth()` using formula = 'y ~ x'