Overview

This document generates and visualizes a synthetic dataset where the payment is a function of two variables: n (from 1 to 20) and v (from 0 to 0.9). We explore how payment varies with the product n × v, and we fit a quadratic regression model to describe this relationship.

Load Libraries

library(ggplot2)
library(dplyr)
library(knitr)

Define Parameters and Generate Data

# Define sequences for n and v
nValues <- seq(1, 20, 1)
vValues <- seq(0, 0.9, 0.01)

# Define the payment function
payment <- function(n, v) {
  9.99201 * n * (1 - v)
}

# Create a data frame with all combinations of n and v
df <- expand.grid(n = nValues, v = vValues)

# Compute nv and payment
df <- df %>%
  mutate(
    nv = n * v,
    payment = payment(n, v)
  )

kable(df[sample(dim(df)[1] ,8), ], caption = "sample values in df")
sample values in df
n v nv payment
929 9 0.46 4.14 48.561169
1162 2 0.58 1.16 8.393288
477 17 0.23 3.91 130.795411
1135 15 0.56 8.40 65.947266
246 6 0.12 0.72 52.757813
525 5 0.26 1.30 36.970437
1067 7 0.53 3.71 32.873713
1211 11 0.60 6.60 43.964844

Scatterplot: Payment vs n × v

ggplot(df, aes(x = nv, y = payment)) +
  geom_point(aes(color = as.factor(round(n)), size = sqrt(1+10*v)), shape = 21, alpha = 0.7) +
# ggplot(df, aes(x = nv, y = payment, color = as.factor(round(n)))) +
#  geom_point(alpha = 0.5, size = 0.7) +
  labs(
    title = "Scatterplot of Payment vs n × v",
    x = "n × v",
    y = "Payment",
    shape = "Rounded v",
    color = "Rounded n"
  ) +
  theme_minimal()

Quadratic Regression: Payment vs n × v

Regression

# Fit quadratic model
model <- lm(payment ~ nv + I(nv^2), data = df)

# Show summary of the model
summary(model)
## 
## Call:
## lm(formula = payment ~ nv + I(nv^2), data = df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -58.230 -36.672  -8.738  26.596 144.866 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 54.97439    2.09272  26.269  < 2e-16 ***
## nv           3.12828    0.79809   3.920 9.19e-05 ***
## I(nv^2)     -0.30933    0.05664  -5.461 5.38e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 44.13 on 1817 degrees of freedom
## Multiple R-squared:  0.02443,    Adjusted R-squared:  0.02336 
## F-statistic: 22.76 on 2 and 1817 DF,  p-value: 1.735e-10

Notice that the coefficient for the quadratic term is negative (with a positive on the linear term), indicating that a concave and eventually downward function is possible. In fact you can even “estimate” the value of “nv” at which it will turn downward.

Regression Plot for “all” data

ggplot(df, aes(x = nv, y = payment,  color = as.factor(round(n, 1)))) +
  geom_point(alpha = 0.4, size = 0.7) +
  stat_smooth(method = "lm", formula = y ~ x + I(x^2), color = "black", size = 1.2) +
  labs(
    title = "Quadratic Regression: Payment vs n × v",
    x = "n × v",
    y = "Payment",
    color = "n"
  ) +
  theme_minimal() +
  theme(legend.position = "right")  

Regression plot separated by v

df <- df %>% filter(round(10 * v) %in% c(1,4,7,10))  # use a manageable subset

ggplot(df, aes(x = nv, y = payment, shape = as.factor(round(10*v)),  color = as.factor(round(n, 1)))) +
  geom_point(alpha = 0.4, size = 0.7) +
# aes(group = 1), to ensure that single regression is done
  stat_smooth(aes(group = 1), method = "lm", formula = y ~ x + I(x^2), color = "black", size = 1.2) +
  labs(
    title = "Quadratic Regression: Payment vs n × v",
    x = "n × v",
    y = "Payment",
    color = "n"
  ) +
  theme_minimal() +
  theme(legend.position = "bottom") +  # Move legend to bottom for horizontal layout
  guides(
    color = guide_legend(nrow = 1, byrow = TRUE),
    shape = guide_legend(nrow = 1, byrow = TRUE)
  )