Case study 1 - Heliotronics: Estimating an Experience Curve

Heliotronics is a Massachusetts based company that promotes innovative ideas and products to foster the adoption of utility grid connected photovoltaic ( Also, the firm devises educational programmes dealing with sustainable energy production to local population and business prospects.

To help Helotronics to determine a competitive price bid for the canton Tessin, it is necessary to estimate the manufacturing cost per solar panel for the 400 solar panels produced for Switzerland starting in April 2022.

# Reading the data

my_data <- read.csv("Case_Study_1.csv", header=TRUE)

# Exploring the data
head(my_data)
##   number_of_solar_panels manufacturing_cost
## 1                    100             1250.0
## 2                    200             1108.7
## 3                    300             1053.5
## 4                    400             1033.4
## 5                    500              989.8
## 6                    600              999.0
# Rename the column's
names(my_data) <- c("Total", "Cost")
head(my_data)
##   Total   Cost
## 1   100 1250.0
## 2   200 1108.7
## 3   300 1053.5
## 4   400 1033.4
## 5   500  989.8
## 6   600  999.0

QUESTION 1

# Plot the data

p <- ggplot(my_data, aes(x=Total, y=Cost)) +
  geom_point() +
  theme_minimal() +  
  labs(
    title = "Scatter Plot of Production VS Cost",
    x = "Total Number of Solar Panels Produced",
    y = "Average Manufacturing Cost per Solar Panel ($)"
  )

# Display the plot
print(p)

By examining the scatter plot we just created, the multiplicative learning model Y = AX^b appears to be applicable to this data set.

The scatter plot shows that as the total number of solar panels produced increases, the average manufacturing cost per solar panel decreases. This trend is indicative of the experience curve effect captured by the model, where cost reductions are realized through increased production.

The plot visually confirms the expected downward trend in costs with increased production, aligning with the principles of the multiplicative learning model.

# Transform the data with the logarithm
my_data$log_Total = log(my_data$Total)
my_data$log_Cost = log(my_data$Cost)

head(my_data)
##   Total   Cost log_Total log_Cost
## 1   100 1250.0  4.605170 7.130899
## 2   200 1108.7  5.298317 7.010943
## 3   300 1053.5  5.703782 6.959873
## 4   400 1033.4  5.991465 6.940610
## 5   500  989.8  6.214608 6.897503
## 6   600  999.0  6.396930 6.906755
# Plotting the transformed scatter plot using ggplot2
p <- ggplot(my_data, aes(x=log_Total, y=log_Cost)) +
  geom_point() +  
  theme_minimal() + 
  labs(
    title = "Log-Log Scatter Plot of Production VS Cost",
    x = "Log of Total Number of Solar Panels Produced",
    y = "Log of Average Manufacturing Cost per Solar Panel ($)"
  )

# Display the plot
print(p)

By transforming both the cumulative production and the average manufacturing cost data to a logarithmic scale, we’ve linearized the relationship between them. The new log-log plot provides a clearer representation of the linear relationship assumed by the multiplicative learning model. Looking at the scatter plot, we can observe a more linear trend, which suggests that the log transformation was appropriate and successful in linearizing the multiplicative relationship.

This transformation will allow us to apply linear regression techniques to estimate the parameters of the learning curve model effectively.

QUESTION 2

# Linear regression (transformed data)
transformed_model <- lm(log_Cost ~ log_Total, data = my_data)


# Summary of the model
model_summary <- summary(transformed_model)
print(model_summary)
## 
## Call:
## lm(formula = log_Cost ~ log_Total, data = my_data)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.054515 -0.019960 -0.004477  0.021216  0.047893 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  7.850325   0.054409  144.28  < 2e-16 ***
## log_Total   -0.154991   0.007936  -19.53  1.7e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0299 on 20 degrees of freedom
## Multiple R-squared:  0.9502, Adjusted R-squared:  0.9477 
## F-statistic: 381.4 on 1 and 20 DF,  p-value: 1.701e-14

Model Fit

The model explains a significant amount of the variance in the data, with an R^2 value of 0.9502. This suggests that approximately 95% of the variability in the average manufacturing cost can be explained by the cumulative production quantity using this model. The adjusted R^2 is 0.9477, which takes into account the number of predictors in the model and is also very high, indicating a good fit.

Coefficients

# Experience parameter and Cost of the first unit
A = coef(transformed_model)["log_Total"]
b = exp(coef(transformed_model)["(Intercept)"])

print(paste("The experience parameter (b) is approximately:", b))
## [1] "The experience parameter (b) is approximately: 2566.56946481684"
print(paste("The cost of the first unit (A), adjusted to the start of the period under consideration, is approximately:", A))
## [1] "The cost of the first unit (A), adjusted to the start of the period under consideration, is approximately: -0.154990618092004"

The intercept is estimated to be 7.85. Since we are working with a log-log model, this means that when the log of the total number of solar panels produced is zero, the log of the average manufacturing cost is about 7.85. In natural terms, this would mean that the cost of the first unit is e(7.850325).

The slope coefficient for log_Total, which represents the experience parameter (b), is -0.155. This is a statistically significant estimate (p-value < 0.01), indicating a high level of confidence in the relationship between cumulative production and cost.

Learning Rate

The learning rate can be derived from the experience parameter. In the context of the learning curve, the learning rate represents the percentage reduction in cost each time the cumulative production doubles. It can be calculated as

b <- -0.154991

progress_ratio <- 2^b

learning_rate <- 1 - progress_ratio

print(learning_rate)
## [1] 0.101862

A learning rate of approximately 0.101862 indicates that for every doubling of the total number of solar panels produced by Heliotronics, the average manufacturing cost per solar panel decreases by about 10.19%. This reflects the efficiency gains attributable to learning and the experience gained through increased production. It’s a significant reduction, showing that the company becomes more cost-efficient as it scales up production.

QUESTION 3

The total production volume will be 4600 Units when the company starts to produce for the canton of Tessin the 400 units. Thus, the average production cost per solar panel for 4700 4800 4900 and 5000 units are estimated and their mean is computed.

log_future_productions = log( c(4700, 4800, 4900, 5000) )   # Transform the future production with log

predicted_log_costs = predict(transformed_model, newdata = data.frame(log_Total = log_future_productions))

predicted_costs = exp(predicted_log_costs)   # Convert the predicted log cost

print(paste("The estimated average manufacturing cost per solar panel above 4,600 units is:",round( mean(predicted_costs),2), "$" ))
## [1] "The estimated average manufacturing cost per solar panel above 4,600 units is: 688.84 $"

Exact production

On the contrary, the estimated average manufacturing cost per solar panel when starting to produce the 400 units is:

log_future_production = log(4600)   # Transform the future production with log

predicted_log_cost = predict(transformed_model, newdata = data.frame(log_Total = log_future_production))

predicted_cost = exp(predicted_log_cost)   # Convert the predicted log cost

print(paste("The estimated average manufacturing cost per solar panel when the total cumulative production reaches 4600 units is:", round( predicted_cost,2),"$" ))
## [1] "The estimated average manufacturing cost per solar panel when the total cumulative production reaches 4600 units is: 694.48 $"

The cost to produce solar panels changes from starting 2,600 units to 3,000 units as it is noticed.

QUESTION 4

To calculate a 95% confidence interval for the average manufacturing cost per solar panel for the panels produced for Switzerland:

  1. Set \(\alpha = 0.05\)
  2. Extract standard error from model
  3. Calculate inferior and superior limits for the confidence interval
alpha<-0.05

sum <- summary(transformed_model)

df<- sum$df[2]

std_error<-  sum$coefficients[1,2]

t_value<- qt(1 - alpha/2, df, lower.tail = TRUE)

b_inf = coef(transformed_model)[1] - std_error*t_value

b_sup = coef(transformed_model)[1] + std_error*t_value

Estimating the confidence interval by using the experience parameter only:

cost_inf =  exp( coef(transformed_model)[2]*log_future_productions+  b_inf )

cost_sup =  exp( coef(transformed_model)[2]*log_future_productions+ b_sup )
print(paste("The estimated confidence interval for the average manufacturing cost per solar panel above 4600 units is: [", round( mean(cost_inf),2),"$", round( mean(cost_sup),2),"$ ]" ) )
## [1] "The estimated confidence interval for the average manufacturing cost per solar panel above 4600 units is: [ 614.93 $ 771.63 $ ]"