Martin Rico
2023-08-01
The project described include three parts that confirm the knowledge about Developing Data Products.
Exploratory Analysis
Estimation
Documentation
For the project is used the diamonds data frame. The goal is allow to the user play with four variables (Carat, Cut, Color, Clarity) and make some plots using ggplot.
The code used to draw the graphics using ggplot and shiny is:
output$distPlot <- renderPlot({
plot <- ggplot(data = diamonds)
if(is.numeric(diamonds[[input$xaxis]])){
plot <- plot + aes(x=diamonds[[input$xaxis]], y=diamonds$price, colour=diamonds[[input$coloraxis]]) +
labs(y="Price (US dollars)", x=input$xaxis, colour=input$coloraxis)+
labs(title = paste("Diamond prices using ",input$xaxis,"and",input$coloraxis))+
geom_point()
}else{
plot <- plot + aes(x=diamonds[[input$xaxis]], y=diamonds$price, fill=diamonds[[input$coloraxis]]) +
labs(y="Price (US dollars)", x=input$xaxis, fill=input$coloraxis)+
labs(title = paste("Diamond prices using ",input$xaxis,"and",input$coloraxis))+
theme_update(plot.title = element_text(hjust = 0.5))+
geom_bar(stat = "identity")
}
plot
})This part allow estimate the diamond price using Carat and Cut variables.
The linear regression model was build using the instruction:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2701.37602 15.43108 -175.060759 0.000000e+00
## carat 7871.08213 13.97963 563.039505 0.000000e+00
## cut.L 1239.80045 26.10004 47.501852 0.000000e+00
## cut.Q -528.59779 23.13239 -22.850983 5.040203e-115
## cut.C 367.90995 20.21416 18.200609 8.496080e-74
## cut^4 74.59427 16.23958 4.593361 4.371486e-06
The estimation is done using the function predict(). The functionality allow change the carat value in order to estimate the price.
newDataForPrediction<-data.frame(carat=3,cut="Ideal")
predictedPrice<-predict(modelFit,newDataForPrediction)
as.double(predictedPrice)## [1] 21538.7
For facility cut value is always “Ideal”.
The documentation is done loading two .txt files where is described the use of the application.
The code to load the .txt files with the information is: