Not a strong correlation!
Pearson's product-moment correlation
data: auto$price and auto$sellingTime
t = 82.327, df = 300660, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.1449817 0.1519730
sample estimates:
cor
0.1484792
---
title: "EDA - Used Car from Ebay Kleinanzeigen"
output:
flexdashboard::flex_dashboard:
orientation: rows
social: menu
source_code: embed
---
```{r setup, include=FALSE}
library(ggplot2)
library(plotly)
library(plyr)
library(flexdashboard)
library(data.table)
library(ggplot2)
library(lubridate)
library(dplyr)
library(gridExtra)
auto <- read.csv("autoCleaned.csv")
```
Freq. Distribute
=======================================================================
Row
-----------------------------------------------------------------------
### Histgram - Age
```{r}
ggplot(aes(as.integer(auto$age)), data=auto) +
geom_histogram(color='black', fill=I('#F79420')) +
scale_x_continuous(limit=c(0, 35), breaks=seq(0, 35, 2)) +
labs(x= 'Car Age', y= 'Count', title= 'Car Age Histogram')
```
### Histgram - Mileage
```{r}
ggplot(aes(auto$mileage), data=auto) +
geom_bar(color='black', fill=I('#F79420')) +
scale_x_continuous(limit=c(0, 100000), breaks=seq(0, 100000, 25000)) +
xlab("Mileage") +
ylab("Count")
```
Row
-----------------------------------------------------------------------
### Histgram - Engine Power
```{r}
ggplot(auto, aes(auto$powerPS)) +
geom_histogram(fill= I('#F79420'), color='black', binwidth=15) +
labs(x= 'Engine Power', y= 'Count') +
ggtitle('Histogram of Engine Power (PowerPS)') +
scale_x_continuous(limit=c(0, 250), breaks=seq(0, 250, 50))
```
### Histgram - Vehicle Type
```{r}
ggplot(auto, aes(x=vehicleType)) +
geom_bar(fill= I('#F79420'), color='black') +
scale_fill_brewer(type= 'div') +
labs(x= 'Vehicle Type', y= 'Count') +
ggtitle('Vehicle Type Frequency Diagram')
```
Price vs. Engine Power
=======================================================================
Column {.tabset}
-----------------------------------------------------------------------
### Price, EnginePower, Type
```{r}
ggplot(data = subset(auto, !is.na(powerPS)), aes(x = powerPS, y = price)) +
geom_point(alpha = 1/50, color = I("#F79420"), position = 'jitter') +
geom_smooth() +
facet_wrap(~vehicleType) +
xlab('Engine Power') +
ylab('Price')
```
### Price vs. EnginPower
```{r}
ggplot(data= subset(auto, !is.na(powerPS)), aes(x= vehicleType, y= powerPS)) +
geom_boxplot(alpha = 1/50, color = I("#F79420")) +
stat_summary(fun.y = mean, geom="point", size=2) +
xlab('Vehicle Type') +
ylab('Engine Power')
```
Price vs. Age
=======================================================================
Column {.tabset}
-----------------------------------------------------------------------
### Price, Type, Age
```{r}
ggplot(data = subset(auto, !is.na(age)), aes(x = age, y = price)) +
geom_point(alpha = 1/50, color = I("#F79420"), position = 'jitter') +
geom_smooth() +
facet_wrap(~vehicleType) +
xlab('Age of cars') +
ylab('Price')
```
### Price vs. Age
```{r}
ggplot(data= subset(auto, !is.na(age)), aes(x= vehicleType, y= age)) +
geom_boxplot(alpha = 1/50, color = I("#F79420")) +
stat_summary(fun.y = mean, geom="point", size=2) +
xlab('Vehicle Type') +
ylab('Age')
```
Selling Time vs. Price
=======================================================================
Column {.tabset}
-----------------------------------------------------------------------
### Selling Time vs. Price + Type
```{r}
ggplot(data = subset(auto, !is.na(sellingTime)), aes(x = price, y = sellingTime)) +
geom_point(alpha = 1/50, color = I("#F79420"), position = 'jitter') +
geom_smooth() +
facet_wrap(~vehicleType) +
xlab('Price') +
ylab('Selling Time')
```
### Correlation Between Price and SellingTime
Not a strong correlation!
```{r}
cor.test(auto$price, auto$sellingTime)
```