X.4 X.3 X.2 X.1
Min. : 1.0 Min. : 1.0 Min. : 1.0 Min. : 1.0
1st Qu.: 500.8 1st Qu.: 500.8 1st Qu.: 500.8 1st Qu.: 500.8
Median :1000.5 Median :1000.5 Median :1000.5 Median :1000.5
Mean :1000.5 Mean :1000.5 Mean :1000.5 Mean :1000.5
3rd Qu.:1500.2 3rd Qu.:1500.2 3rd Qu.:1500.2 3rd Qu.:1500.2
Max. :2000.0 Max. :2000.0 Max. :2000.0 Max. :2000.0
X ID Marital.Status Gender
Min. : 1.0 Min. :11000 Length:2000 Length:2000
1st Qu.: 500.8 1st Qu.:15291 Class :character Class :character
Median :1000.5 Median :19744 Mode :character Mode :character
Mean :1000.5 Mean :19966
3rd Qu.:1500.2 3rd Qu.:24471
Max. :2000.0 Max. :29447
Income Children Education Occupation
Min. : 10000 Min. :0.000 Length:2000 Length:2000
1st Qu.: 30000 1st Qu.:0.000 Class :character Class :character
Median : 60000 Median :2.000 Mode :character Mode :character
Mean : 56215 Mean :1.901
3rd Qu.: 70000 3rd Qu.:3.000
Max. :170000 Max. :5.000
Home.Owner Cars Commute.Distance Region
Length:2000 Min. :0.000 Length:2000 Length:2000
Class :character 1st Qu.:1.000 Class :character Class :character
Mode :character Median :1.000 Mode :character Mode :character
Mean :1.454
3rd Qu.:2.000
Max. :4.000
Age Purchased.Bike
Min. :25.00 Length:2000
1st Qu.:35.00 Class :character
Median :43.00 Mode :character
Mean :44.18
3rd Qu.:52.00
Max. :89.00
X.4 X.3 X.2 X.1 X ID Marital.Status Gender Income Children Education
1 1 1 1 1 1 12496 Married Female 40000 1 Bachelors
2 2 2 2 2 2 24107 Married Male 30000 3 Partial College
3 3 3 3 3 3 14177 Married Male 80000 5 Partial College
4 4 4 4 4 4 24381 Single Male 70000 0 Bachelors
5 5 5 5 5 5 25597 Single Male 30000 0 Bachelors
6 6 6 6 6 6 13507 Married Female 10000 2 Partial College
Occupation Home.Owner Cars Commute.Distance Region Age Purchased.Bike
1 Skilled Manual Yes 0 0-1 Miles Europe 42 No
2 Clerical Yes 1 0-1 Miles Europe 43 No
3 Professional No 2 2-5 Miles Europe 60 No
4 Professional Yes 1 5-10 Miles Pacific 41 Yes
5 Clerical No 0 0-1 Miles Europe 36 Yes
6 Manual Yes 0 1-2 Miles Europe 50 No
'data.frame': 2000 obs. of 18 variables:
$ X.4 : int 1 2 3 4 5 6 7 8 9 10 ...
$ X.3 : int 1 2 3 4 5 6 7 8 9 10 ...
$ X.2 : int 1 2 3 4 5 6 7 8 9 10 ...
$ X.1 : int 1 2 3 4 5 6 7 8 9 10 ...
$ X : int 1 2 3 4 5 6 7 8 9 10 ...
$ ID : int 12496 24107 14177 24381 25597 13507 27974 19364 22155 19280 ...
$ Marital.Status : chr "Married" "Married" "Married" "Single" ...
$ Gender : chr "Female" "Male" "Male" "Male" ...
$ Income : num 40000 30000 80000 70000 30000 10000 160000 40000 20000 60000 ...
$ Children : int 1 3 5 0 0 2 2 1 2 2 ...
$ Education : chr "Bachelors" "Partial College" "Partial College" "Bachelors" ...
$ Occupation : chr "Skilled Manual" "Clerical" "Professional" "Professional" ...
$ Home.Owner : chr "Yes" "Yes" "No" "Yes" ...
$ Cars : num 0 1 2 1 0 0 4 0 2 1 ...
$ Commute.Distance: chr "0-1 Miles" "0-1 Miles" "2-5 Miles" "5-10 Miles" ...
$ Region : chr "Europe" "Europe" "Europe" "Pacific" ...
$ Age : int 42 43 60 41 36 50 33 43 58 43 ...
$ Purchased.Bike : chr "No" "No" "No" "Yes" ...
---
title: "Assignment-2"
output:
flexdashboard::flex_dashboard:
orientation: row
vertical_layout: scroll
source_code: embed
theme: spacelab
social: menu
---
```{r}
library(reshape2)
library('ggvis')
library('tidyverse')
library('ggplot2')
library(corrplot)
```
## Dataset Description {.tabset}
### Summary of dataset
```{r}
bike_buyers = read.csv("bike_buyers_clean.csv", header=T, na.strings='')
summary(bike_buyers)
```
### dataset
```{r}
head(bike_buyers)
```
### str of the dataset
```{r}
str(bike_buyers)
```
## Univariate {.tabset}
### Histogram of income
```{r}
hist(bike_buyers$Income)
```
### Histogram of Age:
```{r}
hist(bike_buyers$Age)
```
### Density Plot of Income:
```{r}
plot(density(bike_buyers$Income), main='Income Density Spread')
```
## Bivariate {.tabset}
### Bar Plot of Gender:
```{r}
counts <- table(bike_buyers$Cars, bike_buyers$Gender)
barplot(counts, main = '',
xlab="Number of Gears",
legend = rownames(counts))
```
### scatter plot of income
```{r}
plot(bike_buyers$Income, type= "p")
```
### Scatter Plot of Age vs. Gender:
```{r}
ggplot(bike_buyers, aes(y = Age, x = Gender)) +
geom_point()
```
### Scatter Plot of Age vs. Income:
```{r}
ggplot(bike_buyers, aes(y = Age, x = Income)) +
geom_point()
```
## Multivariate {.tabset}
### Scatter Plot with Color Gradient:
```{r}
p3 <- ggplot(bike_buyers, aes(x = Age, y = Income)) +
theme(legend.position="top", axis.text=element_text(size = 6))
p4 <- p3 + geom_point(aes(color = Age), alpha = 0.5, size = 1.5, position = position_jitter(width = 0.25, height = 0))
p4 + scale_x_discrete(name="Income") + scale_color_continuous(name="", low = "blue", high = "red")
```
### Line Plot of Age vs. Occupation
```{r}
p5 <- ggplot(bike_buyers, aes(x = Age, y = Occupation))
p5 + geom_line(aes(color = Age)) + facet_wrap(~Gender)
```
### Heatmap:
```{r}
# Select numeric variables
numeric_vars <- bike_buyers %>% select_if(is.numeric)
# Compute the correlation matrix
corr_matrix <- cor(numeric_vars, use = "complete.obs")
# Melt the correlation matrix for ggplot
melted_corr_matrix <- melt(corr_matrix)
# Create the heatmap
ggplot(melted_corr_matrix, aes(x = Var1, y = Var2, fill = value)) +
geom_tile() +
scale_fill_gradient(low = "white", high = "blue") +
labs(title = "Correlation Heatmap", x = "Variable", y = "Variable")
```