Github Project Link - https://github.com/RamanathReddy1/Ramanath-Reddy
There are different models from each bike vendor. These bike models are categorized based on its usage,frame, price and mode. The objective of this project is to analyze this bike vendor dataset to generate any insights on price variation of different models across differnt vendors. The visualization tools will be used to develop correlation among the different parameters accordingly.
There are a total of 97 different bike models and available at each of the 30 bike vendor. The value in each row of the bike vendor column represent the percentage of the model against the total bikes at the vendor. For example, Bad Habit 1 model contribute 1.74% of the total bikes available at Albuquerque Cycles at the time of the data collection.
Download the bikevendors.csv file into your local machine. Then load this dataset into Rstudio.
bikevendors<-read.csv('F:/THESIS/Tanya/R-Folder/data/bikevendors.csv')
anyDuplicated(bikevendors)
## [1] 0
Exploring raw data to clean it for further use
head(bikevendors)
## model category1 category2 frame price
## 1 Bad Habit 1 Mountain Trail Aluminum 3200
## 2 Bad Habit 2 Mountain Trail Aluminum 2660
## 3 Beast of the East 1 Mountain Trail Aluminum 2770
## 4 Beast of the East 2 Mountain Trail Aluminum 2130
## 5 Beast of the East 3 Mountain Trail Aluminum 1620
## 6 CAAD Disc Ultegra Road Elite Road Aluminum 2660
## Albuquerque.Cycles Ann.Arbor.Speed Austin.Cruisers Cincinnati.Speed
## 1 0.017482517 0.006644518 0.008130081 0.00511509
## 2 0.006993007 0.009966777 0.004065041 0.00000000
## 3 0.010489510 0.014950166 0.008130081 0.00000000
## 4 0.010489510 0.009966777 0.008130081 0.00000000
## 5 0.003496503 0.003322259 0.000000000 0.00000000
## 6 0.013986014 0.026578073 0.020325203 0.01534527
## Columbus.Race.Equipment Dallas.Cycles Denver.Bike.Shop Detroit.Cycles
## 1 0.010152284 0.012820513 0.01173403 0.009920635
## 2 0.000000000 0.017094017 0.01390700 0.015873016
## 3 0.000000000 0.004273504 0.01825293 0.011904762
## 4 0.005076142 0.004273504 0.01521078 0.005952381
## 5 0.002538071 0.004273504 0.01694915 0.011904762
## 6 0.010152284 0.000000000 0.01086484 0.007936508
## Indianapolis.Velocipedes Ithaca.Mountain.Climbers Kansas.City.29ers
## 1 0.006269592 0.01819620 0.01815039
## 2 0.003134796 0.01107595 0.01584558
## 3 0.009404389 0.02136076 0.01815039
## 4 0.009404389 0.01819620 0.01382887
## 5 0.000000000 0.01028481 0.01815039
## 6 0.009404389 0.00000000 0.01065975
## Las.Vegas.Cycles Los.Angeles.Cycles Louisville.Race.Equipment
## 1 0.001602564 0.006289308 0.007594937
## 2 0.000000000 0.009433962 0.000000000
## 3 0.001602564 0.025157233 0.000000000
## 4 0.000000000 0.022012579 0.005063291
## 5 0.003205128 0.000000000 0.005063291
## 6 0.011217949 0.015723270 0.027848101
## Miami.Race.Equipment Minneapolis.Bike.Shop Nashville.Cruisers
## 1 0.004213483 0.01826484 0.00867052
## 2 0.011235955 0.01674277 0.01734104
## 3 0.014044944 0.01674277 0.00867052
## 4 0.008426966 0.00761035 0.00867052
## 5 0.004213483 0.01522070 0.02023121
## 6 0.021067416 0.01826484 0.03757225
## New.Orleans.Velocipedes New.York.Cycles Oklahoma.City.Race.Equipment
## 1 0.018478261 0.007407407 0.012987013
## 2 0.002173913 0.007407407 0.009523810
## 3 0.008695652 0.017283951 0.024242424
## 4 0.009782609 0.017283951 0.008658009
## 5 0.004347826 0.004938272 0.005194805
## 6 0.015217391 0.017283951 0.010389610
## Philadelphia.Bike.Shop Phoenix.Bi.peds Pittsburgh.Mountain.Machines
## 1 0.024489796 0.01127555 0.01591512
## 2 0.004081633 0.01902748 0.00265252
## 3 0.000000000 0.01268499 0.00530504
## 4 0.000000000 0.02325581 0.01061008
## 5 0.020408163 0.01620860 0.00265252
## 6 0.016326531 0.01268499 0.00265252
## Portland.Bi.peds Providence.Bi.peds San.Antonio.Bike.Shop
## 1 0.01086956 0.009225092 0.021505376
## 2 0.01086956 0.023985240 0.000000000
## 3 0.01086956 0.009225092 0.005376344
## 4 0.01552795 0.014760148 0.010752688
## 5 0.02018633 0.007380074 0.032258065
## 6 0.01397515 0.007380074 0.005376344
## San.Francisco.Cruisers Seattle.Race.Equipment Tampa.29ers Wichita.Speed
## 1 0.002673797 0.0156250 0.019417476 0.005917160
## 2 0.002673797 0.0078125 0.000000000 0.000000000
## 3 0.000000000 0.0156250 0.009708738 0.000000000
## 4 0.002673797 0.0234375 0.029126214 0.001972387
## 5 0.000000000 0.0078125 0.009708738 0.000000000
## 6 0.002673797 0.0078125 0.000000000 0.009861933
class(bikevendors)
## [1] "data.frame"
dim(bikevendors)
## [1] 97 35
colnames(bikevendors)
## [1] "model" "category1"
## [3] "category2" "frame"
## [5] "price" "Albuquerque.Cycles"
## [7] "Ann.Arbor.Speed" "Austin.Cruisers"
## [9] "Cincinnati.Speed" "Columbus.Race.Equipment"
## [11] "Dallas.Cycles" "Denver.Bike.Shop"
## [13] "Detroit.Cycles" "Indianapolis.Velocipedes"
## [15] "Ithaca.Mountain.Climbers" "Kansas.City.29ers"
## [17] "Las.Vegas.Cycles" "Los.Angeles.Cycles"
## [19] "Louisville.Race.Equipment" "Miami.Race.Equipment"
## [21] "Minneapolis.Bike.Shop" "Nashville.Cruisers"
## [23] "New.Orleans.Velocipedes" "New.York.Cycles"
## [25] "Oklahoma.City.Race.Equipment" "Philadelphia.Bike.Shop"
## [27] "Phoenix.Bi.peds" "Pittsburgh.Mountain.Machines"
## [29] "Portland.Bi.peds" "Providence.Bi.peds"
## [31] "San.Antonio.Bike.Shop" "San.Francisco.Cruisers"
## [33] "Seattle.Race.Equipment" "Tampa.29ers"
## [35] "Wichita.Speed"
library(dplyr)
## Warning: package 'dplyr' was built under R version 3.4.4
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
#check the structure of bikevendors
glimpse(bikevendors)
## Observations: 97
## Variables: 35
## $ model <fct> Bad Habit 1, Bad Habit 2, Beast o...
## $ category1 <fct> Mountain, Mountain, Mountain, Mou...
## $ category2 <fct> Trail, Trail, Trail, Trail, Trail...
## $ frame <fct> Aluminum, Aluminum, Aluminum, Alu...
## $ price <int> 3200, 2660, 2770, 2130, 1620, 266...
## $ Albuquerque.Cycles <dbl> 0.017482517, 0.006993007, 0.01048...
## $ Ann.Arbor.Speed <dbl> 0.006644518, 0.009966777, 0.01495...
## $ Austin.Cruisers <dbl> 0.008130081, 0.004065041, 0.00813...
## $ Cincinnati.Speed <dbl> 0.005115090, 0.000000000, 0.00000...
## $ Columbus.Race.Equipment <dbl> 0.010152284, 0.000000000, 0.00000...
## $ Dallas.Cycles <dbl> 0.012820513, 0.017094017, 0.00427...
## $ Denver.Bike.Shop <dbl> 0.011734029, 0.013906997, 0.01825...
## $ Detroit.Cycles <dbl> 0.009920635, 0.015873016, 0.01190...
## $ Indianapolis.Velocipedes <dbl> 0.006269592, 0.003134796, 0.00940...
## $ Ithaca.Mountain.Climbers <dbl> 0.018196203, 0.011075949, 0.02136...
## $ Kansas.City.29ers <dbl> 0.018150389, 0.015845578, 0.01815...
## $ Las.Vegas.Cycles <dbl> 0.001602564, 0.000000000, 0.00160...
## $ Los.Angeles.Cycles <dbl> 0.006289308, 0.009433962, 0.02515...
## $ Louisville.Race.Equipment <dbl> 0.007594937, 0.000000000, 0.00000...
## $ Miami.Race.Equipment <dbl> 0.004213483, 0.011235955, 0.01404...
## $ Minneapolis.Bike.Shop <dbl> 0.01826484, 0.01674277, 0.0167427...
## $ Nashville.Cruisers <dbl> 0.008670520, 0.017341040, 0.00867...
## $ New.Orleans.Velocipedes <dbl> 0.018478261, 0.002173913, 0.00869...
## $ New.York.Cycles <dbl> 0.007407407, 0.007407407, 0.01728...
## $ Oklahoma.City.Race.Equipment <dbl> 0.012987013, 0.009523810, 0.02424...
## $ Philadelphia.Bike.Shop <dbl> 0.024489796, 0.004081633, 0.00000...
## $ Phoenix.Bi.peds <dbl> 0.011275546, 0.019027484, 0.01268...
## $ Pittsburgh.Mountain.Machines <dbl> 0.01591512, 0.00265252, 0.0053050...
## $ Portland.Bi.peds <dbl> 0.010869565, 0.010869565, 0.01086...
## $ Providence.Bi.peds <dbl> 0.009225092, 0.023985240, 0.00922...
## $ San.Antonio.Bike.Shop <dbl> 0.021505376, 0.000000000, 0.00537...
## $ San.Francisco.Cruisers <dbl> 0.002673797, 0.002673797, 0.00000...
## $ Seattle.Race.Equipment <dbl> 0.0156250, 0.0078125, 0.0156250, ...
## $ Tampa.29ers <dbl> 0.019417476, 0.000000000, 0.00970...
## $ Wichita.Speed <dbl> 0.005917160, 0.000000000, 0.00000...
summary(bikevendors)
## model category1 category2
## Bad Habit 1 : 1 Mountain:51 Elite Road :21
## Bad Habit 2 : 1 Road :46 Cross Country Race:19
## Beast of the East 1: 1 Endurance Road :16
## Beast of the East 2: 1 Trail :13
## Beast of the East 3: 1 Sport : 9
## CAAD Disc Ultegra : 1 Over Mountain : 8
## (Other) :91 (Other) :11
## frame price Albuquerque.Cycles Ann.Arbor.Speed
## Aluminum:40 Min. : 415 Min. :0.000000 Min. :0.000000
## Carbon :57 1st Qu.: 1950 1st Qu.:0.003497 1st Qu.:0.003322
## Median : 3200 Median :0.006993 Median :0.009967
## Mean : 3954 Mean :0.010309 Mean :0.010309
## 3rd Qu.: 5330 3rd Qu.:0.013986 3rd Qu.:0.014950
## Max. :12790 Max. :0.048951 Max. :0.033223
##
## Austin.Cruisers Cincinnati.Speed Columbus.Race.Equipment
## Min. :0.000000 Min. :0.000000 Min. :0.000000
## 1st Qu.:0.004065 1st Qu.:0.002558 1st Qu.:0.005076
## Median :0.008130 Median :0.010230 Median :0.010152
## Mean :0.010309 Mean :0.010309 Mean :0.010309
## 3rd Qu.:0.016260 3rd Qu.:0.015345 3rd Qu.:0.012690
## Max. :0.052846 Max. :0.033248 Max. :0.038071
##
## Dallas.Cycles Denver.Bike.Shop Detroit.Cycles
## Min. :0.000000 Min. :0.0004346 Min. :0.000000
## 1st Qu.:0.004274 1st Qu.:0.0073881 1st Qu.:0.005952
## Median :0.008547 Median :0.0104302 Median :0.009921
## Mean :0.010309 Mean :0.0103093 Mean :0.010309
## 3rd Qu.:0.012821 3rd Qu.:0.0134724 3rd Qu.:0.013889
## Max. :0.042735 Max. :0.0256410 Max. :0.029762
##
## Indianapolis.Velocipedes Ithaca.Mountain.Climbers Kansas.City.29ers
## Min. :0.000000 Min. :0.000000 Min. :0.0002881
## 1st Qu.:0.003135 1st Qu.:0.002373 1st Qu.:0.0072025
## Median :0.006270 Median :0.010285 Median :0.0100835
## Mean :0.010309 Mean :0.010309 Mean :0.0103093
## 3rd Qu.:0.015674 3rd Qu.:0.016614 3rd Qu.:0.0132527
## Max. :0.050157 Max. :0.027690 Max. :0.0247767
##
## Las.Vegas.Cycles Los.Angeles.Cycles Louisville.Race.Equipment
## Min. :0.000000 Min. :0.000000 Min. :0.000000
## 1st Qu.:0.003205 1st Qu.:0.003145 1st Qu.:0.002532
## Median :0.011218 Median :0.009434 Median :0.010127
## Mean :0.010309 Mean :0.010309 Mean :0.010309
## 3rd Qu.:0.014423 3rd Qu.:0.012579 3rd Qu.:0.015190
## Max. :0.040064 Max. :0.047170 Max. :0.030380
##
## Miami.Race.Equipment Minneapolis.Bike.Shop Nashville.Cruisers
## Min. :0.000000 Min. :0.000000 Min. :0.000000
## 1st Qu.:0.002809 1st Qu.:0.006088 1st Qu.:0.002890
## Median :0.009831 Median :0.009132 Median :0.008671
## Mean :0.010309 Mean :0.010309 Mean :0.010309
## 3rd Qu.:0.016854 3rd Qu.:0.013699 3rd Qu.:0.014451
## Max. :0.032303 Max. :0.030441 Max. :0.040462
##
## New.Orleans.Velocipedes New.York.Cycles Oklahoma.City.Race.Equipment
## Min. :0.000000 Min. :0.000000 Min. :0.000000
## 1st Qu.:0.003261 1st Qu.:0.004938 1st Qu.:0.004329
## Median :0.009783 Median :0.009877 Median :0.010390
## Mean :0.010309 Mean :0.010309 Mean :0.010309
## 3rd Qu.:0.016304 3rd Qu.:0.014815 3rd Qu.:0.015584
## Max. :0.027174 Max. :0.027160 Max. :0.027706
##
## Philadelphia.Bike.Shop Phoenix.Bi.peds Pittsburgh.Mountain.Machines
## Min. :0.000000 Min. :0.0007047 Min. :0.000000
## 1st Qu.:0.004082 1st Qu.:0.0056378 1st Qu.:0.002653
## Median :0.008163 Median :0.0098661 Median :0.007958
## Mean :0.010309 Mean :0.0103093 Mean :0.010309
## 3rd Qu.:0.012245 3rd Qu.:0.0140944 3rd Qu.:0.015915
## Max. :0.057143 Max. :0.0232558 Max. :0.045093
##
## Portland.Bi.peds Providence.Bi.peds San.Antonio.Bike.Shop
## Min. :0.000000 Min. :0.000000 Min. :0.000000
## 1st Qu.:0.006211 1st Qu.:0.005535 1st Qu.:0.005376
## Median :0.009317 Median :0.009225 Median :0.010753
## Mean :0.010309 Mean :0.010309 Mean :0.010309
## 3rd Qu.:0.013975 3rd Qu.:0.012915 3rd Qu.:0.016129
## Max. :0.031056 Max. :0.033210 Max. :0.053763
##
## San.Francisco.Cruisers Seattle.Race.Equipment Tampa.29ers
## Min. :0.000000 Min. :0.000000 Min. :0.000000
## 1st Qu.:0.002674 1st Qu.:0.000000 1st Qu.:0.000000
## Median :0.008021 Median :0.007812 Median :0.004854
## Mean :0.010309 Mean :0.010309 Mean :0.010309
## 3rd Qu.:0.016043 3rd Qu.:0.015625 3rd Qu.:0.014563
## Max. :0.042781 Max. :0.054688 Max. :0.048544
##
## Wichita.Speed
## Min. :0.000000
## 1st Qu.:0.003945
## Median :0.009862
## Mean :0.010309
## 3rd Qu.:0.015779
## Max. :0.047337
##
Answer1: The mean price of the bike model is $3200 and maximum price is $12790.
hist(bikevendors$price)
From this histogram, we can conclude that there are highest number of bike models in the sample size within the price range of $2000-$4000.
plot(bikevendors$frame,bikevendors$price)
Answer 2: There are two types of bikes namely with Aluminium and Carbon frame.The box plot provided an overview of the price differentiation in each frame type. We can notice that the price delta in aluminium is less compared to the large deviation in price within bikes with carbon frame. There is one outlier what did not belong within the range in either bike types accordingly.
plot(bikevendors$category1,bikevendors$price)
Similarly, we can see the price deviation is large in mountain bikes compared with the road bikes with their outliers accordingly.
bikevendors_new<-subset(bikevendors, bikevendors$category1 == "Mountain"& bikevendors$category2 == "Sport" & bikevendors$frame == "Aluminum")
head(bikevendors_new)
## model category1 category2 frame price Albuquerque.Cycles
## 17 Catalyst 1 Mountain Sport Aluminum 705 0.020979021
## 18 Catalyst 2 Mountain Sport Aluminum 585 0.013986014
## 19 Catalyst 3 Mountain Sport Aluminum 480 0.031468531
## 20 Catalyst 4 Mountain Sport Aluminum 415 0.017482517
## 89 Trail 1 Mountain Sport Aluminum 1520 0.000000000
## 90 Trail 2 Mountain Sport Aluminum 1350 0.003496503
## Ann.Arbor.Speed Austin.Cruisers Cincinnati.Speed
## 17 0.006644518 0.012195122 0.000000000
## 18 0.008305648 0.008130081 0.000000000
## 19 0.006644518 0.004065041 0.000000000
## 20 0.004983389 0.028455285 0.000000000
## 89 0.009966777 0.016260163 0.000000000
## 90 0.014950166 0.016260163 0.002557545
## Columbus.Race.Equipment Dallas.Cycles Denver.Bike.Shop Detroit.Cycles
## 17 0.000000000 0.021367521 0.01694915 0.005952381
## 18 0.002538071 0.012820513 0.02564103 0.007936508
## 19 0.000000000 0.042735043 0.01651456 0.017857143
## 20 0.002538071 0.017094017 0.01999131 0.005952381
## 89 0.002538071 0.017094017 0.02042590 0.009920635
## 90 0.002538071 0.008547009 0.02086049 0.007936508
## Indianapolis.Velocipedes Ithaca.Mountain.Climbers Kansas.City.29ers
## 17 0.012539185 0.007120253 0.01094785
## 18 0.009404389 0.004746835 0.01872659
## 19 0.012539185 0.004746835 0.01901469
## 20 0.025078370 0.005537975 0.01843849
## 89 0.018808777 0.008702532 0.01642178
## 90 0.009404389 0.022151899 0.01584558
## Las.Vegas.Cycles Los.Angeles.Cycles Louisville.Race.Equipment
## 17 0.000000000 0.009433962 0.000000000
## 18 0.000000000 0.015723270 0.000000000
## 19 0.001602564 0.018867925 0.000000000
## 20 0.000000000 0.009433962 0.002531646
## 89 0.001602564 0.015723270 0.000000000
## 90 0.000000000 0.015723270 0.000000000
## Miami.Race.Equipment Minneapolis.Bike.Shop Nashville.Cruisers
## 17 0.016853933 0.02283105 0.008670520
## 18 0.014044944 0.00761035 0.014450867
## 19 0.002808989 0.01522070 0.011560694
## 20 0.011235955 0.00761035 0.014450867
## 89 0.014044944 0.00913242 0.011560694
## 90 0.018258427 0.00608828 0.005780347
## New.Orleans.Velocipedes New.York.Cycles Oklahoma.City.Race.Equipment
## 17 0.006521739 0.012345679 0.006060606
## 18 0.006521739 0.019753086 0.008658009
## 19 0.008695652 0.002469136 0.004329004
## 20 0.018478261 0.004938272 0.005194805
## 89 0.013043478 0.017283951 0.012987013
## 90 0.003260870 0.014814815 0.011255411
## Philadelphia.Bike.Shop Phoenix.Bi.peds Pittsburgh.Mountain.Machines
## 17 0.012244898 0.01127555 0.01326260
## 18 0.012244898 0.01338971 0.00265252
## 19 0.004081633 0.01409443 0.00000000
## 20 0.008163265 0.01268499 0.01061008
## 89 0.012244898 0.01620860 0.00795756
## 90 0.024489796 0.01338971 0.00795756
## Portland.Bi.peds Providence.Bi.peds San.Antonio.Bike.Shop
## 17 0.00931677 0.016605166 0.01612903
## 18 0.01086956 0.005535055 0.01612903
## 19 0.01863354 0.020295203 0.01612903
## 20 0.01086956 0.009225092 0.01075269
## 89 0.01863354 0.009225092 0.03763441
## 90 0.00621118 0.012915129 0.01612903
## San.Francisco.Cruisers Seattle.Race.Equipment Tampa.29ers Wichita.Speed
## 17 0.000000000 0.0000000 0.000000000 0.001972387
## 18 0.000000000 0.0000000 0.014563107 0.011834320
## 19 0.000000000 0.0078125 0.004854369 0.000000000
## 20 0.002673797 0.0000000 0.048543689 0.003944773
## 89 0.000000000 0.0078125 0.000000000 0.000000000
## 90 0.000000000 0.0078125 0.009708738 0.001972387
Here we can filter out only bike models based on certain conditions for further analysis.
bikeCount<- bikevendors %>% group_by(category2) %>% summarize(count = n()) %>% arrange(count)
## Warning: package 'bindrcpp' was built under R version 3.4.4
bikeCount
## # A tibble: 9 x 2
## category2 count
## <fct> <int>
## 1 Fat Bike 2
## 2 Cyclocross 4
## 3 Triathalon 5
## 4 Over Mountain 8
## 5 Sport 9
## 6 Trail 13
## 7 Endurance Road 16
## 8 Cross Country Race 19
## 9 Elite Road 21
Answer 3: Here using the groupby() and arrange() functions, we were able to find the count of each bike models and also arrange them in ascending order. There are only two fat bikes and 21 elite road bike types in the sample.
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.4.4
g1<- ggplot(bikevendors, aes(category1, fill= category2))
g1+geom_bar(aes()) + ggtitle("Stacked bar plot Showing The Different bikes")
From this diagram, it is very clear that there are high proportion of endurance road bikes in the road category. Likewise, cross country race bikes fall under mountain category.
d1<- ggplot(bikevendors, aes(category2))
d1 + geom_density(aes(fill= category1), width = 0.5)+
labs(title="Density graph Showing composition of different bikes")
## Warning: Ignoring unknown parameters: width
In this diagram, the different types of bike models are categorized based on category1. We find that the there are high number of mountain bikes compared to road bikes in this sample.
ggplot(data = bikevendors, aes(category2))+geom_bar()+facet_grid(category1~.) + ggtitle("Facet grid plot Showing The Different bikes")
From this diagram, we can say that the cross country bikes are highest in mountain category and elite road bikes in the road category.
library(rgdal)
## Warning: package 'rgdal' was built under R version 3.4.4
## Loading required package: sp
## Warning: package 'sp' was built under R version 3.4.4
## rgdal: version: 1.3-3, (SVN revision 759)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 2.2.3, released 2017/11/20
## Path to GDAL shared files: C:/Users/Ravindra/Documents/R/win-library/3.4/rgdal/gdal
## GDAL binary built with GEOS: TRUE
## Loaded PROJ.4 runtime: Rel. 4.9.3, 15 August 2016, [PJ_VERSION: 493]
## Path to PROJ.4 shared files: C:/Users/Ravindra/Documents/R/win-library/3.4/rgdal/proj
## Linking to sp version: 1.3-1
library(leaflet)
## Warning: package 'leaflet' was built under R version 3.4.4
library(dplyr)
library(mapview)
## Warning: package 'mapview' was built under R version 3.4.4
##
## Attaching package: 'mapview'
## The following object is masked from 'package:leaflet':
##
## addMapPane
plot(bikevendors$model,bikevendors$price, main = "Price variation across models")
We have plotted the price of each bike model and found that the SuperX and Synapse models have lowest standard deviation in the prices compared to other models.
bikecompetition<-bikevendors
bikecompetition[bikecompetition==0]<- NA
bikecompete<-bikecompetition[complete.cases(bikecompetition),]
head(bikecompete)
## model category1 category2 frame price
## 1 Bad Habit 1 Mountain Trail Aluminum 3200
## 11 CAAD12 Red Road Elite Road Aluminum 3200
## 64 Supersix Evo Hi-Mod Utegra Road Elite Road Carbon 4260
## 78 Synapse Carbon Ultegra 3 Road Endurance Road Carbon 3200
## Albuquerque.Cycles Ann.Arbor.Speed Austin.Cruisers Cincinnati.Speed
## 1 0.017482517 0.006644518 0.008130081 0.00511509
## 11 0.013986014 0.026578073 0.016260163 0.02813299
## 64 0.006993007 0.008305648 0.020325203 0.02301790
## 78 0.010489510 0.008305648 0.020325203 0.01278772
## Columbus.Race.Equipment Dallas.Cycles Denver.Bike.Shop Detroit.Cycles
## 1 0.010152284 0.012820513 0.011734029 0.009920635
## 11 0.017766497 0.029914530 0.009561060 0.019841270
## 64 0.012690355 0.008547009 0.000869187 0.001984127
## 78 0.002538071 0.021367521 0.007822686 0.009920635
## Indianapolis.Velocipedes Ithaca.Mountain.Climbers Kansas.City.29ers
## 1 0.006269592 0.018196203 0.018150389
## 11 0.050156740 0.013449367 0.007202535
## 64 0.015673981 0.005537975 0.000288101
## 78 0.006269592 0.007911392 0.008066840
## Las.Vegas.Cycles Los.Angeles.Cycles Louisville.Race.Equipment
## 1 0.001602564 0.006289308 0.007594937
## 11 0.006410256 0.009433962 0.020253165
## 64 0.012820513 0.003144654 0.022784810
## 78 0.012820513 0.015723270 0.025316456
## Miami.Race.Equipment Minneapolis.Bike.Shop Nashville.Cruisers
## 1 0.004213483 0.01826484 0.008670520
## 11 0.014044944 0.01674277 0.023121387
## 64 0.009831461 0.00761035 0.005780347
## 78 0.005617978 0.02435312 0.020231214
## New.Orleans.Velocipedes New.York.Cycles Oklahoma.City.Race.Equipment
## 1 0.018478261 0.007407407 0.01298701
## 11 0.021739130 0.009876543 0.01558442
## 64 0.008695652 0.002469136 0.02164502
## 78 0.014130435 0.009876543 0.02164502
## Philadelphia.Bike.Shop Phoenix.Bi.peds Pittsburgh.Mountain.Machines
## 1 0.024489796 0.011275546 0.01591512
## 11 0.004081633 0.009866103 0.01061008
## 64 0.008163265 0.009866103 0.01591512
## 78 0.012244898 0.020436927 0.00530504
## Portland.Bi.peds Providence.Bi.peds San.Antonio.Bike.Shop
## 1 0.010869565 0.009225092 0.02150538
## 11 0.007763975 0.009225092 0.01075269
## 64 0.004658385 0.001845018 0.01075269
## 78 0.010869565 0.014760148 0.01075269
## San.Francisco.Cruisers Seattle.Race.Equipment Tampa.29ers Wichita.Speed
## 1 0.002673797 0.0156250 0.019417476 0.00591716
## 11 0.042780749 0.0156250 0.009708738 0.01183432
## 64 0.016042781 0.0078125 0.004854369 0.03155819
## 78 0.013368984 0.0390625 0.019417476 0.01775148
When the price of the model by the vendor is zero, i assumed this model is either not available for purchase in this region.From this analysis, only four bike models (Bad Habit 1, CAAD 12 Red, Supersix Evo and Synapse Carbon Ultegra 3) are available or sold by all the 30 bike vendors.
apply(bikevendors, 2, function(x) max(x, na.rm = TRUE))
## model category1
## "Trigger Carbon 4" "Road"
## category2 frame
## "Triathalon" "Carbon"
## price Albuquerque.Cycles
## "12790" "0.048951049"
## Ann.Arbor.Speed Austin.Cruisers
## "0.033222591" "0.052845528"
## Cincinnati.Speed Columbus.Race.Equipment
## "0.033248082" "0.038071066"
## Dallas.Cycles Denver.Bike.Shop
## "0.042735043" "0.025641026"
## Detroit.Cycles Indianapolis.Velocipedes
## "0.029761905" "0.050156740"
## Ithaca.Mountain.Climbers Kansas.City.29ers
## "0.027689873" "0.024776721"
## Las.Vegas.Cycles Los.Angeles.Cycles
## "0.040064103" "0.047169811"
## Louisville.Race.Equipment Miami.Race.Equipment
## "0.030379747" "0.032303371"
## Minneapolis.Bike.Shop Nashville.Cruisers
## "0.03044140" "0.040462428"
## New.Orleans.Velocipedes New.York.Cycles
## "0.027173913" "0.027160494"
## Oklahoma.City.Race.Equipment Philadelphia.Bike.Shop
## "0.027705628" "0.057142857"
## Phoenix.Bi.peds Pittsburgh.Mountain.Machines
## "0.023255814" "0.04509284"
## Portland.Bi.peds Providence.Bi.peds
## "0.031055901" "0.033210332"
## San.Antonio.Bike.Shop San.Francisco.Cruisers
## "0.053763441" "0.042780749"
## Seattle.Race.Equipment Tampa.29ers
## "0.0546875" "0.048543689"
## Wichita.Speed
## "0.047337278"
Tampa.29ersmodels<-subset(bikevendors, bikevendors$Tampa.29ers== max(bikevendors$Tampa.29ers,na.rm = TRUE))
Tampa.29ersmodels$model
## [1] Catalyst 4 Jekyll Carbon 4
## 97 Levels: Bad Habit 1 Bad Habit 2 ... Trigger Carbon 4
From this we can find out the two bike models (Catalyst 4 & Jekyll Carbon 4) each have 4.86% share of the total bikes at the Tampa.29ers.
a<-bikevendors %>% filter(price>3000)
b<-bikevendors %>% filter(category1=="Road")
c<-bikevendors %>% filter(model=="Catalyst 1")
d<-bikevendors %>% filter(category2=="Sport")
pie(bikevendors$Albuquerque.Cycles, main = "Bike Model share")
From this we can notice that models # 10 & 49 have the largest share in the inventory at Albuquerque.Cycles
cor.test(bikevendors$Albuquerque.Cycles,bikevendors$Austin.Cruisers)
##
## Pearson's product-moment correlation
##
## data: bikevendors$Albuquerque.Cycles and bikevendors$Austin.Cruisers
## t = 1.1915, df = 95, p-value = 0.2364
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.08003616 0.31321123
## sample estimates:
## cor
## 0.1213462