Overview:
This data processing report is a part of task series to complete my course education in Full Stack Academy - Algoritma Data Science Education, Jakarta, Indonesia. This second task is about practising competencies for data visualization mostly with dplyr and ggplot libraries set of commands. The data itself can be downloaded from Kaggle.com.
Data source: https://www.kaggle.com/stephenofarrell/cost-of-living
Business Objective
This report is aimed to explore preliminary possibilities to setup “budget accomodation business - Capsul apartment/ hostel” in some most expensive cities measured by cost of living. Some indicators that will be analyzed are:
Some indicators outside those 5 indicators above still displayed in a short glance in order to show the process of pre processing and exploratory steps, but not to be described in the form of graphs.
Last, some of summaries will be explained in short sentence to emphasize the report.
01. Inspect data using glimpse
## Observations: 55
## Variables: 161
## $ X <fct> "Meal, Inexpensive Restaurant", ...
## $ Saint.Petersburg..Russia <dbl> 7.34, 29.35, 4.40, 2.20, 2.20, 0...
## $ Istanbul..Turkey <dbl> 4.58, 15.28, 3.82, 3.06, 3.06, 0...
## $ Izmir..Turkey <dbl> 3.06, 12.22, 3.06, 2.29, 2.75, 0...
## $ Helsinki..Finland <dbl> 12.00, 65.00, 8.00, 6.50, 6.75, ...
## $ Chisinau..Moldova <dbl> 4.67, 20.74, 4.15, 1.04, 1.43, 0...
## $ Milan..Italy <dbl> 15.00, 60.00, 8.00, 5.00, 5.00, ...
## $ Cairo..Egypt <dbl> 3.38, 17.48, 4.51, 1.69, 2.82, 0...
## $ Banja.Luka..Bosnia.And.Herzegovina <dbl> 3.58, 22.99, 3.58, 1.02, 1.53, 1...
## $ Baku..Azerbaijan <dbl> 5.27, 23.73, 4.22, 0.84, 2.11, 0...
## $ Guadalajara..Mexico <dbl> 5.25, 23.86, 4.25, 1.43, 2.39, 0...
## $ Kathmandu..Nepal <dbl> 1.99, 11.92, 5.56, 2.38, 3.18, 0...
## $ Hanoi..Vietnam <dbl> 1.94, 15.52, 3.88, 0.78, 1.55, 0...
## $ Ho.Chi.Minh.City..Vietnam <dbl> 1.94, 17.50, 3.87, 0.78, 1.36, 0...
## $ Mexico.City..Mexico <dbl> 4.77, 23.86, 4.77, 1.91, 2.86, 0...
## $ Rome..Italy <dbl> 15.00, 55.00, 8.00, 5.00, 4.00, ...
## $ Monterrey..Mexico <dbl> 5.75, 23.94, 4.31, 1.68, 2.39, 0...
## $ Yekaterinburg..Russia <dbl> 5.88, 29.42, 4.12, 1.47, 2.94, 0...
## $ Sarajevo..Bosnia.And.Herzegovina <dbl> 3.57, 20.42, 4.08, 1.79, 1.79, 1...
## $ Kharkiv..Ukraine <dbl> 4.50, 18.76, 3.75, 0.88, 1.29, 0...
## $ Kiev..Ukraine <dbl> 5.63, 22.51, 3.75, 1.13, 1.80, 0...
## $ Calgary..Canada <dbl> 13.75, 42.97, 6.87, 4.81, 5.50, ...
## $ Tunis..Tunisia <dbl> 1.92, 12.78, 3.19, 1.60, 1.60, 0...
## $ Edmonton..Canada <dbl> 13.75, 48.12, 6.53, 4.12, 4.12, ...
## $ Amsterdam..Netherlands <dbl> 15.00, 65.00, 8.00, 5.00, 4.00, ...
## $ Belgrade..Serbia <dbl> 5.39, 24.62, 5.09, 1.70, 1.95, 1...
## $ Odessa..Ukraine <dbl> 5.65, 20.67, 3.67, 0.94, 1.20, 0...
## $ Paris..France <dbl> 15.00, 50.00, 8.70, 7.00, 6.00, ...
## $ Eindhoven..Netherlands <dbl> 15.00, 56.50, 8.00, 5.00, 3.00, ...
## $ Plovdiv..Bulgaria <dbl> 5.10, 20.42, 4.59, 1.28, 1.53, 0...
## $ Thessaloniki..Greece <dbl> 10.00, 35.00, 6.50, 3.50, 4.50, ...
## $ Ottawa..Canada <dbl> 11.00, 55.00, 6.87, 4.81, 4.81, ...
## $ Sofia..Bulgaria <dbl> 6.14, 25.57, 4.60, 1.53, 1.97, 0...
## $ Rotterdam..Netherlands <dbl> 13.00, 50.00, 8.00, 4.00, 4.00, ...
## $ Varna..Bulgaria <dbl> 6.12, 20.42, 4.08, 1.02, 1.53, 0...
## $ Novi.Sad..Serbia <dbl> 4.84, 17.85, 4.42, 1.50, 1.80, 1...
## $ Utrecht..Netherlands <dbl> 15.00, 65.00, 8.00, 5.00, 4.50, ...
## $ Berlin..Germany <dbl> 8.50, 40.00, 7.00, 3.50, 3.10, 2...
## $ Beirut..Lebanon <dbl> 8.98, 53.91, 7.19, 3.22, 4.17, 0...
## $ Austin..TX..United.States <dbl> 13.48, 44.92, 7.19, 4.49, 5.39, ...
## $ Singapore..Singapore <dbl> 8.33, 39.98, 5.33, 6.66, 8.00, 1...
## $ Toronto..Canada <dbl> 13.75, 55.00, 7.56, 5.16, 5.50, ...
## $ Auckland..New.Zealand <dbl> 11.95, 50.78, 6.57, 5.68, 5.38, ...
## $ Podgorica..Montenegro <dbl> 5.00, 25.00, 3.00, 1.50, 1.90, 1...
## $ Vancouver..Canada <dbl> 11.69, 55.00, 6.87, 4.81, 4.81, ...
## $ Tokyo..Japan <dbl> 8.19, 40.97, 5.70, 4.10, 4.92, 1...
## $ Victoria..Canada <dbl> 12.05, 49.93, 6.89, 4.82, 4.82, ...
## $ Winnipeg..Canada <dbl> 10.31, 41.25, 6.80, 4.12, 4.81, ...
## $ Boston..MA..United.States <dbl> 13.47, 62.85, 7.18, 6.29, 6.73, ...
## $ Chicago..IL..United.States <dbl> 13.47, 53.87, 7.18, 4.94, 6.29, ...
## $ Almaty..Kazakhstan <dbl> 4.77, 23.84, 3.81, 1.07, 1.43, 0...
## $ Oslo..Norway <dbl> 18.70, 80.88, 11.12, 9.00, 9.10,...
## $ Frankfurt..Germany <dbl> 10.00, 50.00, 8.00, 4.00, 3.50, ...
## $ Bratislava..Slovakia <dbl> 6.00, 30.00, 6.00, 2.00, 2.00, 1...
## $ Dallas..TX..United.States <dbl> 13.48, 44.92, 6.51, 4.49, 5.39, ...
## $ Zagreb..Croatia <dbl> 6.72, 33.58, 5.37, 2.01, 2.42, 1...
## $ Hamburg..Germany <dbl> 11.00, 50.00, 8.00, 4.00, 3.50, ...
## $ Krakow..Cracow...Poland <dbl> 5.91, 23.62, 4.72, 2.13, 2.36, 1...
## $ Riga..Latvia <dbl> 7.00, 40.00, 5.00, 2.90, 2.20, 1...
## $ Gdansk..Poland <dbl> 5.91, 23.62, 4.25, 2.13, 2.36, 1...
## $ Santiago..Chile <dbl> 6.94, 34.68, 5.20, 2.95, 3.47, 0...
## $ Nairobi..Kenya <dbl> 4.43, 26.58, 6.20, 2.04, 2.66, 0...
## $ Abu.Dhabi..United.Arab.Emirates <dbl> 6.12, 48.92, 6.24, 9.78, 9.78, 0...
## $ Houston..TX..United.States <dbl> 13.48, 53.91, 6.74, 5.39, 5.84, ...
## $ Tbilisi..Georgia <dbl> 4.67, 18.69, 4.05, 0.93, 1.25, 0...
## $ Dubai..United.Arab.Emirates <dbl> 7.33, 48.89, 6.11, 11.00, 11.00,...
## $ Bogota..Colombia <dbl> 3.30, 19.24, 4.40, 0.96, 2.20, 0...
## $ Brno..Czech.Republic <dbl> 5.95, 25.19, 5.16, 1.39, 1.59, 1...
## $ Munich..Germany <dbl> 12.00, 60.00, 8.00, 3.80, 3.90, ...
## $ Poznan..Poland <dbl> 5.90, 23.61, 4.72, 2.01, 2.36, 0...
## $ Las.Vegas..NV..United.States <dbl> 13.47, 53.87, 7.18, 5.39, 6.29, ...
## $ London..United.Kingdom <dbl> 17.49, 64.15, 7.00, 5.83, 5.25, ...
## $ Los.Angeles..CA..United.States <dbl> 13.47, 58.36, 7.18, 5.39, 6.29, ...
## $ Panama.City..Panama <dbl> 8.98, 35.94, 6.06, 2.58, 3.59, 1...
## $ Seoul..South.Korea <dbl> 6.42, 31.11, 5.06, 3.11, 4.67, 1...
## $ Warsaw..Poland <dbl> 5.91, 28.35, 4.72, 2.36, 2.36, 0...
## $ Prague..Czech.Republic <dbl> 5.95, 29.75, 5.55, 1.59, 1.98, 1...
## $ Wroclaw..Poland <dbl> 5.91, 25.99, 4.72, 1.89, 2.36, 0...
## $ Kuala.Lumpur..Malaysia <dbl> 2.87, 16.60, 3.32, 3.32, 4.42, 0...
## $ New.York..NY..United.States <dbl> 17.97, 76.37, 8.09, 6.29, 7.19, ...
## $ Copenhagen..Denmark <dbl> 17.40, 80.29, 10.04, 6.69, 5.35,...
## $ Ljubljana..Slovenia <dbl> 8.00, 35.00, 5.00, 2.60, 3.40, 2...
## $ Chandigarh..India <dbl> 2.54, 10.16, 3.30, 1.90, 3.17, 0...
## $ Colombo..Sri.Lanka <dbl> 1.49, 14.90, 4.22, 1.49, 2.48, 0...
## $ Noida..India <dbl> 3.80, 12.66, 3.16, 1.39, 2.53, 0...
## $ Kaunas..Lithuania <dbl> 6.00, 30.00, 5.00, 2.70, 3.00, 1...
## $ Athens..Greece <dbl> 10.00, 40.00, 6.00, 4.00, 4.00, ...
## $ Phoenix..AZ..United.States <dbl> 10.77, 53.87, 6.73, 3.59, 4.49, ...
## $ Hong.Kong..Hong.Kong <dbl> 5.78, 46.26, 4.05, 5.78, 5.78, 1...
## $ Portland..OR..United.States <dbl> 12.58, 44.92, 6.74, 4.49, 4.49, ...
## $ Lisbon..Portugal <dbl> 8.50, 35.00, 6.00, 2.00, 2.50, 1...
## $ Beijing..China <dbl> 3.91, 26.04, 4.56, 1.95, 3.06, 0...
## $ Cape.Town..South.Africa <dbl> 8.91, 35.81, 3.74, 2.18, 2.49, 0...
## $ Tirana..Albania <dbl> 4.10, 24.62, 4.76, 1.27, 2.00, 1...
## $ Porto..Portugal <dbl> 7.00, 30.00, 6.00, 2.00, 2.50, 1...
## $ Durban..South.Africa <dbl> 7.47, 25.53, 3.74, 1.62, 1.68, 0...
## $ Budapest..Hungary <dbl> 5.99, 29.96, 4.79, 1.50, 1.80, 1...
## $ Vilnius..Lithuania <dbl> 7.00, 35.00, 5.00, 3.00, 3.00, 1...
## $ Johannesburg..South.Africa <dbl> 8.47, 31.37, 3.45, 1.73, 2.20, 0...
## $ Barcelona..Spain <dbl> 11.00, 40.00, 7.00, 2.50, 3.00, ...
## $ San.Diego..CA..United.States <dbl> 13.48, 53.91, 6.74, 5.39, 6.29, ...
## $ San.Francisco..CA..United.States <dbl> 15.95, 71.88, 8.98, 6.29, 7.19, ...
## $ Lima..Peru <dbl> 3.26, 21.71, 4.21, 1.90, 2.71, 0...
## $ Seattle..WA..United.States <dbl> 13.48, 62.89, 7.19, 5.39, 5.84, ...
## $ Brasov..Romania <dbl> 6.27, 20.91, 4.18, 1.25, 1.88, 1...
## $ Bucharest..Romania <dbl> 6.28, 27.21, 4.19, 1.78, 2.09, 1...
## $ Tashkent..Uzbekistan <dbl> 3.59, 15.72, 2.70, 0.90, 1.80, 0...
## $ Ahmedabad..India <dbl> 1.90, 8.86, 3.16, 1.90, 3.80, 0....
## $ Cluj.Napoca..Romania <dbl> 5.23, 23.00, 4.18, 1.57, 2.09, 1...
## $ Madrid..Spain <dbl> 12.00, 40.00, 8.00, 3.00, 3.00, ...
## $ Tallinn..Estonia <dbl> 8.00, 45.00, 6.00, 4.00, 4.00, 1...
## $ Bangalore..India <dbl> 2.53, 12.66, 3.16, 1.90, 3.80, 0...
## $ Iasi..Romania <dbl> 5.23, 20.93, 4.19, 1.26, 1.88, 1...
## $ Chennai..India <dbl> 1.52, 7.59, 3.35, 1.90, 3.16, 0....
## $ Doha..Qatar <dbl> 6.18, 49.47, 6.18, 11.13, 11.13,...
## $ Delhi..India <dbl> 3.80, 15.19, 3.16, 1.90, 3.10, 0...
## $ Gurgaon..India <dbl> 3.80, 12.66, 3.16, 1.96, 3.16, 0...
## $ Valencia..Spain <dbl> 8.75, 35.00, 7.00, 2.00, 2.50, 1...
## $ Vienna..Austria <dbl> 10.00, 45.00, 7.25, 4.00, 4.00, ...
## $ Hyderabad..India <dbl> 1.90, 8.86, 3.80, 1.52, 2.53, 0....
## $ Montevideo..Uruguay <dbl> 8.41, 31.23, 6.96, 1.92, 2.40, 1...
## $ Tel.Aviv.Yafo..Israel <dbl> 15.54, 64.75, 12.95, 7.77, 7.77,...
## $ Timisoara..Romania <dbl> 5.23, 20.91, 4.18, 1.25, 1.67, 0...
## $ Taipei..Taiwan <dbl> 3.61, 24.04, 3.91, 1.77, 2.10, 0...
## $ Kolkata..India <dbl> 2.53, 10.76, 3.79, 1.87, 3.48, 0...
## $ Skopje..Macedonia <dbl> 4.87, 16.25, 3.25, 1.62, 1.95, 1...
## $ Shanghai..China <dbl> 4.56, 26.04, 4.56, 1.30, 2.93, 0...
## $ Bangkok..Thailand <dbl> 1.78, 23.74, 5.04, 2.37, 4.45, 0...
## $ Mumbai..India <dbl> 3.80, 15.19, 3.80, 2.28, 3.92, 0...
## $ Reykjavik..Iceland <dbl> 17.96, 109.16, 11.64, 8.73, 7.28...
## $ Amman..Jordan <dbl> 6.34, 38.02, 6.34, 6.34, 6.34, 0...
## $ Pune..India <dbl> 3.16, 12.66, 3.16, 2.28, 3.54, 0...
## $ Stockholm..Sweden <dbl> 11.40, 66.47, 7.60, 6.55, 6.65, ...
## $ Buenos.Aires..Argentina <dbl> 5.99, 20.97, 4.94, 1.50, 1.95, 0...
## $ Minsk..Belarus <dbl> 8.93, 27.95, 4.49, 1.80, 2.70, 0...
## $ San.Jose..Costa.Rica <dbl> 6.29, 42.47, 6.29, 2.36, 3.81, 1...
## $ Casablanca..Morocco <dbl> 4.45, 20.63, 5.16, 2.58, 2.81, 0...
## $ Lodz..Poland <dbl> 4.72, 23.61, 4.25, 1.77, 1.65, 0...
## $ Montreal..Canada <dbl> 10.31, 44.69, 6.87, 4.81, 5.50, ...
## $ Sao.Paulo..Brazil <dbl> 6.58, 28.51, 5.92, 1.97, 3.29, 1...
## $ Gothenburg..Sweden <dbl> 10.45, 75.97, 7.60, 6.31, 6.17, ...
## $ Dublin..Ireland <dbl> 15.00, 60.00, 8.00, 5.80, 5.50, ...
## $ Moscow..Russia <dbl> 8.80, 36.69, 4.40, 2.93, 2.57, 0...
## $ Santo.Domingo..Dominican.Republic <dbl> 5.10, 34.01, 5.10, 1.70, 2.55, 0...
## $ Adelaide..Australia <dbl> 11.19, 40.39, 6.84, 4.97, 5.59, ...
## $ Zurich..Switzerland <dbl> 23.12, 92.46, 12.94, 6.24, 5.55,...
## $ Yerevan..Armenia <dbl> 4.70, 22.54, 4.60, 1.13, 1.69, 0...
## $ Manila..Philippines <dbl> 3.56, 17.82, 2.67, 1.25, 2.32, 0...
## $ Brisbane..Australia <dbl> 12.38, 49.54, 7.43, 4.95, 5.57, ...
## $ Jakarta..Indonesia <dbl> 2.62, 16.40, 3.28, 2.62, 3.44, 0...
## $ Ankara..Turkey <dbl> 3.82, 15.28, 3.51, 2.75, 3.06, 0...
## $ Lviv..Ukraine <dbl> 3.75, 18.76, 3.56, 1.50, 1.50, 0...
## $ Novosibirsk..Russia <dbl> 5.72, 22.01, 3.67, 1.10, 2.20, 0...
## $ Bursa..Turkey <dbl> 3.82, 11.47, 3.06, 2.37, 3.06, 0...
## $ Brussels..Belgium <dbl> 15.00, 60.00, 8.20, 4.00, 4.00, ...
## $ Jerusalem..Israel <dbl> 15.56, 62.24, 12.97, 7.26, 7.26,...
## $ Melbourne..Australia <dbl> 10.22, 49.54, 7.12, 5.57, 5.57, ...
## $ Perth..Australia <dbl> 12.43, 56.55, 7.32, 5.90, 5.59, ...
## $ Sydney..Australia <dbl> 11.81, 54.37, 7.15, 4.97, 4.97, ...
## $ Alexandria..Egypt <dbl> 2.81, 14.06, 3.38, 1.69, 2.81, 0...
## $ Quito..Ecuador <dbl> 3.59, 31.45, 5.39, 1.35, 2.70, 0...
02. Check missing values using colSums
## [1] FALSE
03 Transpose Columns to Row with t()
cost_t <- data.frame(t(cost))%>%
row_to_names(row_number = 1)
names(cost_t)[5] <- "Imported Beer (0.33 litre bottles)"
setDT(cost_t, keep.rownames = "City in Country")
head(cost_t)03. Converting data type in columns from factor to number with mutate function
# Convert all data to numeric and remove variable 'City in Country'
cos_n <- cost_t %>%
mutate_if(is.factor, as.character) %>%
mutate_if(is.character, as.numeric) %>%
select(c(-`City in Country`)) %>%
mutate(id = row_number()) ## Warning: NAs introduced by coercion
# Assigning only 'City in Country' variable'
cos_ci <- cost_t %>%
select(c(`City in Country`)) %>%
mutate(id = row_number())
head(cos_ci) # Joining data 'City in Country' variable with other numeric variables
cos <- merge(cos_ci,cos_n, by="id") %>%
select(c(-id))
head(cos)04. Selecting data type to be analyzed
a. Cost of food
cost.food <-cos %>%
select(`City in Country`, `McMeal at McDonalds (or Equivalent Combo Meal)`, `Meal, Inexpensive Restaurant`, `Meal for 2 People, Mid-range Restaurant, Three-course`, `Loaf of Fresh White Bread (500g)`, `Eggs (regular) (12)`, `Local Cheese (1kg)`, `Chicken Breasts (Boneless, Skinless), (1kg)`, `Beef Round (1kg) (or Equivalent Back Leg Red Meat)`, `Rice (white), (1kg)`, `Potato (1kg)`, `Onion (1kg)` )
glimpse(cost.food)## Observations: 160
## Variables: 12
## $ `City in Country` <chr> "Saint.Pete...
## $ `McMeal at McDonalds (or Equivalent Combo Meal)` <dbl> 4.40, 3.82,...
## $ `Meal, Inexpensive Restaurant` <dbl> 7.34, 4.58,...
## $ `Meal for 2 People, Mid-range Restaurant, Three-course` <dbl> 29.35, 15.2...
## $ `Loaf of Fresh White Bread (500g)` <dbl> 0.71, 0.36,...
## $ `Eggs (regular) (12)` <dbl> 1.18, 1.62,...
## $ `Local Cheese (1kg)` <dbl> 7.60, 5.32,...
## $ `Chicken Breasts (Boneless, Skinless), (1kg)` <dbl> 3.96, 3.50,...
## $ `Beef Round (1kg) (or Equivalent Back Leg Red Meat)` <dbl> 7.18, 9.73,...
## $ `Rice (white), (1kg)` <dbl> 0.92, 1.30,...
## $ `Potato (1kg)` <dbl> 0.56, 0.59,...
## $ `Onion (1kg)` <dbl> 0.48, 0.62,...
cofmcd <- cost.food %>%
arrange(desc(`McMeal at McDonalds (or Equivalent Combo Meal)`))
ggplot((head(cofmcd,15)), mapping = aes(x = reorder(`City in Country`, `McMeal at McDonalds (or Equivalent Combo Meal)` ) , `McMeal at McDonalds (or Equivalent Combo Meal)`)) +
geom_bar(stat="identity", fill ="#06697A") +
coord_flip() +
theme(plot.background=element_rect(fill = "light gray")) +
theme(panel.background=element_blank()) +
theme(panel.grid = element_blank()) +
theme(text=element_text(size=9,colour="black")) +
scale_x_discrete(name="")+
ylab("Cost") +
ggtitle("Cost of McDonalds Combo Meal in big cities") +
theme(plot.title = element_text(size= 12, face = "bold"))Summaries:
Top 5 Cost of food meal measured by Mc Donalds Combo Meal in big cities:
cofmim <- cost.food %>%
arrange(desc(`Meal, Inexpensive Restaurant`))
ggplot((head(cofmim,15)) , mapping = aes(x = reorder(`City in Country`, `Meal, Inexpensive Restaurant` ) ,`Meal, Inexpensive Restaurant` )) +
geom_bar(stat="identity", fill ="#06697A") +
coord_flip() +
theme(plot.background=element_rect(fill = "light gray")) +
theme(panel.background=element_blank()) +
theme(panel.grid = element_blank()) +
theme(text=element_text(size=9,colour="black")) +
scale_x_discrete(name="")+
ylab("Cost ") +
ggtitle("Cost of Inexpensive meals in big cities") +
theme(plot.title = element_text(size= 12, face = "bold"))Summaries:
Top 5 Cost of food meal measured by Cost of meals in Inexpensive restaurants in big cities:
b. cost of fruit & vegetables
cost.fruit.vegetables <- cos %>%
select(`City in Country`, `Apples (1kg)`,`Oranges (1kg)`, `Banana (1kg)`, `Lettuce (1 head)`, `Tomato (1kg)`)
glimpse(cost.fruit.vegetables)## Observations: 160
## Variables: 6
## $ `City in Country` <chr> "Saint.Petersburg..Russia", "Istanbul..Turkey", ...
## $ `Apples (1kg)` <dbl> 1.29, 0.85, 0.77, 2.10, 0.70, 2.13, 1.31, 0.86, ...
## $ `Oranges (1kg)` <dbl> 1.25, 0.86, 0.73, 1.75, 1.22, 2.05, 0.39, 1.18, ...
## $ `Banana (1kg)` <dbl> 0.89, 1.91, 1.78, 1.61, 1.37, 1.99, 0.66, 1.06, ...
## $ `Lettuce (1 head)` <dbl> 0.86, 0.61, 0.57, 2.30, 0.84, 1.27, 0.28, 0.55, ...
## $ `Tomato (1kg)` <dbl> 1.91, 0.80, 0.70, 2.91, 1.56, 2.45, 0.35, 0.94, ...
c. cost of drinks
cost.drinks <- cos %>%
select ( `City in Country`,`Water (0.33 liter bottle) `, `Water (1.5 liter bottle)`, `Milk (regular), (1 liter)`, `Coke/Pepsi (0.33 liter bottle)`, `Cappuccino (regular)`, `Domestic Beer (0.5 liter draught)`, `Domestic Beer (0.5 liter bottle)`, `Imported Beer (0.33 liter bottle)`, `Imported Beer (0.33 litre bottles)`, `Bottle of Wine (Mid-Range)`, )
glimpse(cost.drinks)## Observations: 160
## Variables: 11
## $ `City in Country` <chr> "Saint.Petersburg..Russia", "I...
## $ `Water (0.33 liter bottle) ` <dbl> 0.53, 0.24, 0.22, 1.89, 0.44, ...
## $ `Water (1.5 liter bottle)` <dbl> 0.63, 0.33, 0.29, 1.54, 0.59, ...
## $ `Milk (regular), (1 liter)` <dbl> 0.98, 0.71, 0.65, 0.96, 0.68, ...
## $ `Coke/Pepsi (0.33 liter bottle)` <dbl> 0.76, 0.64, 0.61, 2.66, 0.64, ...
## $ `Cappuccino (regular)` <dbl> 1.96, 1.84, 1.56, 3.87, 1.25, ...
## $ `Domestic Beer (0.5 liter draught)` <dbl> 2.20, 3.06, 2.29, 6.50, 1.04, ...
## $ `Domestic Beer (0.5 liter bottle)` <dbl> 0.88, 1.79, 1.63, 2.23, 0.77, ...
## $ `Imported Beer (0.33 liter bottle)` <dbl> 1.89, 2.48, 2.09, 2.95, 1.38, ...
## $ `Imported Beer (0.33 litre bottles)` <dbl> 2.20, 3.06, 2.75, 6.75, 1.43, ...
## $ `Bottle of Wine (Mid-Range)` <dbl> 5.87, 7.64, 6.11, 12.00, 3.61,...
cofcoke <- cost.drinks %>%
arrange(desc(`Coke/Pepsi (0.33 liter bottle)`))
ggplot((head(cofcoke,15)) , mapping = aes(x = reorder(`City in Country`, `Coke/Pepsi (0.33 liter bottle)` ) ,`Coke/Pepsi (0.33 liter bottle)` )) +
geom_bar(stat="identity", fill ="#06697A") +
coord_flip() +
theme(plot.background=element_rect(fill = "light gray")) +
theme(panel.background=element_blank()) +
theme(panel.grid = element_blank()) +
theme(text=element_text(size=9,colour="black")) +
theme(axis.text = element_text(colour = "black")) +
theme(axis.ticks = element_line(size = 2)) +
scale_x_discrete(name="")+
ylab("Cost") +
ggtitle("Cost of Coke/Pepsi in big cities") +
theme(plot.title = element_text(size= 12, face = "bold"))Summaries:
Top 5 Cost of drinks measured by Cost Coke & Pepsi in big cities:
d. cost of transportation
cost.transport <- cos %>%
select(`City in Country`, `One-way Ticket (Local Transport)`, `Taxi Start (Normal Tariff)`, `Taxi 1km (Normal Tariff)`, `Taxi 1hour Waiting (Normal Tariff)`, `Volkswagen Golf`, `Toyota Corolla 1.6l 97kW Comfort (Or Equivalent New Car)`, `Gasoline (1 liter)`)
glimpse(cost.transport)## Observations: 160
## Variables: 8
## $ `City in Country` <chr> "Saint.P...
## $ `One-way Ticket (Local Transport)` <dbl> 0.59, 0....
## $ `Taxi Start (Normal Tariff)` <dbl> 1.47, 0....
## $ `Taxi 1km (Normal Tariff)` <dbl> 0.26, 0....
## $ `Taxi 1hour Waiting (Normal Tariff)` <dbl> 4.40, 3....
## $ `Volkswagen Golf` <dbl> 19289.39...
## $ `Toyota Corolla 1.6l 97kW Comfort (Or Equivalent New Car)` <dbl> 19305.29...
## $ `Gasoline (1 liter)` <dbl> 0.67, 1....
e. cost of clothing
cost.clothing <- cos %>%
select( `City in Country`, `1 Pair of Nike Running Shoes (Mid-Range)`, `1 Pair of Men Leather Business Shoes` , `1 Pair of Jeans (Levis 501 Or Similar)`, `1 Summer Dress in a Chain Store (Zara, H&M, ...)`)
glimpse(cost.clothing)## Observations: 160
## Variables: 5
## $ `City in Country` <chr> "Saint.Petersbur...
## $ `1 Pair of Nike Running Shoes (Mid-Range)` <dbl> 74.88, 61.31, 52...
## $ `1 Pair of Men Leather Business Shoes` <dbl> 100.72, 50.58, 4...
## $ `1 Pair of Jeans (Levis 501 Or Similar)` <dbl> 71.86, 36.15, 33...
## $ `1 Summer Dress in a Chain Store (Zara, H&M, ...)` <dbl> 38.25, 25.91, 22...
f. apartment cost
cost.apartment <- cos %>%
select(`City in Country`, `Apartment (1 bedroom) Outside of Centre`, `Apartment (1 bedroom) in City Centre`, `Apartment (3 bedrooms) Outside of Centre`, `Apartment (3 bedrooms) in City Centre`, `Price per Square Meter to Buy Apartment Outside of Centre`, `Price per Square Meter to Buy Apartment in City Centre`)
glimpse(cost.apartment)## Observations: 160
## Variables: 7
## $ `City in Country` <chr> "Saint....
## $ `Apartment (1 bedroom) Outside of Centre` <dbl> 344.27,...
## $ `Apartment (1 bedroom) in City Centre` <dbl> 524.45,...
## $ `Apartment (3 bedrooms) Outside of Centre` <dbl> 615.19,...
## $ `Apartment (3 bedrooms) in City Centre` <dbl> 1012.53...
## $ `Price per Square Meter to Buy Apartment Outside of Centre` <dbl> 1507.70...
## $ `Price per Square Meter to Buy Apartment in City Centre` <dbl> 2476.05...
coa <- cost.apartment %>%
arrange(desc(`Apartment (1 bedroom) in City Centre`))
ggplot((head(coa,15)) , mapping = aes(x = reorder(`City in Country`, `Apartment (1 bedroom) in City Centre` ) ,`Apartment (1 bedroom) in City Centre` ) ) +
geom_bar(stat="identity", fill ="#06697A") +
coord_flip() +
theme(plot.background=element_rect(fill = "light gray")) +
theme(panel.background=element_blank()) +
theme(panel.grid = element_blank()) +
theme(text=element_text(size=9,colour="black")) +
theme(axis.text = element_text(colour = "black")) +
theme(axis.ticks = element_line(size = 2)) +
scale_x_discrete(name="")+
ylab("Cost ") +
ggtitle("Apartment cost with 1 bedroom in City Centre") +
theme(plot.title = element_text(size= 12, face = "bold"))Summaries:
Top 5 Apartment cost with 1 bedroom in city centre of big cities:
coa3 <- cost.apartment %>%
arrange(desc(`Apartment (3 bedrooms) in City Centre`))
ggplot((head(coa,15)) , mapping = aes(x = reorder(`City in Country`, `Apartment (3 bedrooms) in City Centre` ) ,`Apartment (3 bedrooms) in City Centre` )) +
geom_bar(stat="identity", fill ="#06697A") +
coord_flip() +
theme(plot.background=element_rect(fill = "light gray")) +
theme(panel.background=element_blank()) +
theme(panel.grid = element_blank()) +
theme(text=element_text(size=9,colour="black")) +
theme(axis.text = element_text(colour = "black")) +
theme(axis.ticks = element_line(size = 2)) +
scale_x_discrete(name="")+
ylab("Cost ") +
ggtitle("Apartment Cost with 3 bedroom in City Centre")+
theme(plot.title = element_text(size= 12, face = "bold"))Summaries:
Top 5 Apartment cost with 3 bedroom in city centre of big cities:
g. cost of utilities
cost.utilities <- cos %>%
select(`City in Country`, `Basic (Electricity, Heating, Cooling, Water, Garbage) for 85m2 Apartment`, `1 min. of Prepaid Mobile Tariff Local (No Discounts or Plans)`, `Internet (60 Mbps or More, Unlimited Data, Cable/ADSL)`)
glimpse(cost.utilities)## Observations: 160
## Variables: 4
## $ `City in Country` <chr> ...
## $ `Basic (Electricity, Heating, Cooling, Water, Garbage) for 85m2 Apartment` <dbl> ...
## $ `1 min. of Prepaid Mobile Tariff Local (No Discounts or Plans)` <dbl> ...
## $ `Internet (60 Mbps or More, Unlimited Data, Cable/ADSL)` <dbl> ...
coi <- cost.utilities %>%
arrange(desc(`Internet (60 Mbps or More, Unlimited Data, Cable/ADSL)`))
ggplot((head(coi,15)) , mapping = aes(x = reorder(`City in Country`, `Internet (60 Mbps or More, Unlimited Data, Cable/ADSL)` ) ,`Internet (60 Mbps or More, Unlimited Data, Cable/ADSL)` )) +
geom_bar(stat="identity", fill ="#06697A") +
coord_flip() +
theme(plot.background=element_rect(fill = "light gray")) +
theme(panel.background=element_blank()) +
theme(panel.grid = element_blank()) +
theme(text=element_text(size=9,colour="black")) +
theme(axis.text = element_text(colour = "black")) +
theme(axis.ticks = element_line(size = 2)) +
scale_x_discrete(name="")+
ylab("Cost ") +
ggtitle("Cost of Internet Unlimited Data/Cable/ASDL in Big cities") +
theme(plot.title = element_text(size= 12, face = "bold"))Summaries:
Top 5 Internet cost with unlimited data/Cable/ASDL in big cities:
h. cost of leisures, health & education
cost.lhe <- cos %>%
select(`City in Country`,`Cigarettes 20 Pack (Marlboro)`, `Fitness Club, Monthly Fee for 1 Adult`, `Tennis Court Rent (1 Hour on Weekend)`, `Cinema, International Release, 1 Seat`, `Preschool (or Kindergarten), Full Day, Private, Monthly for 1 Child`, `International Primary School, Yearly for 1 Child` )
glimpse(cost.lhe)## Observations: 160
## Variables: 7
## $ `City in Country` <chr> ...
## $ `Cigarettes 20 Pack (Marlboro)` <dbl> ...
## $ `Fitness Club, Monthly Fee for 1 Adult` <dbl> ...
## $ `Tennis Court Rent (1 Hour on Weekend)` <dbl> ...
## $ `Cinema, International Release, 1 Seat` <dbl> ...
## $ `Preschool (or Kindergarten), Full Day, Private, Monthly for 1 Child` <dbl> ...
## $ `International Primary School, Yearly for 1 Child` <dbl> ...
i. Income & mortgage interest rates
income.and.mortgage <- cos %>%
select ( `City in Country`, `Average Monthly Net Salary (After Tax)`, `Mortgage Interest Rate in Percentages (%), Yearly, for 20 Years Fixed-Rate`)
glimpse(income.and.mortgage)## Observations: 160
## Variables: 3
## $ `City in Country` <chr> ...
## $ `Average Monthly Net Salary (After Tax)` <dbl> ...
## $ `Mortgage Interest Rate in Percentages (%), Yearly, for 20 Years Fixed-Rate` <dbl> ...
From preliminary reports above we could recommend some highest cost of living measured by those 5 indicators especially in apartment cost & cost of food to have potentials in establishing capsule hotels/apartment/ stays such as: San Fransisco USA, New York USA, Boston USA,Hongkong Hongkong, London UK, Los Angeles USA and Zurich Switzerland.
In order to have more certain conclusion about the analysis we should do more analysis with more comprehensive data and prediction to anticipate upcoming trends.