1 Explanation

Before travelling, usually we will make a holiday plan about our destination. This project give a little describe about some tourist destination in Yogyakarta.

2 Input Data

rating <- read.csv("Data Input/tourism_rating.csv")
place <- read.csv("Data Input/tourism_with_id.csv")
user <- read.csv("Data Input/user.csv")

So!! This is our destination data, let’s go to next step

2.1 Data Inspection

There are many ways to inspect our data, we can use ‘head()’, ‘tail()’, ‘dim()’, or ‘names()’. We have 3 datas, so we will check each data.

2.1.1 Rating Data

head(rating)
dim(rating)
#> [1] 10000     3
names(rating)
#> [1] "User_Id"       "Place_Id"      "Place_Ratings"

2.1.2 Place Data

head(place)
dim(place)
#> [1] 437  13
names(place)
#>  [1] "Place_Id"     "Place_Name"   "Description"  "Category"     "City"        
#>  [6] "Price"        "Rating"       "Time_Minutes" "Coordinate"   "Lat"         
#> [11] "Long"         "X"            "X.1"

2.1.3 User Data

head(user)
dim(user)
#> [1] 300   3
names(user)
#> [1] "User_Id"  "Location" "Age"

From rating, place and user data, there are correlation from each other. Rating & place data, have correlation from “Place_Id”, and rating & user data have correlation from User_Id.

3 Data Cleansing

3.1 Check Data Types

str(rating)
#> 'data.frame':    10000 obs. of  3 variables:
#>  $ User_Id      : int  1 1 1 1 1 1 1 1 1 1 ...
#>  $ Place_Id     : int  179 344 5 373 101 312 258 20 154 393 ...
#>  $ Place_Ratings: int  3 2 5 3 4 2 5 4 2 5 ...
str(place)
#> 'data.frame':    437 obs. of  13 variables:
#>  $ Place_Id    : int  1 2 3 4 5 6 7 8 9 10 ...
#>  $ Place_Name  : chr  "Monumen Nasional" "Kota Tua" "Dunia Fantasi" "Taman Mini Indonesia Indah (TMII)" ...
#>  $ Description : chr  "Monumen Nasional atau yang populer disingkat dengan Monas atau Tugu Monas adalah monumen peringatan setinggi 13"| __truncated__ "Kota tua di Jakarta, yang juga bernama Kota Tua, berpusat di Alun-Alun Fatahillah, yaitu alun-alun yang ramai d"| __truncated__ "Dunia Fantasi atau disebut juga Dufan adalah tempat hiburan yang terletak di kawasan Taman Impian Jaya Ancol, J"| __truncated__ "Taman Mini Indonesia Indah merupakan suatu kawasan taman wisata bertema budaya Indonesia di Jakarta Timur. Area"| __truncated__ ...
#>  $ Category    : chr  "Budaya" "Budaya" "Taman Hiburan" "Taman Hiburan" ...
#>  $ City        : chr  "Jakarta" "Jakarta" "Jakarta" "Jakarta" ...
#>  $ Price       : int  20000 0 270000 10000 94000 25000 4000 180000 175000 150000 ...
#>  $ Rating      : num  4.6 4.6 4.6 4.5 4.5 4.5 4.5 4 4.4 4.5 ...
#>  $ Time_Minutes: int  15 90 360 NA 60 10 NA NA NA NA ...
#>  $ Coordinate  : chr  "{'lat': -6.1753924, 'lng': 106.8271528}" "{'lat': -6.137644799999999, 'lng': 106.8171245}" "{'lat': -6.125312399999999, 'lng': 106.8335377}" "{'lat': -6.302445899999999, 'lng': 106.8951559}" ...
#>  $ Lat         : num  -6.18 -6.14 -6.13 -6.3 -6.12 ...
#>  $ Long        : num  107 107 107 107 107 ...
#>  $ X           : logi  NA NA NA NA NA NA ...
#>  $ X.1         : int  1 2 3 4 5 6 7 8 9 10 ...
str(user)
#> 'data.frame':    300 obs. of  3 variables:
#>  $ User_Id : int  1 2 3 4 5 6 7 8 9 10 ...
#>  $ Location: chr  "Semarang, Jawa Tengah" "Bekasi, Jawa Barat" "Cirebon, Jawa Barat" "Bekasi, Jawa Barat" ...
#>  $ Age     : int  20 21 23 21 20 18 39 40 38 39 ...

3.2 Delete Unuser Column

place <- place[,!names(place)%in% c("X","X.1","Time_Minutes"), drop=F]
head(place)

3.3 Select Location only from Yogyakarta

place_yogya <- place[place$City=="Yogyakarta",]
head(place_yogya)

3.4 Convert Data Types

place_yogya$Place_Name <- as.factor(place_yogya$Place_Name)
place_yogya$Category <- as.factor(place_yogya$Category)

3.5 Merge Rating Data, Place Data & User Data

place_yogya_merge <- merge(place_yogya,rating,by.x = "Place_Id", by.y = "Place_Id")
place_yogya_merge <- merge(place_yogya_merge,user,by.x = "User_Id", by.y = "User_Id")
head(place_yogya_merge)

4 Data Manipulation & Transformation

4.1 Make rating frequncy place data

place_yogya_rating_freq <- aggregate(x=Place_Ratings~Place_Name,
                                     data = place_yogya_merge, 
                                     FUN = length)

4.1.1 Top 10 Most Rated Place

top_10_rated <- head(place_yogya_rating_freq[
                            order(place_yogya_rating_freq$Place_Ratings,
                            decreasing = T),],10)

“Pantai Parangtritis” is the most rated place.

4.1.2 Top 10 Highest Rating Place

top_10_rating <- head(place_yogya[order(place_yogya$Rating,decreasing = T),names(place_yogya)%in%c("Place_Name","Rating")],10)

“Desa Wisata Sungai Code Jogja Kota” and “Kauman Pakualaman Yogyakarta” are the highest rating, 5 of 5.

4.2 Number of Place by Category

number_category <- as.data.frame(sort(table(place_yogya$Category),decreasing = T))
number_category

4.2.1 Plot of Number of Place by Category

barplot(Freq~Var1, number_category)

“Taman Hiburan” Category have most destination place.

4.3 Number of Location by Price Category

convert_price <- function(y){ 
    if(y == 0)
      {
      y <- "Free Charges" # Untuk lokasi gratis
    }
    else if(y > 0 & y <= 10000)
      {
      y <- "Below IDR 10k" # untuk Harga sampai Rp 10.000
    }
    else if(y > 10000 & y <= 100000)
      {
      y <- "IDR 10k to IDR 100k" # untuk harga 10k - 100k
    }
    else
      {
      y <- "Above IDR 100k" # untuk harga diatas 100k
    }  
}
place_yogya$Price_Cat <- sapply(X = place_yogya$Price, 
                            FUN = convert_price) 

place_yogya$Price_Cat <- as.factor(place_yogya$Price_Cat)
place_yogya$Price_Cat <- factor(place_yogya$Price_Cat, levels= c("Free Charges", "Below IDR 10k", "IDR 10k to IDR 100k", "Above IDR 100k"))
as.data.frame(table(place_yogya$Price_Cat))

4.4 Average Price by Category

aggregate(x=Price~Category,data = place_yogya, FUN = mean)

“Budaya” Category has the highest average price.

4.4.1 5 Highest Charges Place

place_price <- aggregate(x=Price~Place_Name,data = place_yogya, FUN = max)
place_price[order(place_price$Price,decreasing = T),][0:5,]

“Goa Jomblang” has highest price place.

4.4.2 5 Lowest Charges Place

place_price_charge <- place_price[place_price$Price!=0,]
place_price_charge[order(place_price_charge$Price,decreasing = F),][0:5,]

4.4.3 Free Charge by Place

place_yogya[place_yogya$Price==0,2]
#>  [1] Situs Warungboto                   Nol Kilometer Jl.Malioboro        
#>  [3] Desa Wisata Sungai Code Jogja Kota Alun Alun Selatan Yogyakarta      
#>  [5] Kampung Wisata Kadipaten           Taman Budaya Yogyakarta           
#>  [7] Kampung Wisata Sosro Menduran      Tugu Pal Putih Jogja              
#>  [9] Candi Donotirto                    Kawasan Malioboro                 
#> [11] Embung Tambakboyo                  Gedung Agung Yogyakarta           
#> [13] Kampung Wisata Rejowinangun        Kauman Pakualaman Yogyakarta      
#> [15] Alun-alun Utara Keraton Yogyakarta Gumuk Pasir Parangkusumo          
#> [17] Kawasan Wisata Sosrowijayan        Bendung Lepen                     
#> [19] Ledok Sambi                        Bentara Budaya Yogyakarta (BBY)   
#> [21] Desa Wisata Kelor                  Pasar Kebon Empring Bintaran      
#> [23] Geoforest Watu Payung Turunan      Pasar Beringharjo                 
#> [25] Desa Wisata Pulesari              
#> 126 Levels: Air Terjun Kedung Pedut ... Wisata Kraton Jogja

There are 25 free charge places

4.4.4 Free Charge by Category

as.data.frame(table(place_yogya[place_yogya$Price==0,4]))

Taman Hiburan is most free charge Place

4.5 Top 10 Visitors Origin

visitor_origin <- as.data.frame(sort(table(place_yogya_merge$Location),decreasing = T)[1:10])

5 Data Plot

library(ggplot2)

5.1 Top 10 Most Rated Destination by Visitor

ggplot(data = top_10_rated, mapping = aes(x=Place_Ratings,
                                          y=reorder(Place_Name,Place_Ratings),
                                          fill=Place_Ratings))+
  geom_col()+
  scale_fill_gradient(low = "red",
                      high = "black")+
  geom_text(aes(label=Place_Ratings),
            position = position_dodge(width=0.9),
            size=3,
            hjust=0.01)+
  labs(title = "Top 10 Most Rated Destination",
       subtitle = "Count of Rating Given by Visitor",
       x="Count of Rating",
       y="")+
  theme_light() +
  theme(legend.position = "none")

“Pantai Parangtritis” is the most rated place.

5.2 Top 10 Highest Rating

ggplot(data = top_10_rating, mapping = aes(x=Rating,
                                          y=reorder(Place_Name,Rating),
                                          fill=Rating))+
  geom_col()+
  scale_fill_gradient(low = "green",
                      high = "orange")+
  geom_text(aes(label=Rating),
            position = position_dodge(width=0.9),
            size=3,
            hjust=0.01)+
  labs(title = "Top 10 Highest Destination",
       x="Rating",
       y="")+
  theme_minimal() +
  theme(legend.position = "none")

“Desa Wisata Sungai Code Jogja Kota” and “Kauman Pakualaman Yogyakarta” are the highest rating, 5 of 5.

5.3 Number of Locations by Price Category

ggplot(data = as.data.frame(table(place_yogya$Price_Cat)),
       mapping = aes(x=Var1, y=Freq))+
  geom_col(aes(fill=Var1))+
  labs(title = "Distribution of The Number of Locations by Price Category",
       x="Price Category",
       y="Number of Locations")+
  theme_light() +
  theme(legend.position = "none")

In Yogyakarta, Most of Locations have Low Price, and There are Many Free Charges Locations too.

6 Conclusion and Travelling Reccomendation

Yogyakarta have many travelling destination, from free of charges and charges. Yogyakarta have 25 free charge travelling destination. So, if you want to travelling around yogyakarta with low budget, you can consider to visit those 25 places. But in Yogyakarta, also there are charges place with low charge, start from IDR 2.000,00 and the highest charges place IDR 500.000,00. If budget not in your concern, you can choose your destination from most rated and highest rating place in Yogyakarta. Happy Holiday!!!