library(ggplot2) library(leaflet) library(scales) library(tidyr) library(colorspace) library(ggridges) library(lubridate)
kopi <- read.csv("data_input/coffeeanalysis.csv")Melihat Data teratas
head(kopi)Melihat 10 data Terbawah
tail(kopi)Melihat Dimensi Data
dim(kopi)## [1] 2095 12
Melihat Isi data Kolom
names(kopi)## [1] "name" "roaster" "roast" "loc_country" "origin_1"
## [6] "origin_2" "X100g_USD" "rating" "review_date" "desc_1"
## [11] "desc_2" "desc_3"
Dari pemeriksaan yang dilakukan dapat di simpulkan : * Data kopi berisi 2095 baris dan 12 kolom * Setiap nama kolom : “name”, “roaster”, “roast”, “loc_country”, “origin_1”, “origin_2”, “X100g_USD”, “rating”, “review_date” “desc_1”, “desc_2”, “desc_3”
str(kopi)## 'data.frame': 2095 obs. of 12 variables:
## $ name : chr "â\200œSweetyâ\200\235 Espresso Blend" "Flora Blend Espresso" "Ethiopia Shakiso Mormora" "Ethiopia Suke Quto" ...
## $ roaster : chr "A.R.C." "A.R.C." "Revel Coffee" "Roast House" ...
## $ roast : chr "Medium-Light" "Medium-Light" "Medium-Light" "Medium-Light" ...
## $ loc_country: chr "Hong Kong" "Hong Kong" "United States" "United States" ...
## $ origin_1 : chr "Panama" "Africa" "Guji Zone" "Guji Zone" ...
## $ origin_2 : chr "Ethiopia" "Asia Pacific" "Southern Ethiopia" "Oromia Region" ...
## $ X100g_USD : num 14.32 9.05 4.7 4.19 4.85 ...
## $ rating : int 95 94 92 92 94 93 93 93 93 94 ...
## $ review_date: chr "November 2017" "November 2017" "November 2017" "November 2017" ...
## $ desc_1 : chr "Evaluated as espresso. Sweet-toned, deeply rich, chocolaty. Vanilla paste, dark chocolate, narcissus, pink grap"| __truncated__ "Evaluated as espresso. Sweetly tart, floral-toned. Honeysuckle, oak, dried apricot, dark chocolate, thyme in ar"| __truncated__ "Crisply sweet, cocoa-toned. Lemon blossom, roasted cacao nib, date, rice candy, white peppercorn in aroma and c"| __truncated__ "Delicate, sweetly spice-toned. Pink peppercorn, date, myrrh, lavender, roasted cacao nib in aroma and cup. Cris"| __truncated__ ...
## $ desc_2 : chr "An espresso blend comprised of coffees from Panama and Ethiopia. A.R.C., whose motto is â\200œmore than special"| __truncated__ "An espresso blend comprised of coffees from Africa and the Asia-Pacific. A.R.C., whose motto is â\200œmore than"| __truncated__ "This coffee tied for the third-highest rating in a tasting of 71 organic-certified coffees from Africa for Coff"| __truncated__ "This coffee tied for the third-highest rating in a tasting of 71 organic-certified coffees from Africa for Coff"| __truncated__ ...
## $ desc_3 : chr "A radiant espresso blend that shines equally in the straight shot and in milk, alive with notes of rich dark ch"| __truncated__ "A floral-driven straight shot, amplified with notes of stone fruit and chocolate in cappuccino-scaled milk." "A gently spice-toned, floral- driven wet-processed Ethiopia cup with pleasing notes of cocoa throughout." "Lavender-like flowers and hints of zesty pink peppercorn animate this crisply sweet wet-processed Ethiopia cup." ...
kopi$roast <- as.factor(kopi$roast)
#kopi$review_date <- as.Date(kopi$review_date, "%m/%y")str(kopi)## 'data.frame': 2095 obs. of 12 variables:
## $ name : chr "â\200œSweetyâ\200\235 Espresso Blend" "Flora Blend Espresso" "Ethiopia Shakiso Mormora" "Ethiopia Suke Quto" ...
## $ roaster : chr "A.R.C." "A.R.C." "Revel Coffee" "Roast House" ...
## $ roast : Factor w/ 6 levels "","Dark","Light",..: 6 6 6 6 4 3 6 6 6 4 ...
## $ loc_country: chr "Hong Kong" "Hong Kong" "United States" "United States" ...
## $ origin_1 : chr "Panama" "Africa" "Guji Zone" "Guji Zone" ...
## $ origin_2 : chr "Ethiopia" "Asia Pacific" "Southern Ethiopia" "Oromia Region" ...
## $ X100g_USD : num 14.32 9.05 4.7 4.19 4.85 ...
## $ rating : int 95 94 92 92 94 93 93 93 93 94 ...
## $ review_date: chr "November 2017" "November 2017" "November 2017" "November 2017" ...
## $ desc_1 : chr "Evaluated as espresso. Sweet-toned, deeply rich, chocolaty. Vanilla paste, dark chocolate, narcissus, pink grap"| __truncated__ "Evaluated as espresso. Sweetly tart, floral-toned. Honeysuckle, oak, dried apricot, dark chocolate, thyme in ar"| __truncated__ "Crisply sweet, cocoa-toned. Lemon blossom, roasted cacao nib, date, rice candy, white peppercorn in aroma and c"| __truncated__ "Delicate, sweetly spice-toned. Pink peppercorn, date, myrrh, lavender, roasted cacao nib in aroma and cup. Cris"| __truncated__ ...
## $ desc_2 : chr "An espresso blend comprised of coffees from Panama and Ethiopia. A.R.C., whose motto is â\200œmore than special"| __truncated__ "An espresso blend comprised of coffees from Africa and the Asia-Pacific. A.R.C., whose motto is â\200œmore than"| __truncated__ "This coffee tied for the third-highest rating in a tasting of 71 organic-certified coffees from Africa for Coff"| __truncated__ "This coffee tied for the third-highest rating in a tasting of 71 organic-certified coffees from Africa for Coff"| __truncated__ ...
## $ desc_3 : chr "A radiant espresso blend that shines equally in the straight shot and in milk, alive with notes of rich dark ch"| __truncated__ "A floral-driven straight shot, amplified with notes of stone fruit and chocolate in cappuccino-scaled milk." "A gently spice-toned, floral- driven wet-processed Ethiopia cup with pleasing notes of cocoa throughout." "Lavender-like flowers and hints of zesty pink peppercorn animate this crisply sweet wet-processed Ethiopia cup." ...
colSums(is.na(kopi))## name roaster roast loc_country origin_1 origin_2
## 0 0 0 0 0 0
## X100g_USD rating review_date desc_1 desc_2 desc_3
## 0 0 0 0 0 0
anyNA(kopi)## [1] FALSE
menghapus kolom origin_2, desc_1, desc_2, desc_2
kopi <- kopi[,-c(10:12)]str(kopi)## 'data.frame': 2095 obs. of 9 variables:
## $ name : chr "â\200œSweetyâ\200\235 Espresso Blend" "Flora Blend Espresso" "Ethiopia Shakiso Mormora" "Ethiopia Suke Quto" ...
## $ roaster : chr "A.R.C." "A.R.C." "Revel Coffee" "Roast House" ...
## $ roast : Factor w/ 6 levels "","Dark","Light",..: 6 6 6 6 4 3 6 6 6 4 ...
## $ loc_country: chr "Hong Kong" "Hong Kong" "United States" "United States" ...
## $ origin_1 : chr "Panama" "Africa" "Guji Zone" "Guji Zone" ...
## $ origin_2 : chr "Ethiopia" "Asia Pacific" "Southern Ethiopia" "Oromia Region" ...
## $ X100g_USD : num 14.32 9.05 4.7 4.19 4.85 ...
## $ rating : int 95 94 92 92 94 93 93 93 93 94 ...
## $ review_date: chr "November 2017" "November 2017" "November 2017" "November 2017" ...
Penjelasan Singkat
summary(kopi)## name roaster roast loc_country
## Length:2095 Length:2095 : 15 Length:2095
## Class :character Class :character Dark : 5 Class :character
## Mode :character Mode :character Light : 287 Mode :character
## Medium : 259
## Medium-Dark : 39
## Medium-Light:1490
## origin_1 origin_2 X100g_USD rating
## Length:2095 Length:2095 Min. : 0.120 Min. :84.00
## Class :character Class :character 1st Qu.: 4.930 1st Qu.:92.00
## Mode :character Mode :character Median : 5.860 Median :93.00
## Mean : 9.323 Mean :93.11
## 3rd Qu.: 8.785 3rd Qu.:94.00
## Max. :132.280 Max. :98.00
## review_date
## Length:2095
## Class :character
## Mode :character
##
##
##
Ringkasan * Jenis roast kopi yang sering digunakan adalah Medium-Light sebanyak 1490 * Jenis roast kopi yang paling sedikit digunakan adalah Dark sebanyak 5 * Harga tertinggi biji kopi adalah $ 132.280/100g 7 harga terendah adalah $0.120/100g * Rating kopi tertinggi diangka 98 dan rating terendah diangka 84
aggregate(rating ~ origin_1,kopi,mean)boxplot(kopi$rating)Insight: terdapat outlier, persebaran data rating kopi di angka 92 sampai 94
Negara penghasil kopi (Origin_1 & Origin_2) mana yang punya harga tertinggi
kopi[kopi$X100g_USD == 132.280 ,]Jawaban : negara Boquete Growing Region & Western Panama yang mempunyai harga kopi tertinggi yaitu $0.12/100g dengan rating 97
rata rata rating kopi dunia yang banyak di konsumsi adalah 93
mean(kopi$rating)## [1] 93.11408
Kesimpulan : * rating sebuh kopi tidak di pengaruhi oleh rating * roast yang paling banyak digunakan adalah Medium-Light * persebaran data rating kopi di angka 92 sampai 94