Importing Packages
## Warning: cannot remove prior installation of package 'lubridate'
## Warning in file.copy(savedcopy, lib, recursive = TRUE): problem copying
## C:\Users\LENOVO\AppData\Local\R\win-library\4.2\00LOCK\lubridate\libs\x64\lubridate.dll
## to
## C:\Users\LENOVO\AppData\Local\R\win-library\4.2\lubridate\libs\x64\lubridate.dll:
## Permission denied
## Warning: restored 'lubridate'
## Warning: package 'dplyr' is in use and will not be installed
## Warning: package 'tidyr' is in use and will not be installed
Importing Dataset
Exploring Dataset
head(bestsellers_with_categories)
## Name
## 1 10-Day Green Smoothie Cleanse
## 2 11/22/63: A Novel
## 3 12 Rules for Life: An Antidote to Chaos
## 4 1984 (Signet Classics)
## 5 5,000 Awesome Facts (About Everything!) (National Geographic Kids)
## 6 A Dance with Dragons (A Song of Ice and Fire)
## Author User.Rating Reviews Price Year Genre
## 1 JJ Smith 4.7 17350 8 2016 Non Fiction
## 2 Stephen King 4.6 2052 22 2011 Fiction
## 3 Jordan B. Peterson 4.7 18979 15 2018 Non Fiction
## 4 George Orwell 4.7 21424 6 2017 Fiction
## 5 National Geographic Kids 4.8 7665 12 2019 Non Fiction
## 6 George R. R. Martin 4.4 12643 11 2011 Fiction
is.null(bestsellers_with_categories)
## [1] FALSE
str(bestsellers_with_categories)
## 'data.frame': 550 obs. of 7 variables:
## $ Name : chr "10-Day Green Smoothie Cleanse" "11/22/63: A Novel" "12 Rules for Life: An Antidote to Chaos" "1984 (Signet Classics)" ...
## $ Author : chr "JJ Smith" "Stephen King" "Jordan B. Peterson" "George Orwell" ...
## $ User.Rating: num 4.7 4.6 4.7 4.7 4.8 4.4 4.7 4.7 4.7 4.6 ...
## $ Reviews : int 17350 2052 18979 21424 7665 12643 19735 19699 5983 23848 ...
## $ Price : int 8 22 15 6 12 11 30 15 3 8 ...
## $ Year : int 2016 2011 2018 2017 2019 2011 2014 2017 2018 2016 ...
## $ Genre : chr "Non Fiction" "Fiction" "Non Fiction" "Fiction" ...
The data looks good, consist of 7 columns, no null or empty
cells.
Analyzing Dataset
The data set start from 2009-2019. The min review is 37 and the max
review is 87841. The Price is start from $0 to the highest is $105.There
are 12 books that cost $0 and 2 books that cost $105. User Rating start
from 3.3 to the highest 4.9.
## # A tibble: 2 × 2
## Genre Name
## <chr> <int>
## 1 Fiction 240
## 2 Non Fiction 310
## [1] 248
There are 550 books in which 240 are Fiction and 310 are Non
fiction.There are 248 Authors in this Data. Author Jeff Kinney wrote 12
books in these past years and he has the highest number of the
book.
Summary
We can see that Fiction book is 44% and Non Fiction is 56%. From
2009 to 2019 the review is increased. The highest number of people to
give a good rating from 4.5 to 4.9 is in the Non Fiction book. In each
year the produced of Non Fiction book is higher than Fiction book. And
there is no linear relationship between price and rating, means that
higher price does not result higher rating in Fiction and Non Fiction
book.