Central Valley Data Analytics Meetup - 10.30.2021
Zhenning "Jimmy" Xu, followe me on Twitter: https://twitter.com/MKTJimmyxu
10.29.2021
This presentation is created using R Studio and RMarkdown. For more details on authoring R presentations please visit https://support.rstudio.com/hc/en-us/articles/200486468.
Ref: Why R? https://techvidvan.com/tutorials/r-tutorial/ R Career ¨C Discover various Opportunities and Scope of R Programming! https://data-flair.training/blogs/r-careers/
This presentation is created using R Studio and RMarkdown. For more details on authoring R presentations please visit https://support.rstudio.com/hc/en-us/articles/200486468.
EDA is use to understand features of our data.The following are some common procedures:
Exploratory Data Analysis (EDA) and data visualization often go hand in hand. The following are some popular methods:
Top marketing skills to add to your resume:
#We will be using the avocado dataset for this exercise
urlfile<-'https://raw.github.com/utjimmyx/resources/master/avocado_HAA.csv'
data<-read.csv(urlfile, fileEncoding="UTF-8-BOM")
summary(data)
date average_price total_volume type
Length:12628 Min. :0.500 Min. : 253 Length:12628
Class :character 1st Qu.:1.100 1st Qu.: 15733 Class :character
Mode :character Median :1.320 Median : 94806 Mode :character
Mean :1.359 Mean : 325259
3rd Qu.:1.570 3rd Qu.: 430222
Max. :2.780 Max. :5660216
year geography
Min. :2017 Length:12628
1st Qu.:2018 Class :character
Median :2019 Mode :character
Mean :2019
3rd Qu.:2020
Max. :2020
cor(data$total_volume, data$average_price , method = "pearson", use = "complete.obs")
[1] -0.4169306
#We will be using the avocado dataset for this exercise
urlfile<-'https://raw.github.com/utjimmyx/resources/master/avocado_HAA.csv'
data<-read.csv(urlfile, fileEncoding="UTF-8-BOM")
summary(data)
date average_price total_volume type
Length:12628 Min. :0.500 Min. : 253 Length:12628
Class :character 1st Qu.:1.100 1st Qu.: 15733 Class :character
Mode :character Median :1.320 Median : 94806 Mode :character
Mean :1.359 Mean : 325259
3rd Qu.:1.570 3rd Qu.: 430222
Max. :2.780 Max. :5660216
year geography
Min. :2017 Length:12628
1st Qu.:2018 Class :character
Median :2019 Mode :character
Mean :2019
3rd Qu.:2020
Max. :2020
cor(data$total_volume, data$average_price , method = "pearson", use = "complete.obs")
[1] -0.4169306