Download Hurricane Harvey Resource contact info into R.

setwd("~/Desktop")
library(readxl)
Harvey <- read_excel("~/Desktop/Harvey.xlsx")
[110, 3]: expecting numeric: got 'anne.imber@gmail.com'[113, 3]: expecting numeric: got '1415 CALIFORNIA ST Houston, TX 77006'[114, 3]: expecting numeric: got '3317 Montrose Blvd. Houston, TX 77006'[115, 3]: expecting numeric: got '2612 Smith St. Houston, TX 77006'[116, 3]: expecting numeric: got '6400 Fannin St. Houston, TX77030'[117, 3]: expecting numeric: got '6128 Broadway St Galveston, TX 77551'
head(Harvey)

Look at column names

names(Harvey)
[1] "Who"      "Number"   "Email"   
[4] "Category" "Reason"   "City"    
[7] "Notes"    "Source"  

Look for missing values

sum(is.na(Harvey))
[1] 481

This has too many missing values for practice. I’m going to swtich to another data set.

Practice Guided by https://www.youtube.com/watch?v=S7l1z301G30

setwd("~/Desktop")
library(readxl)
Films <- read_excel("~/Desktop/Movies.xlsx")
head(Films)
names(Films)
[1] "Title"        "Year"        
[3] "Rating"       "Runtime"     
[5] "Critic.Score" "Box.Office"  

-Rename a comumn

names(Films)[5]<- "Critic.Score"

Count Missing Values

sum(is.na(Films))
[1] 0

I’m going to delete some values for practice and reload the excel file.

setwd("~/Desktop")
library(readxl)
Movies1 <- read_excel("~/Desktop/Movies.xlsx")
head(Movies1)
names(Movies1)[5]<- "Critic.Score"
sum(is.na(Movies1))
[1] 3

Delete Rows with na values.

q <- na.omit(Movies1)

Replace na values with mean values

Movies2$Runtime[which(is.na(Movies1$Runtime))] <- mean(Movies2$Runtime,na.rm = TRUE)
sum(is.na(Movies2))
[1] 2

Create Dummy Variables

Movies3<-cut(Movies2$Critic.Score, br = c(0,50,75,100), labels = c("Bad","Decent", "Good"))
Movies3
   [1] Bad    Good   Bad    Decent
   [5] Bad    Bad    Decent Bad   
   [9] Good   Decent Bad    Bad   
  [13] Good   Bad    Bad    Bad   
  [17] Bad    Bad    Good   Bad   
  [21] Bad    Good   Bad    Bad   
  [25] Decent Bad    Good   Bad   
  [29] Bad    Bad    Bad    Good  
  [33] Bad    Bad    Bad    Bad   
  [37] Bad    Good   Decent Bad   
  [41] Decent Good   Good   Bad   
  [45] Bad    Decent Good   Bad   
  [49] Bad    Good   Bad    Bad   
  [53] Bad    Bad    Decent Decent
  [57] Bad    Bad    Bad    Bad   
  [61] Bad    Decent Bad    Decent
  [65] Bad    Bad    Bad    Decent
  [69] Good   Good   Bad    Decent
  [73] Good   Bad    Decent Bad   
  [77] Good   Good   Good   Good  
  [81] Decent Decent Decent Decent
  [85] Good   Good   Decent Bad   
  [89] Good   Good   Decent Decent
  [93] Good   Good   Decent Bad   
  [97] Bad    Good   Decent Bad   
 [101] Bad    Bad    Bad    Bad   
 [105] Decent Bad    <NA>   Bad   
 [109] Bad    Bad    Bad    Decent
 [113] Bad    Bad    Bad    Bad   
 [117] Bad    Decent Bad    Decent
 [121] Decent Bad    Good   Bad   
 [125] Bad    Decent Bad    Bad   
 [129] Decent Bad    Bad    Bad   
 [133] Bad    Bad    Bad    Bad   
 [137] Good   Decent Good   Bad   
 [141] Good   Bad    Good   Bad   
 [145] Bad    Bad    Bad    Bad   
 [149] Decent Bad    Decent Bad   
 [153] Decent Decent Decent Decent
 [157] Decent Bad    Decent Bad   
 [161] Bad    Bad    Bad    Bad   
 [165] Decent Good   Decent Decent
 [169] Bad    Bad    Bad    Good  
 [173] Decent Bad    Bad    Bad   
 [177] Bad    Good   Decent Decent
 [181] Decent Good   Bad    Bad   
 [185] Bad    Bad    Bad    Bad   
 [189] Good   Bad    Bad    Decent
 [193] Bad    Bad    Bad    Decent
 [197] Decent Decent Bad    Bad   
 [201] Decent Bad    Decent Decent
 [205] Good   Bad    Bad    Bad   
 [209] Bad    Bad    Good   Decent
 [213] Bad    Decent Decent Bad   
 [217] Good   Good   Good   Decent
 [221] Bad    Bad    Bad    Decent
 [225] Good   Bad    Bad    Bad   
 [229] Good   Bad    Good   Decent
 [233] Bad    Bad    Bad    Decent
 [237] Good   Decent Bad    Bad   
 [241] Decent Good   Good   Bad   
 [245] Decent Good   Bad    Good  
 [249] Decent Good   Good   Bad   
 [253] Bad    Decent Bad    Good  
 [257] Bad    Decent Good   Decent
 [261] Decent Good   Decent Good  
 [265] Decent Bad    Bad    Decent
 [269] Decent Bad    Bad    Bad   
 [273] <NA>   Bad    Bad    Bad   
 [277] Decent Decent Bad    Bad   
 [281] Bad    Bad    Bad    Bad   
 [285] Bad    Decent Bad    Decent
 [289] Bad    Bad    Decent Bad   
 [293] Bad    Bad    Good   Bad   
 [297] Bad    Decent Bad    Good  
 [301] Bad    Good   Bad    Bad   
 [305] Good   Bad    Bad    Bad   
 [309] Bad    Bad    Bad    Bad   
 [313] Decent Decent Good   Decent
 [317] Bad    Good   Bad    Good  
 [321] Decent Bad    Bad    Good  
 [325] Bad    Bad    Decent Good  
 [329] Bad    Good   Bad    Bad   
 [333] Decent Bad    Decent Good  
 [337] Bad    Bad    Good   Decent
 [341] Bad    Decent Good   Decent
 [345] Bad    Bad    Decent Bad   
 [349] Good   Decent Bad    Decent
 [353] Bad    Bad    Decent Bad   
 [357] Decent Bad    Bad    Bad   
 [361] Bad    Good   Bad    <NA>  
 [365] Bad    Bad    Bad    Bad   
 [369] <NA>   Bad    Bad    Decent
 [373] Good   Decent Decent Bad   
 [377] Decent Decent Bad    Decent
 [381] Decent Decent Decent Decent
 [385] Bad    <NA>   Decent Bad   
 [389] Bad    Bad    Decent Good  
 [393] Good   Decent Good   Good  
 [397] Good   Good   Bad    Good  
 [401] Bad    Bad    Good   Decent
 [405] Bad    Decent Decent Bad   
 [409] Good   Bad    Bad    Good  
 [413] Bad    Bad    Bad    Good  
 [417] Good   Decent Bad    Good  
 [421] <NA>   Decent Good   Good  
 [425] Good   Good   Good   Decent
 [429] Good   Good   Bad    Good  
 [433] Good   Decent Decent Bad   
 [437] Bad    Bad    Decent Decent
 [441] Decent Good   Good   Decent
 [445] Good   Good   Bad    Good  
 [449] Good   Bad    Decent Bad   
 [453] Bad    Bad    Bad    Bad   
 [457] Bad    Decent Bad    Bad   
 [461] Bad    Bad    Decent Bad   
 [465] Bad    Bad    Bad    Decent
 [469] Bad    Bad    Bad    Bad   
 [473] Bad    Bad    Decent Bad   
 [477] Bad    Decent Bad    Bad   
 [481] Bad    Bad    Bad    Bad   
 [485] Bad    Bad    Bad    Bad   
 [489] Good   Bad    Good   Bad   
 [493] Decent Bad    Decent Bad   
 [497] Good   Good   Bad    Decent
 [501] Decent Bad    Bad    Bad   
 [505] Good   Decent Bad    Bad   
 [509] Bad    Bad    Bad    Decent
 [513] Bad    Bad    Bad    Bad   
 [517] Decent Good   Bad    Bad   
 [521] Bad    Bad    Bad    Good  
 [525] Bad    Decent Decent Decent
 [529] Good   Bad    Bad    Bad   
 [533] Bad    Good   Bad    Bad   
 [537] Bad    Decent Bad    Good  
 [541] Bad    Bad    Bad    Good  
 [545] Good   Decent Bad    Bad   
 [549] Good   Bad    Decent Bad   
 [553] Bad    Decent Decent Good  
 [557] Good   Bad    Bad    Decent
 [561] Good   Bad    Bad    Decent
 [565] Bad    Bad    Bad    Bad   
 [569] Bad    Bad    Bad    Good  
 [573] Good   Decent Bad    Decent
 [577] Decent Good   Bad    Bad   
 [581] Decent Bad    Good   Bad   
 [585] Good   Bad    Decent Decent
 [589] Bad    Decent Good   Bad   
 [593] Bad    Good   Bad    Bad   
 [597] Decent Decent Good   Decent
 [601] Bad    Good   Good   Good  
 [605] Good   Good   Good   Bad   
 [609] Decent Good   Bad    Decent
 [613] Decent Decent Bad    Decent
 [617] Bad    Good   Good   Decent
 [621] Bad    Good   Decent Good  
 [625] Bad    Bad    Good   Decent
 [629] Bad    Bad    Bad    Bad   
 [633] Good   Decent Bad    Bad   
 [637] Bad    Decent Bad    Bad   
 [641] Bad    Bad    Bad    Bad   
 [645] Bad    Decent Bad    Decent
 [649] Bad    Bad    Decent Bad   
 [653] Good   Decent Bad    Bad   
 [657] Good   Bad    Bad    Decent
 [661] Bad    Decent Bad    Bad   
 [665] Bad    Bad    Decent Bad   
 [669] Decent Bad    Bad    Bad   
 [673] Good   Bad    Decent Bad   
 [677] Good   Bad    Bad    Bad   
 [681] Bad    Good   Bad    Decent
 [685] Bad    Bad    Bad    Decent
 [689] Decent Bad    Good   Good  
 [693] Bad    Bad    Decent Bad   
 [697] Bad    Good   Bad    Bad   
 [701] Bad    Good   Bad    Decent
 [705] Good   Good   Bad    Decent
 [709] Good   Bad    Good   Bad   
 [713] Bad    Good   Bad    Bad   
 [717] Good   Bad    Bad    Decent
 [721] Bad    Bad    <NA>   Decent
 [725] Bad    Bad    Bad    Decent
 [729] Good   Decent Bad    Decent
 [733] Decent Decent Bad    Bad   
 [737] Good   Decent Good   Bad   
 [741] Good   Bad    Good   Bad   
 [745] Bad    Bad    Good   Good  
 [749] Bad    Bad    Bad    Bad   
 [753] Decent Decent Bad    Bad   
 [757] Bad    Bad    Good   Bad   
 [761] Decent Decent Decent Bad   
 [765] Bad    Decent Bad    Bad   
 [769] Bad    Bad    Bad    Good  
 [773] Bad    Decent Decent Decent
 [777] Bad    Decent Bad    Bad   
 [781] Bad    Good   Bad    Good  
 [785] Good   Good   Good   Bad   
 [789] Good   Bad    Bad    Good  
 [793] Bad    Good   Decent Good  
 [797] Decent Bad    Good   Decent
 [801] Decent Bad    Bad    Decent
 [805] Bad    Decent Good   Decent
 [809] Decent Decent Decent Decent
 [813] Bad    Decent Bad    Bad   
 [817] Good   Bad    Bad    Bad   
 [821] Decent Decent Bad    Bad   
 [825] Bad    Bad    Bad    Decent
 [829] Good   <NA>   Bad    Decent
 [833] Bad    Bad    Bad    Bad   
 [837] Bad    Bad    Bad    Decent
 [841] Bad    Decent Bad    Bad   
 [845] Bad    Decent Bad    Decent
 [849] Bad    Bad    Bad    Bad   
 [853] Decent Decent Bad    Decent
 [857] Bad    Bad    Bad    Good  
 [861] Bad    Bad    Bad    Decent
 [865] Bad    Decent Bad    Good  
 [869] Bad    Bad    Decent Good  
 [873] Good   Good   Decent Decent
 [877] Bad    Good   Bad    Bad   
 [881] Bad    Decent Good   Decent
 [885] Good   Bad    Bad    Bad   
 [889] Decent Good   Decent Good  
 [893] Bad    Good   Decent Bad   
 [897] Decent Bad    Decent Bad   
 [901] Bad    Good   Good   Bad   
 [905] Bad    Bad    Bad    Bad   
 [909] Good   Good   Bad    Good  
 [913] Good   Bad    Bad    Good  
 [917] Good   Bad    Bad    Decent
 [921] Decent Bad    Bad    Decent
 [925] Decent Bad    Decent Bad   
 [929] Good   Good   Bad    Bad   
 [933] Good   Good   Bad    Decent
 [937] Decent Decent Bad    Bad   
 [941] Bad    Decent Good   Bad   
 [945] Bad    Decent Bad    Decent
 [949] Bad    Bad    <NA>   Decent
 [953] Bad    Decent Bad    Bad   
 [957] Decent Decent Bad    Good  
 [961] Bad    Decent Bad    Good  
 [965] Good   Good   Bad    Bad   
 [969] Bad    Good   Bad    Bad   
 [973] Decent Bad    Decent Decent
 [977] Good   Good   Decent Good  
 [981] Bad    Bad    Bad    Bad   
 [985] Decent Bad    Good   Bad   
 [989] Bad    Decent Good   Bad   
 [993] Bad    Good   Bad    Good  
 [997] Decent Bad    Decent Decent
 [ reached getOption("max.print") -- omitted 2238 entries ]
Levels: Bad Decent Good

Regress Box.Office over Critic.Rating

reg1 <- lm(Box.Office ~ Critic.Score, data = Movies3)
summary(reg1)

Call:
lm(formula = Box.Office ~ Critic.Score, data = Movies3)

Residuals:
   Min     1Q Median     3Q    Max 
-59.33 -38.04 -20.06  12.98 707.21 

Coefficients:
             Estimate Std. Error
(Intercept)  21.85118    2.32123
Critic.Score  0.37880    0.04082
             t value Pr(>|t|)    
(Intercept)    9.414   <2e-16 ***
Critic.Score   9.280   <2e-16 ***
---
Signif. codes:  
  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05
  ‘.’ 0.1 ‘ ’ 1

Residual standard error: 64.25 on 3234 degrees of freedom
  (2 observations deleted due to missingness)
Multiple R-squared:  0.02594,   Adjusted R-squared:  0.02564 
F-statistic: 86.12 on 1 and 3234 DF,  p-value: < 2.2e-16

Predicted model will be y = .37880x + 21.85118, and the p value is significant.

If we needed to normalize the data we could try the following:

summary(Movies1$Box.Office)
    Min.  1st Qu.   Median     Mean 
  0.0002   1.0000  16.1000  40.6800 
 3rd Qu.     Max.     NA's 
 51.5000 760.5000        1 
summary(log10(Movies1$Box.Office+1))
     Min.   1st Qu.    Median 
0.0000786 0.3010000 1.2330000 
     Mean   3rd Qu.      Max. 
1.1060000 1.7200000 2.8820000 
     NA's 
        1 
LS0tCnRpdGxlOiAiRGF0YSBDbGVhbmluZyB3aXRoIFIiCm91dHB1dDogaHRtbF9ub3RlYm9vawotLS0KCgpEb3dubG9hZCBIdXJyaWNhbmUgSGFydmV5IFJlc291cmNlIGNvbnRhY3QgaW5mbyBpbnRvIFIuCmBgYHtyfQpzZXR3ZCgifi9EZXNrdG9wIikKbGlicmFyeShyZWFkeGwpCkhhcnZleSA8LSByZWFkX2V4Y2VsKCJ+L0Rlc2t0b3AvSGFydmV5Lnhsc3giKQpoZWFkKEhhcnZleSkKYGBgCgpMb29rIGF0IGNvbHVtbiBuYW1lcwoKYGBge3J9Cm5hbWVzKEhhcnZleSkKYGBgCkxvb2sgZm9yIG1pc3NpbmcgdmFsdWVzCmBgYHtyfQpzdW0oaXMubmEoSGFydmV5KSkKYGBgCgpUaGlzIGhhcyB0b28gbWFueSBtaXNzaW5nIHZhbHVlcyBmb3IgcHJhY3RpY2UuIEknbSBnb2luZyB0byBzd3RpY2ggdG8gYW5vdGhlciBkYXRhIHNldC4KClByYWN0aWNlIEd1aWRlZCBieSBodHRwczovL3d3dy55b3V0dWJlLmNvbS93YXRjaD92PVM3bDF6MzAxRzMwCgoKCmBgYHtyfQoKYGBgCgpgYGB7cn0Kc2V0d2QoIn4vRGVza3RvcCIpCmxpYnJhcnkocmVhZHhsKQpGaWxtcyA8LSByZWFkX2V4Y2VsKCJ+L0Rlc2t0b3AvTW92aWVzLnhsc3giKQpoZWFkKEZpbG1zKQpgYGAKCgoKYGBge3J9Cm5hbWVzKEZpbG1zKQpgYGAKCi1SZW5hbWUgYSBjb211bW4KYGBge3J9Cm5hbWVzKEZpbG1zKVs1XTwtICJDcml0aWMuU2NvcmUiCmBgYAoKQ291bnQgTWlzc2luZyBWYWx1ZXMKYGBge3J9CnN1bShpcy5uYShGaWxtcykpCmBgYAoKCkknbSBnb2luZyB0byBkZWxldGUgc29tZSB2YWx1ZXMgZm9yIHByYWN0aWNlIGFuZCByZWxvYWQgdGhlIGV4Y2VsIGZpbGUuCgoKYGBge3J9CnNldHdkKCJ+L0Rlc2t0b3AiKQpsaWJyYXJ5KHJlYWR4bCkKTW92aWVzMSA8LSByZWFkX2V4Y2VsKCJ+L0Rlc2t0b3AvTW92aWVzLnhsc3giKQpoZWFkKE1vdmllczEpCmBgYAoKYGBge3J9Cm5hbWVzKE1vdmllczEpWzVdPC0gIkNyaXRpYy5TY29yZSIKc3VtKGlzLm5hKE1vdmllczEpKQpgYGAKRGVsZXRlIFJvd3Mgd2l0aCBuYSB2YWx1ZXMuCgpgYGB7cn0KcSA8LSBuYS5vbWl0KE1vdmllczEpCmBgYAoKUmVwbGFjZSBuYSB2YWx1ZXMgd2l0aCBtZWFuIHZhbHVlcwoKCmBgYHtyfQpNb3ZpZXMyJFJ1bnRpbWVbd2hpY2goaXMubmEoTW92aWVzMSRSdW50aW1lKSldIDwtIG1lYW4oTW92aWVzMiRSdW50aW1lLG5hLnJtID0gVFJVRSkKCnN1bShpcy5uYShNb3ZpZXMyKSkKYGBgCgpDcmVhdGUgRHVtbXkgVmFyaWFibGVzCgpgYGB7cn0KTW92aWVzMzwtY3V0KE1vdmllczIkQ3JpdGljLlNjb3JlLCBiciA9IGMoMCw1MCw3NSwxMDApLCBsYWJlbHMgPSBjKCJCYWQiLCJEZWNlbnQiLCAiR29vZCIpKQpNb3ZpZXMzCmBgYAoKCgpSZWdyZXNzIEJveC5PZmZpY2Ugb3ZlciBDcml0aWMuUmF0aW5nCmBgYHtyfQpyZWcxIDwtIGxtKEJveC5PZmZpY2UgfiBDcml0aWMuU2NvcmUsIGRhdGEgPSBNb3ZpZXMzKQpzdW1tYXJ5KHJlZzEpCmBgYAoKClByZWRpY3RlZCBtb2RlbCB3aWxsIGJlIHkgPSAuMzc4ODB4ICsgMjEuODUxMTgsIGFuZCB0aGUgcCB2YWx1ZSBpcyBzaWduaWZpY2FudC4KCgoKSWYgd2UgbmVlZGVkIHRvIG5vcm1hbGl6ZSB0aGUgZGF0YSB3ZSBjb3VsZCB0cnkgdGhlIGZvbGxvd2luZzoKCmBgYHtyfQpzdW1tYXJ5KE1vdmllczEkQm94Lk9mZmljZSkKc3VtbWFyeShsb2cxMChNb3ZpZXMxJEJveC5PZmZpY2UrMSkpCmBgYAoK