I loaded and required the necessary packages and data to complete the assignment.

##  [1] "work"       "hoursw"     "child6"     "child618"   "agew"      
##  [6] "educw"      "hearnw"     "wagew"      "hoursh"     "ageh"      
## [11] "educh"      "wageh"      "income"     "educwm"     "educwf"    
## [16] "unemprate"  "city"       "experience"
##   work         hoursw           child6          child618    
##  yes:325   Min.   :   0.0   Min.   :0.0000   Min.   :0.000  
##  no :428   1st Qu.:   0.0   1st Qu.:0.0000   1st Qu.:0.000  
##            Median : 288.0   Median :0.0000   Median :1.000  
##            Mean   : 740.6   Mean   :0.2377   Mean   :1.353  
##            3rd Qu.:1516.0   3rd Qu.:0.0000   3rd Qu.:2.000  
##            Max.   :4950.0   Max.   :3.0000   Max.   :8.000  
##       agew           educw           hearnw           wagew     
##  Min.   :30.00   Min.   : 5.00   Min.   : 0.000   Min.   :0.00  
##  1st Qu.:36.00   1st Qu.:12.00   1st Qu.: 0.000   1st Qu.:0.00  
##  Median :43.00   Median :12.00   Median : 1.625   Median :0.00  
##  Mean   :42.54   Mean   :12.29   Mean   : 2.375   Mean   :1.85  
##  3rd Qu.:49.00   3rd Qu.:13.00   3rd Qu.: 3.788   3rd Qu.:3.58  
##  Max.   :60.00   Max.   :17.00   Max.   :25.000   Max.   :9.98  
##      hoursh          ageh           educh           wageh        
##  Min.   : 175   Min.   :30.00   Min.   : 3.00   Min.   : 0.4121  
##  1st Qu.:1928   1st Qu.:38.00   1st Qu.:11.00   1st Qu.: 4.7883  
##  Median :2164   Median :46.00   Median :12.00   Median : 6.9758  
##  Mean   :2267   Mean   :45.12   Mean   :12.49   Mean   : 7.4822  
##  3rd Qu.:2553   3rd Qu.:52.00   3rd Qu.:15.00   3rd Qu.: 9.1667  
##  Max.   :5010   Max.   :60.00   Max.   :17.00   Max.   :40.5090  
##      income          educwm           educwf         unemprate     
##  Min.   : 1500   Min.   : 0.000   Min.   : 0.000   Min.   : 3.000  
##  1st Qu.:15428   1st Qu.: 7.000   1st Qu.: 7.000   1st Qu.: 7.500  
##  Median :20880   Median :10.000   Median : 7.000   Median : 7.500  
##  Mean   :23081   Mean   : 9.251   Mean   : 8.809   Mean   : 8.624  
##  3rd Qu.:28200   3rd Qu.:12.000   3rd Qu.:12.000   3rd Qu.:11.000  
##  Max.   :96000   Max.   :17.000   Max.   :17.000   Max.   :14.000  
##   city       experience   
##  no :269   Min.   : 0.00  
##  yes:484   1st Qu.: 4.00  
##            Median : 9.00  
##            Mean   :10.63  
##            3rd Qu.:15.00  
##            Max.   :45.00

Select Four Continuous Variables

I created a table dataframe for dataset.

Mroz <- tbl_df(Mroz)
Mroz
## Source: local data frame [753 x 18]
## 
##    work hoursw child6 child618 agew educw hearnw wagew hoursh ageh educh
## 1    no   1610      1        0   32    12 3.3540  2.65   2708   34    12
## 2    no   1656      0        2   30    12 1.3889  2.65   2310   30     9
## 3    no   1980      1        3   35    12 4.5455  4.04   3072   40    12
## 4    no    456      0        3   34    12 1.0965  3.25   1920   53    10
## 5    no   1568      1        2   31    14 4.5918  3.60   2000   32    12
## 6    no   2032      0        0   54    12 4.7421  4.70   1040   57    11
## 7    no   1440      0        2   37    16 8.3333  5.95   2670   37    12
## 8    no   1020      0        0   54    12 7.8431  9.98   4120   53     8
## 9    no   1458      0        2   48    12 2.1262  0.00   1995   52     4
## 10   no   1600      0        2   39    12 4.6875  4.15   2100   43    12
## ..  ...    ...    ...      ...  ...   ...    ...   ...    ...  ...   ...
## Variables not shown: wageh (dbl), income (int), educwm (int), educwf
##   (int), unemprate (dbl), city (fctr), experience (int)

I selected four variables and used the select command to only show the four variables I selected. Then I ran a summary of Mroz1.

Mroz1 <- Mroz %>%
  select(hoursw, hoursh, income, ageh)
Mroz1
## Source: local data frame [753 x 4]
## 
##    hoursw hoursh income ageh
## 1    1610   2708  16310   34
## 2    1656   2310  21800   30
## 3    1980   3072  21040   40
## 4     456   1920   7300   53
## 5    1568   2000  27300   32
## 6    2032   1040  19495   57
## 7    1440   2670  21152   37
## 8    1020   4120  18900   53
## 9    1458   1995  20405   52
## 10   1600   2100  20425   43
## ..    ...    ...    ...  ...
summary(Mroz1)
##      hoursw           hoursh         income           ageh      
##  Min.   :   0.0   Min.   : 175   Min.   : 1500   Min.   :30.00  
##  1st Qu.:   0.0   1st Qu.:1928   1st Qu.:15428   1st Qu.:38.00  
##  Median : 288.0   Median :2164   Median :20880   Median :46.00  
##  Mean   : 740.6   Mean   :2267   Mean   :23081   Mean   :45.12  
##  3rd Qu.:1516.0   3rd Qu.:2553   3rd Qu.:28200   3rd Qu.:52.00  
##  Max.   :4950.0   Max.   :5010   Max.   :96000   Max.   :60.00

Estimate Pearson Product-Moment Correlations for four pairs of variables.

I ran a Pearson Correlation for four pairs of the variables.

rquery.cormat(Mroz1)

## $r
##        hoursh   ageh hoursw income
## hoursh      1                     
## ageh   -0.095      1              
## hoursw -0.056 -0.031      1       
## income   0.13  0.041   0.15      1
## 
## $p
##         hoursh ageh  hoursw income
## hoursh       0                    
## ageh    0.0088    0               
## hoursw    0.12 0.39       0       
## income 0.00042 0.27 5.6e-05      0
## 
## $sym
##        hoursh ageh hoursw income
## hoursh 1                        
## ageh          1                 
## hoursw             1            
## income                    1     
## attr(,"legend")
## [1] 0 ' ' 0.3 '.' 0.6 ',' 0.8 '+' 0.9 '*' 0.95 'B' 1

Test null hypotheses that the population correlations = 0 for the four pairs of variables you selected.

Hypothesis (null) The population correlations equals zero for hoursh and ageh

Hypothesis (alternative) The population correlations do not equal zero for hoursh and ageh

A Pearsons Correlation will be used to test the Hypothesis. The null hypothesis is rejected at the specified .05 level, r=-.095, p <.05


Hypothesis (null) The population correlations equals zero for hoursh and income

Hypothesis (alternative) The population correlations do not equal zero for hoursh and income

A Pearsons Correlation will be used to test the Hypothesis. The null hypothesis is rejected at the specified .05 level, r=.13, p <.05


Hypothesis (null) The population correlations equals zero for ageh and income

Hypothesis (alternative) The population correlations do not equal zero for ageh and income

A Pearsons Correlation will be used to test the Hypothesis. We fail to reject the null hypothesis at the specified .05 level, r=.041, p >.05


Hypothesis (null) The population correlations equals zero for hoursw and income

Hypothesis (alternative) The population correlations do not equal zero for hoursw and income

A Pearsons Correlation will be used to test the Hypothesis. The null hypothesis is rejected at the specified .05 level, r=.15, p <.05


Using ggvis, plot scatterplots containing points and a smooth line for the four pairs of variable you selected.

Scatterplot for hoursh and ageh

Mroz1 %>%
  ggvis(x = ~hoursh, y = ~ageh) %>% layer_points() %>% layer_smooths() %>% add_axis("x", title = "hoursh", title_offset = 50) %>%
  add_axis("y", title = "ageh", title_offset = 50)

Scatterplot for hoursh and income

Mroz1 %>%
  ggvis(x = ~hoursh, y = ~income) %>% layer_points() %>% layer_smooths() %>% add_axis("x", title = "hoursh", title_offset = 50) %>%
  add_axis("y", title = "income", title_offset = 50)

Scatterplot for ageh and income

Mroz1 %>%
  ggvis(x = ~ageh, y = ~income) %>% layer_points() %>% layer_smooths() %>% add_axis("x", title = "hoursh", title_offset = 50) %>%
  add_axis("y", title = "income", title_offset = 50)

Scatterplost for hoursw and income

Mroz1 %>%
  ggvis(x = ~hoursw, y = ~income) %>% layer_points() %>% layer_smooths() %>% add_axis("x", title = "hoursh", title_offset = 50) %>%
  add_axis("y", title = "income", title_offset = 50)

Heat Map

The following code creates a heat map of these correlations: (see correlogram above)

cormat<-rquery.cormat(Mroz1, graphType="heatmap")