CUNY SPS R WORKSHOP

Week 2 Assignment

Tage N Singh

This assignment uses a dataset of physician visits, it is located and accessed on github

Answer to Question 1

## 
## Attaching package: 'scales'
## The following object is masked from 'package:readr':
## 
##     col_factor
## The Summary of the dataset is :
##  Physician_Visits Non_Physician_Visits Outpatient_Visits 
##  Min.   : 0.000   Min.   :  0.000      Min.   :  0.0000  
##  1st Qu.: 1.000   1st Qu.:  0.000      1st Qu.:  0.0000  
##  Median : 4.000   Median :  0.000      Median :  0.0000  
##  Mean   : 5.774   Mean   :  1.618      Mean   :  0.7508  
##  3rd Qu.: 8.000   3rd Qu.:  1.000      3rd Qu.:  0.0000  
##  Max.   :89.000   Max.   :104.000      Max.   :141.0000  
##  Non_Physician_Outpatient_Visits   ER_Visits       Hospitalizations
##  Min.   :  0.0000                Min.   : 0.0000   Min.   :0.000   
##  1st Qu.:  0.0000                1st Qu.: 0.0000   1st Qu.:0.000   
##  Median :  0.0000                Median : 0.0000   Median :0.000   
##  Mean   :  0.5361                Mean   : 0.2635   Mean   :0.296   
##  3rd Qu.:  0.0000                3rd Qu.: 0.0000   3rd Qu.:0.000   
##  Max.   :155.0000                Max.   :12.0000   Max.   :8.000   
##  Chronic_Conditions   Disability         Age            Black          
##  Min.   :0.000      Min.   :0.000   Min.   : 66.00   Length:4406       
##  1st Qu.:1.000      1st Qu.:0.000   1st Qu.: 69.00   Class :character  
##  Median :1.000      Median :0.000   Median : 73.00   Mode  :character  
##  Mean   :1.542      Mean   :0.204   Mean   : 74.02                     
##  3rd Qu.:2.000      3rd Qu.:0.000   3rd Qu.: 78.00                     
##  Max.   :8.000      Max.   :1.000   Max.   :109.00                     
##      Sex              Married          Education_Years   Fam_Income     
##  Length:4406        Length:4406        Min.   : 0.00   Min.   :-1013.0  
##  Class :character   Class :character   1st Qu.: 8.00   1st Qu.:  912.2  
##  Mode  :character   Mode  :character   Median :11.00   Median : 1698.5  
##                                        Mean   :10.29   Mean   : 2527.2  
##                                        3rd Qu.:12.00   3rd Qu.: 3172.8  
##                                        Max.   :18.00   Max.   :54835.0  
##    Employed         Private_Ins          Medicaid            Region         
##  Length:4406        Length:4406        Length:4406        Length:4406       
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##  Health_Status     
##  Length:4406       
##  Class :character  
##  Mode  :character  
##                    
##                    
## 
## The following 2 lines will present the mean,median of the Age and Education years fields.
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   66.00   69.00   73.00   74.02   78.00  109.00
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00    8.00   11.00   10.29   12.00   18.00
## Veryifing the row count
## [1] 4406

Answer to Question 2

##  Hospitalizations Chronic_Conditions   Disability          Age        
##  Min.   :0.0000   Min.   :0.000      Min.   :0.0000   Min.   : 66.00  
##  1st Qu.:0.0000   1st Qu.:0.000      1st Qu.:0.0000   1st Qu.: 69.00  
##  Median :0.0000   Median :1.000      Median :0.0000   Median : 73.00  
##  Mean   :0.2233   Mean   :1.251      Mean   :0.1433   Mean   : 73.99  
##  3rd Qu.:0.0000   3rd Qu.:2.000      3rd Qu.:0.0000   3rd Qu.: 78.00  
##  Max.   :7.0000   Max.   :8.000      Max.   :1.0000   Max.   :109.00  
##     Black               Sex              Married          Education_Years
##  Length:1500        Length:1500        Length:1500        Min.   : 0.0   
##  Class :character   Class :character   Class :character   1st Qu.: 8.0   
##  Mode  :character   Mode  :character   Mode  :character   Median :11.0   
##                                                           Mean   :10.5   
##                                                           3rd Qu.:12.0   
##                                                           Max.   :18.0   
##    Fam_Income        Employed         Private_Ins          Medicaid        
##  Min.   :    0.0   Length:1500        Length:1500        Length:1500       
##  1st Qu.:  975.8   Class :character   Class :character   Class :character  
##  Median : 1771.0   Mode  :character   Mode  :character   Mode  :character  
##  Mean   : 2615.9                                                           
##  3rd Qu.: 3217.2                                                           
##  Max.   :54835.0                                                           
##     Region         
##  Length:1500       
##  Class :character  
##  Mode  :character  
##                    
##                    
## 
## Veryifing the row count
## [1] 1500

Answer to Question 3

##  [1] "Hospitalizations"   "Chronic_Conditions" "Disability"        
##  [4] "Age"                "Black"              "Sex"               
##  [7] "Married"            "Education_Years"    "Fam_Income"        
## [10] "Employed"           "Private_Ins"        "Medicaid"          
## [13] "Region"

Answer to Question 4

## The Summary of the Q3 dataset with new column names is
##      hosps            chcond         disable       time_on_earth   
##  Min.   :0.0000   Min.   :0.000   Min.   :0.0000   Min.   : 66.00  
##  1st Qu.:0.0000   1st Qu.:0.000   1st Qu.:0.0000   1st Qu.: 69.00  
##  Median :0.0000   Median :1.000   Median :0.0000   Median : 73.00  
##  Mean   :0.2233   Mean   :1.251   Mean   :0.1433   Mean   : 73.99  
##  3rd Qu.:0.0000   3rd Qu.:2.000   3rd Qu.:0.0000   3rd Qu.: 78.00  
##  Max.   :7.0000   Max.   :8.000   Max.   :1.0000   Max.   :109.00  
##      race              gender           hooked_up             eduyrs    
##  Length:1500        Length:1500        Length:1500        Min.   : 0.0  
##  Class :character   Class :character   Class :character   1st Qu.: 8.0  
##  Mode  :character   Mode  :character   Mode  :character   Median :11.0  
##                                                           Mean   :10.5  
##                                                           3rd Qu.:12.0  
##                                                           Max.   :18.0  
##      faminc          working            privins            govtins         
##  Min.   :    0.0   Length:1500        Length:1500        Length:1500       
##  1st Qu.:  975.8   Class :character   Class :character   Class :character  
##  Median : 1771.0   Mode  :character   Mode  :character   Mode  :character  
##  Mean   : 2615.9                                                           
##  3rd Qu.: 3217.2                                                           
##  Max.   :54835.0                                                           
##     where          
##  Length:1500       
##  Class :character  
##  Mode  :character  
##                    
##                    
## 
## Verifying the row count
## [1] 1500
## --------------------------------------------------------------------------------------------------------------
## --------------------------------------------------------------------------------------------------------------
##  The mean, median of the Age column now named time_on_earth is
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   66.00   69.00   73.00   73.99   78.00  109.00
## --------------------------------------------------------------------------------------------------------------
##  The mean, median of the ORIGINAL AGE column is
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   66.00   69.00   73.00   74.02   78.00  109.00
##  Note that the mean and median difference between the full dataset of records and a dataset of 1500 records is miniscule
## --------------------------------------------------------------------------------------------------------------
## --------------------------------------------------------------------------------------------------------------
##  The mean, median of the Education column now named eduyrs is
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     0.0     8.0    11.0    10.5    12.0    18.0
## --------------------------------------------------------------------------------------------------------------
##  The mean, median of the ORIGINAL EDUCATION column is
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00    8.00   11.00   10.29   12.00   18.00
##  Note that the mean and median difference between the full dataset of records and a dataset of 1500 records is miniscule
## --------------------------------------------------------------------------------------------------------------
## --------------------------------------------------------------------------------------------------------------

Answer to Question 5

## Replacing calues in colmn 'where
## Replacing values of other with smoething_else
## Replacing values of Northeast with north_east
## Replacing values of midwest with middle_country

Answer to Question 6

## Our Original Dataset
##    Physician_Visits Non_Physician_Visits Outpatient_Visits
## 1                 0                    0                 0
## 2                 8                    0                 0
## 3                 0                    0                 0
## 4                 1                    0                 0
## 5                 1                    0                 0
## 6                 0                    0                 0
## 7                 2                    0                 0
## 8                 1                    0                 0
## 9                 5                    0                 0
## 10                2                    5                20
##    Non_Physician_Outpatient_Visits ER_Visits Hospitalizations
## 1                                0         0                0
## 2                                0         0                0
## 3                                0         0                0
## 4                                0         0                0
## 5                                0         0                0
## 6                                0         0                0
## 7                                0         0                0
## 8                                0         0                0
## 9                                0         0                0
## 10                               0         0                0
##    Chronic_Conditions Disability Age Black    Sex Married Education_Years
## 1                   1          1  79   yes Female      No               4
## 2                   0          0  77    no Female      No               5
## 3                   2          0  92    no Female      No               8
## 4                   0          0  66    no Female      No               7
## 5                   0          0  70    no Female     Yes              14
## 6                   0          0  67    no   Male     Yes              12
## 7                   0          1  92    no   Male     Yes              13
## 8                   0          0  68    no   Male     Yes              14
## 9                   0          0  66    no   Male     Yes              16
## 10                  0          0  70   yes   Male     Yes              12
##    Fam_Income Employed Private_Ins Medicaid  Region Health_Status
## 1         414       No          No       No Midwest     Excellent
## 2         384       No          No      Yes Midwest     Excellent
## 3         480       No          No       No Midwest     Excellent
## 4         625       No          No       No Midwest     Excellent
## 5        4175       No          No       No Midwest     Excellent
## 6        1992       No          No       No Midwest     Excellent
## 7        1857       No          No       No Midwest     Excellent
## 8        4175       No          No       No Midwest     Excellent
## 9       10300      Yes          No       No Midwest     Excellent
## 10      10421      Yes          No       No Midwest     Excellent
## Resultant dataset from Question 2
##    Hospitalizations Chronic_Conditions Disability Age Black    Sex Married
## 1                 0                  1          1  79   yes Female      No
## 2                 0                  0          0  77    no Female      No
## 3                 0                  2          0  92    no Female      No
## 4                 0                  0          0  66    no Female      No
## 5                 0                  0          0  70    no Female     Yes
## 6                 0                  0          0  67    no   Male     Yes
## 7                 0                  0          1  92    no   Male     Yes
## 8                 0                  0          0  68    no   Male     Yes
## 9                 0                  0          0  66    no   Male     Yes
## 10                0                  0          0  70   yes   Male     Yes
##    Education_Years Fam_Income Employed Private_Ins Medicaid  Region
## 1                4        414       No          No       No Midwest
## 2                5        384       No          No      Yes Midwest
## 3                8        480       No          No       No Midwest
## 4                7        625       No          No       No Midwest
## 5               14       4175       No          No       No Midwest
## 6               12       1992       No          No       No Midwest
## 7               13       1857       No          No       No Midwest
## 8               14       4175       No          No       No Midwest
## 9               16      10300      Yes          No       No Midwest
## 10              12      10421      Yes          No       No Midwest
## Resultant dataset from Question 3
##    hosps chcond disable time_on_earth race gender hooked_up eduyrs faminc
## 1      0      1       1            79  yes Female        No      4    414
## 2      0      0       0            77   no Female        No      5    384
## 3      0      2       0            92   no Female        No      8    480
## 4      0      0       0            66   no Female        No      7    625
## 5      0      0       0            70   no Female       Yes     14   4175
## 6      0      0       0            67   no   Male       Yes     12   1992
## 7      0      0       1            92   no   Male       Yes     13   1857
## 8      0      0       0            68   no   Male       Yes     14   4175
## 9      0      0       0            66   no   Male       Yes     16  10300
## 10     0      0       0            70  yes   Male       Yes     12  10421
##    working privins govtins   where
## 1       No      No      No Midwest
## 2       No      No     Yes Midwest
## 3       No      No      No Midwest
## 4       No      No      No Midwest
## 5       No      No      No Midwest
## 6       No      No      No Midwest
## 7       No      No      No Midwest
## 8       No      No      No Midwest
## 9      Yes      No      No Midwest
## 10     Yes      No      No Midwest
## Resultant dataset from Question 5
##    hosps chcond disable time_on_earth race gender hooked_up eduyrs faminc
## 1      0      1       1            79  yes Female        No      4    414
## 2      0      0       0            77   no Female        No      5    384
## 3      0      2       0            92   no Female        No      8    480
## 4      0      0       0            66   no Female        No      7    625
## 5      0      0       0            70   no Female       Yes     14   4175
## 6      0      0       0            67   no   Male       Yes     12   1992
## 7      0      0       1            92   no   Male       Yes     13   1857
## 8      0      0       0            68   no   Male       Yes     14   4175
## 9      0      0       0            66   no   Male       Yes     16  10300
## 10     0      0       0            70  yes   Male       Yes     12  10421
##    working privins govtins          where
## 1       No      No      No middle_country
## 2       No      No     Yes middle_country
## 3       No      No      No middle_country
## 4       No      No      No middle_country
## 5       No      No      No middle_country
## 6       No      No      No middle_country
## 7       No      No      No middle_country
## 8       No      No      No middle_country
## 9      Yes      No      No middle_country
## 10     Yes      No      No middle_country