Q1. What is a major difference between matrices and data frames?

A matrix must have the same data type and length. But a Data Frame can be more general, it’s more like a table and can have different data types.

ncbirths <- read.csv("/resources/rstudio/BusinessStatistics/Data/ncbirths.csv")
str(ncbirths)
## 'data.frame':    1000 obs. of  13 variables:
##  $ fage          : int  NA NA 19 21 NA NA 18 17 NA 20 ...
##  $ mage          : int  13 14 15 15 15 15 15 15 16 16 ...
##  $ mature        : Factor w/ 2 levels "mature mom","younger mom": 2 2 2 2 2 2 2 2 2 2 ...
##  $ weeks         : int  39 42 37 41 39 38 37 35 38 37 ...
##  $ premie        : Factor w/ 3 levels "<NA>","full term",..: 2 2 2 2 2 2 2 3 2 2 ...
##  $ visits        : int  10 15 11 6 9 19 12 5 9 13 ...
##  $ marital       : Factor w/ 3 levels "<NA>","married",..: 2 2 2 2 2 2 2 2 2 2 ...
##  $ gained        : int  38 20 38 34 27 22 76 15 NA 52 ...
##  $ weight        : num  7.63 7.88 6.63 8 6.38 5.38 8.44 4.69 8.81 6.94 ...
##  $ lowbirthweight: Factor w/ 2 levels "low","not low": 2 2 2 2 2 1 2 1 2 2 ...
##  $ gender        : Factor w/ 2 levels "female","male": 2 2 1 2 1 2 2 2 2 1 ...
##  $ habit         : Factor w/ 3 levels "<NA>","nonsmoker",..: 2 2 2 2 2 2 2 2 2 2 ...
##  $ whitemom      : Factor w/ 3 levels "<NA>","not white",..: 2 2 3 3 2 2 2 2 3 3 ...
summary(ncbirths)
##       fage            mage            mature        weeks      
##  Min.   :14.00   Min.   :13   mature mom :133   Min.   :20.00  
##  1st Qu.:25.00   1st Qu.:22   younger mom:867   1st Qu.:37.00  
##  Median :30.00   Median :27                     Median :39.00  
##  Mean   :30.26   Mean   :27                     Mean   :38.33  
##  3rd Qu.:35.00   3rd Qu.:32                     3rd Qu.:40.00  
##  Max.   :55.00   Max.   :50                     Max.   :45.00  
##  NA's   :171                                    NA's   :2      
##        premie        visits            marital        gained     
##  <NA>     :  2   Min.   : 0.0   <NA>       :  1   Min.   : 0.00  
##  full term:846   1st Qu.:10.0   married    :386   1st Qu.:20.00  
##  premie   :152   Median :12.0   not married:613   Median :30.00  
##                  Mean   :12.1                     Mean   :30.33  
##                  3rd Qu.:15.0                     3rd Qu.:38.00  
##                  Max.   :30.0                     Max.   :85.00  
##                  NA's   :9                        NA's   :27     
##      weight       lowbirthweight    gender          habit    
##  Min.   : 1.000   low    :111    female:503   <NA>     :  1  
##  1st Qu.: 6.380   not low:889    male  :497   nonsmoker:873  
##  Median : 7.310                               smoker   :126  
##  Mean   : 7.101                                              
##  3rd Qu.: 8.060                                              
##  Max.   :11.750                                              
##                                                              
##       whitemom  
##  <NA>     :  2  
##  not white:284  
##  white    :714  
##                 
##                 
##                 
## 

Q2. How many newborns are there in the dataset?

There are 1000 newborns in this dataset.

Hint: See the number of observations under the structure of the data set.

Q3. How many columns are there?

There are 13 Columns because there are 13 variables.

Hint: See the number of variables under the structure of the data set.

Q4. How many factor variables are there?

There are a total of 7 factor variables in the dataset.

Hint: See the description of variables under the structure of the data set.

Q5. How much does the heaviest newborn weigh?

The heaviest newborn weighs 11.75 lbs.

Hint: See the summary statistics of the data set.