R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

# Loading required libraries
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)

# Load the dataset
adult_income_data <- read.csv("C:/Users/RAKESH REDDY/OneDrive/Desktop/adult_income_data.csv")

Random sampling of adult_income_data dataset:

Here we are creating a random sample data sets with 9 columns i.e. age, edunum, maritalstatus, occupation, relationship, sex, capitalgain, hoursperweek, nativecounty and made sure that each sample contains 50% of data with replacement.

samples <-sample(5:10,1)
columns <- c("age","edunum","maritalstatus","occupation","relationship","sex","capitalgain","hoursperweek", "nativecountry")
subsample_list <- list()
for (i in 1:samples) {
    sample_size <- round(0.5 * nrow(adult_income_data))
    
  sample_index <- sample(1:nrow(adult_income_data), size = sample_size, replace = TRUE)
  
  subsample <- adult_income_data[sample_index, columns]
  
  subsample_list[[i]] <- subsample
}
View(subsample_list)

Summary of the data samples

summary_table <- lapply(subsample_list, function(subsample){
  summary_df <-summary(subsample)
  knitr::kable(summary_df, caption = "Stats Summary of data samples")
}) 
for (i in 1: samples){
  cat("### Subsample", i, "summary statisics \n")
  print(summary_table[[i]])
}
## ### Subsample 1 summary statisics 
## 
## 
## Table: Stats Summary of data samples
## 
## |   |     age      |    edunum    |maritalstatus    | occupation      |relationship     |    sex          | capitalgain  | hoursperweek |nativecountry    |
## |:--|:-------------|:-------------|:----------------|:----------------|:----------------|:----------------|:-------------|:-------------|:----------------|
## |   |Min.   :17.00 |Min.   : 1.00 |Length:8140      |Length:8140      |Length:8140      |Length:8140      |Min.   :    0 |Min.   : 1.0  |Length:8140      |
## |   |1st Qu.:28.00 |1st Qu.: 9.00 |Class :character |Class :character |Class :character |Class :character |1st Qu.:    0 |1st Qu.:40.0  |Class :character |
## |   |Median :37.00 |Median :10.00 |Mode  :character |Mode  :character |Mode  :character |Mode  :character |Median :    0 |Median :40.0  |Mode  :character |
## |   |Mean   :38.65 |Mean   :10.02 |NA               |NA               |NA               |NA               |Mean   : 1180 |Mean   :40.6  |NA               |
## |   |3rd Qu.:48.00 |3rd Qu.:12.00 |NA               |NA               |NA               |NA               |3rd Qu.:    0 |3rd Qu.:45.0  |NA               |
## |   |Max.   :90.00 |Max.   :16.00 |NA               |NA               |NA               |NA               |Max.   :99999 |Max.   :99.0  |NA               |
## ### Subsample 2 summary statisics 
## 
## 
## Table: Stats Summary of data samples
## 
## |   |     age     |    edunum   |maritalstatus    | occupation      |relationship     |    sex          | capitalgain  | hoursperweek |nativecountry    |
## |:--|:------------|:------------|:----------------|:----------------|:----------------|:----------------|:-------------|:-------------|:----------------|
## |   |Min.   :17.0 |Min.   : 1.0 |Length:8140      |Length:8140      |Length:8140      |Length:8140      |Min.   :    0 |Min.   : 1.00 |Length:8140      |
## |   |1st Qu.:28.0 |1st Qu.: 9.0 |Class :character |Class :character |Class :character |Class :character |1st Qu.:    0 |1st Qu.:40.00 |Class :character |
## |   |Median :37.0 |Median :10.0 |Mode  :character |Mode  :character |Mode  :character |Mode  :character |Median :    0 |Median :40.00 |Mode  :character |
## |   |Mean   :38.9 |Mean   :10.1 |NA               |NA               |NA               |NA               |Mean   : 1100 |Mean   :40.42 |NA               |
## |   |3rd Qu.:48.0 |3rd Qu.:13.0 |NA               |NA               |NA               |NA               |3rd Qu.:    0 |3rd Qu.:45.00 |NA               |
## |   |Max.   :90.0 |Max.   :16.0 |NA               |NA               |NA               |NA               |Max.   :99999 |Max.   :99.00 |NA               |
## ### Subsample 3 summary statisics 
## 
## 
## Table: Stats Summary of data samples
## 
## |   |     age      |    edunum    |maritalstatus    | occupation      |relationship     |    sex          | capitalgain  | hoursperweek |nativecountry    |
## |:--|:-------------|:-------------|:----------------|:----------------|:----------------|:----------------|:-------------|:-------------|:----------------|
## |   |Min.   :17.00 |Min.   : 1.00 |Length:8140      |Length:8140      |Length:8140      |Length:8140      |Min.   :    0 |Min.   : 1.00 |Length:8140      |
## |   |1st Qu.:28.00 |1st Qu.: 9.00 |Class :character |Class :character |Class :character |Class :character |1st Qu.:    0 |1st Qu.:40.00 |Class :character |
## |   |Median :37.00 |Median :10.00 |Mode  :character |Mode  :character |Mode  :character |Mode  :character |Median :    0 |Median :40.00 |Mode  :character |
## |   |Mean   :38.57 |Mean   :10.06 |NA               |NA               |NA               |NA               |Mean   : 1142 |Mean   :40.34 |NA               |
## |   |3rd Qu.:48.00 |3rd Qu.:12.00 |NA               |NA               |NA               |NA               |3rd Qu.:    0 |3rd Qu.:45.00 |NA               |
## |   |Max.   :90.00 |Max.   :16.00 |NA               |NA               |NA               |NA               |Max.   :99999 |Max.   :99.00 |NA               |
## ### Subsample 4 summary statisics 
## 
## 
## Table: Stats Summary of data samples
## 
## |   |     age      |    edunum    |maritalstatus    | occupation      |relationship     |    sex          | capitalgain  | hoursperweek |nativecountry    |
## |:--|:-------------|:-------------|:----------------|:----------------|:----------------|:----------------|:-------------|:-------------|:----------------|
## |   |Min.   :17.00 |Min.   : 1.00 |Length:8140      |Length:8140      |Length:8140      |Length:8140      |Min.   :    0 |Min.   : 1.00 |Length:8140      |
## |   |1st Qu.:28.00 |1st Qu.: 9.00 |Class :character |Class :character |Class :character |Class :character |1st Qu.:    0 |1st Qu.:40.00 |Class :character |
## |   |Median :37.00 |Median :10.00 |Mode  :character |Mode  :character |Mode  :character |Mode  :character |Median :    0 |Median :40.00 |Mode  :character |
## |   |Mean   :38.92 |Mean   :10.07 |NA               |NA               |NA               |NA               |Mean   : 1178 |Mean   :40.55 |NA               |
## |   |3rd Qu.:48.00 |3rd Qu.:13.00 |NA               |NA               |NA               |NA               |3rd Qu.:    0 |3rd Qu.:45.00 |NA               |
## |   |Max.   :90.00 |Max.   :16.00 |NA               |NA               |NA               |NA               |Max.   :99999 |Max.   :99.00 |NA               |
## ### Subsample 5 summary statisics 
## 
## 
## Table: Stats Summary of data samples
## 
## |   |     age      |    edunum    |maritalstatus    | occupation      |relationship     |    sex          | capitalgain  | hoursperweek |nativecountry    |
## |:--|:-------------|:-------------|:----------------|:----------------|:----------------|:----------------|:-------------|:-------------|:----------------|
## |   |Min.   :17.00 |Min.   : 1.00 |Length:8140      |Length:8140      |Length:8140      |Length:8140      |Min.   :    0 |Min.   : 1.00 |Length:8140      |
## |   |1st Qu.:27.00 |1st Qu.: 9.00 |Class :character |Class :character |Class :character |Class :character |1st Qu.:    0 |1st Qu.:40.00 |Class :character |
## |   |Median :37.00 |Median :10.00 |Mode  :character |Mode  :character |Mode  :character |Mode  :character |Median :    0 |Median :40.00 |Mode  :character |
## |   |Mean   :38.36 |Mean   :10.06 |NA               |NA               |NA               |NA               |Mean   : 1098 |Mean   :40.37 |NA               |
## |   |3rd Qu.:48.00 |3rd Qu.:12.00 |NA               |NA               |NA               |NA               |3rd Qu.:    0 |3rd Qu.:45.00 |NA               |
## |   |Max.   :90.00 |Max.   :16.00 |NA               |NA               |NA               |NA               |Max.   :99999 |Max.   :99.00 |NA               |
## ### Subsample 6 summary statisics 
## 
## 
## Table: Stats Summary of data samples
## 
## |   |     age      |    edunum    |maritalstatus    | occupation      |relationship     |    sex          | capitalgain  | hoursperweek |nativecountry    |
## |:--|:-------------|:-------------|:----------------|:----------------|:----------------|:----------------|:-------------|:-------------|:----------------|
## |   |Min.   :17.00 |Min.   : 1.00 |Length:8140      |Length:8140      |Length:8140      |Length:8140      |Min.   :    0 |Min.   : 1.0  |Length:8140      |
## |   |1st Qu.:28.00 |1st Qu.: 9.00 |Class :character |Class :character |Class :character |Class :character |1st Qu.:    0 |1st Qu.:40.0  |Class :character |
## |   |Median :37.00 |Median :10.00 |Mode  :character |Mode  :character |Mode  :character |Mode  :character |Median :    0 |Median :40.0  |Mode  :character |
## |   |Mean   :38.81 |Mean   :10.04 |NA               |NA               |NA               |NA               |Mean   : 1198 |Mean   :40.7  |NA               |
## |   |3rd Qu.:48.00 |3rd Qu.:12.00 |NA               |NA               |NA               |NA               |3rd Qu.:    0 |3rd Qu.:45.0  |NA               |
## |   |Max.   :90.00 |Max.   :16.00 |NA               |NA               |NA               |NA               |Max.   :99999 |Max.   :99.0  |NA               |
## ### Subsample 7 summary statisics 
## 
## 
## Table: Stats Summary of data samples
## 
## |   |     age      |    edunum    |maritalstatus    | occupation      |relationship     |    sex          | capitalgain  | hoursperweek |nativecountry    |
## |:--|:-------------|:-------------|:----------------|:----------------|:----------------|:----------------|:-------------|:-------------|:----------------|
## |   |Min.   :17.00 |Min.   : 1.00 |Length:8140      |Length:8140      |Length:8140      |Length:8140      |Min.   :    0 |Min.   : 1.00 |Length:8140      |
## |   |1st Qu.:28.00 |1st Qu.: 9.00 |Class :character |Class :character |Class :character |Class :character |1st Qu.:    0 |1st Qu.:40.00 |Class :character |
## |   |Median :37.00 |Median :10.00 |Mode  :character |Mode  :character |Mode  :character |Mode  :character |Median :    0 |Median :40.00 |Mode  :character |
## |   |Mean   :38.78 |Mean   :10.06 |NA               |NA               |NA               |NA               |Mean   : 1159 |Mean   :40.45 |NA               |
## |   |3rd Qu.:48.00 |3rd Qu.:12.00 |NA               |NA               |NA               |NA               |3rd Qu.:    0 |3rd Qu.:45.00 |NA               |
## |   |Max.   :90.00 |Max.   :16.00 |NA               |NA               |NA               |NA               |Max.   :99999 |Max.   :99.00 |NA               |
## ### Subsample 8 summary statisics 
## 
## 
## Table: Stats Summary of data samples
## 
## |   |     age     |    edunum    |maritalstatus    | occupation      |relationship     |    sex          | capitalgain  | hoursperweek |nativecountry    |
## |:--|:------------|:-------------|:----------------|:----------------|:----------------|:----------------|:-------------|:-------------|:----------------|
## |   |Min.   :17.0 |Min.   : 1.00 |Length:8140      |Length:8140      |Length:8140      |Length:8140      |Min.   :    0 |Min.   : 1.00 |Length:8140      |
## |   |1st Qu.:28.0 |1st Qu.: 9.00 |Class :character |Class :character |Class :character |Class :character |1st Qu.:    0 |1st Qu.:40.00 |Class :character |
## |   |Median :37.0 |Median :10.00 |Mode  :character |Mode  :character |Mode  :character |Mode  :character |Median :    0 |Median :40.00 |Mode  :character |
## |   |Mean   :38.7 |Mean   :10.08 |NA               |NA               |NA               |NA               |Mean   : 1060 |Mean   :40.32 |NA               |
## |   |3rd Qu.:48.0 |3rd Qu.:12.00 |NA               |NA               |NA               |NA               |3rd Qu.:    0 |3rd Qu.:45.00 |NA               |
## |   |Max.   :90.0 |Max.   :16.00 |NA               |NA               |NA               |NA               |Max.   :99999 |Max.   :99.00 |NA               |
## ### Subsample 9 summary statisics 
## 
## 
## Table: Stats Summary of data samples
## 
## |   |     age      |    edunum    |maritalstatus    | occupation      |relationship     |    sex          | capitalgain  | hoursperweek |nativecountry    |
## |:--|:-------------|:-------------|:----------------|:----------------|:----------------|:----------------|:-------------|:-------------|:----------------|
## |   |Min.   :17.00 |Min.   : 1.00 |Length:8140      |Length:8140      |Length:8140      |Length:8140      |Min.   :    0 |Min.   : 1.00 |Length:8140      |
## |   |1st Qu.:28.00 |1st Qu.: 9.00 |Class :character |Class :character |Class :character |Class :character |1st Qu.:    0 |1st Qu.:40.00 |Class :character |
## |   |Median :37.00 |Median :10.00 |Mode  :character |Mode  :character |Mode  :character |Mode  :character |Median :    0 |Median :40.00 |Mode  :character |
## |   |Mean   :38.89 |Mean   :10.12 |NA               |NA               |NA               |NA               |Mean   : 1067 |Mean   :40.54 |NA               |
## |   |3rd Qu.:48.00 |3rd Qu.:13.00 |NA               |NA               |NA               |NA               |3rd Qu.:    0 |3rd Qu.:45.00 |NA               |
## |   |Max.   :90.00 |Max.   :16.00 |NA               |NA               |NA               |NA               |Max.   :99999 |Max.   :99.00 |NA               |
## ### Subsample 10 summary statisics 
## 
## 
## Table: Stats Summary of data samples
## 
## |   |     age      |    edunum    |maritalstatus    | occupation      |relationship     |    sex          | capitalgain  | hoursperweek |nativecountry    |
## |:--|:-------------|:-------------|:----------------|:----------------|:----------------|:----------------|:-------------|:-------------|:----------------|
## |   |Min.   :17.00 |Min.   : 1.00 |Length:8140      |Length:8140      |Length:8140      |Length:8140      |Min.   :    0 |Min.   : 1.00 |Length:8140      |
## |   |1st Qu.:28.00 |1st Qu.: 9.00 |Class :character |Class :character |Class :character |Class :character |1st Qu.:    0 |1st Qu.:40.00 |Class :character |
## |   |Median :37.00 |Median :10.00 |Mode  :character |Mode  :character |Mode  :character |Mode  :character |Median :    0 |Median :40.00 |Mode  :character |
## |   |Mean   :39.01 |Mean   :10.08 |NA               |NA               |NA               |NA               |Mean   : 1164 |Mean   :40.57 |NA               |
## |   |3rd Qu.:48.00 |3rd Qu.:12.00 |NA               |NA               |NA               |NA               |3rd Qu.:    0 |3rd Qu.:45.00 |NA               |
## |   |Max.   :90.00 |Max.   :16.00 |NA               |NA               |NA               |NA               |Max.   :99999 |Max.   :99.00 |NA               |

Scrutinizing the subsamples list:

Summary statistics for each sub samples are found and histograms are plotted

summary_statistics <-lapply(subsample_list, summary)

histograms <- lapply(subsample_list, function(subsample) {
  ggplot(subsample, aes(x = age)) +
    geom_histogram(binwidth = 1, fill = 'orange', color = 'black') +
    labs(title = "Histogram for age distribution", x = 'age', y = 'count')
})


for (i in 1:samples) {
  cat("Subsample", i, "summary statistics:\n")
  print(summary_statistics[[i]])
  print(histograms[[i]])
}
## Subsample 1 summary statistics:
##       age            edunum      maritalstatus       occupation       
##  Min.   :17.00   Min.   : 1.00   Length:8140        Length:8140       
##  1st Qu.:28.00   1st Qu.: 9.00   Class :character   Class :character  
##  Median :37.00   Median :10.00   Mode  :character   Mode  :character  
##  Mean   :38.65   Mean   :10.02                                        
##  3rd Qu.:48.00   3rd Qu.:12.00                                        
##  Max.   :90.00   Max.   :16.00                                        
##  relationship           sex             capitalgain     hoursperweek 
##  Length:8140        Length:8140        Min.   :    0   Min.   : 1.0  
##  Class :character   Class :character   1st Qu.:    0   1st Qu.:40.0  
##  Mode  :character   Mode  :character   Median :    0   Median :40.0  
##                                        Mean   : 1180   Mean   :40.6  
##                                        3rd Qu.:    0   3rd Qu.:45.0  
##                                        Max.   :99999   Max.   :99.0  
##  nativecountry     
##  Length:8140       
##  Class :character  
##  Mode  :character  
##                    
##                    
## 

## Subsample 2 summary statistics:
##       age           edunum     maritalstatus       occupation       
##  Min.   :17.0   Min.   : 1.0   Length:8140        Length:8140       
##  1st Qu.:28.0   1st Qu.: 9.0   Class :character   Class :character  
##  Median :37.0   Median :10.0   Mode  :character   Mode  :character  
##  Mean   :38.9   Mean   :10.1                                        
##  3rd Qu.:48.0   3rd Qu.:13.0                                        
##  Max.   :90.0   Max.   :16.0                                        
##  relationship           sex             capitalgain     hoursperweek  
##  Length:8140        Length:8140        Min.   :    0   Min.   : 1.00  
##  Class :character   Class :character   1st Qu.:    0   1st Qu.:40.00  
##  Mode  :character   Mode  :character   Median :    0   Median :40.00  
##                                        Mean   : 1100   Mean   :40.42  
##                                        3rd Qu.:    0   3rd Qu.:45.00  
##                                        Max.   :99999   Max.   :99.00  
##  nativecountry     
##  Length:8140       
##  Class :character  
##  Mode  :character  
##                    
##                    
## 

## Subsample 3 summary statistics:
##       age            edunum      maritalstatus       occupation       
##  Min.   :17.00   Min.   : 1.00   Length:8140        Length:8140       
##  1st Qu.:28.00   1st Qu.: 9.00   Class :character   Class :character  
##  Median :37.00   Median :10.00   Mode  :character   Mode  :character  
##  Mean   :38.57   Mean   :10.06                                        
##  3rd Qu.:48.00   3rd Qu.:12.00                                        
##  Max.   :90.00   Max.   :16.00                                        
##  relationship           sex             capitalgain     hoursperweek  
##  Length:8140        Length:8140        Min.   :    0   Min.   : 1.00  
##  Class :character   Class :character   1st Qu.:    0   1st Qu.:40.00  
##  Mode  :character   Mode  :character   Median :    0   Median :40.00  
##                                        Mean   : 1142   Mean   :40.34  
##                                        3rd Qu.:    0   3rd Qu.:45.00  
##                                        Max.   :99999   Max.   :99.00  
##  nativecountry     
##  Length:8140       
##  Class :character  
##  Mode  :character  
##                    
##                    
## 

## Subsample 4 summary statistics:
##       age            edunum      maritalstatus       occupation       
##  Min.   :17.00   Min.   : 1.00   Length:8140        Length:8140       
##  1st Qu.:28.00   1st Qu.: 9.00   Class :character   Class :character  
##  Median :37.00   Median :10.00   Mode  :character   Mode  :character  
##  Mean   :38.92   Mean   :10.07                                        
##  3rd Qu.:48.00   3rd Qu.:13.00                                        
##  Max.   :90.00   Max.   :16.00                                        
##  relationship           sex             capitalgain     hoursperweek  
##  Length:8140        Length:8140        Min.   :    0   Min.   : 1.00  
##  Class :character   Class :character   1st Qu.:    0   1st Qu.:40.00  
##  Mode  :character   Mode  :character   Median :    0   Median :40.00  
##                                        Mean   : 1178   Mean   :40.55  
##                                        3rd Qu.:    0   3rd Qu.:45.00  
##                                        Max.   :99999   Max.   :99.00  
##  nativecountry     
##  Length:8140       
##  Class :character  
##  Mode  :character  
##                    
##                    
## 

## Subsample 5 summary statistics:
##       age            edunum      maritalstatus       occupation       
##  Min.   :17.00   Min.   : 1.00   Length:8140        Length:8140       
##  1st Qu.:27.00   1st Qu.: 9.00   Class :character   Class :character  
##  Median :37.00   Median :10.00   Mode  :character   Mode  :character  
##  Mean   :38.36   Mean   :10.06                                        
##  3rd Qu.:48.00   3rd Qu.:12.00                                        
##  Max.   :90.00   Max.   :16.00                                        
##  relationship           sex             capitalgain     hoursperweek  
##  Length:8140        Length:8140        Min.   :    0   Min.   : 1.00  
##  Class :character   Class :character   1st Qu.:    0   1st Qu.:40.00  
##  Mode  :character   Mode  :character   Median :    0   Median :40.00  
##                                        Mean   : 1098   Mean   :40.37  
##                                        3rd Qu.:    0   3rd Qu.:45.00  
##                                        Max.   :99999   Max.   :99.00  
##  nativecountry     
##  Length:8140       
##  Class :character  
##  Mode  :character  
##                    
##                    
## 

## Subsample 6 summary statistics:
##       age            edunum      maritalstatus       occupation       
##  Min.   :17.00   Min.   : 1.00   Length:8140        Length:8140       
##  1st Qu.:28.00   1st Qu.: 9.00   Class :character   Class :character  
##  Median :37.00   Median :10.00   Mode  :character   Mode  :character  
##  Mean   :38.81   Mean   :10.04                                        
##  3rd Qu.:48.00   3rd Qu.:12.00                                        
##  Max.   :90.00   Max.   :16.00                                        
##  relationship           sex             capitalgain     hoursperweek 
##  Length:8140        Length:8140        Min.   :    0   Min.   : 1.0  
##  Class :character   Class :character   1st Qu.:    0   1st Qu.:40.0  
##  Mode  :character   Mode  :character   Median :    0   Median :40.0  
##                                        Mean   : 1198   Mean   :40.7  
##                                        3rd Qu.:    0   3rd Qu.:45.0  
##                                        Max.   :99999   Max.   :99.0  
##  nativecountry     
##  Length:8140       
##  Class :character  
##  Mode  :character  
##                    
##                    
## 

## Subsample 7 summary statistics:
##       age            edunum      maritalstatus       occupation       
##  Min.   :17.00   Min.   : 1.00   Length:8140        Length:8140       
##  1st Qu.:28.00   1st Qu.: 9.00   Class :character   Class :character  
##  Median :37.00   Median :10.00   Mode  :character   Mode  :character  
##  Mean   :38.78   Mean   :10.06                                        
##  3rd Qu.:48.00   3rd Qu.:12.00                                        
##  Max.   :90.00   Max.   :16.00                                        
##  relationship           sex             capitalgain     hoursperweek  
##  Length:8140        Length:8140        Min.   :    0   Min.   : 1.00  
##  Class :character   Class :character   1st Qu.:    0   1st Qu.:40.00  
##  Mode  :character   Mode  :character   Median :    0   Median :40.00  
##                                        Mean   : 1159   Mean   :40.45  
##                                        3rd Qu.:    0   3rd Qu.:45.00  
##                                        Max.   :99999   Max.   :99.00  
##  nativecountry     
##  Length:8140       
##  Class :character  
##  Mode  :character  
##                    
##                    
## 

## Subsample 8 summary statistics:
##       age           edunum      maritalstatus       occupation       
##  Min.   :17.0   Min.   : 1.00   Length:8140        Length:8140       
##  1st Qu.:28.0   1st Qu.: 9.00   Class :character   Class :character  
##  Median :37.0   Median :10.00   Mode  :character   Mode  :character  
##  Mean   :38.7   Mean   :10.08                                        
##  3rd Qu.:48.0   3rd Qu.:12.00                                        
##  Max.   :90.0   Max.   :16.00                                        
##  relationship           sex             capitalgain     hoursperweek  
##  Length:8140        Length:8140        Min.   :    0   Min.   : 1.00  
##  Class :character   Class :character   1st Qu.:    0   1st Qu.:40.00  
##  Mode  :character   Mode  :character   Median :    0   Median :40.00  
##                                        Mean   : 1060   Mean   :40.32  
##                                        3rd Qu.:    0   3rd Qu.:45.00  
##                                        Max.   :99999   Max.   :99.00  
##  nativecountry     
##  Length:8140       
##  Class :character  
##  Mode  :character  
##                    
##                    
## 

## Subsample 9 summary statistics:
##       age            edunum      maritalstatus       occupation       
##  Min.   :17.00   Min.   : 1.00   Length:8140        Length:8140       
##  1st Qu.:28.00   1st Qu.: 9.00   Class :character   Class :character  
##  Median :37.00   Median :10.00   Mode  :character   Mode  :character  
##  Mean   :38.89   Mean   :10.12                                        
##  3rd Qu.:48.00   3rd Qu.:13.00                                        
##  Max.   :90.00   Max.   :16.00                                        
##  relationship           sex             capitalgain     hoursperweek  
##  Length:8140        Length:8140        Min.   :    0   Min.   : 1.00  
##  Class :character   Class :character   1st Qu.:    0   1st Qu.:40.00  
##  Mode  :character   Mode  :character   Median :    0   Median :40.00  
##                                        Mean   : 1067   Mean   :40.54  
##                                        3rd Qu.:    0   3rd Qu.:45.00  
##                                        Max.   :99999   Max.   :99.00  
##  nativecountry     
##  Length:8140       
##  Class :character  
##  Mode  :character  
##                    
##                    
## 

## Subsample 10 summary statistics:
##       age            edunum      maritalstatus       occupation       
##  Min.   :17.00   Min.   : 1.00   Length:8140        Length:8140       
##  1st Qu.:28.00   1st Qu.: 9.00   Class :character   Class :character  
##  Median :37.00   Median :10.00   Mode  :character   Mode  :character  
##  Mean   :39.01   Mean   :10.08                                        
##  3rd Qu.:48.00   3rd Qu.:12.00                                        
##  Max.   :90.00   Max.   :16.00                                        
##  relationship           sex             capitalgain     hoursperweek  
##  Length:8140        Length:8140        Min.   :    0   Min.   : 1.00  
##  Class :character   Class :character   1st Qu.:    0   1st Qu.:40.00  
##  Mode  :character   Mode  :character   Median :    0   Median :40.00  
##                                        Mean   : 1164   Mean   :40.57  
##                                        3rd Qu.:    0   3rd Qu.:45.00  
##                                        Max.   :99999   Max.   :99.00  
##  nativecountry     
##  Length:8140       
##  Class :character  
##  Mode  :character  
##                    
##                    
## 

Anomalies and consistency

mean_age <- lapply(subsample_list, function(subsample) {
  mean(subsample$age)
})
View(mean_age)


std_dev_age <- lapply(subsample_list, function(subsample) {
  sd(subsample$age)
})
View(std_dev_age)


for (i in 1:samples) {
  cat("Subsample", i, "Mean of age:", mean_age[[i]], "\n")
  cat("Subsample", i, "SD of age:", std_dev_age[[i]], "\n")
}
## Subsample 1 Mean of age: 38.65381 
## Subsample 1 SD of age: 13.68828 
## Subsample 2 Mean of age: 38.90455 
## Subsample 2 SD of age: 13.81402 
## Subsample 3 Mean of age: 38.57285 
## Subsample 3 SD of age: 13.86388 
## Subsample 4 Mean of age: 38.92359 
## Subsample 4 SD of age: 13.8591 
## Subsample 5 Mean of age: 38.35762 
## Subsample 5 SD of age: 13.61142 
## Subsample 6 Mean of age: 38.81167 
## Subsample 6 SD of age: 13.74207 
## Subsample 7 Mean of age: 38.78157 
## Subsample 7 SD of age: 13.69245 
## Subsample 8 Mean of age: 38.69926 
## Subsample 8 SD of age: 13.86312 
## Subsample 9 Mean of age: 38.89275 
## Subsample 9 SD of age: 13.82123 
## Subsample 10 Mean of age: 39.00971 
## Subsample 10 SD of age: 13.83545

Analysis and Conclusion

In the analysis of multiple subsamples, it appears that there are no substantial differences in the characteristics we examined. The summary statistics, distributions, and overall patterns in the sub-samples exhibits a high degree of similarity. In terms of anomalies there were no clear instances of anomalies in any of the sub-samples. This suggests that the data set appears to represent a relatively homogeneous.

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.