R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(magrittr)
library(ggplot2)
imdb_movies <- read.csv('C:/Users/DELL/Downloads/imdb.csv')


# Set the seed for reproducibility
set.seed(123)

# Number of rows in the dataset
total_rows <- nrow(imdb_movies)

# Creating 5 random subsamples
num_subsamples <- 5
sample_size <- round(0.5 * total_rows)  # Approximately 50% of the data

subsamples <- list()  # Creating a list to store the subsamples

for (i in 1:num_subsamples) {
  # Randomly sampling rows from the dataset
  sample_rows <- sample(1:total_rows, size = sample_size, replace = TRUE)
  
   # Selecting a random set of columns (at least 6 columns) with replacement
  columns_to_select <- sample(1:ncol(msleep), size = sample_size, replace = TRUE)
  
  # Creating a new data frame with the selected rows and columns
  subsamples[[i]] <- msleep[sample_rows, columns_to_select]
  
    # Assigning meaningful column names to the subsample (optional)
  colnames(subsamples[[i]]) <- paste("col", 1:ncol(subsamples[[i]]))
  
    # Printing the first few rows of each subsample
  cat("Subsample", i, ":\n")
  print(head(subsamples[[i]]))
  cat("\n")
}
## Subsample 1 :
## # A tibble: 6 × 188
##   `col 1` `col 2` `col 3` `col 4` `col 5` `col 6` `col 7`    `col 8` `col 9`
##     <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl> <chr>        <dbl> <chr>  
## 1 NA       NA        NA      NA   NA        NA    <NA>          NA   <NA>   
## 2  0.0064   0.117    12.5     1.5  0.0064    0.42 Chinchilla    11.5 herbi  
## 3 NA       NA        NA      NA   NA        NA    <NA>          NA   <NA>   
## 4 NA       NA        NA      NA   NA        NA    <NA>          NA   <NA>   
## 5 NA       NA        NA      NA   NA        NA    <NA>          NA   <NA>   
## 6 NA       NA        NA      NA   NA        NA    <NA>          NA   <NA>   
## # ℹ 179 more variables: `col 10` <dbl>, `col 11` <chr>, `col 12` <dbl>,
## #   `col 13` <dbl>, `col 14` <dbl>, `col 15` <dbl>, `col 16` <chr>,
## #   `col 17` <dbl>, `col 18` <dbl>, `col 19` <dbl>, `col 20` <dbl>,
## #   `col 21` <dbl>, `col 22` <chr>, `col 23` <dbl>, `col 24` <dbl>,
## #   `col 25` <dbl>, `col 26` <dbl>, `col 27` <chr>, `col 28` <dbl>,
## #   `col 29` <dbl>, `col 30` <dbl>, `col 31` <chr>, `col 32` <dbl>,
## #   `col 33` <dbl>, `col 34` <chr>, `col 35` <dbl>, `col 36` <chr>, …
## 
## Subsample 2 :
## # A tibble: 6 × 188
##   `col 1`  `col 2` `col 3`  `col 4` `col 5` `col 6`  `col 7` `col 8` `col 9`
##   <chr>      <dbl> <chr>      <dbl>   <dbl> <chr>      <dbl>   <dbl>   <dbl>
## 1 <NA>        NA   <NA>        NA    NA     <NA>      NA        NA     NA   
## 2 Rodentia    NA   Rodentia    17     0.045 Rodentia  NA        17     NA   
## 3 <NA>        NA   <NA>        NA    NA     <NA>      NA        NA     NA   
## 4 <NA>        NA   <NA>        NA    NA     <NA>      NA        NA     NA   
## 5 Primates     1.1 Primates    14.2   0.2   Primates   0.005    14.2    0.55
## 6 <NA>        NA   <NA>        NA    NA     <NA>      NA        NA     NA   
## # ℹ 179 more variables: `col 10` <dbl>, `col 11` <dbl>, `col 12` <chr>,
## #   `col 13` <chr>, `col 14` <dbl>, `col 15` <dbl>, `col 16` <chr>,
## #   `col 17` <dbl>, `col 18` <dbl>, `col 19` <chr>, `col 20` <dbl>,
## #   `col 21` <dbl>, `col 22` <chr>, `col 23` <dbl>, `col 24` <chr>,
## #   `col 25` <dbl>, `col 26` <chr>, `col 27` <chr>, `col 28` <chr>,
## #   `col 29` <chr>, `col 30` <dbl>, `col 31` <dbl>, `col 32` <chr>,
## #   `col 33` <chr>, `col 34` <chr>, `col 35` <chr>, `col 36` <chr>, …
## 
## Subsample 3 :
## # A tibble: 6 × 188
##   `col 1` `col 2`        `col 3` `col 4` `col 5` `col 6` `col 7` `col 8` `col 9`
##   <chr>   <chr>          <chr>   <chr>   <chr>   <chr>   <chr>     <dbl> <chr>  
## 1 <NA>    Desert hedgeh… Desert… Desert… Erinac… <NA>    Paraec…     2.7 Paraec…
## 2 <NA>    <NA>           <NA>    <NA>    <NA>    <NA>    <NA>       NA   <NA>   
## 3 <NA>    <NA>           <NA>    <NA>    <NA>    <NA>    <NA>       NA   <NA>   
## 4 carni   Long-nosed ar… Long-n… Long-n… Cingul… carni   Dasypus     3.1 Dasypus
## 5 <NA>    <NA>           <NA>    <NA>    <NA>    <NA>    <NA>       NA   <NA>   
## 6 herbi   Round-tailed … Round-… Round-… Rodent… herbi   Neofib…    NA   Neofib…
## # ℹ 179 more variables: `col 10` <chr>, `col 11` <dbl>, `col 12` <dbl>,
## #   `col 13` <chr>, `col 14` <chr>, `col 15` <dbl>, `col 16` <dbl>,
## #   `col 17` <chr>, `col 18` <dbl>, `col 19` <dbl>, `col 20` <chr>,
## #   `col 21` <dbl>, `col 22` <dbl>, `col 23` <chr>, `col 24` <chr>,
## #   `col 25` <dbl>, `col 26` <chr>, `col 27` <chr>, `col 28` <chr>,
## #   `col 29` <chr>, `col 30` <chr>, `col 31` <dbl>, `col 32` <dbl>,
## #   `col 33` <dbl>, `col 34` <dbl>, `col 35` <chr>, `col 36` <chr>, …
## 
## Subsample 4 :
## # A tibble: 6 × 188
##    `col 1` `col 2`       `col 3` `col 4` `col 5` `col 6` `col 7` `col 8` `col 9`
##      <dbl> <chr>           <dbl> <chr>     <dbl>   <dbl> <chr>   <chr>     <dbl>
## 1 NA       <NA>           NA     <NA>     NA      NA     <NA>    <NA>    NA     
## 2 NA       <NA>           NA     <NA>     NA      NA     <NA>    <NA>    NA     
## 3  0.00033 Soricomorpha    0.048 Musk s…   0.048   0.183 <NA>    <NA>     3.3e-4
## 4 NA       <NA>           NA     <NA>     NA      NA     <NA>    <NA>    NA     
## 5  0.0063  Didelphimorp…   1.7   North …   1.7     0.333 lc      lc       6.3e-3
## 6 NA       <NA>           NA     <NA>     NA      NA     <NA>    <NA>    NA     
## # ℹ 179 more variables: `col 10` <chr>, `col 11` <dbl>, `col 12` <chr>,
## #   `col 13` <chr>, `col 14` <chr>, `col 15` <chr>, `col 16` <dbl>,
## #   `col 17` <chr>, `col 18` <chr>, `col 19` <dbl>, `col 20` <chr>,
## #   `col 21` <dbl>, `col 22` <chr>, `col 23` <chr>, `col 24` <chr>,
## #   `col 25` <chr>, `col 26` <dbl>, `col 27` <dbl>, `col 28` <dbl>,
## #   `col 29` <chr>, `col 30` <dbl>, `col 31` <dbl>, `col 32` <dbl>,
## #   `col 33` <chr>, `col 34` <dbl>, `col 35` <chr>, `col 36` <chr>, …
## 
## Subsample 5 :
## # A tibble: 6 × 188
##   `col 1` `col 2` `col 3`      `col 4` `col 5` `col 6`   `col 7` `col 8` `col 9`
##     <dbl> <chr>   <chr>        <chr>     <dbl> <chr>     <chr>     <dbl> <chr>  
## 1 NA      <NA>    <NA>         <NA>     NA     <NA>      <NA>     NA     <NA>   
## 2  0.025  insecti Tachyglossus <NA>     NA     Short-no… Short-…  NA     Tachyg…
## 3 NA      <NA>    <NA>         <NA>     NA     <NA>      <NA>     NA     <NA>   
## 4  0.0004 herbi   Mus          nt        0.183 House mo… House …   0.183 Mus    
## 5  0.025  insecti Tachyglossus <NA>     NA     Short-no… Short-…  NA     Tachyg…
## 6  0.0123 herbi   Heterohyrax  lc       NA     Gray hyr… Gray h…  NA     Hetero…
## # ℹ 179 more variables: `col 10` <dbl>, `col 11` <chr>, `col 12` <chr>,
## #   `col 13` <dbl>, `col 14` <chr>, `col 15` <chr>, `col 16` <chr>,
## #   `col 17` <chr>, `col 18` <dbl>, `col 19` <dbl>, `col 20` <chr>,
## #   `col 21` <dbl>, `col 22` <chr>, `col 23` <chr>, `col 24` <dbl>,
## #   `col 25` <dbl>, `col 26` <dbl>, `col 27` <chr>, `col 28` <dbl>,
## #   `col 29` <chr>, `col 30` <dbl>, `col 31` <dbl>, `col 32` <dbl>,
## #   `col 33` <chr>, `col 34` <dbl>, `col 35` <dbl>, `col 36` <dbl>, …

Subsample 1: