This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(magrittr)
library(ggplot2)
imdb_movies <- read.csv('C:/Users/DELL/Downloads/imdb.csv')
# Set the seed for reproducibility
set.seed(123)
# Number of rows in the dataset
total_rows <- nrow(imdb_movies)
# Creating 5 random subsamples
num_subsamples <- 5
sample_size <- round(0.5 * total_rows) # Approximately 50% of the data
subsamples <- list() # Creating a list to store the subsamples
for (i in 1:num_subsamples) {
# Randomly sampling rows from the dataset
sample_rows <- sample(1:total_rows, size = sample_size, replace = TRUE)
# Selecting a random set of columns (at least 6 columns) with replacement
columns_to_select <- sample(1:ncol(msleep), size = sample_size, replace = TRUE)
# Creating a new data frame with the selected rows and columns
subsamples[[i]] <- msleep[sample_rows, columns_to_select]
# Assigning meaningful column names to the subsample (optional)
colnames(subsamples[[i]]) <- paste("col", 1:ncol(subsamples[[i]]))
# Printing the first few rows of each subsample
cat("Subsample", i, ":\n")
print(head(subsamples[[i]]))
cat("\n")
}
## Subsample 1 :
## # A tibble: 6 × 188
## `col 1` `col 2` `col 3` `col 4` `col 5` `col 6` `col 7` `col 8` `col 9`
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <chr>
## 1 NA NA NA NA NA NA <NA> NA <NA>
## 2 0.0064 0.117 12.5 1.5 0.0064 0.42 Chinchilla 11.5 herbi
## 3 NA NA NA NA NA NA <NA> NA <NA>
## 4 NA NA NA NA NA NA <NA> NA <NA>
## 5 NA NA NA NA NA NA <NA> NA <NA>
## 6 NA NA NA NA NA NA <NA> NA <NA>
## # ℹ 179 more variables: `col 10` <dbl>, `col 11` <chr>, `col 12` <dbl>,
## # `col 13` <dbl>, `col 14` <dbl>, `col 15` <dbl>, `col 16` <chr>,
## # `col 17` <dbl>, `col 18` <dbl>, `col 19` <dbl>, `col 20` <dbl>,
## # `col 21` <dbl>, `col 22` <chr>, `col 23` <dbl>, `col 24` <dbl>,
## # `col 25` <dbl>, `col 26` <dbl>, `col 27` <chr>, `col 28` <dbl>,
## # `col 29` <dbl>, `col 30` <dbl>, `col 31` <chr>, `col 32` <dbl>,
## # `col 33` <dbl>, `col 34` <chr>, `col 35` <dbl>, `col 36` <chr>, …
##
## Subsample 2 :
## # A tibble: 6 × 188
## `col 1` `col 2` `col 3` `col 4` `col 5` `col 6` `col 7` `col 8` `col 9`
## <chr> <dbl> <chr> <dbl> <dbl> <chr> <dbl> <dbl> <dbl>
## 1 <NA> NA <NA> NA NA <NA> NA NA NA
## 2 Rodentia NA Rodentia 17 0.045 Rodentia NA 17 NA
## 3 <NA> NA <NA> NA NA <NA> NA NA NA
## 4 <NA> NA <NA> NA NA <NA> NA NA NA
## 5 Primates 1.1 Primates 14.2 0.2 Primates 0.005 14.2 0.55
## 6 <NA> NA <NA> NA NA <NA> NA NA NA
## # ℹ 179 more variables: `col 10` <dbl>, `col 11` <dbl>, `col 12` <chr>,
## # `col 13` <chr>, `col 14` <dbl>, `col 15` <dbl>, `col 16` <chr>,
## # `col 17` <dbl>, `col 18` <dbl>, `col 19` <chr>, `col 20` <dbl>,
## # `col 21` <dbl>, `col 22` <chr>, `col 23` <dbl>, `col 24` <chr>,
## # `col 25` <dbl>, `col 26` <chr>, `col 27` <chr>, `col 28` <chr>,
## # `col 29` <chr>, `col 30` <dbl>, `col 31` <dbl>, `col 32` <chr>,
## # `col 33` <chr>, `col 34` <chr>, `col 35` <chr>, `col 36` <chr>, …
##
## Subsample 3 :
## # A tibble: 6 × 188
## `col 1` `col 2` `col 3` `col 4` `col 5` `col 6` `col 7` `col 8` `col 9`
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <chr>
## 1 <NA> Desert hedgeh… Desert… Desert… Erinac… <NA> Paraec… 2.7 Paraec…
## 2 <NA> <NA> <NA> <NA> <NA> <NA> <NA> NA <NA>
## 3 <NA> <NA> <NA> <NA> <NA> <NA> <NA> NA <NA>
## 4 carni Long-nosed ar… Long-n… Long-n… Cingul… carni Dasypus 3.1 Dasypus
## 5 <NA> <NA> <NA> <NA> <NA> <NA> <NA> NA <NA>
## 6 herbi Round-tailed … Round-… Round-… Rodent… herbi Neofib… NA Neofib…
## # ℹ 179 more variables: `col 10` <chr>, `col 11` <dbl>, `col 12` <dbl>,
## # `col 13` <chr>, `col 14` <chr>, `col 15` <dbl>, `col 16` <dbl>,
## # `col 17` <chr>, `col 18` <dbl>, `col 19` <dbl>, `col 20` <chr>,
## # `col 21` <dbl>, `col 22` <dbl>, `col 23` <chr>, `col 24` <chr>,
## # `col 25` <dbl>, `col 26` <chr>, `col 27` <chr>, `col 28` <chr>,
## # `col 29` <chr>, `col 30` <chr>, `col 31` <dbl>, `col 32` <dbl>,
## # `col 33` <dbl>, `col 34` <dbl>, `col 35` <chr>, `col 36` <chr>, …
##
## Subsample 4 :
## # A tibble: 6 × 188
## `col 1` `col 2` `col 3` `col 4` `col 5` `col 6` `col 7` `col 8` `col 9`
## <dbl> <chr> <dbl> <chr> <dbl> <dbl> <chr> <chr> <dbl>
## 1 NA <NA> NA <NA> NA NA <NA> <NA> NA
## 2 NA <NA> NA <NA> NA NA <NA> <NA> NA
## 3 0.00033 Soricomorpha 0.048 Musk s… 0.048 0.183 <NA> <NA> 3.3e-4
## 4 NA <NA> NA <NA> NA NA <NA> <NA> NA
## 5 0.0063 Didelphimorp… 1.7 North … 1.7 0.333 lc lc 6.3e-3
## 6 NA <NA> NA <NA> NA NA <NA> <NA> NA
## # ℹ 179 more variables: `col 10` <chr>, `col 11` <dbl>, `col 12` <chr>,
## # `col 13` <chr>, `col 14` <chr>, `col 15` <chr>, `col 16` <dbl>,
## # `col 17` <chr>, `col 18` <chr>, `col 19` <dbl>, `col 20` <chr>,
## # `col 21` <dbl>, `col 22` <chr>, `col 23` <chr>, `col 24` <chr>,
## # `col 25` <chr>, `col 26` <dbl>, `col 27` <dbl>, `col 28` <dbl>,
## # `col 29` <chr>, `col 30` <dbl>, `col 31` <dbl>, `col 32` <dbl>,
## # `col 33` <chr>, `col 34` <dbl>, `col 35` <chr>, `col 36` <chr>, …
##
## Subsample 5 :
## # A tibble: 6 × 188
## `col 1` `col 2` `col 3` `col 4` `col 5` `col 6` `col 7` `col 8` `col 9`
## <dbl> <chr> <chr> <chr> <dbl> <chr> <chr> <dbl> <chr>
## 1 NA <NA> <NA> <NA> NA <NA> <NA> NA <NA>
## 2 0.025 insecti Tachyglossus <NA> NA Short-no… Short-… NA Tachyg…
## 3 NA <NA> <NA> <NA> NA <NA> <NA> NA <NA>
## 4 0.0004 herbi Mus nt 0.183 House mo… House … 0.183 Mus
## 5 0.025 insecti Tachyglossus <NA> NA Short-no… Short-… NA Tachyg…
## 6 0.0123 herbi Heterohyrax lc NA Gray hyr… Gray h… NA Hetero…
## # ℹ 179 more variables: `col 10` <dbl>, `col 11` <chr>, `col 12` <chr>,
## # `col 13` <dbl>, `col 14` <chr>, `col 15` <chr>, `col 16` <chr>,
## # `col 17` <chr>, `col 18` <dbl>, `col 19` <dbl>, `col 20` <chr>,
## # `col 21` <dbl>, `col 22` <chr>, `col 23` <chr>, `col 24` <dbl>,
## # `col 25` <dbl>, `col 26` <dbl>, `col 27` <chr>, `col 28` <dbl>,
## # `col 29` <chr>, `col 30` <dbl>, `col 31` <dbl>, `col 32` <dbl>,
## # `col 33` <chr>, `col 34` <dbl>, `col 35` <dbl>, `col 36` <dbl>, …
Subsample 1: