#Nomor 6

data1 <- data.frame(Titanic)
any(is.na(data1)) #Cek apakah ada NA di seluruh data set
## [1] FALSE
colSums(is.na(data1)) #cek jumlah NA per kolom
##    Class      Sex      Age Survived     Freq 
##        0        0        0        0        0

#Nomor 7

data1 <- data.frame(Titanic)
# Buat boxplot dan simpan nilainya
box <- boxplot(data1$Freq, plot = FALSE)

# Lihat nilai outlier
outliers <- box$out

# Hitung jumlah outlier
length(outliers)
## [1] 2

R Markdown

#Nomor 8

sum(duplicated(data1))
## [1] 0
data1 <- data.frame(Titanic)
sum(duplicated(data1))
## [1] 0

R Markdown

#Nomor 9

nilai <- c(70, 75, 80, 85, 85, 90, 95, 100, 60, 75, 77, 85, 90, 98, 68, 92, 85, 66, 75, 80, 72, 84, 50, 69, 76, 80, 90, 95, 88, 77)
mean(nilai)     # rata-rata
## [1] 80.4
median(nilai)   # median
## [1] 80
sd(nilai)       # standar deviasi
## [1] 11.48792

#Nomor 10

# Load package dan data
library(mlbench)
## Warning: package 'mlbench' was built under R version 4.4.3
data("BreastCancer")

# Hapus baris yang memiliki NA
bc <- na.omit(BreastCancer)

# Tentukan total observasi
n <- nrow(bc)

# Set seed untuk reprodusibilitas
set.seed(110)

# Hitung proporsi 80% untuk training
train_index <- sample(1:n, size = 0.8 * n)

# Bagi datanya
train_data <- bc[train_index, ]
test_data <- bc[-train_index, ]

# Tampilkan dimensi testing
nrow(test_data)
## [1] 137

Data training

train <- subset(BreastCancer, split == TRUE)

Data testing

test <- subset(BreastCancer, split == FALSE)

Lihat jumlah baris

nrow(train) nrow(test)


`


This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.

When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:


``` r
summary(cars)
##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.