#Nomor 6
data1 <- data.frame(Titanic)
any(is.na(data1)) #Cek apakah ada NA di seluruh data set
## [1] FALSE
colSums(is.na(data1)) #cek jumlah NA per kolom
## Class Sex Age Survived Freq
## 0 0 0 0 0
#Nomor 7
data1 <- data.frame(Titanic)
# Buat boxplot dan simpan nilainya
box <- boxplot(data1$Freq, plot = FALSE)
# Lihat nilai outlier
outliers <- box$out
# Hitung jumlah outlier
length(outliers)
## [1] 2
#Nomor 8
sum(duplicated(data1))
## [1] 0
data1 <- data.frame(Titanic)
sum(duplicated(data1))
## [1] 0
#Nomor 9
nilai <- c(70, 75, 80, 85, 85, 90, 95, 100, 60, 75, 77, 85, 90, 98, 68, 92, 85, 66, 75, 80, 72, 84, 50, 69, 76, 80, 90, 95, 88, 77)
mean(nilai) # rata-rata
## [1] 80.4
median(nilai) # median
## [1] 80
sd(nilai) # standar deviasi
## [1] 11.48792
#Nomor 10
# Load package dan data
library(mlbench)
## Warning: package 'mlbench' was built under R version 4.4.3
data("BreastCancer")
# Hapus baris yang memiliki NA
bc <- na.omit(BreastCancer)
# Tentukan total observasi
n <- nrow(bc)
# Set seed untuk reprodusibilitas
set.seed(110)
# Hitung proporsi 80% untuk training
train_index <- sample(1:n, size = 0.8 * n)
# Bagi datanya
train_data <- bc[train_index, ]
test_data <- bc[-train_index, ]
# Tampilkan dimensi testing
nrow(test_data)
## [1] 137
train <- subset(BreastCancer, split == TRUE)
test <- subset(BreastCancer, split == FALSE)
nrow(train) nrow(test)
`
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.
When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
``` r
summary(cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
You can also embed plots, for example:
Note that the echo = FALSE parameter was added to the
code chunk to prevent printing of the R code that generated the
plot.