Data set

Ingin membandingkan suatu indikator tertentu dari 4 regions.

set.seed(785)
ns <- c(15,28,10,20,75)
n <- length(ns)
region <- factor(rep(1:n,ns),labels=paste("Region ", 1:n, sep=""))
indic <- rnorm(length(region), mean=3+(as.numeric(region)-5)^2)

Classical boxplot by region

boxplot(indic~region, border=1:n, xlab="Regions", ylab="Indicator")

1 Removing non-data

Menghapus Bingkai Warna

Menghapus warna pada tepi boxplot sehingga hanya tampilan dasar yang diberikan.

boxplot(indic~region, xlab="Regions", ylab="Indicator")

Menghapus Sumbu (Axes)

  • Menggunakan axes=F untuk menghilangkan sumbu.

  • Menambahkan sumbu kembali secara manual menggunakan fungsi axis():

side=1: Menambahkan sumbu horizontal.

side=2: Menambahkan sumbu vertikal.

boxplot(indic~region, xlab="Regions", ylab="Indicator", 
        axes=F)
axis(side = 1)
axis(side = 2)

Menghapus Elemen Lainnya

boxcol = "white": Menghapus warna boxplot (membuatnya tidak terlihat).

boxplot(indic~region, xlab="Regions", ylab="Indicator", 
        axes = FALSE, 
        pars = list(boxcol = "white"))
axis(side = 1)
axis(side = 2)

Modifikasi Elemen Non-Data

  • whisklty: Gaya garis untuk whisker (garis batas atas/bawah dari boxplot).

  • staplelty: Menghapus “staple” (garis pendek di ujung whisker).

  • outcex: Ukuran titik outlier diatur menjadi 0.5.

boxplot(indic~region, xlab="Regions", ylab="Indicator", 
        axes = FALSE,
        pars = list(boxcol = "white",
                    whisklty = c(1, 1),
                    staplelty = "blank", 
                    outcex = 0.5) )
axis(side = 1)
axis(side=2)

Menambahkan Gaya “Tufte Style”

medlty = "blank": Menghapus garis median.

medpch = 16: Mengganti median dengan titik hitam solid.

#Tufte Style 
boxplot(indic~region, xlab="Regions", ylab="Indicator",
         axes = FALSE,
         pars = list(boxcol = "white",
                    whisklty = c(1, 1),
                    staplelty = "blank", 
                    outcex = 0.5, 
                    medlty = "blank", 
                    medpch=16) )
axis(side = 1)
axis(side=2)

Menampilkan Mean atau Median pada Boxplot

Menambahkan Mean/Median:

by(): Menghitung median (atau mean) dari indic untuk setiap region.

points(): Menampilkan titik mean/median pada plot.

text(): Menambahkan label angka (nilai mean/median) di dekat titik mean/median.

boxplot(indic~region, xlab="Regions", ylab="Indicator",
         axes = FALSE,
         pars = list(boxcol = "white",
                    whisklty = c(1, 1),
                    staplelty = "blank", 
                    outcex = 0.5, 
                    medlty = "blank", 
                    medpch=16) )

# Get the region means or median
means <- by(indic, region, median)  
# Plot symbols for each mean, centered on x 
points(1:5, means, pch = 23, cex = 0.75)
# Now label the means, formatting the values
# to one decimal place. Place the values to the
# left of each region plot.

text(1:5 - 0.1, means,  labels = formatC(means, format = "f", digits = 1),
     pos = 2, cex = 1.0)
axis(side = 1)
axis(side=2)

Boxplot Akhir dengan Elemen Tambahan

  • Label sumbu dan sumbu diberi warna abu-abu.

  • Simbol mean (titik) diberi warna biru gelap.

  • Median diwakili oleh titik solid dengan modifikasi tampilan yang lebih estetis.

boxplot(indic~region, xlab="Regions", ylab="Indicator",
         axes = FALSE,
         pars = list(boxcol = "white",
                    whisklty = c(1, 1),
                    staplelty = "blank", 
                    outcex = 0.5, 
                    medlty = "blank", 
                    medpch =20),
                    col.lab="grey")

# Get the region means or median
means <- by(indic, region, median)  
# Plot symbols for each mean, centered on x 
points(1:5, means, pch = 23, cex = 0.75, bg = "darkblue")
# Now label the means, formatting the values
# to one decimal place. Place the values to the
# left of each region plot.

text(1:5 - 0.1, means,  labels = formatC(means, format = "f", digits = 1),
     pos = 2, cex = 1.0, col = "darkblue")
axis(side = 1, col = "grey", col.axis = "grey")
axis(side=2, col ="grey", col.axis = "grey")

#axiscolors = "grey"

2 Comparing boxplot

Membandingkan rug plot, boxplot, density plot, dan violin plot:

  • Rug Plot: Menampilkan distribusi data dalam bentuk garis kecil pada sumbu.

  • Boxplot: Representasi standar distribusi data.

  • Density Plot: Grafik kepadatan distribusi data.

  • Violin Plot: Kombinasi antara boxplot dan density plot.

require(hdrcde)
require(vioplot)
require(Hmisc)

set.seed(785)
x <- rnorm(200,2,1)
opar <- par(mfrow=c(1,5), mar=c(3,2,4,1))

xxx <- seq(min(x), max(x), length=500)
yyy <- dnorm(xxx,  mean=2)

## Fake to highlight the Rug
plot(yyy, xxx, type="l", col= "white", main="Rug", axes = FALSE)
rug(x,side = 2, lwd = 1,ticksize = 1)
boxplot(x, col="pink", main="standard\nboxplot", axes = FALSE)
plot(yyy, xxx, type="l", main="density", frame.plot = FALSE)
vioplot(x, axes = FALSE)
title("violin plot")

#bpplot(x)
par(opar)


Direktorat Statistik Kesejahteraan Rakyat, BPS,