How to Analyzing Data’s with R

library(mosaicCalc)
## Loading required package: mosaic
## Registered S3 method overwritten by 'mosaic':
##   method                           from   
##   fortify.SpatialPolygonsDataFrame ggplot2
## 
## The 'mosaic' package masks several functions from core packages in order to add 
## additional features.  The original behavior of these functions should not be affected by this.
## 
## Attaching package: 'mosaic'
## The following objects are masked from 'package:dplyr':
## 
##     count, do, tally
## The following object is masked from 'package:Matrix':
## 
##     mean
## The following object is masked from 'package:ggplot2':
## 
##     stat
## The following objects are masked from 'package:stats':
## 
##     binom.test, cor, cor.test, cov, fivenum, IQR, median, prop.test,
##     quantile, sd, t.test, var
## The following objects are masked from 'package:base':
## 
##     max, mean, min, prod, range, sample, sum
## Loading required package: mosaicCore
## 
## Attaching package: 'mosaicCore'
## The following objects are masked from 'package:dplyr':
## 
##     count, tally
## 
## Attaching package: 'mosaicCalc'
## The following object is masked from 'package:stats':
## 
##     D

Fungsi t.test() akan menghasilkan output berupa nilai t uji, derajat kebebasan (df), nilai p-value, rentang estimasi nilai rata-rata berdasarkan tingkat kepercayaan yang digunakan, serta estimasi nilai rata-rata sampel. Fungsi wilcox.test() akan menghasilkan dua buah output yaitu nilai W dan p-value berdasarkan nilai W yang dihasilkan.

# Uji hipotesis konsentrasi ozon = 40 ppm
# parametrik
t.test(x=airquality$Ozone, alternative = "two.sided",
       mu = 40)
## 
##  One Sample t-test
## 
## data:  airquality$Ozone
## t = 0.69521, df = 115, p-value = 0.4883
## alternative hypothesis: true mean is not equal to 40
## 95 percent confidence interval:
##  36.06240 48.19622
## sample estimates:
## mean of x 
##  42.12931
# nonparametrik
wilcox.test(x=airquality$Ozone, alternative = "two.sided",
       mu = 40)
## 
##  Wilcoxon signed rank test with continuity correction
## 
## data:  airquality$Ozone
## V = 3188, p-value = 0.6826
## alternative hypothesis: true location is not equal to 40
# Uji hipotesis dua populasi
dni3 <- dimnames(iris3)
ii <- data.frame(matrix(aperm(iris3, c(1,3,2)), ncol = 4,
                        dimnames = list(NULL, sub(" L.",".Length",
                                        sub(" W.",".Width", dni3[[2]])))),
    Species = gl(3, 50, labels = sub("S", "s", sub("V", "v", dni3[[3]]))))
# parametrik
t.test(x=iris$Sepal.Length[iris$Species=="setosa"], 
       y=ii$Sepal.Length[iris$Species=="versicolor"])
## 
##  Welch Two Sample t-test
## 
## data:  iris$Sepal.Length[iris$Species == "setosa"] and ii$Sepal.Length[iris$Species == "versicolor"]
## t = -10.521, df = 86.538, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1.1057074 -0.7542926
## sample estimates:
## mean of x mean of y 
##     5.006     5.936
# nonparametrik
wilcox.test(x=iris$Sepal.Length[iris$Species=="setosa"], 
       y=ii$Sepal.Length[iris$Species=="versicolor"])
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  iris$Sepal.Length[iris$Species == "setosa"] and ii$Sepal.Length[iris$Species == "versicolor"]
## W = 168.5, p-value = 8.346e-14
## alternative hypothesis: true location shift is not equal to 0

Berdasarkan output yang dihasilkan, metode Pearson menghasilkan output berupa nilai t uji, derajat kebebasan, nilai p-value, rentang estimasi nilai korelasi berdasarkan tingkat kepercayaan, dan estimasi nilai korelasi. Metode Kendall dan Spearman disisi lai menghasilkan output berupa nilai z uji dan S untuk masing-masing metode serta nilai p-value berdasarkan nilai statistika uji dan estimasi koefisien korelasi.

# Pearson
cor.test(x = airquality$Ozone, y = airquality$Solar.R,
         alternative = "two.sided",
         method = "pearson")
## 
##  Pearson's product-moment correlation
## 
## data:  x and y
## t = 3.8798, df = 109, p-value = 0.0001793
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.173194 0.502132
## sample estimates:
##       cor 
## 0.3483417
# Kendall
cor.test(x = airquality$Ozone, y = airquality$Solar.R,
         alternative = "two.sided",
         method = "kendall")
## 
##  Kendall's rank correlation tau
## 
## data:  x and y
## z = 3.7096, p-value = 0.0002076
## alternative hypothesis: true tau is not equal to 0
## sample estimates:
##       tau 
## 0.2403194
# Spearman
cor.test(x = airquality$Ozone, y = airquality$Solar.R,
         alternative = "two.sided",
         method = "spearman")
## Warning in cor.test.default(x, y, ...): Cannot compute exact p-value with ties
## 
##  Spearman's rank correlation rho
## 
## data:  x and y
## S = 148561, p-value = 0.0001806
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##       rho 
## 0.3481865

Berdasarkan hasil yang diperoleh diketahui bahwa rata-rata konsentrasi bulanan ozon tidak sama tiap bulannya atau minimal terdapat satu bulan dimana konsentrasi ozonnya berbeda secara signifikan dengan konsentrasi ozon pada bulan-bulan lainnya. Untuk lebih memahami terkait analisis varians pada R dan cara membaca output kedua fungsi tersebut, pembaca dapat membaca tulisan pada halaman situs sthda.

summary(aov(Ozone~Month, airquality))
##              Df Sum Sq Mean Sq F value Pr(>F)  
## Month         1   3387    3387   3.171 0.0776 .
## Residuals   114 121756    1068                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 37 observations deleted due to missingness
kruskal.test(Ozone~Month, airquality)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  Ozone by Month
## Kruskal-Wallis chi-squared = 29.267, df = 4, p-value = 6.901e-06

Referensi Bloomfield, V.A. 2014. Using R for Numerical Analysis in Science and Engineering. CRC Press. Coqhlan, A. Tanpa Tahun. Using R for Multivariate Analysis. https://little-book-of-r-for-multivariate-analysis.readthedocs.io/en/latest/src/multivariateanalysis.html#principal-component-analysis. Primartha, R. 2018. Belajar Machine Learning Teori dan Praktik. Penerbit Informatika : Bandung Rosadi,D. 2016. Analisis Statistika dengan R. Gadjah Mada University Press: Yogyakarta. Rosidi, M. 2019. Uji Hipotesis. https://environmental-data-modeling.netlify.com/tutorial/11_uji_hipotesis/. STHDA. Tanpa Tahun. Comparing Means in R. http://www.sthda.com/english/wiki/comparing-means-in-r.