Ngay 2. So sanh 2 nhom

Viec 1. Doc du lieu Bone data

df <- read.csv("D:\\Giang\\Data_analysis_R_01_2026\\DATA\\DATA\\Bone_data.csv")

Viec 2. So sanh mat do xuong nam va nu

2.1. Ve bieu do Histogram danh gia mat do co xuong dui (fnbmd)

library(lessR)

## 
## lessR 4.5                            feedback: gerbing@pdx.edu 
## --------------------------------------------------------------
## > d <- Read("")  Read data file, many formats available, e.g., Excel
##   d is the default data frame, data= in analysis routines optional
## 
## Many examples of reading, writing, and manipulating data, graphics,
## testing means and proportions, regression, factor analysis,
## customization, forecasting, and aggregation to pivot tables.
##   Enter: browseVignettes("lessR")
## 
## View lessR updates, now including modern time series forecasting
##   and many, new Plotly interactive visualizations output. Most
##   visualization functions are now reorganized to three functions:
##      Chart(): type="bar", "pie", "radar", "bubble", "treemap", "icicle"
##      X(): type="histogram", "density", "vbs" and more
##      XY(): type="scatter" for a scatterplot, or "contour", "smooth"
##    Most previous function calls still work, such as:
##      BarChart(), Histogram, and Plot().
##   Enter: news(package="lessR"), or ?Chart, ?X, or ?XY
## There is also Flows() for Sankey flow diagrams, see ?Flows
## 
## Interactive data analysis for constructing visualizations.
##   Enter: interact()

Histogram(fnbmd,data=df)

## lessR visualizations are now unified over just three core functions:
##   - Chart() for pivot tables, such as bar charts. More info: ?Chart
##   - X() for a single variable x, such as histograms. More info: ?X
##   - XY() for scatterplots of two variables, x and y. More info: ?XY
## 
## Histogram() is deprecated, though still working for now.
## Please use X(..., type = "histogram") going forward.

## [Interactive plot from the Plotly R package (Sievert, 2020)]

## >>> Suggestions 
## bin_width: set the width of each bin 
## bin_start: set the start of the first bin 
## bin_end: set the end of the last bin 
## X(fnbmd, type="density")  # smoothed curve + histogram 
## X(fnbmd, type="vbs")  # Violin/Box/Scatterplot (VBS) plot 
## 
## --- fnbmd --- 
##  
##        n    miss      mean        sd       min       mdn       max 
##      2122      40     0.829     0.155     0.280     0.820     1.510 
##  
## 
##   
## --- Outliers ---     from the box plot: 33 
##  
## Small      Large 
## -----      ----- 
##  0.3      1.5 
##  0.3      1.5 
##  0.4      1.4 
##  0.4      1.4 
##  0.4      1.4 
##  0.4      1.4 
##  0.4      1.4 
##  0.4      1.4 
##  0.4      1.3 
##  0.4      1.3 
##  0.4      1.3 
##           1.3 
##           1.3 
##           1.3 
##           1.3 
##           1.2 
##           1.2 
##           1.2 
## 
## + 15 more outliers 
## 
## 
## Bin Width: 0.1 
## Number of Bins: 14 
##  
##        Bin  Midpnt  Count    Prop  Cumul.c  Cumul.p 
## --------------------------------------------------- 
##  0.2 > 0.3    0.25      1    0.00        1     0.00 
##  0.3 > 0.4    0.35      9    0.00       10     0.00 
##  0.4 > 0.5    0.45     15    0.01       25     0.01 
##  0.5 > 0.6    0.55    103    0.05      128     0.06 
##  0.6 > 0.7    0.65    306    0.14      434     0.20 
##  0.7 > 0.8    0.75    522    0.25      956     0.45 
##  0.8 > 0.9    0.85    534    0.25     1490     0.70 
##  0.9 > 1.0    0.95    371    0.17     1861     0.88 
##  1.0 > 1.1    1.05    183    0.09     2044     0.96 
##  1.1 > 1.2    1.15     48    0.02     2092     0.99 
##  1.2 > 1.3    1.25     21    0.01     2113     1.00 
##  1.3 > 1.4    1.35      6    0.00     2119     1.00 
##  1.4 > 1.5    1.45      2    0.00     2121     1.00 
##  1.5 > 1.6    1.55      1    0.00     2122     1.00 
##

Bieu do cho thay bien fnbmd tuan theo luat phan bo chuan (normal distribution)

2.2.So sanh mat do co xuong dui giua nam va nu

Vi fnbmd phan bo chuan, gia thiet mau giua 2 nhom nam va nu doc lap. Su dung phuong phap so sanh t-test cho 2 nhom

ttest(fnbmd~sex, data=df)

## 
## Compare fnbmd across sex with levels Male and Female 
## Grouping Variable:  sex
## Response Variable:  fnbmd
## 
## 
## ------ Describe ------
## 
## fnbmd for sex Male:  n.miss = 23,  n = 822,  mean = 0.910,  sd = 0.153
## fnbmd for sex Female:  n.miss = 17,  n = 1300,  mean = 0.778,  sd = 0.132
## 
## Mean Difference of fnbmd:  0.132
## 
## Weighted Average Standard Deviation:   0.141 
## 
## 
## ------ Assumptions ------
## 
## Note: These hypothesis tests can perform poorly, and the 
##       t-test is typically robust to violations of assumptions. 
##       Use as heuristic guides instead of interpreting literally. 
## 
## Null hypothesis, for each group, is a normal distribution of fnbmd.
## Group Male: Sample mean assumed normal because n > 30, so no test needed.
## Group Female: Sample mean assumed normal because n > 30, so no test needed.
## 
## Null hypothesis is equal variances of fnbmd, homogeneous.
## Variance Ratio test:  F = 0.023/0.018 = 1.336,  df = 821;1299,  p-value = 0.000
## Levene's test, Brown-Forsythe:  t = 3.449,  df = 2120,  p-value = 0.001
## 
## 
## ------ Infer ------
## 
## --- Assume equal population variances of fnbmd for each sex 
## 
## t-cutoff for 95% range of variation: tcut =  1.961 
## Standard Error of Mean Difference: SE =  0.006 
## 
## Hypothesis Test of 0 Mean Diff:  t-value = 21.080,  df = 2120,  p-value = 0.000
## 
## Margin of Error for 95% Confidence Level:  0.012
## 95% Confidence Interval for Mean Difference:  0.120 to 0.144
## 
## 
## --- Do not assume equal population variances of fnbmd for each sex 
## 
## t-cutoff: tcut =  1.961 
## Standard Error of Mean Difference: SE =  0.006 
## 
## Hypothesis Test of 0 Mean Diff:  t = 20.407,  df = 1560.981, p-value = 0.000
## 
## Margin of Error for 95% Confidence Level:  0.013
## 95% Confidence Interval for Mean Difference:  0.119 to 0.145
## 
## 
## ------ Effect Size ------
## 
## --- Assume equal population variances of fnbmd for each sex 
## 
## Standardized Mean Difference of fnbmd, Cohen's d:  0.939
## 
## 
## ------ Practical Importance ------
## 
## Minimum Mean Difference of practical importance: mmd
## Minimum Standardized Mean Difference of practical importance: msmd
## Neither value specified, so no analysis
## 
## 
## ------ Graphics Smoothing Parameter ------
## 
## Density bandwidth for sex Male: 0.044
## Density bandwidth for sex Female: 0.034

## Viec 3.So sanh 2 nhom RER ### 3.1 Nhap du lieu

placebo <- c(105, 119, 100, 97, 96, 101, 94, 95, 98)
cf <- c(96, 99, 94, 89, 96, 93, 88, 105, 88)

3.3. Dung bootstrap de so sanh RER trong 2 nhom

library(simpleboot)

## Simple Bootstrap Routines (1.1-8)

library(boot)

set.seed(123) #de ket qua duoc lap lai

boot_result <- two.boot(placebo, cf, FUN=mean, R=10000)
boot.ci(boot_result, type = c("perc", "bca","norm", "basic"))

## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 10000 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = boot_result, type = c("perc", "bca", "norm", 
##     "basic"))
## 
## Intervals : 
## Level      Normal              Basic         
## 95%   ( 0.383, 12.188 )   ( 0.111, 11.778 )  
## 
## Level     Percentile            BCa          
## 95%   ( 0.889, 12.556 )   ( 0.778, 12.444 )  
## Calculations and Intervals on Original Scale

#bca bias-corrected accelerated boot CI dung cho co mau nho

Viec 4. so sanh ti le gay xuong giua nam va nu

library(table1)

## 
## Attaching package: 'table1'

## The following object is masked from 'package:lessR':
## 
##     label

## The following objects are masked from 'package:base':
## 
##     units, units<-

table1(~as.factor(fx)| sex, data=df)

	Female (N=1317)	Male (N=845)	Overall (N=2162)
as.factor(fx)
0	916 (69.6%)	701 (83.0%)	1617 (74.8%)
1	401 (30.4%)	144 (17.0%)	545 (25.2%)

chisq.test(df$fx, df$sex)

## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  df$fx and df$sex
## X-squared = 48.363, df = 1, p-value = 3.542e-12

Day 2_Group comparision

Giang Phạm

2026-01-07