There are 3 types of analysis method in scientific research:
1: Analysis of difference
t-test: Dependent variable is continuous, Independent variable including 2 groups
ANOVA:One dependent variable is continuous, Independent variable including more than 2 groups (1,2,3 Way anova)
MANOVA for multiple continuous dependent variables
Chi-square: both variables are categorical variable
2: Correlation analysis and Prediction
Correlation analysis
Linear regression analysis
Logistic regression
3: Association analysis
library(readxl)
Data <- read_excel("C:/Users/Admin/Desktop/R/Data.xlsx",
col_types = c("text", "text", "text", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric"))
attach(Data)
require(ggplot2)
require(car)
require(psych)
require(relaimpo)
Y = Dependent variable (Obligated continuous variable) X = Independent variable (continuous or not) α = Intercept (value of Y when X = 0) β = Slope (Estimate): changing value of Y when X changed 1 unit ε = Random error R^2 = Percentage of the contribution in
A linear will be estimated basing on Least Square Method (a Linear is on a line which formed from minimum d^2)
cor.test(`Body Weight (g)`,`Liver (g)`)
##
## Pearson's product-moment correlation
##
## data: Body Weight (g) and Liver (g)
## t = 6.5252, df = 22, p-value = 1.455e-06
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.6076240 0.9155086
## sample estimates:
## cor
## 0.8119908
Simple_linear_CO=lm(`Body Weight (g)`~`Liver (g)`)
summary(Simple_linear_CO)
##
## Call:
## lm(formula = `Body Weight (g)` ~ `Liver (g)`)
##
## Residuals:
## Min 1Q Median 3Q Max
## -462.32 -225.54 -40.23 147.18 921.06
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 417.752 235.647 1.773 0.0901 .
## `Liver (g)` 38.131 5.844 6.525 1.46e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 337.2 on 22 degrees of freedom
## Multiple R-squared: 0.6593, Adjusted R-squared: 0.6438
## F-statistic: 42.58 on 1 and 22 DF, p-value: 1.455e-06
p=ggplot(Data,aes(x=`Liver (g)`,y=`Body Weight (g)`))
p+geom_point()+theme_bw()+theme_classic()+geom_smooth(method="lm",formula= y~x)
Type of categorical 1: Nominal: sex, location, nation 2: Ordinal: Level of something, Stage of something t-test can be used in this cage but it just estimate the different and not estimate the prediction, negative or positive relationship
t.test(`Weight gain (g)`~Gender)
##
## Welch Two Sample t-test
##
## data: Weight gain (g) by Gender
## t = 2.6819, df = 16.312, p-value = 0.01617
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 14.32066 121.56268
## sample estimates:
## mean in group Female mean in group Male
## 207.5917 139.6500
simple_linear_CA=lm(`Weight gain (g)`~Gender)
summary(simple_linear_CA)
##
## Call:
## lm(formula = `Weight gain (g)` ~ Gender)
##
## Residuals:
## Min 1Q Median 3Q Max
## -109.49 -45.81 -3.15 47.64 116.41
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 207.59 17.91 11.589 7.75e-11 ***
## GenderMale -67.94 25.33 -2.682 0.0136 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 62.05 on 22 degrees of freedom
## Multiple R-squared: 0.2464, Adjusted R-squared: 0.2121
## F-statistic: 7.193 on 1 and 22 DF, p-value: 0.01362
m=lm(`Weight gain (g)`~Gender+`Liver (g)`+Gender:`Liver (g)`)
m1=lm(`Weight gain (g)`~Gender)
m2= lm(`Weight gain (g)`~`Liver (g)`)
summary(m)
##
## Call:
## lm(formula = `Weight gain (g)` ~ Gender + `Liver (g)` + Gender:`Liver (g)`)
##
## Residuals:
## Min 1Q Median 3Q Max
## -102.797 -46.847 -0.179 45.105 113.197
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 188.7844 64.1593 2.942 0.00805 **
## GenderMale -65.9907 90.7864 -0.727 0.47572
## `Liver (g)` 0.4980 1.6252 0.306 0.76243
## GenderMale:`Liver (g)` -0.0699 2.2532 -0.031 0.97556
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 64.81 on 20 degrees of freedom
## Multiple R-squared: 0.2527, Adjusted R-squared: 0.1406
## F-statistic: 2.254 on 3 and 20 DF, p-value: 0.1133
summary(m1)
##
## Call:
## lm(formula = `Weight gain (g)` ~ Gender)
##
## Residuals:
## Min 1Q Median 3Q Max
## -109.49 -45.81 -3.15 47.64 116.41
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 207.59 17.91 11.589 7.75e-11 ***
## GenderMale -67.94 25.33 -2.682 0.0136 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 62.05 on 22 degrees of freedom
## Multiple R-squared: 0.2464, Adjusted R-squared: 0.2121
## F-statistic: 7.193 on 1 and 22 DF, p-value: 0.01362
summary(m2)
##
## Call:
## lm(formula = `Weight gain (g)` ~ `Liver (g)`)
##
## Residuals:
## Min 1Q Median 3Q Max
## -103.69 -47.85 -18.55 36.56 148.90
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 163.4915 49.8999 3.276 0.00345 **
## `Liver (g)` 0.2626 1.2374 0.212 0.83387
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 71.41 on 22 degrees of freedom
## Multiple R-squared: 0.002044, Adjusted R-squared: -0.04332
## F-statistic: 0.04505 on 1 and 22 DF, p-value: 0.8339
y=ggplot(Data,aes(x=`Liver (g)`,y=`Weight gain (g)`,fill=Gender))
y+geom_point()+theme_bw()+theme_classic()+geom_smooth(method="lm",formula= y~x)
pairs.panels(Data)
h_model=lm(`Body Weight (g)`~`Liver (g)`+`Viscera (g)`+`Fillet (g)`+`Abdomial fat (g)`)
metrics=calc.relimp(h_model,type=c("lmg"))
metrics
## Response variable: Body Weight (g)
## Total response variance: 319289.2
## Analysis based on 24 observations
##
## 4 Regressors:
## Liver (g) Viscera (g) Fillet (g) Abdomial fat (g)
## Proportion of variance explained by model: 81.94%
## Metrics are not normalized (rela=FALSE).
##
## Relative importance metrics:
##
## lmg
## Liver (g) 0.21505179
## Viscera (g) 0.18711632
## Fillet (g) 0.33505603
## Abdomial fat (g) 0.08215074
##
## Average coefficients for different model sizes:
##
## 1X 2Xs 3Xs 4Xs
## Liver (g) 38.130890 20.412847 1.339063 -8.170211
## Viscera (g) 9.543310 8.424649 7.186276 6.675346
## Fillet (g) 2.415306 2.456220 2.372892 2.238671
## Abdomial fat (g) 16.844015 -10.936325 -11.761811 -11.719022
boot=boot.relimp(h_model,b=1000,type=c("lmg"),fixed = F)
booteval.relimp(boot,typesel = c("lmg"),level=0.9,bty = "perc",nodiff=T)
## Response variable: Body Weight (g)
## Total response variance: 319289.2
## Analysis based on 24 observations
##
## 4 Regressors:
## Liver (g) Viscera (g) Fillet (g) Abdomial fat (g)
## Proportion of variance explained by model: 81.94%
## Metrics are not normalized (rela=FALSE).
##
## Relative importance metrics:
##
## lmg
## Liver (g) 0.21505179
## Viscera (g) 0.18711632
## Fillet (g) 0.33505603
## Abdomial fat (g) 0.08215074
##
## Average coefficients for different model sizes:
##
## 1X 2Xs 3Xs 4Xs
## Liver (g) 38.130890 20.412847 1.339063 -8.170211
## Viscera (g) 9.543310 8.424649 7.186276 6.675346
## Fillet (g) 2.415306 2.456220 2.372892 2.238671
## Abdomial fat (g) 16.844015 -10.936325 -11.761811 -11.719022
##
##
## Confidence interval information ( 1000 bootstrap replicates, bty= perc ):
## Relative Contributions with confidence intervals:
##
## Lower Upper
## percentage 0.9 0.9 0.9
## Liver (g).lmg 0.2151 _BC_ 0.1626 0.2777
## Viscera (g).lmg 0.1871 _BC_ 0.1305 0.2537
## Fillet (g).lmg 0.3351 A___ 0.2433 0.4171
## Abdomial fat (g).lmg 0.0822 ___D 0.0549 0.1456
##
## Letters indicate the ranks covered by bootstrap CIs.
## (Rank bootstrap confidence intervals always obtained by percentile method)
## CAUTION: Bootstrap confidence intervals can be somewhat liberal.