####### ΕΡΓΑΣΤΗΡΙΑΚΗ ΕΦΑΡΜΟΓΗ ΧΩΡΙΚΗ ΑΝΑΛΥΣΗ ######
#### Ενσωματωμένα σύνολα δεδομένων στην R
### iris
### This dataset contains measurements of the sepal
### and petal lengths and widths of 3 species of iris flowers
### It has 150 observations and 4 variables.
## φόρτωση του dataset "iris"
iris
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
## 7 4.6 3.4 1.4 0.3 setosa
## 8 5.0 3.4 1.5 0.2 setosa
## 9 4.4 2.9 1.4 0.2 setosa
## 10 4.9 3.1 1.5 0.1 setosa
## 11 5.4 3.7 1.5 0.2 setosa
## 12 4.8 3.4 1.6 0.2 setosa
## 13 4.8 3.0 1.4 0.1 setosa
## 14 4.3 3.0 1.1 0.1 setosa
## 15 5.8 4.0 1.2 0.2 setosa
## 16 5.7 4.4 1.5 0.4 setosa
## 17 5.4 3.9 1.3 0.4 setosa
## 18 5.1 3.5 1.4 0.3 setosa
## 19 5.7 3.8 1.7 0.3 setosa
## 20 5.1 3.8 1.5 0.3 setosa
## 21 5.4 3.4 1.7 0.2 setosa
## 22 5.1 3.7 1.5 0.4 setosa
## 23 4.6 3.6 1.0 0.2 setosa
## 24 5.1 3.3 1.7 0.5 setosa
## 25 4.8 3.4 1.9 0.2 setosa
## 26 5.0 3.0 1.6 0.2 setosa
## 27 5.0 3.4 1.6 0.4 setosa
## 28 5.2 3.5 1.5 0.2 setosa
## 29 5.2 3.4 1.4 0.2 setosa
## 30 4.7 3.2 1.6 0.2 setosa
## 31 4.8 3.1 1.6 0.2 setosa
## 32 5.4 3.4 1.5 0.4 setosa
## 33 5.2 4.1 1.5 0.1 setosa
## 34 5.5 4.2 1.4 0.2 setosa
## 35 4.9 3.1 1.5 0.2 setosa
## 36 5.0 3.2 1.2 0.2 setosa
## 37 5.5 3.5 1.3 0.2 setosa
## 38 4.9 3.6 1.4 0.1 setosa
## 39 4.4 3.0 1.3 0.2 setosa
## 40 5.1 3.4 1.5 0.2 setosa
## 41 5.0 3.5 1.3 0.3 setosa
## 42 4.5 2.3 1.3 0.3 setosa
## 43 4.4 3.2 1.3 0.2 setosa
## 44 5.0 3.5 1.6 0.6 setosa
## 45 5.1 3.8 1.9 0.4 setosa
## 46 4.8 3.0 1.4 0.3 setosa
## 47 5.1 3.8 1.6 0.2 setosa
## 48 4.6 3.2 1.4 0.2 setosa
## 49 5.3 3.7 1.5 0.2 setosa
## 50 5.0 3.3 1.4 0.2 setosa
## 51 7.0 3.2 4.7 1.4 versicolor
## 52 6.4 3.2 4.5 1.5 versicolor
## 53 6.9 3.1 4.9 1.5 versicolor
## 54 5.5 2.3 4.0 1.3 versicolor
## 55 6.5 2.8 4.6 1.5 versicolor
## 56 5.7 2.8 4.5 1.3 versicolor
## 57 6.3 3.3 4.7 1.6 versicolor
## 58 4.9 2.4 3.3 1.0 versicolor
## 59 6.6 2.9 4.6 1.3 versicolor
## 60 5.2 2.7 3.9 1.4 versicolor
## 61 5.0 2.0 3.5 1.0 versicolor
## 62 5.9 3.0 4.2 1.5 versicolor
## 63 6.0 2.2 4.0 1.0 versicolor
## 64 6.1 2.9 4.7 1.4 versicolor
## 65 5.6 2.9 3.6 1.3 versicolor
## 66 6.7 3.1 4.4 1.4 versicolor
## 67 5.6 3.0 4.5 1.5 versicolor
## 68 5.8 2.7 4.1 1.0 versicolor
## 69 6.2 2.2 4.5 1.5 versicolor
## 70 5.6 2.5 3.9 1.1 versicolor
## 71 5.9 3.2 4.8 1.8 versicolor
## 72 6.1 2.8 4.0 1.3 versicolor
## 73 6.3 2.5 4.9 1.5 versicolor
## 74 6.1 2.8 4.7 1.2 versicolor
## 75 6.4 2.9 4.3 1.3 versicolor
## 76 6.6 3.0 4.4 1.4 versicolor
## 77 6.8 2.8 4.8 1.4 versicolor
## 78 6.7 3.0 5.0 1.7 versicolor
## 79 6.0 2.9 4.5 1.5 versicolor
## 80 5.7 2.6 3.5 1.0 versicolor
## 81 5.5 2.4 3.8 1.1 versicolor
## 82 5.5 2.4 3.7 1.0 versicolor
## 83 5.8 2.7 3.9 1.2 versicolor
## 84 6.0 2.7 5.1 1.6 versicolor
## 85 5.4 3.0 4.5 1.5 versicolor
## 86 6.0 3.4 4.5 1.6 versicolor
## 87 6.7 3.1 4.7 1.5 versicolor
## 88 6.3 2.3 4.4 1.3 versicolor
## 89 5.6 3.0 4.1 1.3 versicolor
## 90 5.5 2.5 4.0 1.3 versicolor
## 91 5.5 2.6 4.4 1.2 versicolor
## 92 6.1 3.0 4.6 1.4 versicolor
## 93 5.8 2.6 4.0 1.2 versicolor
## 94 5.0 2.3 3.3 1.0 versicolor
## 95 5.6 2.7 4.2 1.3 versicolor
## 96 5.7 3.0 4.2 1.2 versicolor
## 97 5.7 2.9 4.2 1.3 versicolor
## 98 6.2 2.9 4.3 1.3 versicolor
## 99 5.1 2.5 3.0 1.1 versicolor
## 100 5.7 2.8 4.1 1.3 versicolor
## 101 6.3 3.3 6.0 2.5 virginica
## 102 5.8 2.7 5.1 1.9 virginica
## 103 7.1 3.0 5.9 2.1 virginica
## 104 6.3 2.9 5.6 1.8 virginica
## 105 6.5 3.0 5.8 2.2 virginica
## 106 7.6 3.0 6.6 2.1 virginica
## 107 4.9 2.5 4.5 1.7 virginica
## 108 7.3 2.9 6.3 1.8 virginica
## 109 6.7 2.5 5.8 1.8 virginica
## 110 7.2 3.6 6.1 2.5 virginica
## 111 6.5 3.2 5.1 2.0 virginica
## 112 6.4 2.7 5.3 1.9 virginica
## 113 6.8 3.0 5.5 2.1 virginica
## 114 5.7 2.5 5.0 2.0 virginica
## 115 5.8 2.8 5.1 2.4 virginica
## 116 6.4 3.2 5.3 2.3 virginica
## 117 6.5 3.0 5.5 1.8 virginica
## 118 7.7 3.8 6.7 2.2 virginica
## 119 7.7 2.6 6.9 2.3 virginica
## 120 6.0 2.2 5.0 1.5 virginica
## 121 6.9 3.2 5.7 2.3 virginica
## 122 5.6 2.8 4.9 2.0 virginica
## 123 7.7 2.8 6.7 2.0 virginica
## 124 6.3 2.7 4.9 1.8 virginica
## 125 6.7 3.3 5.7 2.1 virginica
## 126 7.2 3.2 6.0 1.8 virginica
## 127 6.2 2.8 4.8 1.8 virginica
## 128 6.1 3.0 4.9 1.8 virginica
## 129 6.4 2.8 5.6 2.1 virginica
## 130 7.2 3.0 5.8 1.6 virginica
## 131 7.4 2.8 6.1 1.9 virginica
## 132 7.9 3.8 6.4 2.0 virginica
## 133 6.4 2.8 5.6 2.2 virginica
## 134 6.3 2.8 5.1 1.5 virginica
## 135 6.1 2.6 5.6 1.4 virginica
## 136 7.7 3.0 6.1 2.3 virginica
## 137 6.3 3.4 5.6 2.4 virginica
## 138 6.4 3.1 5.5 1.8 virginica
## 139 6.0 3.0 4.8 1.8 virginica
## 140 6.9 3.1 5.4 2.1 virginica
## 141 6.7 3.1 5.6 2.4 virginica
## 142 6.9 3.1 5.1 2.3 virginica
## 143 5.8 2.7 5.1 1.9 virginica
## 144 6.8 3.2 5.9 2.3 virginica
## 145 6.7 3.3 5.7 2.5 virginica
## 146 6.7 3.0 5.2 2.3 virginica
## 147 6.3 2.5 5.0 1.9 virginica
## 148 6.5 3.0 5.2 2.0 virginica
## 149 6.2 3.4 5.4 2.3 virginica
## 150 5.9 3.0 5.1 1.8 virginica
## Αντικείμενα της Iris
species <- iris$Species
sepal.l <- iris$Sepal.Length
petal.l <- iris$Petal.Length
sepal.w <- iris$Sepal.Width
petal.w <- iris$Petal.Width
## δημιουργία data.frame
iris.df <- data.frame (species, sepal.l, sepal.w, petal.l, petal.w)
### ΕΝΟΤΗΤΑ 1: ΠΕΡΙΓΡΑΦΙΗ ΣΤΑΤΙΣΤΙΚΗ #########
## 1.1. BASE R
# .1.1 Υπολογισμός Διαγραμμάτων
plot(species, col="lightblue")

plot(sepal.l)

hist(sepal.l, col="red")

plot(species, sepal.l, col="blue",
xlab="test1", ylab="test2", main="TEST3")

boxplot(sepal.l~species)

# 1.1.2 Υπολογισμός βασικών στατιστικών μέτρων
mean (petal.w)
## [1] 1.199333
median (petal.w)
## [1] 1.3
quantile(petal.w)
## 0% 25% 50% 75% 100%
## 0.1 0.3 1.3 1.8 2.5
summary(petal.w)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.100 0.300 1.300 1.199 1.800 2.500
# 1.1.3 Απεικόνιση σε ιστόγραμμα βασικών στατιστικών μέτρων
hist(petal.w)
abline(v = mean(petal.w),col = "red", lwd = 3) # Add line for mean
abline (v=median(petal.w), col = "blue", lwd = 3) # Add line for median
abline (v=quantile(petal.w), col = "black", lwd = 3) # Add line for quantile

### ΕΝΟΤΗΤΑ 2: ΠΕΡΙΓΡΑΦΙΗ ΣΤΑΤΙΣΤΙΚΗ -μέρος 2 #########
## ΕΙΔΙΚΑ ΠΑΚΕΤΑ ΠΕΡΙΓΡΑΦΙΚΗΣ ΣΤΑΤΙΣΤΙΚΗΣ
library(pastecs)
## Warning: package 'pastecs' was built under R version 4.2.3
#stat.desc(iris.df)
library (psych)
# σύνολο πίνακα
describe(iris.df)
## vars n mean sd median trimmed mad min max range skew kurtosis
## species* 1 150 2.00 0.82 2.00 2.00 1.48 1.0 3.0 2.0 0.00 -1.52
## sepal.l 2 150 5.84 0.83 5.80 5.81 1.04 4.3 7.9 3.6 0.31 -0.61
## sepal.w 3 150 3.06 0.44 3.00 3.04 0.44 2.0 4.4 2.4 0.31 0.14
## petal.l 4 150 3.76 1.77 4.35 3.76 1.85 1.0 6.9 5.9 -0.27 -1.42
## petal.w 5 150 1.20 0.76 1.30 1.18 1.04 0.1 2.5 2.4 -0.10 -1.36
## se
## species* 0.07
## sepal.l 0.07
## sepal.w 0.04
## petal.l 0.14
## petal.w 0.06
# ανάλυση ανά είδος
describeBy(iris.df, species)
##
## Descriptive statistics by group
## group: setosa
## vars n mean sd median trimmed mad min max range skew kurtosis se
## species* 1 50 1.00 0.00 1.0 1.00 0.00 1.0 1.0 0.0 NaN NaN 0.00
## sepal.l 2 50 5.01 0.35 5.0 5.00 0.30 4.3 5.8 1.5 0.11 -0.45 0.05
## sepal.w 3 50 3.43 0.38 3.4 3.42 0.37 2.3 4.4 2.1 0.04 0.60 0.05
## petal.l 4 50 1.46 0.17 1.5 1.46 0.15 1.0 1.9 0.9 0.10 0.65 0.02
## petal.w 5 50 0.25 0.11 0.2 0.24 0.00 0.1 0.6 0.5 1.18 1.26 0.01
## ------------------------------------------------------------
## group: versicolor
## vars n mean sd median trimmed mad min max range skew kurtosis
## species* 1 50 2.00 0.00 2.00 2.00 0.00 2.0 2.0 0.0 NaN NaN
## sepal.l 2 50 5.94 0.52 5.90 5.94 0.52 4.9 7.0 2.1 0.10 -0.69
## sepal.w 3 50 2.77 0.31 2.80 2.78 0.30 2.0 3.4 1.4 -0.34 -0.55
## petal.l 4 50 4.26 0.47 4.35 4.29 0.52 3.0 5.1 2.1 -0.57 -0.19
## petal.w 5 50 1.33 0.20 1.30 1.32 0.22 1.0 1.8 0.8 -0.03 -0.59
## se
## species* 0.00
## sepal.l 0.07
## sepal.w 0.04
## petal.l 0.07
## petal.w 0.03
## ------------------------------------------------------------
## group: virginica
## vars n mean sd median trimmed mad min max range skew kurtosis
## species* 1 50 3.00 0.00 3.00 3.00 0.00 3.0 3.0 0.0 NaN NaN
## sepal.l 2 50 6.59 0.64 6.50 6.57 0.59 4.9 7.9 3.0 0.11 -0.20
## sepal.w 3 50 2.97 0.32 3.00 2.96 0.30 2.2 3.8 1.6 0.34 0.38
## petal.l 4 50 5.55 0.55 5.55 5.51 0.67 4.5 6.9 2.4 0.52 -0.37
## petal.w 5 50 2.03 0.27 2.00 2.03 0.30 1.4 2.5 1.1 -0.12 -0.75
## se
## species* 0.00
## sepal.l 0.09
## sepal.w 0.05
## petal.l 0.08
## petal.w 0.04
########## ΤΕΛΟΣ 2ης ΕΝΟΤΗΤΑΣ
## 2.2 ggplots
# φορτωση του αρχείου
Orange
## Grouped Data: circumference ~ age | Tree
## Tree age circumference
## 1 1 118 30
## 2 1 484 58
## 3 1 664 87
## 4 1 1004 115
## 5 1 1231 120
## 6 1 1372 142
## 7 1 1582 145
## 8 2 118 33
## 9 2 484 69
## 10 2 664 111
## 11 2 1004 156
## 12 2 1231 172
## 13 2 1372 203
## 14 2 1582 203
## 15 3 118 30
## 16 3 484 51
## 17 3 664 75
## 18 3 1004 108
## 19 3 1231 115
## 20 3 1372 139
## 21 3 1582 140
## 22 4 118 32
## 23 4 484 62
## 24 4 664 112
## 25 4 1004 167
## 26 4 1231 179
## 27 4 1372 209
## 28 4 1582 214
## 29 5 118 30
## 30 5 484 49
## 31 5 664 81
## 32 5 1004 125
## 33 5 1231 142
## 34 5 1372 174
## 35 5 1582 177
# ετικέτες και σύνοψη
labels (Orange)
## [[1]]
## [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14" "15"
## [16] "16" "17" "18" "19" "20" "21" "22" "23" "24" "25" "26" "27" "28" "29" "30"
## [31] "31" "32" "33" "34" "35"
##
## [[2]]
## [1] "Tree" "age" "circumference"
summary(Orange)
## Tree age circumference
## 3:7 Min. : 118.0 Min. : 30.0
## 1:7 1st Qu.: 484.0 1st Qu.: 65.5
## 5:7 Median :1004.0 Median :115.0
## 2:7 Mean : 922.1 Mean :115.9
## 4:7 3rd Qu.:1372.0 3rd Qu.:161.5
## Max. :1582.0 Max. :214.0
range (Orange$Tree)
## [1] 3 4
## Levels: 3 < 1 < 5 < 2 < 4
# δομή αρχείου
str(Orange)
## Classes 'nfnGroupedData', 'nfGroupedData', 'groupedData' and 'data.frame': 35 obs. of 3 variables:
## $ Tree : Ord.factor w/ 5 levels "3"<"1"<"5"<"2"<..: 2 2 2 2 2 2 2 4 4 4 ...
## $ age : num 118 484 664 1004 1231 ...
## $ circumference: num 30 58 87 115 120 142 145 33 69 111 ...
## - attr(*, "formula")=Class 'formula' language circumference ~ age | Tree
## .. ..- attr(*, ".Environment")=<environment: R_EmptyEnv>
## - attr(*, "labels")=List of 2
## ..$ x: chr "Time since December 31, 1968"
## ..$ y: chr "Trunk circumference"
## - attr(*, "units")=List of 2
## ..$ x: chr "(days)"
## ..$ y: chr "(mm)"
# διαμόρφωση data.frame
tree <- as.factor (Orange$Tree)
age <- Orange$age
perim <- Orange$circumference
orange.tree <- data.frame(tree, age, perim)
# μια μεταβλητή
# Πρώτα δημιουργούμε ένα αντικείμενο R με το αντίστοιχο διάγραμμα
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.2.3
##
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
##
## %+%, alpha
ggplot(data = orange.tree, aes(x = perim, fill=tree)) +
geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

ggplot(data = orange.tree, aes(x = age, y=perim)) +
geom_point()

ggplot(data = orange.tree, aes(x = age, y=perim, fill=tree)) +
geom_bar(stat="identity", position="dodge")

ggplot(data = orange.tree, aes(x = age, y=perim, fill=tree)) +
geom_boxplot()

### ΕΝΟΤΗΤΑ 3: CORRELATION #########
## 3.1
cor(age,perim)
## [1] 0.9135189
cor.test(age,perim)
##
## Pearson's product-moment correlation
##
## data: age and perim
## t = 12.9, df = 33, p-value = 1.931e-14
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.8342364 0.9557955
## sample estimates:
## cor
## 0.9135189
library(Hmisc)
## Warning: package 'Hmisc' was built under R version 4.2.3
##
## Attaching package: 'Hmisc'
## The following object is masked from 'package:psych':
##
## describe
## The following objects are masked from 'package:base':
##
## format.pval, units
rcorr(as.matrix((orange.tree)))
## tree age perim
## tree 1.00 0.00 0.07
## age 0.00 1.00 0.91
## perim 0.07 0.91 1.00
##
## n= 35
##
##
## P
## tree age perim
## tree 1.000 0.699
## age 1.000 0.000
## perim 0.699 0.000
library(corrplot)
## Warning: package 'corrplot' was built under R version 4.2.3
## corrplot 0.92 loaded
orange.tree2 <- orange.tree[,-1]
corrplot(cor(orange.tree2))

corrplot(cor(orange.tree2), method = "number")

corrplot.mixed(cor(orange.tree2))

##### ΤΕΛΟΣ 3ης ΕΝΟΤΗΤΑΣ