[Data Structure]
str(data1)
'data.frame': 24866 obs. of 12 variables:
$ j2dt : POSIXct, format: "2016-09-01 09:26:02" "2016-09-01 09:26:12" "2016-09-01 09:26:23" "2016-09-01 09:26:33" ...
$ lat : num 35 35 35 35 35 ...
$ long : num 129 129 129 129 129 ...
$ pressure : int 0 0 0 0 0 0 0 0 0 0 ...
$ temp : num 25.3 25.3 25.3 25.3 25.3 ...
$ salinity : num 31.1 31.1 31.1 31.1 31.1 ...
$ density1 : num 20.3 20.3 20.3 20.3 20.3 ...
$ density2 : num 20.3 20.3 20.3 20.3 20.3 ...
$ soundv : num 1531 1531 1531 1531 1531 ...
$ flag : num 0 0 0 0 0 0 0 0 0 0 ...
$ temp_flag: logi NA NA NA NA NA NA ...
$ sal_flag : logi NA NA NA NA NA NA ...
[Data summary]
summary(data1)
j2dt lat long pressure temp salinity density1 density2
Min. :2016-09-01 09:26:02 Min. :35.01 Min. :129.0 Min. :0 Min. :22.10 Min. :10.15 Min. : 4.667 Min. : 4.667
1st Qu.:2016-09-02 02:42:05 1st Qu.:36.10 1st Qu.:129.5 1st Qu.:0 1st Qu.:24.72 1st Qu.:30.56 1st Qu.:20.009 1st Qu.:20.009
Median :2016-09-02 19:58:08 Median :36.21 Median :129.6 Median :0 Median :24.95 Median :31.08 Median :20.328 Median :20.328
Mean :2016-09-02 19:58:08 Mean :36.13 Mean :129.6 Mean :0 Mean :24.95 Mean :30.86 Mean :20.228 Mean :20.228
3rd Qu.:2016-09-03 13:14:10 3rd Qu.:36.27 3rd Qu.:129.8 3rd Qu.:0 3rd Qu.:25.18 3rd Qu.:31.41 3rd Qu.:20.672 3rd Qu.:20.672
Max. :2016-09-04 06:30:13 Max. :36.43 Max. :130.0 Max. :0 Max. :30.48 Max. :33.34 Max. :21.954 Max. :21.954
soundv flag temp_flag sal_flag
Min. :1507 Min. :0 Mode :logical Mode :logical
1st Qu.:1529 1st Qu.:0 FALSE:24759 FALSE:24501
Median :1530 Median :0 TRUE :76 TRUE :334
Mean :1530 Mean :0 NA's :31 NA's :31
3rd Qu.:1531 3rd Qu.:0
Max. :1543 Max. :0
Moving IQR TEST Result..(Windows size= 31, 중앙값 16번째 값 좌우로 15개씩 총 31개 단위데이터 이용) 따라서 1~15행, 마지막행-15행까지의 데이터는 버려짐.
[Temperature’s M-IQR 검출결과]
cat("Temperature ->>", "Outliers(M-IQR) :", nrow(temp_Outliers), " , ", "Passed :", nrow(temp_Passed))
Temperature ->> Outliers(M-IQR) : 76 , Passed : 24759
[Temperature’s 시각화 체크]
boxplot(data1$temp, col="lightgrey", horizontal = T, xlab="째C", ylab="", main="temperature")
hist(data1$temp, col="lightcyan", breaks = 100, probability = TRUE, xlab="째C")
[Temperature’s Dynamic Plotting]
p<-ggplot(data1, aes(x=j2dt, y=temp, col=temp_flag)) + geom_point(size=1) +
scale_color_manual(values=c("black", "red")) + ggtitle("temperature Plot")
ggplotly(p)
[Salinity’s M-IQR 검출결과]
cat("Salinity ->>", "Outliers(M-IQR) :", nrow(salinity_Outliers), " , ", "Passed :", nrow(salinity_Passed))
Salinity ->> Outliers(M-IQR) : 334 , Passed : 24501
[Salinity’s 시각화 체크]
boxplot(data1$salinity, col="lightgrey", horizontal = T, xlab="", ylab="", main="Salinity")
hist(data1$salinity, col="lightcyan", breaks = 100, probability = TRUE, xlab="")
[Salinity’s Dynamic Plotting]
p<-ggplot(data1, aes(x=j2dt, y=salinity, col=sal_flag)) + geom_point(size=1) +
scale_color_manual(values=c("black", "red")) + ggtitle("Salinity Plot")
ggplotly(p)