Baixar a pasta ‘aula_organização_dados’, neste link de GDrive.
Baixar a pasta inteira. Na página de destino do link clicar no diretório‘aula_organização_dados’ com o botão direito do mouse e selecione ‘download’. Deve cair na sua pasta de Downloads. Retire do zip e coloque a pasta ‘aula_organização_dados’ em qualquer lugar no seu micro.
3. Ativar o projeto e abrir o script QMD
No RStudio: File - Open Project - navegue até o local de sua pasta ‘aula_organização_dados’ e clicar duas vezes no arquivo ‘Biometria_2024.Rproj’. File - Open File - clicar duas vezes no arquivo ‘exerc_organização_dados_no_R.qmd’.
Agora seus caminhos dentro do script qmd serão relativos a esta pasta. A estrutura de pastas e subpastas acima da pasta de projeto não deve ser incluída nos caminhos.
Exercício: Ler dados coletados; classificar folhas; mapas de localizacao
1. Instale e carregue pacotes
Rodar este primeiro bloco de código usando ‘Run current chunk’
# Vetor contendo nomes dos pacotes; packages <-c("dplyr", "randomForest", "caret", "sf","terra", "tidyterra","tidyverse","viridis","ggplot2")# Verifica se cada pacote está instalado e, se não, instala for (item in packages) { if (!item %in%installed.packages()[,"Package"]) {install.packages(item)} # Carrega na memória RAM cada pacote (cada item em pac_nomes) library(item, character.only =TRUE)}
Warning: package 'dplyr' was built under R version 4.3.2
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
Warning: package 'randomForest' was built under R version 4.3.2
randomForest 4.7-1.1
Type rfNews() to see new features/changes/bug fixes.
Attaching package: 'randomForest'
The following object is masked from 'package:dplyr':
combine
Loading required package: ggplot2
Warning: package 'ggplot2' was built under R version 4.3.2
Attaching package: 'ggplot2'
The following object is masked from 'package:randomForest':
margin
Loading required package: lattice
Warning: package 'sf' was built under R version 4.3.2
Linking to GEOS 3.11.2, GDAL 3.7.2, PROJ 9.3.0; sf_use_s2() is TRUE
Warning: package 'terra' was built under R version 4.3.2
terra 1.7.65
Warning: package 'tidyterra' was built under R version 4.3.3
Attaching package: 'tidyterra'
The following object is masked from 'package:stats':
filter
Warning: package 'readr' was built under R version 4.3.2
Warning: package 'purrr' was built under R version 4.3.2
Warning: package 'stringr' was built under R version 4.3.2
Warning: package 'lubridate' was built under R version 4.3.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ randomForest::combine() masks dplyr::combine()
✖ tidyr::extract() masks terra::extract()
✖ tidyterra::filter() masks dplyr::filter(), stats::filter()
✖ dplyr::lag() masks stats::lag()
✖ purrr::lift() masks caret::lift()
✖ ggplot2::margin() masks randomForest::margin()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Warning: package 'viridis' was built under R version 4.3.2
Loading required package: viridisLite
2. Importar dados coletados (em csv) para o formato dataframe
# Carregue o dataset de um específico arquivo CSV # Dataset com 8 variáveis coletadas com 30 repeticoes de cada espécie (3 espécies), totalizando 90 obsdat <-read.csv("./tables/dat_example.csv", header = T, sep =",")
3. Examine as estrutura dos dados e defina as classes
# Examine as estatísticas dos dados numéricossummary(dat)
tree_class altura_cm largura_cm massa_fresca_g area_foliar_cm2
Min. :1 Min. :18.00 Min. :10.00 Min. : 95.0 Min. :190.0
1st Qu.:1 1st Qu.:26.00 1st Qu.:15.00 1st Qu.:130.0 1st Qu.:214.2
Median :2 Median :30.00 Median :17.00 Median :140.0 Median :307.5
Mean :2 Mean :30.56 Mean :16.68 Mean :153.4 Mean :308.4
3rd Qu.:3 3rd Qu.:35.00 3rd Qu.:19.00 3rd Qu.:193.8 3rd Qu.:388.8
Max. :3 Max. :43.00 Max. :23.00 Max. :215.0 Max. :480.0
esbeltez y_coord_geographic x_coord_geographic dap
Min. :1.120 Min. :-3.092 Min. :-59.99 Min. :15.00
1st Qu.:1.820 1st Qu.:-3.092 1st Qu.:-59.99 1st Qu.:15.00
Median :1.870 Median :-3.092 Median :-59.99 Median :30.00
Mean :1.849 Mean :-3.092 Mean :-59.99 Mean :31.67
3rd Qu.:1.897 3rd Qu.:-3.092 3rd Qu.:-59.99 3rd Qu.:50.00
Max. :2.700 Max. :-3.092 Max. :-59.99 Max. :50.00
# Transforme a coluna "tree_class" em classesdat$tree_class <-factor(dat$tree_class)# Examine a estrutura dos dados novamentestr(dat)
# Remove two columnsdat_subset <-subset(dat, select =-c(y_coord_geographic, x_coord_geographic))# Check the class countsclass_counts_dat <-table(dat_subset$tree_class)class_counts_dat
1 2 3
30 30 30
# 1 2 3 # 30 30 30
Random Forest method
5-fold cross validation
4. Criar modelo (rf) usando dados de treinamento e teste (5f-cv)
set.seed(123)# Define a configuração de validação cruzada com 10 foldersfitcontrol_10fold <-trainControl(method ="cv", number =10, search ="random", savePredictions =TRUE)# Train the Random Forest model with 10-fold cross-validation mod_10cv <-train(factor(tree_class) ~ ., data = dat_subset, method ="rf", trControl = fitcontrol_10fold, tuneLength =2, ntree =10,maxnodes =30, # Número máximo de nós na árvoremax_depth =10) # Profundidade máxima da árvore# Check the best parametersmod_10cv$bestTune
# Combinar os rastersmdt_combined <- terra::merge(mdt_1, mdt_2)print(mdt_combined)
class : SpatRaster
dimensions : 3601, 7201, 1 (nrow, ncol, nlyr)
resolution : 0.0002777778, 0.0002777778 (x, y)
extent : -61.00014, -58.99986, -4.000139, -2.999861 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326)
source(s) : memory
varname : manaus_s1
name : manaus_s1
min value : -44
max value : 129
# Plotar o raster combinadoplot(mdt_combined, main ="MDT Combined")
7. Recortar o raster pela área de estudo
# Visualizar o MDT com escala de cor viridisplot(mdt_combined, col =viridis(256), main ="MDT Combined")
# Importar o vetor da área de estudostudy_area <- sf::st_read("./shp/study_local.shp")
Reading layer `study_local' from data source
`C:\R_projects\Biometria_2024\Biometria_2024\shp\study_local.shp'
using driver `ESRI Shapefile'
Simple feature collection with 1 feature and 1 field
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: -59.99672 ymin: -3.093296 xmax: -59.99209 ymax: -3.090149
Geodetic CRS: WGS 84
# Criar um buffer em torno da área de estudostudy_area_buffer <- sf::st_buffer(study_area, dist =200)# Converter o buffer para o mesmo CRS que mdt_combinedstudy_area_buffer <- sf::st_transform(study_area_buffer, crs(mdt_combined))# Cortar o mdt_combined usando o buffer da study_areamdt_combined_cropped <- terra::crop(mdt_combined, study_area_buffer)plot(mdt_combined_cropped, col =viridis(256), main ="Study area")
8. Insira as coordenadas das árvores (em grau decimal)
# Selecione o rótulo das árvores, o dap e as coordenadas coords_geog <- dat %>%select(tree_class,x_coord_geographic, y_coord_geographic, dap)# Mantenha somente uma linha de coordenada por árvoreunique_coords <- coords_geog %>%distinct(x_coord_geographic, y_coord_geographic, .keep_all = T)# Remova os NASunique_coords <-na.omit(unique_coords)# Conferir o dfhead(unique_coords)
# Criar um objeto vetor com as coordenadas e os atributos desejadosunique_coords_sf <- unique_coords %>%st_as_sf(coords =c("x_coord_geographic", "y_coord_geographic"), crs =4326) %>%mutate(dap = unique_coords$dap,tree_class = unique_coords$tree_class)#Escale os dados de DAP(unique_coords_sf$dap)/10
[1] 5.0 1.5 3.0
9. Plote o mapa final de localizacao
# Plot a area de estudop_1 <-ggplot() +geom_spatraster(data = mdt_combined_cropped) +scale_fill_viridis() +labs(title ="Study area", fill ="Elevation") +theme_minimal()+geom_sf(data = unique_coords_sf, color ="white", size =2) + ggspatial::annotation_scale(location ="bl", pad_x=unit(1, "cm"), pad_y =unit(1, "cm")) + ggspatial::annotation_north_arrow(location ="tr", height =unit(1.0, "cm"), # encolhe a setawidth =unit(1.0, "cm"), # encolhe a setapad_x=unit(1, "cm"), pad_y =unit(1, "cm"))p_1
# Plot a area de estudo com o DAP proporcional ao tamanhocores_tree_class <-c("1"="red", "2"="blue", "3"="green") p_2 <-ggplot() +geom_spatraster(data = mdt_combined_cropped) +scale_fill_gradientn(colors =c("black", "white")) +labs(title ="Study area", fill ="Elevation") +theme_minimal()+geom_sf(data = unique_coords_sf, aes(color =as.factor(tree_class)), size = (unique_coords_sf$dap)/10) +scale_color_manual(values = cores_tree_class,name ="Tree Class")+ ggspatial::annotation_scale(location ="bl", pad_x=unit(1, "cm"), pad_y =unit(1, "cm")) + ggspatial::annotation_north_arrow(location ="tr", height =unit(1.0, "cm"), # encolhe a setawidth =unit(1.0, "cm"), # encolhe a setapad_x=unit(1, "cm"), pad_y =unit(1, "cm"))p_2
10. Exercício para entregar
Faca um metadado com os dados coletados, define os tipos de dados, unidade e todas as informacoes necessárias para os metadados.