These data were collected as part of a timber survey around
Anchorage. Each row represents measurements from an individual spruce
tree (black spruce or white spruce). ‘Height’ is the height of the tree
(in feet); ‘DBH’ is the diameter of the tree’s trunk (in inches);
‘Species’ is the species of the tree, either ‘P.mariana’ (black spruce)
or ‘P.glauca’ (white spruce). The volume of a spruce tree (in cu. ft)
can be estimated as: Volume= 0.65559+0.00191(〖DBH〗^2)(Height)
1.) Bring the data into R as a data frame.
2.) Look at both the top and bottom of the data frame to verify it
was imported correctly.
What is the height of the tallest and shortest spruce trees in our
sample?
What is the DBH of the five thickest spruce trees in our
sample?
Calculate total volume for each tree as a variable in the data
frame.
How many trees were sampled? – 796
How many trees of each species were sampled?
What is the mean volume of black spruce (P. mariana) and white
spruce (P. glauca) in our sample? Which one is greater, and by how
much?
Study Questions
Define what a vector and a data frame are in R and how they are
different.
Briefly explain the purpose of each of the following: 1) R console
2) R script 3) R Markdown
What is the purpose of the head() and tail () functions? Why would
we want to use them?
Grading will be based on the following:
50 pts Data questions
30 pts Study questions
20 pts Well-presented document (complete sentences, well organized,
etc.)
setwd(“C:\Users\omega\OneDrive\Documents\Applied
Statistics\assignments\Data Files”)
Read the Excel file
timber_data <- read_excel(“timber_cruising.xlsx”)
Check if data was imported correctly
head(timber_data) # View the first few rows tail(timber_data) # View
the last few rows
Find the maximum (tallest tree)
max_height <- max(timber_data$Height, na.rm = TRUE)
Find the minimum (shortest tree)
min_height <- min(timber_data$Height, na.rm = TRUE)
Display the results
cat(“The tallest tree is”, max_height, “feet tall.”) cat(“The
shortest tree is”, min_height, “feet tall.”)
Find the five trees with the largest DBH
thickest_trees <- timber_data[order(-timber_data$DBH), ] # Sort in
descending order
Select the top 5 values
top_5_dbh <- thickest_trees$DBH[1:5]
Display the results
cat(“The DBH of the five thickest trees (in inches) are:”, top_5_dbh,
“”)
Calculate volume for each tree
timber_data\(Volume <- 0.65559 + 0.00191
* (timber_data\)DBH^2) * timber_data$Height
View the first few rows to confirm the new column was added
head(timber_data)
Count the total number of trees
num_trees <- nrow(timber_data)
Display the result
cat(“Total number of trees sampled:”, num_trees, “”)
Count how many trees of each species were sampled
species_counts <- table(timber_data$Species)
Display the result
print(species_counts)
Calculate the mean volume for black spruce (P. mariana)
mean_black_spruce <- mean(timber_data\(Volume[timber_data\)Species ==
“P.mariana”], na.rm = TRUE)
Calculate the mean volume for white spruce (P. glauca)
mean_white_spruce <- mean(timber_data\(Volume[timber_data\)Species == “P.glauca”],
na.rm = TRUE)
Find the difference between them
volume_difference <- abs(mean_white_spruce -
mean_black_spruce)
Display the results
cat(“Mean volume of Black Spruce (P. mariana):”, mean_black_spruce,
“cu ft”) cat(“Mean volume of White Spruce (P. glauca):”,
mean_white_spruce, “cu ft”) cat(“White Spruce has”, volume_difference,
“cu ft more volume than Black Spruce.”)
Study Questions Answered
1.) A vector in R is a one-dimensional data structure that holds
elements of the same type (e.g., numeric, character, logical).
Example:
my_vector <- c(1, 2, 3, 4, 5) print(my_vector)
A data frame is a two-dimensional table-like structure where each
column can hold different data types. Example:
my_df <- data.frame(ID = c(1, 2, 3), Name = c(“Alice”, “Bob”,
“Charlie”)) print(my_df)
2.) R Console: The interactive window where you can run individual
commands and see immediate output.
R Script: A file that saves a series of R commands for
reproducibility.
R Markdown: A document format that combines code, text, and results
for reports and presentations.
3.) The head() function shows the first few rows of a dataset, while
tail() shows the last few rows. These are useful for verifying data
after importing it.