These data were collected as part of a timber survey around Anchorage. Each row represents measurements from an individual spruce tree (black spruce or white spruce). ‘Height’ is the height of the tree (in feet); ‘DBH’ is the diameter of the tree’s trunk (in inches); ‘Species’ is the species of the tree, either ‘P.mariana’ (black spruce) or ‘P.glauca’ (white spruce). The volume of a spruce tree (in cu. ft) can be estimated as: Volume= 0.65559+0.00191(〖DBH〗^2)(Height)

1.) Bring the data into R as a data frame.

2.) Look at both the top and bottom of the data frame to verify it was imported correctly.

What is the height of the tallest and shortest spruce trees in our sample?

What is the DBH of the five thickest spruce trees in our sample?

Calculate total volume for each tree as a variable in the data frame.

How many trees were sampled? – 796

How many trees of each species were sampled?

What is the mean volume of black spruce (P. mariana) and white spruce (P. glauca) in our sample? Which one is greater, and by how much?

Study Questions

Define what a vector and a data frame are in R and how they are different.

Briefly explain the purpose of each of the following: 1) R console 2) R script 3) R Markdown

What is the purpose of the head() and tail () functions? Why would we want to use them?

Grading will be based on the following:

50 pts Data questions

30 pts Study questions

20 pts Well-presented document (complete sentences, well organized, etc.)

setwd(“C:\Users\omega\OneDrive\Documents\Applied Statistics\assignments\Data Files”)

Read the Excel file

timber_data <- read_excel(“timber_cruising.xlsx”)

Check if data was imported correctly

head(timber_data) # View the first few rows tail(timber_data) # View the last few rows

Find the maximum (tallest tree)

max_height <- max(timber_data$Height, na.rm = TRUE)

Find the minimum (shortest tree)

min_height <- min(timber_data$Height, na.rm = TRUE)

Display the results

cat(“The tallest tree is”, max_height, “feet tall.”) cat(“The shortest tree is”, min_height, “feet tall.”)

Find the five trees with the largest DBH

thickest_trees <- timber_data[order(-timber_data$DBH), ] # Sort in descending order

Select the top 5 values

top_5_dbh <- thickest_trees$DBH[1:5]

Display the results

cat(“The DBH of the five thickest trees (in inches) are:”, top_5_dbh, “”)

Calculate volume for each tree

timber_data\(Volume <- 0.65559 + 0.00191 * (timber_data\)DBH^2) * timber_data$Height

View the first few rows to confirm the new column was added

head(timber_data)

Count the total number of trees

num_trees <- nrow(timber_data)

Display the result

cat(“Total number of trees sampled:”, num_trees, “”)

Count how many trees of each species were sampled

species_counts <- table(timber_data$Species)

Display the result

print(species_counts)

Calculate the mean volume for black spruce (P. mariana)

mean_black_spruce <- mean(timber_data\(Volume[timber_data\)Species == “P.mariana”], na.rm = TRUE)

Calculate the mean volume for white spruce (P. glauca)

mean_white_spruce <- mean(timber_data\(Volume[timber_data\)Species == “P.glauca”], na.rm = TRUE)

Find the difference between them

volume_difference <- abs(mean_white_spruce - mean_black_spruce)

Display the results

cat(“Mean volume of Black Spruce (P. mariana):”, mean_black_spruce, “cu ft”) cat(“Mean volume of White Spruce (P. glauca):”, mean_white_spruce, “cu ft”) cat(“White Spruce has”, volume_difference, “cu ft more volume than Black Spruce.”)

Study Questions Answered

1.) A vector in R is a one-dimensional data structure that holds elements of the same type (e.g., numeric, character, logical). Example:

my_vector <- c(1, 2, 3, 4, 5) print(my_vector)

A data frame is a two-dimensional table-like structure where each column can hold different data types. Example:

my_df <- data.frame(ID = c(1, 2, 3), Name = c(“Alice”, “Bob”, “Charlie”)) print(my_df)

2.) R Console: The interactive window where you can run individual commands and see immediate output.

R Script: A file that saves a series of R commands for reproducibility.

R Markdown: A document format that combines code, text, and results for reports and presentations.

3.) The head() function shows the first few rows of a dataset, while tail() shows the last few rows. These are useful for verifying data after importing it.