For example, to add striped lines (alternative row colors) to your table and you want to highlight the hovered row, you can simply type:
The option condensed can also be handy in many cases when you don’t want your table to be too large. It has slightly shorter row height.
For some small tables with only few columns, a page wide table looks awful. To make it easier, you can specify whether you want the table to have full_width or not in kable_styling. By default, full_width is set to be TRUE for HTML tables (note that for LaTeX, the default is FALSE since I don’t want to change the “common” looks unless you specified it.)
What is the difference between tidy() function vs. kable() function for table styling
What I am noticing so far, is that tidy() works best for matrices and vectors. For example, if you used the function as.matrix() to convert your data into a matrix and then do tidy(
On the other hand kable() seems to be much more flexible when it comes to its arguments. It can take in matrices, data frames, and vectors.
morphology_data <- read_csv("mate_trials_summer_2019.csv")
Parsed with column specification:
cols(
ID_num = col_double(),
TgroupID = col_character(),
GgroupID = col_character(),
sex = col_character(),
beak = col_double(),
thorax = col_double(),
wing = col_double(),
body = col_double(),
w_morph = col_character(),
recorder = col_character(),
computer = col_character(),
date_recorded_by_hand = col_character(),
data_recorded_on_excel = col_character(),
notes = col_character()
)
# Using tidy() to display the table
morph_matrix <- as.matrix(morphology_data)
tidy(head(morph_matrix))
Warning: 'tidy.matrix' is deprecated.
See help("Deprecated")
#Using kable() and kable_styling() to make
kable(head(morphology_data)) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F)
ID_num | TgroupID | GgroupID | sex | beak | thorax | wing | body | w_morph | recorder | computer | date_recorded_by_hand | data_recorded_on_excel | notes |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
475 | NA | T1 | F | 6.29 | 3.69 | 7.81 | 11.22 | S | A | yes | 09.11.19 | 09.11.19 | NA |
268 | NA | T2 | F | 8.44 | 3.73 | 10.20 | 14.00 | L | A | yes | 09.11.19 | 09.11.19 | NA |
261 | NA | T3 | F | 8.44 | 3.68 | 9.57 | 13.21 | S | A | yes | 09.11.19 | 09.11.19 | too big for microscope |
261 | NA | T3 | F | 8.55 | 3.56 | 9.49 | 13.13 | S | A | no | 09.17.19 | 09.17.19 | NA |
284 | NA | T4 | F | 8.42 | 3.83 | 9.54 | 13.15 | L | A | yes | 09.11.19 | 09.11.19 | NA |
327 | NA | T5 | F | 8.82 | 3.79 | 10.20 | 13.90 | L | A | yes | 09.11.19 | 09.11.19 | NA |
The differences between vectors and lists in R.
list1 = c(1,2,3) # this is a vector NOT a list
list2 = list(1,2,3) # this is a list NOT a vector
When using select(), there is no need to place your column names that you want to pull out from your dataframe as a list of strings. You can just list them inside the curved parentheses as their column names.
‘$’ refers to a specific column relative to a specific data frame. This is the efficient R notation that will automatically call to that column.
female_data <- morphology_data %>%
select(sex, beak) %>%
filter(sex =="F")
kable(head(female_data)) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = F, position = "left")
sex | beak |
---|---|
F | 6.29 |
F | 8.44 |
F | 8.44 |
F | 8.55 |
F | 8.42 |
F | 8.82 |
f_mean <- mean(female_data$beak)
f_sd <- sd(female_data$beak)
f_max <- max(female_data$beak)
f_min <- min(female_data$beak)
# Get rid of rows with NA values
male_data <- morphology_data %>%
select(sex, beak) %>%
filter(sex =="M", beak != "NA")
kable(head(male_data)) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = F, position = "left")
sex | beak |
---|---|
M | 6.00 |
M | 6.40 |
M | 5.82 |
M | 6.35 |
M | 5.45 |
M | 5.97 |
m_mean <- mean(male_data$beak)
m_sd <- sd(male_data$beak)
m_max <- max(male_data$beak)
m_min <- min(male_data$beak)
m_min2 <- min(male_data$beak)
# Let's make a dataframe from scratch summarizing the statistics.
summary_table = matrix(byrow = TRUE, c(f_mean, f_sd, f_max, f_min,
m_mean, m_sd, m_max, m_min), nrow = 2,
dimnames = list(c("female","male"),
c("mean", "sd", "max", "min")))
kable(summary_table) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = F, position = "left")
mean | sd | max | min | |
---|---|---|---|---|
female | 7.641 | 0.9773 | 9.34 | 5.77 |
male | 5.781 | 0.4585 | 6.91 | 4.69 |
To help visualize the data we can graph histograms - this will help us see the distribution. To do so we can quickly use generic functions such as
female_hist <- hist(female_data$beak)
male_hist <- hist(male_data$beak)
female_bp <- boxplot(female_data$beak)
male_bp <- boxplot(male_data$beak)
But the ggformula package has specific functions that can make data visualizaiotn more dynamic:
We can then use grid.arrange() to help space out how we want our graphs to be placed when we knit to a PDF. grid.arrange() arranges multiple grobs on a page. What is a grob? A grid graphical object (“grob”) is a description of a graphical item. These basic classes provide default behavior for validating, drawing, and modifying graphical objects.
R Colors! https://status.rstudio.com
Generic argument input into the ggformula functions:
gf_
# How does beak length differ by sex?
p0 <- gf_histogram(~ beak, data = morphology_data, bins= 15, binwidth= 1, color =~sex, fill=~sex,
title="Beak length distribution by sex",
xlab= "Beak length (mm)",
ylab= "Number of soapberry bugs")
# How does thorax length differ by wing morph?
p5 <- gf_boxplot( ~thorax, data=morphology_data, color= ~w_morph)
# What is the relationship between beak length and thorax?
p6 <- gf_point(beak ~ thorax, data=morphology_data, color= ~w_morph)
# Let's arrange the graphs using grid.arrange()
grid.arrange(p0, p5, p6, ncol=2.5)
Warning: Removed 1 rows containing non-finite values (stat_bin).
Warning: Removed 1 rows containing non-finite values (stat_boxplot).
Warning: Removed 1 rows containing missing values (geom_point).
# More graphs we can make to see different relationships:
p1 <- gf_histogram(~beak, data = female_data, color = "white", bins = 15,
title = "Female Beak Lengths", xlab = "beak lengths (mm)")
p2 <- gf_histogram(~beak, data=male_data, color = "white", bins = 15,
title = "Male Break Lengths", xlab = "beak lengths (mm)")
p3 <- gf_boxplot(~beak, data= female_data)
p4 <- gf_boxplot(~beak, data=male_data)
grid.arrange(grobs = list(p1, p2, p3, p4), ncol=2)
Using the generic plot() function, we can plot our data and take advantage of the many arguments the function can take in order to produce a presentable graph.
Arguments:
Font size can be modified using the graphical parameter : cex. The default value is 1. If cex value is inferior to 1, then the text size is decreased. Conversely, any value of cex greater than 1 can increase the font size. The following arguments can be used to change the font size :
plot(female_data$beak)
plot(male_data$beak, main="male beak lengths",
xlab = "Index",
ylab = "beak lengths (mm)",
sub = "Figure 2. Male beak lengths....",
col= 'blue')
Numbers (which includes integers and floats: e.g. 6.88 and 1)
Strings (which includes anything in quotations). Strings in R imply a character vector.
String Specific Functions:
\(+\) or \(-\) addition or subtraction
\(*\) multiplication
/ division
%% remainder
** exponent
l <- c(1, 2, 3, 4, 5)
for (i in l) {
if (i %% 2 == 0)
print(i) }
[1] 2
[1] 4
i<-0
l <- vector(mode = "list", length = 0)
for (b in female_data$beak) {
if (b < 6.00) {
i <- i + 1
l <- c(l, b)
cat("Beak length that's smaller than 6: ", b)
}}
Beak length that's smaller than 6: 5.77
print(i)
[1] 1
print(l)
[[1]]
[1] 5.77