| Who | Variables | Threshold | Test % Correct |
|---|---|---|---|
| Alden | job2 + height + diet3 + income2 |
0.5 | 84.1 |
| Amanda | income_level + job_new + body_type_buckets |
0.5 | 70.7 |
| Brenda | income+height |
0.5 | 83.2 |
| James | income + job + orientation + body_type |
0.5 | 71.4 |
| Albert | height |
0.5 | 83.0 |
Recall from Lec12.R and quiz
Important Principle of Coding: Don’t Repeat Yourself
# 1. Get gold & bitcoin data
# 2. Make column structure the same
# 3. Add variable type
gold <- Quandl("BUNDESBANK/BBK01_WT5511") %>%
select(Date, Value) %>%
mutate(type="Gold")
bitcoin <- Quandl("BAVERAGE/USD") %>%
rename(Value = `24h Average`) %>%
select(Date, Value) %>%
mutate(type="Bitcoin")
# Combine them into single data frame using bind_rows()
combined <- bind_rows(gold, bitcoin) %>%
# Group by here!
group_by(type) %>%
# Then do the following ONLY ONCE:
filter(year(Date) >= 2011) %>%
arrange(Date) %>%
mutate(
Value_yest = lag(Value),
rel_diff = 100 * (Value-Value_yest)/Value_yest
)
# Plot
ggplot(combined, aes(x=Date, y=rel_diff, col=type)) +
geom_line() +
labs(y="% Change")
When parsing the time, what most of you did:
jukebox_hourly <- jukebox %>%
mutate(
date_time = parse_date_time(date_time, "%b %d %H%M%S %Y"),
hour=hour(date_time)
) %>%
group_by(hour) %>%
summarise(count=n())
What’s wrong with this plot?
ggplot(data=jukebox_hourly, aes(x=hour, y=count)) +
geom_bar(stat="identity") +
xlab("Hour of day") +
ylab("Number of songs played")