These are all from a successful UMAD run (log1) on the Syllables problem in the data collection for the GECCO 2018 paper introducing UMAD and its variants.
semantics <- read_csv("~/Downloads/GECCO18/semantics.csv")
## Parsed with column specification:
## cols(
## Generation = col_integer(),
## Total.error = col_integer(),
## Semantics.UUID = col_character()
## )
min_total_error <- read_csv("~/Downloads/GECCO18/min_total_error.csv")
## Parsed with column specification:
## cols(
## Generation = col_integer(),
## Min_total_error = col_integer()
## )
column_version <- cbind(semantics,
min_total_error$Min_total_error)
names(semantics) <- c("Generation", "Total_error", "Semantics.UUID")
names(min_total_error) <- c("Generation", "Total_error")
row_version <- combine(semantics[, c(1, 2)], min_total_error,
names=c("Lineage total error", "Min total error"))
The blue line shows the total error across a successful lineage from the initial generation to the success in generation 166. The red line shows the lowest total error across the entire population over the run. We can see that the lineage that led to (the earliest) success frequently chose individuals that weren’t the “best” based on total error, sometimes by quite a lot. There are times, however, where the two lines overlap, indicating that the individuals in this lineage were the “best” (based on total error) in their generation.
The initial total error for the lineage is over 100K, so I chopped off the top of the graph.
This is zooming in on generations 40-80 in the graph above. The colored dots indicate semantics, although not perfectly. These are just R’s default color mappings, and there are dots with very similar (or maybe identical?) colors that actually have different semantics. We can see, however, that there are horizontal stretches that all have the same color, indicating that the semantics aren’t changing for several generations.
Shawn and I were thinking of two additions to this that we haven’t gotten to yet:
Classified_instruction_counts <- read.csv("~/Downloads/GECCO18/Classified_instruction_counts.csv", sep="")
Classified_instruction_counts$Kind <-
factor(Classified_instruction_counts$prefix_factor,
levels=c("Constant", "input", "print", "boolean", "integer",
"char", "string", "tags", "exec"))
ordered_counts <-
Classified_instruction_counts %>%
arrange(Kind, Instruction) %>%
mutate(Instruction = factor(Instruction, unique(Instruction)))
ggplot(ordered_counts, aes(x=Generation, y=Instruction, group=Kind)) +
geom_point(aes(size=Count, color=Kind), alpha=0.75) + theme_bw() +
theme(axis.text.y=element_blank(), axis.ticks.y = element_blank()) +
scale_color_brewer(palette="Set1")
errors <- read.csv("~/Downloads/GECCO18/errors.csv", sep="")
wide <- spread(errors, Test_case, Error)
my_palette <- colorRampPalette(c("#1a9641", "#ffffbf", "#d7191c"))(n=26)
col_breaks = c(seq(-1, 0, length=1),
seq(0.0001, 0.9, length=19),
seq(0.9001, 7, length=7))
pheatmap(t(log(as.matrix(wide[,-1])+1)),
Rowv=NA, scale="none", col=my_palette,
show_colnames = FALSE, show_rownames = FALSE,
main="Errors (ln) over time",
cluster_cols = FALSE,
breaks = col_breaks, border_color=NA,
xlab="Generation")