#packages
library(readODS)
library(ggplot2)
library(dplyr)
library(tidyr)
library(kableExtra)
library(forcats)
library(purrr)
library(tseries)
library(ggfortify)
library(binom)
library(broom)
library(stringr)
library(ggpmisc)
library(corrplot)
library(DT)
library(shades)
#data
FAB <- read_ods("~\\FABtory\\FABtory.ods", sheet =2)
FAB2 <- FAB  |>  pivot_longer(
  cols = Biretta:WhiteOre,
  names_to = "Item",
  values_to = "AttemptsNeeded"
  )
Snitt <- FAB2  |> 
  group_by(Toon, Who) |>
  summarise(MeanAttemptsNeeded = mean(AttemptsNeeded))
Robe <- read_ods("~\\FABtory\\FABtory.ods", sheet =3)
Design <- read_ods("~\\FABtory\\FABtory.ods", sheet =4)
RobeSummaryTable <- read_ods("~\\FABtory\\FABtory.ods", sheet =6)

1 Summary

New data presented here show that the Holy Roman Emperor (HRE) policy effect on alchemy experiments is huge, much larger than the effect of the Adept title. Utilising a plethora of tricks it was possible over a few days to complete the Memorial Album for alchemy experiments (with FAB as reward) for a large number of low level characters at base rank 1 alchemy, at 20%–25% success rate (depending on item) across 3100 alchemy experiments. Also having the court alchemist advisor role on top of the HRE effect gave 10% additional success, as revealed by a further 1962 separate experiments focusing solely on Adept’s robe. With only adept title the success was around 1%. The proportion of minor success was unaffected by these factors. There was no measurable effect of alchemy rank, Oxford skills, paymaster aide or EX equipment on robe success, but an effect of alchemy rank on the minor successes.

2 Introduction

In Uncharted Waters Online (UWO), the Alchemy Experiment Result (No.1) Memorial Album (MA) give Forbidden Alchemy Book (FAB, Figure 1) as a reward for completing. It is notoriously hard to complete because the success rate of obtaining the entries in it is very low, thought to be 0.1% base or less for some items. Due to this difficulty, and because one character can only obtain one FAB, it is a valuable and sought after item that sells for around 10 captain’s tickets. It gives rank 15 increased Nanban rates, the highest in the game. (It is also possible to obtain FAB from spending real money, but the ticket it comes from also gives the choice of Vagrant’s Wear which is even more valuable so few would choose to get a FAB that way).

The FAB

FIGURE 1: The FAB

Having base rank 13 in Alchemy helps, because the Adept title obtained at r13 increases the success rate of alchemy experiments. But even with the title active the success rate is low: Around 1% to get Adept’s Biretta and Adept’s Robe, 5% to get Gold Short Sword and 10% for the rest (unpublished gut feelings based on getting adept title and FAB for about 25 characters in the past). With adept title, the number of attempts needed to get one item can still easily reach the hundreds (585 have been observed, and more rumoured). So this is hard to do, considering the relative difficulty of obtaining the base materials in quantity and in particular the need for base r13.

Increased availability of gear boosting alchemy in recent years have made it possible to grind alchemy very efficiently from base r1+11 (making gold at John Dee in London, and using gaseous drugs at Paracelsus in Venice), making it easier to obtain the adept title. It is nevertheless still a major effort because of the huge amount of materials needed for the skill grind and the low success of the experiments. Although it takes less than 30 minutes to get from r1 to r13 base alchemy using best practices, this obviously does not include the considerable time needed to prepare the necessary mats.

However, increased alchemy experiment success rate is a possible policy of some Holy Roman Emperor (HRE) candidates. With enough support done to unlock this effect it may be possible to easily obtain FAB, even without Adept title, without grinding alchemy at all (provided one could reach the required ranks for the experiments: r10–12 needed, + secondary skills). This HRE policy is rare, but we have had it least once early on Maris server when I made 125 Adept Robe and transmuted them into Alchemist Robe and then Alchemist Robe EX that sold for 1 billion ducats each.

During March 2023 such an opportunity came again in the form of Rudolf II as Emperor. Players that had supported this candidate with at least 49 support permits (i.e. 4.9m ducats effective support needed) got the effect activated for 31 days, along with +2 alchemy (Figure 2).

Policy of Support Alchemy and Court Alchemist advisor position for Rudolf II

FIGURE 2: Policy of Support Alchemy and Court Alchemist advisor position for Rudolf II

Less than a week later PapayaPlay launched an event that gave another +1 alchemy for 2 weeks. In addition, Stinker got the Alchemist advisor court position so could give friends one more +1 alchemy (Figure 2), as well as being Alchemist Meister (base rank 15+ alchemy) and thus able to give yet another +1 to apprentices. If all this was not enough, Atlantis made its appearance in the midst of it. Although unrelated to the experiment success for the MA, Atlantis is very relevant for alchemy since it gives +20% to transmutation success (such as when making Alchemist robe from the Adept’s robe made for the FAB). There was also 100% skill proficiency event going on so it was unavoidable to rank up several times before finishing the MA.

Thus, there was a lot of alchemy skill boost happening at the same time as the experiment success boost. The time was ripe for creating a FABtory.

This report present analyses of some data that was gathered when making MA entries for FAB during these unprecedented circumstances, and some additional experiments performed to better understand exactly how the HRE and other factors influence the outcome of alchemy experiments, and how this relates to crafting success in general.

2.1 Crafting success

Alchemy experiments differ from ordinary alchemy/casting/storage/handicrafts/sewing in that there is a “huge success” category (see Figure 3), as well as several tiers of minor success that all give different items, and failures that produce no item. Ordinary crafting do not have the huge success category but some recipes have a failure category, normal success and several tiers of “great success”. The latter may simply produce more quantity of the same, but for some recipes it results in a different item than normal success, and the goal is often to craft the highest tier such great success item in as few attempts as possible (to preserve valuable materials). My understanding of crafting success can be summarized as follows.

The outcome of ordinary crafting depends on two hidden parameters that can be manipulated independently. The first is the chance of great success. This can be increased by:

  • Paymaster aide
  • Oxford increased production skills 1-4
  • Meister title with apprentice nearby (only when crafting on land)
  • Workshop/galley ship skill (only when crafting at sea)

These factors have no effect at all if using Master’s Secrets, because the great success chance is then instantly made 100%. This is an uncommon consumable item (but obtainable in-game) that cannot be used for alchemy experiments.

The second parameter is the degree of great success, such as tier of cannon penetration or attack/defense stats of clothes. This can be increased by:

  • Skill rank
  • Oxford Item improvement Tech skill
  • Gear with the special goods production rate up EX effect
  • Sailor equipment with the special goods production effect (only at sea with workshop)

One theory is that the probability of producing a huge success in alchemy experiments is affected entirely by other factors than crafting a high tier great success in normal crafting. Alternatively, another school of thought argues that huge success is nothing but a high tier great success, so should be influenced at least partly by the same factors.

Seeing this for the first time in 2011 was something

FIGURE 3: Seeing this for the first time in 2011 was something

3 Methods

3.1 Experimental protocol for FAB

In total I had access to 74 characters that made enough support for Rudolf II as emperor, and had not obtained Adept title or any FAB MA entries previously. Nearly all of these had not yet obtained the alchemy skill at all, and most were low level and low court rank. The following steps were done to obtain a FAB for each of these characters:

  • In the election period, get support permits and invest enough in Rudolf II to unlock the alchemy support benefit.
  • Invest 10m ducats in London to get court rank 4 which is needed for quarters rank 2 necessary to place furniture which is needed to perform alchemy experiments.
  • Sell trade goods to get 20k fame, needed to equip +1 Nigredo gloves. In hindsight, this turned out to be a total overkill but I didn’t know I would get court alchemist or that papaya would have +1 production skills event. Also, I didn’t remember that the HRE gave +2 alchemy. I had assumed +1. So without using +3 paper, aide, job, Boston skill or ship I felt I needed this. Which I didn’t need in any case due to ranking up. But what the heck.
  • Buy quarters and upgrade to rank 2.
  • Make and place kiln (mostly +3 version) and bench (mostly +1 version) in the quarters.
  • Obtain alchemy skill. This was done in one of 4 ways:
    • Do the quest chain for alchemist job and get the skill from the job (the only way to learn it in the past). N = 4
    • Get adventure level 7 and casting rank 2, and then learn alchemy from John Dee in London. N = 27
    • Learn the skill at the scholar from another character at same account bestowing it. N = 24
    • Learn skill at the scholar from another character at same account bestowing it after itself having learned it this way (doubly bestowed). N = 19
  • Get alchemy boosters. I used:
    • +1 from being apprentice
    • +2 accessory
    • +1 hat (there is also +2 hats but I didn’t have any of those)
    • +2 clothes
    • +1 weapon
    • +1 gloves
    • +2 from HRE
    • +1 from being on court alchemist friend-list. In total they would therefore start out at r12 (r1+11). For the last 1/3 of the sample there was also:
    • +1 from Papaya event, so r1+12. During the experiments they would also get enough skills points for a few base ranks.
  • Craft all base mats and ingredients in huge quantities in advance and store on storage alts, aide bazaars, company vaults, quarters. This was done utilising the job-specific Boston skills for alchemy, sewing, handicraft and casting to save on mats when crafting, and using r20 buying skills, Robe EX and merchant mentor Boston skill to buy as much as possible per purchase order. This could have been tweaked even further with r9 EX and 4th fleet member.
  • Each character then made (during the HRE influence period) the 10 items needed for the MA (resulting in 740 items obtained). The experiments terminated when the item was obtained, moved on to the next item, and then to the next character. The items (see Appendix for the recipes) were always made in the following order:
    • Adept’s Biretta (r10)
    • Adept’s Robe (r11)
    • Gold-plated Armour (r12)
    • Knife losing it’s gold (r12)
    • Metal Works (r12)
    • Processed Lumber (r11)
    • Green Ore (r11)
    • Blue Ore (r11)
    • Red Ore (r11)
    • White Ore (r11)

(Yes, I know I’m inconsistent in naming some after the end product and some after the base material, but give me some slack). The characters were handed 3 base mats and plenty consumables of the right kind (other materials were handed back after each MA entry was obtained to avoid using wrong mats, or at wrong furniture, by mistake). They would have to trade me again if more than 3 base mats were needed and thus close and enter the experiment dialogue anew. For the coloured ores they received 250-300, apart from a few that I couldn’t bother to get a new ship for. Some made FAB in a starter barca.

I counted the number of times each experiment was performed until they got the item. That is, I counted in effect the number of consumables used, and not the number of base items. The main base item is not consumed on failures, only on success and “huge success” (i.e. MA item). If different experiments have different tables of possible results (for example if one have many possible success products, and another fewer, or they differ in the rate of failure) this could have affected the number of base items consumed (in addition to possible different probabilities of the MA item), which may in some sense be considered more important than the number of experiments performed. However, I did not collect good data on this.

The raw data for FAB experiments can be found in the Appendix.

3.2 Experimental protocol for Robes

In a separate set of experiments performed shortly after the FAB study I focused on the Adept’s Robe recipe. Unlike in the experiments for FAB I kept track of each outcome. The experimental set-up was designed to yield pairwise comparisons where a treatment group and a control group could be compared while all other factors were held constant. I did 1962 experiments in total, with 1–3 batches for each of 14 combinations of factors tested (Table 1), with approx 80 experiments in each batch until I ran out of resources (fine dye) to produce base materials. The conditions tested were:

  • With or without HRE policy (server wide effect for HRE supporters)
  • With or without Advisor position (personal effect for the sole court alchemist title holder)
  • With or without Adept title (obtained at alchemy base rank 13)
  • With or without Oxfords (increased production success 1–4, and item improvement tech)
  • With or without EX (Golden Ark rank 6 Special Goods Production Rate Up)
  • With r20 or r11 alchemy
kbl(Design[,-2], caption = 'Experimental design for Robe experiments. Group 15 is the Robe part of the FAB setup for comparison.', align ="c") |> 
  kable_paper(c("hover", "condensed"), full_width = F)
TABLE 1: Experimental design for Robe experiments. Group 15 is the Robe part of the FAB setup for comparison.
Condition HRE Advisor Adept EXr6 Oxfords AlchRank Batch Sequence Experiments Who Totals
1 Yes Yes Yes No No 20 1–3 1–204 60+61+83 Stinker 204
2 Yes Yes No No No 20 4–6 205–417 70+79+64 Stinker 222
3 Yes Yes Yes Yes No 20 7–9 418–613 62+68+66 Stinker 196
4 Yes Yes No Yes No 20 10–12 614–825 71+66+75 Stinker 212
5 Yes Yes Yes Yes Yes 20 13–15 826–1028 75+62+66 Stinker 203
6 Yes Yes No No No 11 16–17 1029–1186 79+79 Stinker 158
7 Yes No No No No 11 18 1197–1269 83 Bodal 83
8 Yes No No No No 20 19 1270–1352 83 Bodal 83
9 Yes No Yes No No 11 20 1353–1442 90 Bodal 90
10 Yes No Yes No No 20 21 1443–1522 80 Bodal 80
11 No No No No No 11 22 1523–1630 108 Bib2 108
12 No No Yes No No 11 23 1631–1731 101 Bib2 101
13 No No No No No 20 24 1732–1840 109 Bib2 109
14 No No Yes No No 20 25–26 1841–1962 93+29 Bib2 122
15 Yes No No No No 12–14 NA NA 364 74 FAB 364

One batch is here defined as an uninterrupted sequence of experiments done without closing the dialogue screen. The length of a batch was limited by inventory space and the number of failures (more failure = longer batch), but was not limited by vigour food. No disconnects from the server occurred during the experiments. Batch number 26 was merged with batch 25 (identical conditions) in some analyses because the sample size was deemed too small to be treated separately.

It was not possible to keep the sample sizes strictly equal between batches because even though the number of base clothes were the same (+/- a few), the number of experiments differed due to the fact that failure do not consume the base material and would lead to more experiments possible before the base mats were used up for the batch.

A full factorial experimental design of 6 factors with 2 levels each would require \(2^6\) = 64 combinations. It was not practically possible to test every combination of the factors. I chose to ignore any interaction effects as there is reason to believe all the effects are purely additive, removing the need for a full design. However, certain combinations of factors were physically impossible. For example, the advisor effect could not be obtained without having also the HRE effect. Another difficulty was that without wearing any gear I was already r20 alchemy. However, adjusting alchemy rank down to r11 could be achieved with a combination of mentoring and + gear (Figure 4). Thus it was possible to disentangle skill rank, Adept title and HRE effects which would otherwise be confounded.

Mentoring was used in order to lower skill rank and retain adept title

FIGURE 4: Mentoring was used in order to lower skill rank and retain adept title

Oxfords were compared with all on or all off. Even though it would have been ideal to have experiments with and without Oxford improvement techniques while having Oxford production success on or off, it is still possible to separate the effects. Oxford tech skill is only (possibly) relevant for experiments that did not fail, whereas the production success Oxfords should (possibly) only affect fail rate. Therefore one can compare fail rate with and without Oxfords to test the production success Oxfords, and compare outcome among experiments that did not fail with and without Oxfords to test the item improvement tech Oxford.

All experiments in the Robe setup were done with aide in Paymaster 100s job. Since the FAB trials were not, it was to some extent also possible to test the effect of paymaster aide by comparing certain groups from the Robe setup with the FAB results. Unfortunately there was no data to separate failures from minor success in the FAB experiments, so only the effect of aide on huge success can be considered.

Results were mostly analysed by comparing confidence intervals of proportions. Since some of the observed proportions were 0 or close to 0, I did not use the standard asymptotic (Wald) approximation to calculate binomial confidence intervals, but instead the Wilson method implemented in R package binom.

The raw data for Robe experiments can be found in the Appendix.

This report was written using R Markdown in RStudio. R code used to generate the results and some output can be seen by clicking on the “code” buttons, as well as found in the Appendix (without output) together with software versions used.

4 Results

4.1 FAB experiments


ggplot(FAB2, aes(AttemptsNeeded)) +
  geom_histogram(fill = "#377eb8", colour = "black", binwidth = 1)

ggplot(FAB2, aes(AttemptsNeeded)) +
  stat_density(aes(colour = AlchemySkill),  linewidth = 2,
               bounds = c(1, Inf), geom="line", position="identity") +
  #scale_colour_manual(values = c("#7a0403","#FABA39","#18bc9c", "#2c3e50"))+
  scale_colour_brewer(palette = "RdBu", direction = -1) +
  #scale_colour_manual(values = c("#7a0403","#FABA39","#18bc9c", "#2c3e50"))+
  theme(legend.position = c(0.95,0.95),
        legend.justification = c(1,1))
Needed attempts to get an MA entry overall (histogram), and by how the alchemy skill was obtained (kernel density estimates)Needed attempts to get an MA entry overall (histogram), and by how the alchemy skill was obtained (kernel density estimates)

FIGURE 5: Needed attempts to get an MA entry overall (histogram), and by how the alchemy skill was obtained (kernel density estimates)

Over all 740 MA items obtained, the mean number of experiments needed to obtain one item was 4.19 attempts, with a median of 3. In total 3100 experiments were performed to get all the MA items for all 74 participants. The distribution of needed attempts is shown in Figure 5. The highest number observed was 25 attempts for one item. It did not seem to matter if the skill was obtained from the job, learned by fulfilling requirements, bestowed or bestowed again (Figure 5). If we compare the results from the different items (Figure 6) there does not seem to be any obvious differences – there is a lot of variation within compared to between groups.

FAB3 <- FAB2 |>
  mutate(Item = reorder(Item, AttemptsNeeded, FUN = mean, na.rm = TRUE))
ItemLabels <- FAB3 |>
  group_by(Item) |>
  summarise(snitt = mean(AttemptsNeeded), median =median(AttemptsNeeded))
ItemLabels$snitt2 <- sprintf("bar(x) == %.2f", ItemLabels$snitt)
ItemLabels
## # A tibble: 10 × 4
##    Item       snitt median snitt2        
##    <fct>      <dbl>  <dbl> <chr>         
##  1 ProcLumber  3.45    2.5 bar(x) == 3.45
##  2 Knife       3.80    3   bar(x) == 3.80
##  3 GoldPlated  3.82    3   bar(x) == 3.82
##  4 RedOre      3.99    3   bar(x) == 3.99
##  5 GreenOre    4.12    3   bar(x) == 4.12
##  6 WhiteOre    4.28    3   bar(x) == 4.28
##  7 MetalWorks  4.30    3   bar(x) == 4.30
##  8 Biretta     4.54    3.5 bar(x) == 4.54
##  9 BlueOre     4.68    4   bar(x) == 4.68
## 10 Robe        4.92    3.5 bar(x) == 4.92

ggplot(FAB3, aes(AttemptsNeeded)) +
  geom_histogram(binwidth = 1, colour = "black", 
                 aes(fill=(Item))) + 
  facet_wrap(vars(Item), ncol = 2) +
  #scale_fill_viridis_d(option = "turbo", direction = 1, guide = "none") +
  scale_fill_brewer(palette = "RdBu", guide = "none", direction = -1)+
  geom_label(x = 12.5, y = 17.5, aes(label = snitt2), data = ItemLabels, parse = TRUE, 
             hjust = "center", size = 3, label.size = NA)
Needed attempts for the different items

FIGURE 6: Needed attempts for the different items

This is easier to compare using box plots, shown in Figure 7. We can think of a box plot as dividing the data in 4 parts with equal number of observations. The median (shown as a vertical line) cut the data in half with as many smaller as larger observations, and the box show the lower (25%) and upper (75%) quartiles. That is, the central 50% of the observations lie within the boxes, and 50% below and 50% above the median. Mean values (shown as white dots) are higher than medians because of the shape of the distributions. Again there is no large difference discernible in the difficulty of obtaining any of the items, but we can see that robes have the highest mean followed by biretta and blue ore.

ggplot(FAB3, aes(x=AttemptsNeeded, y=Item, fill=Item)) +
  geom_boxplot() +
  stat_summary(fun=mean, geom="point", shape=20, size=3, color="white", fill="white") +
  ylab(NULL) +
  scale_x_continuous(breaks = seq(0,24, by =2))+
  #scale_fill_viridis_d(option = "turbo", direction = 1, guide="none")
  scale_fill_brewer(palette = "RdBu", guide = "none", direction =-1)
Box plots of number of attempts needed to get each MA entry

FIGURE 7: Box plots of number of attempts needed to get each MA entry

FAB4 <- FAB3 |>
  mutate(ItemGroup = ifelse(Item == "Biretta" | Item == "Robe" , "Robe/Biretta", "Others"))
FAB4$ItemGroup <- as_factor(FAB4$ItemGroup)
ItemLabels2 <- FAB4 |>
  group_by(ItemGroup) |>
  summarise(snitt = mean(AttemptsNeeded), 
            median =median(AttemptsNeeded), 
            n = (length(AttemptsNeeded)))
ItemLabels2$snitt2 <- sprintf("bar(x) == %.2f", ItemLabels2$snitt)
ItemLabels2$n2 <- sprintf("n == %.0f", ItemLabels2$n)
ItemLabels2$snittAndN <- sprintf(
  "bar(x) == %.2f * ',' ~~ n == %.0f",
  ItemLabels2$snitt,
  ItemLabels2$n
)
ggplot(FAB4, aes(AttemptsNeeded)) +
  geom_histogram(binwidth = 1, colour = "black", 
                 aes(fill=(ItemGroup))) + 
  facet_wrap(vars(ItemGroup), ncol = 2) +
  scale_fill_manual(values = c("#B51F2E", "#377eb8"), guide = "none") +
  #scale_fill_manual(values = c("#7a0403", "#95a5a6"), guide = "none") +
  #scale_fill_brewer(palette = "Set1", guide = "none")+
  geom_label(x = 12.5, y = 140, aes(label = snittAndN), data = ItemLabels2, parse = TRUE, 
             hjust = "center", vjust = "center", size = 3, label.size = NA)

ggplot(FAB4, aes(AttemptsNeeded)) +
  geom_histogram(binwidth = 1, colour = "black", 
                 aes(fill=(ItemGroup), y = stat(density))) + 
  facet_wrap(vars(ItemGroup), ncol = 2) +
  scale_fill_manual(values = c("#B51F2E", "#377eb8"), guide = "none") +
  #scale_fill_manual(values = c("#7a0403", "#95a5a6"), guide = "none") +
  #scale_fill_brewer(palette = "Set1", guide = "none")+
  geom_label(x = 12.5, y = 0.235, aes(label = snittAndN), data = ItemLabels2, parse = TRUE, 
             hjust = "center", vjust = "center", size = 3, label.size = NA)

ggplot(FAB4, aes(AttemptsNeeded)) +
  stat_density(aes(colour = ItemGroup),  linewidth = 2,
               bounds = c(1, Inf), geom="line", position="identity") +
  scale_colour_manual(values = c("#B51F2E", "#377eb8"), guide = "none") +
  #scale_colour_viridis_d(option = "turbo", direction = -1) +
  #scale_colour_manual(values = c("#7a0403","#95a5a6")) +
  #scale_colour_brewer(palette = "Set1", guide = "none")+
  theme(legend.position = c(0.95,0.95),
        legend.justification = c(1,1)) +
  guides(colour = guide_legend(title = NULL))

ggplot(FAB4, aes(x=AttemptsNeeded, y=ItemGroup, fill=ItemGroup)) +
  geom_boxplot() +
  stat_summary(fun=mean, geom="point", shape=20, size=3, color="white", fill="white") +
  ylab(NULL) +
  scale_x_continuous(breaks = seq(0,24, by =2))+
  scale_fill_manual(values = c("#B51F2E", "#377eb8"), guide = "none") 
  #scale_fill_manual(values = c("#7a0403","#95a5a6"), guide="none")
  #scale_fill_brewer(palette = "Set1", guide = "none")
Number of attempts needed for two groups of items. Histogram of counts, standardised histogram, kernel density estimates and box plotsNumber of attempts needed for two groups of items. Histogram of counts, standardised histogram, kernel density estimates and box plotsNumber of attempts needed for two groups of items. Histogram of counts, standardised histogram, kernel density estimates and box plotsNumber of attempts needed for two groups of items. Histogram of counts, standardised histogram, kernel density estimates and box plots

FIGURE 8: Number of attempts needed for two groups of items. Histogram of counts, standardised histogram, kernel density estimates and box plots

table(FAB4$ItemGroup)
## 
## Robe/Biretta       Others 
##          148          592
(WT <- wilcox.test(FAB4$AttemptsNeeded ~ FAB4$ItemGroup, 
            paired = F, 
            alternative = "greater"))
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  FAB4$AttemptsNeeded by FAB4$ItemGroup
## W = 47772, p-value = 0.04238
## alternative hypothesis: true location shift is greater than 0

In order to learn if there is more than random variability in the data we can perform some statistical analyses. From a priori information we can expect Robe and Biretta to be harder to make than the other 8 items, so to increase sample size and reduce uncertainty I lumped the data into two groups: Robe+Biretta (n = 148, \(\bar{x}\) = 4.73 attempts) vs the other items (n = 592, \(\bar{x}\) = 4.05 attempts). The distributions of needed attempts for these two groups is shown in in Figure 8. The number of attempts was significantly larger for the Robe/Biretta group (Mann-Whitney U-test, one-sided p = 0.04).

#restructure data
FABprop <- FAB4 |>
  group_by(ItemGroup) |>
  summarise(success = sum(n()), total = sum(AttemptsNeeded))
#binomial confidence intervals
binom::binom.confint(FABprop$success, FABprop$total) 
##           method   x    n      mean     lower     upper
## 1  agresti-coull 148  700 0.2114286 0.1827560 0.2432511
## 2  agresti-coull 592 2400 0.2466667 0.2298297 0.2643133
## 3     asymptotic 148  700 0.2114286 0.1811802 0.2416769
## 4     asymptotic 592 2400 0.2466667 0.2294206 0.2639128
## 5          bayes 148  700 0.2118402 0.1818799 0.2422619
## 6          bayes 592 2400 0.2467722 0.2295945 0.2640682
## 7        cloglog 148  700 0.2114286 0.1820038 0.2423941
## 8        cloglog 592 2400 0.2466667 0.2296016 0.2640760
## 9          exact 148  700 0.2114286 0.1817347 0.2435774
## 10         exact 592 2400 0.2466667 0.2295297 0.2644240
## 11         logit 148  700 0.2114286 0.1827598 0.2432559
## 12         logit 592 2400 0.2466667 0.2298285 0.2643151
## 13        probit 148  700 0.2114286 0.1824644 0.2429217
## 14        probit 592 2400 0.2466667 0.2297478 0.2642309
## 15       profile 148  700 0.2114286 0.1822669 0.2426986
## 16       profile 592 2400 0.2466667 0.2296966 0.2641772
## 17           lrt 148  700 0.2114286 0.1822720 0.2426960
## 18           lrt 592 2400 0.2466667 0.2296995 0.2641903
## 19     prop.test 148  700 0.2114286 0.1821235 0.2439577
## 20     prop.test 592 2400 0.2466667 0.2296313 0.2645215
## 21        wilson 148  700 0.2114286 0.1827968 0.2432103
## 22        wilson 592 2400 0.2466667 0.2298344 0.2643086
(binomRes <- binom::binom.wilson(FABprop$success, FABprop$total))
##   method   x    n      mean     lower     upper
## 1 wilson 148  700 0.2114286 0.1827968 0.2432103
## 2 wilson 592 2400 0.2466667 0.2298344 0.2643086
binomRes$Group <- as_factor(c("Robe/Biretta", "Others"))
#test if equal proportions
(propRes <- prop.test(x = binomRes$x, binomRes$n, alternative = "less"))
## 
##  2-sample test for equality of proportions with continuity correction
## 
## data:  binomRes$x out of binomRes$n
## X-squared = 3.5116, df = 1, p-value = 0.03047
## alternative hypothesis: less
## 95 percent confidence interval:
##  -1.000000000 -0.005094121
## sample estimates:
##    prop 1    prop 2 
## 0.2114286 0.2466667
kbl(FABprop, label = "FABpropTable", 
    caption = 'Number of successes and attempts when making MA entries for FAB', 
    align ="c",
    col.names = c('Group', 'Successes', 'Total Experiments')) |>
  kable_paper(c("hover", "condensed"), full_width = F) #|>
TABLE 2: Number of successes and attempts when making MA entries for FAB
Group Successes Total Experiments
Robe/Biretta 148 700
Others 592 2400
  #column_spec(1, color = "white", bold = F, 
  #            background = c("#ca0020", "#0571b0"))
binomRes |> 
ggplot(aes(x = Group, y = mean, colour= Group)) +
  ylab("Proportion Success") +
  xlab(NULL) +
  geom_pointrange(aes(ymin = lower, ymax = upper), size =0.75, linewidth = 1)+
  scale_colour_manual(values = c("#B51F2E", "#377eb8"), guide = "none") +
  geom_text(aes(label = round(mean, 3), vjust = -2), size = 3) +
  coord_flip() +
  theme(axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank())
Proportion success for the two groups of items, with 95% confidence intervals

FIGURE 9: Proportion success for the two groups of items, with 95% confidence intervals

We may also arguably treat each experiment as independent and look at if the proportion of success is lower for the Robe/Biretta group than for the other group (Table 2). The binomial confidence intervals of the two proportions had some overlap (Figure 9), but as expected the proportion of success was significantly less for the Robe/Biretta group (\(\chi^{2}\) = 3.51 , df = 1, one-sided p = 0.03).

Overall we can say that with the HRE effect active the “difficult” items (Robe and Biretta) on average needed about 5 attempts and the other, “easy”, items about 4 attempts, which translates to 20 % and 25 % chance, respectively.


ggplot(Snitt, aes(MeanAttemptsNeeded)) +
  geom_histogram(aes(fill= Who), binwidth = 0.5, colour="black") +
  #scale_fill_brewer(palette = "RdBu", direction = 1) +
  scale_fill_manual(values = c("#B51F2E","#F4A582","#A7CFE4", "#377eb8"))+
  #scale_fill_viridis_d(option = "turbo", direction = -1) +
  ylab("Count") +
  xlab("Average attempts needed") +
  annotate("text", x = 2.5, y = 9.5, label = "Alta-lucky-vista", colour = "#B51F2E") +
  geom_curve(
    aes(x = 2.5, y = 8.8, xend = 2.0, yend = 2.5),
    arrow = arrow(
    length = unit(0.03, "npc"), 
    type="closed" 
    ),
    colour = "#B51F2E",
    linewidth = 1.2,
    angle = 0 
  ) +
  scale_y_continuous(breaks = seq(0,20, by =2))+
  theme(
    legend.position = c(0.95,0.95),
    legend.justification = c(1,1),
    legend.background = element_rect(fill = "white"),
    legend.key = element_rect(fill = "white", color = NA)
  )
Mean number of attempts needed to get each of the 10 MA entries for FAB

FIGURE 10: Mean number of attempts needed to get each of the 10 MA entries for FAB

The mean number of attempts a player had to do to get each of the 10 items for the MA varied from 2.1 to 6.6 with a mean of 4.19. Some people are luckier than others (Figure 10).

As an anecdote, only one character managed to complete the FAB without getting any other MA entries (like gemstones and plates from minor success by-products, Figure 11). And he did not have any other MA records either. What a guy.

Another unique individual

FIGURE 11: Another unique individual

4.2 Robe experiments

A summary of the results is presented in Table 3 below. This is visualized as stacked bars ordered by proportion Robes in Figure 12 (per condition) and Figure 13 (further divided into the separate batches). Figure 14 show the he actual sequence of outcomes for each batch (batches in same order as in Figure 13, but omitting labelling the conditions to leave more space).

It is difficult to compare stacked bar charts that have more than two categories because the location of the middle categories is shifted relative to each other across groups – they don’t have a common baseline. When analysing the effect of the various treatments in the sub-chapters below, the details are therefore presented as proportion (with 95% confidence intervals) of the relevant category.

kbl(RobeSummaryTable, caption = 'Outcome of Robe Experiments for each batch. The percentage Adept Robe is highlighted.', align ="c") |>
  kable_paper(c("hover", "condensed"), full_width = F) |>
  column_spec(9, background = "#92c5de") |> #, color = "white"
  pack_rows("HRE, Advisor, Adept, r20", 1, 3) |>
  pack_rows("HRE, Advisor, r20", 4, 6) |>
  pack_rows("HRE, Advisor, Adept, r20, Ex", 7, 9) |>
  pack_rows("HRE, Advisor, r20, Ex", 10, 12) |>
  pack_rows("HRE, Advisor, Adept, r20, Ex, Oxf", 13, 15) |>
  pack_rows("HRE, Advisor, r11", 16, 17) |>
  pack_rows("HRE, r11", 18, 18) |>
  pack_rows("HRE, r20", 19, 19) |>
  pack_rows("HRE, Adept, r11", 20, 20) |>
  pack_rows("HRE, Adept, r20", 21, 21) |>
  pack_rows("r11", 22, 22) |>
  pack_rows("Adept, r11", 23, 23) |>
  pack_rows("r20", 24, 24) |>
  pack_rows("Adept, r20", 25, 26)
TABLE 3: Outcome of Robe Experiments for each batch. The percentage Adept Robe is highlighted.
Condition Batch Failure FineVelvet Justaucorps Robe Waistcoat Total %Robe %Failure %minor %Justau %Justau/minor
HRE, Advisor, Adept, r20
1 1 20 7 5 20 8 60 33.3 33.3 33.3 8.3 25.0
1 2 17 6 7 24 7 61 39.3 27.9 32.8 11.5 35.0
1 3 36 8 10 17 12 83 20.5 43.4 36.1 12.0 33.3
HRE, Advisor, r20
2 4 24 12 3 23 8 70 32.9 34.3 32.9 4.3 13.0
2 5 34 7 9 20 9 79 25.3 43.0 31.6 11.4 36.0
2 6 21 4 11 22 6 64 34.4 32.8 32.8 17.2 52.4
HRE, Advisor, Adept, r20, Ex
3 7 16 6 4 26 10 62 41.9 25.8 32.3 6.5 20.0
3 8 22 9 7 23 7 68 33.8 32.4 33.8 10.3 30.4
3 9 19 10 7 24 6 66 36.4 28.8 34.8 10.6 30.4
HRE, Advisor, r20, Ex
4 10 24 4 6 27 10 71 38.0 33.8 28.2 8.5 30.0
4 11 19 12 11 16 8 66 24.2 28.8 47.0 16.7 35.5
4 12 29 11 9 19 7 75 25.3 38.7 36.0 12.0 33.3
HRE, Advisor, Adept, r20, Ex, Oxf
5 13 28 8 12 19 8 75 25.3 37.3 37.3 16.0 42.9
5 14 15 11 11 21 4 62 33.9 24.2 41.9 17.7 42.3
5 15 19 4 12 15 16 66 22.7 28.8 48.5 18.2 37.5
HRE, Advisor, r11
6 16 30 15 1 20 13 79 25.3 38.0 36.7 1.3 3.4
6 17 31 9 1 26 12 79 32.9 39.2 27.8 1.3 4.5
HRE, r11
7 18 37 13 3 15 15 83 18.1 44.6 37.3 3.6 9.7
HRE, r20
8 19 37 10 10 15 11 83 18.1 44.6 37.3 12.0 32.3
HRE, Adept, r11
9 20 43 15 2 18 12 90 20.0 47.8 32.2 2.2 6.9
HRE, Adept, r20
10 21 33 10 13 16 8 80 20.0 41.3 38.8 16.3 41.9
r11
11 22 68 24 3 0 13 108 0.0 63.0 37.0 2.8 7.5
Adept, r11
12 23 62 18 5 1 15 101 1.0 61.4 37.6 5.0 13.2
r20
13 24 69 16 11 0 13 109 0.0 63.3 36.7 10.1 27.5
Adept, r20
14 25 53 9 16 0 15 93 0.0 57.0 43.0 17.2 40.0
14 26 19 2 4 1 3 29 3.4 65.5 31.0 13.8 44.4
Total Result NA 825 260 193 428 256 1962 21.8 42.0 36.1 9.8 27.2

# get condition descriptions from table, and combine with sample size for figure labels
# and reorder factor levels by % robes. and the items from "hardest to easiest"
# keep them reordered in the file so don't have to do within ggplot
Robe2 <- Robe |> left_join(Design |> select(Condition, Condition2, Totals), by="Condition")
Robe2 <- Robe2 |>
  mutate(Condition3 = paste0(Condition2, " (n = ", Totals, ")")) |>
  mutate(Condition3 = fct_reorder(.f = Condition3, 
                                         .x = Result,
                                         .fun = function(.x) mean(.x == "Robe"),
                                         .desc = TRUE)) |>
  mutate(Result = fct_relevel(Result, "Robe", "Justaucorps", "Waistcoat", "FineVelvet", "Failure"))
  
ggplot(Robe2, aes(Condition3, fill = fct_rev(Result))) + 
  geom_bar(position = "fill", width = 0.9,) + #
  #shades::brightness(scale_fill_brewer(palette = "RdBu", direction = -1),   shades::delta(-0.05)) +
  scale_fill_brewer(palette = "RdYlBu", direction = -1) +
  #scale_fill_viridis_d(option = "turbo", direction = 1) +
  scale_y_continuous(labels = scales::percent, breaks = seq(0, 1, by = 0.2), expand = expansion(mult = c(0, 0.02))) +
  scale_x_discrete(expand = expansion(mult = c(0, 0)))+
  ylab("Percentage") +
  xlab(NULL) +
  theme(legend.background = element_rect(fill = "#EBEBEB"),
        legend.key = element_rect(fill = NA),
        legend.position = "top",
        axis.text.y = element_text(size = rel(1)),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank(),
        legend.justification = c(1, 0),
        panel.background = element_rect(fill = "white")) +
  guides(fill = guide_legend(reverse = TRUE, title = NULL))+
  coord_flip()
Percentage of outcomes for Robe experiments under different conditions

FIGURE 12: Percentage of outcomes for Robe experiments under different conditions


# include batch, within condition
Robe2$GroupBatch <- ave(Robe2$Batch, Robe2$Condition, 
                        FUN = function(x) as.numeric(factor(x)))
ggplot(Robe2, aes(GroupBatch, fill = fct_rev(Result))) + 
  geom_bar(position = "fill",  width = 1) + #colour = "black",
  scale_fill_brewer(palette = "RdYlBu", direction = -1) +
  #scale_fill_viridis_d(option = "turbo", direction = 1) +
  scale_y_continuous(labels = scales::percent, breaks = seq(0, 1, by = 0.2), expand = expansion(mult = c(0, 0.02))) +
  scale_x_continuous(breaks = NULL, expand = expansion(mult = c(0, 0))) +
  ylab("Percentage") +
  xlab(NULL) +
  theme(legend.background = element_rect(fill = "#EBEBEB"),
        legend.key = element_rect(fill = NA),
        legend.position = "top",
        axis.text.y = element_text(size = rel(1.2)),
        legend.justification = c(1, 0)) +
  guides(fill = guide_legend(reverse = TRUE, title = NULL))+
  coord_flip()+
  facet_wrap(~fct_rev(Condition3), strip.position = "left", ncol = 1, scales = "free_y") +
  theme(strip.text.y.left = element_text(angle = 0, hjust = 1), 
        panel.spacing = unit(0.1,"cm"), 
        strip.background = element_rect(fill = NA))
Percentage of outcomes split in batches

FIGURE 13: Percentage of outcomes split in batches

# time series visualization
ggplot(Robe2, aes(x=BatSeq, fill = fct_rev(Result))) +
  geom_bar(position = "fill", width = 1) + #, colour = "black"
  facet_wrap(vars(fct_rev(Condition3), GroupBatch), strip.position = "left", ncol = 1) +
  scale_fill_brewer(palette = "RdYlBu", direction = -1) +
  #scale_fill_viridis_d(option = "turbo", direction = 1)+
  scale_x_continuous(breaks = seq(0, 120, by = 20), expand = expansion(mult = c(0, 0))) +
  scale_y_continuous(breaks = NULL, expand = expansion(mult = c(0, 0))) +
  theme(legend.background = element_rect(fill = "#EBEBEB"),
        legend.key = element_rect(fill = NA),
        legend.position = "top",
        axis.text.y = element_text(size = rel(1.2)),
        legend.justification = c(0.5, 0)) +
  guides(fill = guide_legend(reverse = TRUE, title = NULL)) +
  theme(strip.text.y.left = element_blank(), 
        strip.text.x = element_blank(),
        panel.spacing = unit(0.1,"cm")) +
  xlab("Sequence") +
  ylab(NULL)
Sequence of outcomes for Robe experiments under different conditions, for each batch

FIGURE 14: Sequence of outcomes for Robe experiments under different conditions, for each batch

4.2.1 HRE effect

HRE <- Robe2 |> 
  mutate(robeSeries = fct_other(Result, keep = "Robe")) |> 
  group_by(HRE, Condition2, robeSeries) |>
  summarise(number = n(), Condition = mean(Condition)) |> 
  ungroup() |> 
  pivot_wider(
  names_from  = "robeSeries",
  values_from = "number",
  values_fill = 0) |> 
  mutate(Total = Robe+Other,
         PropRobe = Robe/Total)
HRE <- cbind(HRE, 
      binom.confint(HRE$Robe, n = HRE$Total, methods = "wilson"))

HRE <- HRE |> 
  filter(Condition %in% c(7:10) | HRE == "NO") |> #manual subsetting
  mutate(Condition3 = (str_replace(Condition2, c("HRE, "), "")))
  
HRE |> 
ggplot(aes(y = Condition3, x = PropRobe, colour = HRE)) +
  ylab(NULL) +
  xlab("Proportion Robe") +
  geom_pointrange(aes(xmin = lower, xmax = upper), 
                  size =0.75, linewidth = 1,
                  position = position_dodge(width=1)) +
  scale_colour_manual(values = c("#95a5a6", "#2c3e50")) +
  theme(legend.position = "top") +
  facet_wrap(vars(Condition3), strip.position = "left", ncol = 1, scales = "free_y")+
  theme(strip.text.y.left = element_blank(), 
        strip.text.x = element_blank(),
        panel.spacing = unit(0.5,"cm"),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank()) +
  geom_text(aes(label = paste("n =", Total)), vjust = 0, hjust = -0.5, size = 3)
Effect of Holy Roman Emperor support alchemy policy (HRE). Proportion of experiments that yielded Robe under otherwise similar conditions, with 95% confidence intervals

FIGURE 15: Effect of Holy Roman Emperor support alchemy policy (HRE). Proportion of experiments that yielded Robe under otherwise similar conditions, with 95% confidence intervals

There was a large and consistent effect of the Holy Roman Empire policy (Figure 15). Having the HRE effect added around 20% to the Robe proportion, both when at r11 and r20 alchemy and with or without adept title.

4.2.2 Advisor effect

ADV <- Robe2 |> 
  mutate(robeSeries = fct_other(Result, keep = "Robe")) |> 
  group_by(Advisor, Condition2, robeSeries) |>
  summarise(number = n(), Condition = mean(Condition)) |> 
  ungroup() |> 
  pivot_wider(
  names_from  = "robeSeries",
  values_from = "number",
  values_fill = 0) |> 
  mutate(Total = Robe+Other,
         PropRobe = Robe/Total)
ADV <- cbind(ADV, 
      binom.confint(ADV$Robe, n = ADV$Total, methods = "wilson"))

ADV <- ADV |> 
  filter(Condition %in% c(1,2,6,7,8,10)) |> #dont like manual subsetting of these rows...
  mutate(Condition3 = (str_replace(Condition2, c("Advisor, "), "")))
  
ADV |> 
ggplot(aes(y = Condition3, x = PropRobe, colour = Advisor)) +
  ylab(NULL) +
  xlab("Proportion Robe") +
  geom_pointrange(aes(xmin = lower, xmax = upper), 
                  size =0.75, linewidth = 1,
                  position = position_dodge(width=1)) +
  scale_colour_manual(values = c("#95a5a6", "#2c3e50")) +
  theme(legend.position = "top") +
  facet_wrap(vars(Condition3), strip.position = "left", ncol = 1, scales = "free_y")+
  theme(strip.text.y.left = element_blank(), 
        strip.text.x = element_blank(),
        panel.spacing = unit(0.5,"cm"),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank()) +
  geom_text(aes(label = paste("n =", Total)), vjust = 0, hjust = -0.5, size = 3)
Effect of Court Alchemist Advisor position. Proportion of experiments that yielded Robe under otherwise similar conditions, with 95% confidence intervals

FIGURE 16: Effect of Court Alchemist Advisor position. Proportion of experiments that yielded Robe under otherwise similar conditions, with 95% confidence intervals

ADVwide <- ADV |> 
  group_by(Advisor) |> 
  summarise(SumRobes = sum(Robe), SumTots = sum(Total)) |> 
  mutate(SumProp = SumRobes/SumTots)
(ADVpropTestTotal <- prop.test(x = ADVwide$SumRobes, ADVwide$SumTots, alternative = "less"))
## 
##  2-sample test for equality of proportions with continuity correction
## 
## data:  ADVwide$SumRobes out of ADVwide$SumTots
## X-squared = 10.542, df = 1, p-value = 0.0005836
## alternative hypothesis: less
## 95 percent confidence interval:
##  -1.00000000 -0.05767609
## sample estimates:
##    prop 1    prop 2 
## 0.1869919 0.2991304
(ADVpropTests <- 
  split(ADV, as.factor(ADV$Condition3)) |>  
  map(function (df) prop.test(x = df$Robe, n = df$Total, alternative = "less")))
## $`HRE, Adept, r20`
## 
##  2-sample test for equality of proportions with continuity correction
## 
## data:  df$Robe out of df$Total
## X-squared = 2.3721, df = 1, p-value = 0.06176
## alternative hypothesis: less
## 95 percent confidence interval:
##  -1.0000000000  0.0001854689
## sample estimates:
##    prop 1    prop 2 
## 0.2000000 0.2990196 
## 
## 
## $`HRE, r11`
## 
##  2-sample test for equality of proportions with continuity correction
## 
## data:  df$Robe out of df$Total
## X-squared = 2.9495, df = 1, p-value = 0.04295
## alternative hypothesis: less
## 95 percent confidence interval:
##  -1.00000000 -0.00979298
## sample estimates:
##    prop 1    prop 2 
## 0.1807229 0.2911392 
## 
## 
## $`HRE, r20`
## 
##  2-sample test for equality of proportions with continuity correction
## 
## data:  df$Robe out of df$Total
## X-squared = 4.0799, df = 1, p-value = 0.0217
## alternative hypothesis: less
## 95 percent confidence interval:
##  -1.00000000 -0.02935368
## sample estimates:
##    prop 1    prop 2 
## 0.1807229 0.3051643

The advisor effect was clearly smaller in magnitude than the HRE effect, adding consistently around 10% to the Robe proportion in the three contrasts that could be made (Figure 16). The confidence intervals overlapped but the effect was significant in the HRE-r11 and HRE-r20 groups, and highly significant when pooling the three groups (46/246 vs 172/574, \(\chi^{2}\) = 10.54 , df = 1, one-sided p < 0.001).

4.2.3 Adept effect and base success rate

ADEPT <- Robe2 |> 
  mutate(robeSeries = fct_other(Result, keep = "Robe")) |> 
  group_by(Adept, Condition2, robeSeries) |>
  summarise(number = n(), Condition = mean(Condition)) |> 
  ungroup() |> 
  pivot_wider(
  names_from  = "robeSeries",
  values_from = "number",
  values_fill = 0) |> 
  mutate(Total = Robe+Other,
         PropRobe = Robe/Total)
ADEPT <- cbind(ADEPT, 
      binom.confint(ADEPT$Robe, n = ADEPT$Total, methods = "wilson"))

ADEPT <- ADEPT |> 
  filter(!(Condition %in% c(5,6))) |> #dont like manual subsetting of these rows...
  mutate(Condition3 = (str_replace(Condition2, c("Adept, "), "")))
  
ADEPT |> 
ggplot(aes(y = Condition3, x = PropRobe, colour = Adept)) +
  ylab(NULL) +
  xlab("Proportion Robe") +
  geom_pointrange(aes(xmin = lower, xmax = upper), 
                  size =0.75, linewidth = 1,
                  position = position_dodge(width=1)) +
  scale_colour_manual(values = c("#95a5a6", "#2c3e50")) +
  theme(legend.position = "top") +
  facet_wrap(vars(Condition3), strip.position = "left", ncol = 1, scales = "free_y")+
  theme(strip.text.y.left = element_blank(), 
        strip.text.x = element_blank(),
        panel.spacing = unit(0.5,"cm"),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank()) +
  ggrepel::geom_text_repel(aes(label = paste("n =", Total)), 
                           size = 3, 
                           direction = "x", 
                           min.segment.length = 5, 
                           seed =666)
Effect of Adept title. Proportion of experiments that yielded Robe under otherwise similar conditions, with 95% confidence intervals

FIGURE 17: Effect of Adept title. Proportion of experiments that yielded Robe under otherwise similar conditions, with 95% confidence intervals

ADEPTwide <- ADEPT |> 
  pivot_wider(
  names_from  = c(Adept),
  values_from = "PropRobe") |>
  group_by(Condition3) |> 
  summarise(NO = max(NO, na.rm = T), YES = max(YES, na.rm = T)) |> 
  mutate(difference = YES-NO)
adeptdiff <-  round(mean(ADEPTwide$difference), digits = 4)

(adeptEst <- binom.confint(x= 2, n = 223, methods = "wilson"))
##   method x   n       mean       lower      upper
## 1 wilson 2 223 0.00896861 0.002462972 0.03210504
(baseEst <- binom.confint(x= 0, n = 217, methods = "wilson"))
##   method x   n mean        lower      upper
## 1 wilson 0 217    0 1.704549e-18 0.01739465

The adept effect was too small to be precisely measured with these data (Figure 17), but it is clearly smaller than the HRE and advisor effects. On average (for the 6 contrasts) the effect was to increase the proportion by 0.022 (i.e. to add 2.2% to the success rate), but the estimates varied from -0.006 to 0.08. Although we do not know the exact base rate, it is most reliable to simply estimate the adept effect from the two samples of adept without HRE or advisor effect, because the control group estimates otherwise suffers from uncertainty as well. If we pool the two samples (that only differed in alchemy rank) we get 2 robes out of 223 trials (0.009), with a 95% confidence interval of 0.0025–0.032. We are therefore 95% certain that the true effect of the adept title is less than 3.2%, with a point estimate of 0.9% (minus the base rate).

The base rate is very low, and below the detection limit of the study. No robes resulted from the 217 trials in the two r11/r20 groups without adept title or other effects (from the two lower panels in Figure 17). If we pool these we get a 95% confidence interval of 0–0.017. We are thus 95% certain that the true base rate is below 1.7%, with a point estimate of 0.0%. It should be noted that the base rate is known not to be zero – it is possible to produce the robe without adept title or any other effects active.

4.2.4 Alchemy skill rank, EX, Oxford and aide

There was no effect of alchemy rank on the probability of getting a Robe (Figure 18): in all five comparisons the treatment (r20) and the control group (r11) was very similar to each other, relative to the uncertainty and the readily evident effects of HRE and Advisor.

ALCH <- Robe2 |> 
  mutate(robeSeries = fct_other(Result, keep = "Robe")) |> 
  group_by(Alchemy, Condition2, robeSeries) |>
  summarise(number = n(), Condition = mean(Condition)) |> 
  ungroup() |> 
  pivot_wider(
  names_from  = "robeSeries",
  values_from = "number",
  values_fill = 0) |> 
  mutate(Total = Robe+Other,
         PropRobe = Robe/Total)
ALCH <- cbind(ALCH, 
      binom.confint(ALCH$Robe, n = ALCH$Total, methods = "wilson"))

ALCH <- ALCH |> 
  filter(Condition %in% c(2,6:14)) |> #dont like manual subsetting of these rows...
  mutate(Condition3 = (str_replace(Condition2, c(", r11"), ""))) |> 
  mutate(Condition3 = (str_replace(Condition3, c(", r20"), ""))) |> 
  mutate(Condition3 = (str_replace(Condition3, c("r11"), "Nothing"))) |> 
  mutate(Condition3 = (str_replace(Condition3, c("r20"), "Nothing")))
  
ALCH |> 
ggplot(aes(y = Condition3, x = PropRobe, colour = as.factor((Alchemy)))) +
  ylab(NULL) +
  xlab("Proportion Robe") +
  labs(colour = "Alchemy rank")+
  geom_pointrange(aes(xmin = lower, xmax = upper), 
                  size =0.75, linewidth = 1,
                  position = position_dodge(width=1)) +
  scale_colour_manual(values = c("#95a5a6", "#2c3e50")) +
  theme(legend.position = "top") +
  facet_wrap(vars(Condition3), strip.position = "left", ncol = 1, scales = "free_y")+
  theme(strip.text.y.left = element_blank(), 
        strip.text.x = element_blank(),
        panel.spacing = unit(0.5,"cm"),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank()) +
  ggrepel::geom_text_repel(aes(label = paste("n =", Total)), 
                           size = 3, 
                           direction = "x", 
                           min.segment.length = 5, 
                           seed =666)
Effect of Alchemy rank (r11 or r20). Proportion of experiments that yielded Robe under otherwise similar conditions, with 95% confidence intervals

FIGURE 18: Effect of Alchemy rank (r11 or r20). Proportion of experiments that yielded Robe under otherwise similar conditions, with 95% confidence intervals

EX equipment had no effect either. Two comparisons were available: In one comparison the treatment group estimate was higher than the control group (Figure 19, top) and in another comparison lower (Figure 19, bottom). In both cases with considerable overlap of confidence intervals. To illuminate the issue further I included the group with Oxford effects on in addition to EX (Figure 19, middle). If anything this group should be higher or at least similar to the group without Oxford. In fact it was lower, and lower than the control group in the top panel. Taken together this shows no measurable effect of EX equipment on Robe success.

EX <- Robe2 |> 
  mutate(robeSeries = fct_other(Result, keep = "Robe")) |> 
  group_by(EX, Condition2, robeSeries) |>
  summarise(number = n(), Condition = mean(Condition)) |> 
  ungroup() |> 
  pivot_wider(
  names_from  = "robeSeries",
  values_from = "number",
  values_fill = 0) |> 
  mutate(Total = Robe+Other,
         PropRobe = Robe/Total)
EX <- cbind(EX, 
      binom.confint(EX$Robe, n = EX$Total, methods = "wilson"))

EX <- EX |> 
  filter(Condition %in% c(1:5)) |> #dont like manual subsetting of these rows...
  mutate(Condition3 = (str_replace(Condition2, c(", Ex"), "")))

EXwide <- EX |> 
  pivot_wider(
  names_from  = c(EX),
  values_from = "PropRobe") |>
  group_by(Condition3) |> 
  summarise(NO = mean(NO, na.rm = T), YES = mean(YES, na.rm = T)) |> 
  mutate(difference = YES-NO)
EXdiff <-  round(mean(EXwide$difference), digits = 4)  

EX |> 
ggplot(aes(y = Condition3, x = PropRobe, colour = EX)) +
  ylab(NULL) +
  xlab("Proportion Robe") +
  geom_pointrange(aes(xmin = lower, xmax = upper), 
                  size =0.75, linewidth = 1,
                  position = position_dodge(width=1)) +
  scale_colour_manual(values = c("#95a5a6", "#2c3e50")) +
  theme(legend.position = "top") +
  facet_wrap(vars(Condition3), strip.position = "left", ncol = 1, scales = "free_y")+
  theme(strip.text.y.left = element_blank(), 
        strip.text.x = element_blank(),
        panel.spacing = unit(0.5,"cm"),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank()) +
  ggrepel::geom_text_repel(aes(label = paste("n =", Total)),
  size = 3,
  min.segment.length = 5,
  seed =555)
Effect of EX equipment. Proportion of experiments that yielded Robe under otherwise similar conditions, with 95% confidence intervals. Also includes a group with Oxfords on, but otherwise similar to the top panel.

FIGURE 19: Effect of EX equipment. Proportion of experiments that yielded Robe under otherwise similar conditions, with 95% confidence intervals. Also includes a group with Oxfords on, but otherwise similar to the top panel.

The same comparison also reveal no increased Robe success from Oxford skills (for clarity shown separately in Figure 20). Unfortunately only this one contrast of treatment vs. control was available, but if anything success was reduced with Oxfords on which indicates random noise.

OXF <- Robe2 |> 
  mutate(robeSeries = fct_other(Result, keep = "Robe")) |> 
  group_by(Oxfords, Condition2, robeSeries) |>
  summarise(number = n(), Condition = mean(Condition)) |> 
  ungroup() |> 
  pivot_wider(
  names_from  = "robeSeries",
  values_from = "number",
  values_fill = 0) |> 
  mutate(Total = Robe+Other,
         PropRobe = Robe/Total)
OXF <- cbind(OXF, 
      binom.confint(OXF$Robe, n = OXF$Total, methods = "wilson"))

OXF <- OXF |> 
  filter(Condition %in% c(3,5)) |> #dont like manual subsetting of these rows...
  mutate(Condition3 = (str_replace(Condition2, c(", Oxf"), "")))

OXFwide <- OXF |> 
  pivot_wider(
  names_from  = c(Oxfords),
  values_from = "PropRobe") |>
  group_by(Condition3) |> 
  summarise(NO = mean(NO, na.rm = T), YES = mean(YES, na.rm = T)) |> 
  mutate(difference = YES-NO)
OXFdiff <-  round(mean(EXwide$difference), digits = 4) 

OXF |> 
ggplot(aes(y = Condition3, x = PropRobe, colour = Oxfords)) +
  ylab(NULL) +
  xlab("Proportion Robe") +
  geom_pointrange(aes(xmin = lower, xmax = upper), 
                  size =0.75, linewidth = 1,
                  position = position_dodge(width=1)) +
  scale_colour_manual(values = c("#95a5a6", "#2c3e50")) +
  theme(legend.position = "top") +
  facet_wrap(vars(Condition3), strip.position = "left", ncol = 1, scales = "free_y")+
  theme(strip.text.y.left = element_blank(), 
        strip.text.x = element_blank(),
        panel.spacing = unit(0.5,"cm"),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank()) +
  ggrepel::geom_text_repel(aes(label = paste("n =", Total)),
  size = 3,
  min.segment.length = 5,
  seed =555)
Effect of Oxford skills active. Proportion of experiments that yielded Robe under otherwise similar conditions, with 95% confidence intervals

FIGURE 20: Effect of Oxford skills active. Proportion of experiments that yielded Robe under otherwise similar conditions, with 95% confidence intervals

All trials in the Robe set-up were done with aide in paymaster 100s. However, a comparison with the FAB experiments allows an estimate of a paymaster aide effect to be made since only very few had an aide at all in the FAB trials. The relevant condition is HRE, with rank 11 or 20 alchemy. These were pooled to increase sample size (rank had no effect). The FAB experiments were done with HRE, at r12–14 (included in Table 1 for convenience). Figure 21 shows that having aide in paymaster did not improve the chance of producing a Robe. If anything, the probability was lower with paymaster but the difference is well within the uncertainty.

Paymaster <- read_ods("~\\FABtory\\paymaster.ods", sheet =1)
Paymaster$Condition3 <- (as.factor(Paymaster$Condition3))
Paymaster$Condition3 <-fct_relevel(Paymaster$Condition3, "HRE (r11+r20)", "HRE (r12-r14)")

Paymaster |> 
ggplot(aes(y = Condition3, x = PropRobe, colour = Paymaster)) +
  ylab(NULL) +
  xlab("Proportion Robe") +
  geom_pointrange(aes(xmin = lower, xmax = upper), 
                  size =0.75, linewidth = 1,
                  position = position_dodge(width=1)) +
  scale_colour_manual(values = c("#95a5a6", "#2c3e50")) +
  theme(legend.position = "top") +
  facet_wrap(vars(Condition3), strip.position = "left", ncol = 1, scales = "free_y")+
  theme(strip.text.y.left = element_blank(), #element_text(angle = 0, hjust = 0),
        strip.text.x = element_blank(),
        panel.spacing = unit(0,"cm"),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank()) +
  geom_text(aes(label = paste("n =", Total)), vjust = -1, hjust = -0.5, size = 3)
Effect of aide in Paymaster. Proportion of experiments that yielded Robe under otherwise similar conditions, with 95% confidence intervals. Note that the control group is from the FAB trials.

FIGURE 21: Effect of aide in Paymaster. Proportion of experiments that yielded Robe under otherwise similar conditions, with 95% confidence intervals. Note that the control group is from the FAB trials.

Note that the scale varies in Figure 1821.

4.2.5 Effects on failure and minor success

#proportion minorsuccess
RST <- RobeSummaryTable
#remove row for totals
RST <- RST |> 
  filter(Batch < 27) |> 
  mutate(Condition = as.numeric(Condition))
RST <- RST |> 
  mutate(Minor = (FineVelvet+Justaucorps+Waistcoat)) 
RST <- RST |> 
  mutate(propMinor = (FineVelvet+Justaucorps+Waistcoat)/Total) 
RST <- RST |> left_join(Design |> 
                          select(Condition, Condition2), by="Condition")
RST <- cbind(RST, 
      MinorProp=binom.confint(RST$Minor, n = RST$Total, methods = "wilson"))

#combine batches
RST2 <- RST |> 
  group_by(Condition2) |> 
  summarise(
    Condition = mean(Condition),
    Failure = sum(Failure),
    FineVelvet = sum(FineVelvet),
    Justaucorps = sum(Justaucorps),
    Robe = sum(Robe),
    Waistcoat = sum(Waistcoat),
    Total = sum(Total),
    Minor = sum(Minor)
  ) |> 
  mutate(propMinor = (FineVelvet+Justaucorps+Waistcoat)/Total,
         propRobe = Robe/Total,
         propFailure = Failure/Total,
         propJustau = Justaucorps/Total,
         propVelvet = FineVelvet/Total,
         propWaist = Waistcoat/Total,
         propJustau_minor = Justaucorps/Minor,
         propVelvet_minor = FineVelvet/Minor,
         propWaist_minor = Waistcoat/Minor
         ) 
RST2  <- RST2 |> 
  mutate(Condition2 = as.factor(Condition2)) |> 
  mutate(Condition2 = fct_reorder(Condition2, propRobe, .fun = mean)) |> 
  mutate(Condition2 = fct_rev(Condition2))
RST2 <- cbind(RST2, 
              propMinor = binom.confint(RST2$Minor, n = RST2$Total, methods = "wilson"))
#overall proportion minor success:
propMinorOverall <- sum(RST2$Minor)/sum(RST2$Total)
propMinorOverall2 <- mean(RST2$propMinor)

RST2 |> 
  ggplot(aes(y = Condition2, x = propMinor)) +
  ylab(NULL) +
  xlab("Proportion minor success") +
  geom_vline(xintercept = propMinorOverall, linetype = "dashed", colour = "#95a5a6")+
  geom_pointrange(aes(xmin = propMinor.lower, xmax = propMinor.upper), 
                  size =0.5, linewidth = 0.75,
                  position = position_dodge(width=1)) +
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))
Proportion of minor success (Fine Velvet, Justaucorps or Waistcoat) under different experimental conditions

FIGURE 22: Proportion of minor success (Fine Velvet, Justaucorps or Waistcoat) under different experimental conditions

So far we have only considered Robes. We will now look more closely at the other outcomes: failure and minor successes.

The proportion of minor success (justaucorps, waistcoat and fine velvet together) was remarkably constant under all conditions (overall mean = 0.361), and thus seems fixed at a ratio of around 2:3 (i.e. 40% minor success, 60% failures and huge success, or maybe 35:65), unaffected by the factors that influence huge success (Figure 22).

The surprising consequence is therefore that all that can be attained by changing the conditions is to shift the share of outcomes within these two categories.

RST2 <- cbind(RST2, 
              propFailure = binom.confint(RST2$Failure, n = RST2$Total, methods = "wilson"))
RST2 <- cbind(RST2, 
              propRobe = binom.confint(RST2$Robe, n = RST2$Total, methods = "wilson"))
RST2 <- cbind(RST2, 
              propJustau = binom.confint(RST2$Justaucorps, n = RST2$Total, methods = "wilson"))
RST2 <- cbind(RST2, 
              propVelvet = binom.confint(RST2$FineVelvet, n = RST2$Total, methods = "wilson"))
RST2 <- cbind(RST2, 
              propWaist = binom.confint(RST2$Waistcoat, n = RST2$Total, methods = "wilson"))
RST2 <- cbind(RST2, 
              propJustau_minor = binom.confint(RST2$Justaucorps, n = RST2$Minor, methods = "wilson"))
RST2 <- cbind(RST2, 
              propVelvet_minor = binom.confint(RST2$FineVelvet, n = RST2$Minor, methods = "wilson"))
RST2 <- cbind(RST2, 
              propWaist_minor = binom.confint(RST2$Waistcoat, n = RST2$Minor, methods = "wilson"))
rank11 <- c(6,7,9,11,12)
RST2 <- RST2 |> 
  mutate(AlchemyRank = ifelse(Condition %in% rank11, 11, 20))

propFailureOverall <- sum(RST2$Failure)/sum(RST2$Total)
RST2 |> 
  ggplot(aes(y = Condition2, x = propFailure.mean)) +
  ylab(NULL) +
  xlab("Proportion Failure") +
  geom_vline(xintercept = propFailureOverall, linetype = "dashed", colour = "#95a5a6")+
  geom_pointrange(aes(xmin = propFailure.lower, xmax = propFailure.upper), 
                  size =0.5, linewidth = 0.75)+
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))

propRobeOverall <- sum(RST2$Robe)/sum(RST2$Total)
RST2 |> 
  ggplot(aes(y = Condition2, x = propRobe.mean)) +
  ylab(NULL) +
  xlab("Proportion Adept's Robe") +
  geom_vline(xintercept = propRobeOverall, linetype = "dashed", colour = "#95a5a6")+
  geom_pointrange(aes(xmin = propRobe.lower, xmax = propRobe.upper), 
                  size =0.5, linewidth = 0.75)+
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))

propJustauOverall <- sum(RST2$Justaucorps)/sum(RST2$Total)

propJustauR11 <- RST2 |>  
  filter(AlchemyRank == 11) |> 
  summarise(out = sum(Justaucorps)/sum(Total)) |> 
  pull(out)
propJustauR20 <- RST2 |>  
  filter(AlchemyRank == 20) |> 
  summarise(out = sum(Justaucorps)/sum(Total)) |> 
  pull(out)

JustauPropAlchemy <- RST2 |>
  group_by(AlchemyRank) |>
  summarise(Justaucorps = sum(Justaucorps), Total = sum(Total))

(JustauAlchRes <- 
    binom.wilson(JustauPropAlchemy$Justaucorps, JustauPropAlchemy$Total))
##   method   x    n       mean     lower     upper
## 1 wilson  15  540 0.02777778 0.0169047 0.0453220
## 2 wilson 178 1422 0.12517581 0.1089796 0.1433917
JustauAlchRes$Group <- as_factor(c("r11", "r20"))
#test if equal proportions
(JustauAlchResTest <- 
    prop.test(x = JustauAlchRes$x, JustauAlchRes$n, alternative = "two.sided"))
## 
##  2-sample test for equality of proportions with continuity correction
## 
## data:  JustauAlchRes$x out of JustauAlchRes$n
## X-squared = 40.77, df = 1, p-value = 0.0000000001713
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.12076502 -0.07403104
## sample estimates:
##     prop 1     prop 2 
## 0.02777778 0.12517581

RST2 |> 
  ggplot(aes(y = Condition2, x = propJustau.mean, 
             colour = as.factor(AlchemyRank))) +
  ylab(NULL) +
  xlab("Proportion Admiral Justaucorps") +
  geom_vline(xintercept = propJustauOverall, linetype = "dashed", 
             colour = "#95a5a6")+
  geom_vline(xintercept = propJustauR11, linetype = "dashed", 
             colour = "#377eb8")+
  geom_vline(xintercept = propJustauR20, linetype = "dashed", 
             colour = "#B51F2E")+
  geom_pointrange(aes(xmin = propJustau.lower, xmax = propJustau.upper), 
                  size =0.5, linewidth = 0.75)+
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))+
  scale_colour_manual(values = c("#377eb8", "#B51F2E"))+
  theme( 
    legend.position = c(0.95,0.95),
    legend.justification = c(1,1),
  )+
  labs(colour = "Alchemy rank")
 

propVelvetOverall <- sum(RST2$FineVelvet)/sum(RST2$Total)
RST2 |> 
  ggplot(aes(y = Condition2, x = propVelvet.mean)) +
  ylab(NULL) +
  xlab("Proportion Fine Velvet") +
  geom_vline(xintercept = propVelvetOverall, linetype = "dashed", 
             colour = "#95a5a6")+
  geom_pointrange(aes(xmin = propVelvet.lower, xmax = propVelvet.upper), 
                  size =0.5, linewidth = 0.75)+
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))

propWaistOverall <- sum(RST2$Waistcoat)/sum(RST2$Total)
RST2 |> 
  ggplot(aes(y = Condition2, x = propWaist.mean)) +
  ylab(NULL) +
  xlab("Proportion Waistcoat") +
  geom_vline(xintercept = propWaistOverall, linetype = "dashed", 
             colour = "#95a5a6")+
  geom_pointrange(aes(xmin = propWaist.lower, xmax = propWaist.upper), 
                  size =0.5, linewidth = 0.75)+
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))
Proportion of all possible outcomes: Failure; Robe; Justaucorps; Fine Velvet; Waistcoat. Dashed lines show the mean for each item.Proportion of all possible outcomes: Failure; Robe; Justaucorps; Fine Velvet; Waistcoat. Dashed lines show the mean for each item.Proportion of all possible outcomes: Failure; Robe; Justaucorps; Fine Velvet; Waistcoat. Dashed lines show the mean for each item.Proportion of all possible outcomes: Failure; Robe; Justaucorps; Fine Velvet; Waistcoat. Dashed lines show the mean for each item.Proportion of all possible outcomes: Failure; Robe; Justaucorps; Fine Velvet; Waistcoat. Dashed lines show the mean for each item.

FIGURE 23: Proportion of all possible outcomes: Failure; Robe; Justaucorps; Fine Velvet; Waistcoat. Dashed lines show the mean for each item.

Figure 23 shows the proportion of all possible outcomes, including Robes again for ease of comparison.

Since proportion of minor success was constant, the proportion of failure (Figure 23, upper panel) was in consequence largely an inverse mirror of the proportion of Robes: Conditions that increased the proportion of Robes lowered the proportion of failures and vice versa. As with Robes, we can readily see three clusters: Advisor has the lowest amount of failures, HRE a bit more, and considerably lower without HRE, with adept less clear (Figure 23).

The proportion of the various minor successes appeared to be unaffected by changing the conditions, except an effect of alchemy rank on Justaucorps, discussed below.

#proportions innen minor
#prop.test
propJustauOverall_Minor <- sum(RST2$Justaucorps)/(sum(RST2$Justaucorps)+sum(RST2$Waistcoat)+sum(RST$FineVelvet))
propJustauR11_Minor <- RST2 |>  
  filter(AlchemyRank == 11) |> 
  summarise(out = sum(Justaucorps)/sum(Minor)) |> 
  pull(out)
propJustauR20_Minor <- RST2 |>  filter(AlchemyRank == 20) |> 
  summarise(out = sum(Justaucorps)/sum(Minor)) |> 
  pull(out)
RST2 |> 
  ggplot(aes(y = Condition2, x = propJustau_minor.mean, 
             colour = as.factor(AlchemyRank))) +
  ylab(NULL) +
  xlab("Proportion Justaucorps among minor successes") +
  geom_vline(xintercept = propJustauOverall_Minor, linetype = "dashed", 
             colour = "#95a5a6")+
  geom_vline(xintercept = propJustauR11_Minor, linetype = "dashed", 
             colour = "#377eb8")+
  geom_vline(xintercept = propJustauR20_Minor, linetype = "dashed", 
             colour = "#B51F2E")+
  geom_pointrange(aes(xmin = propJustau_minor.lower, xmax = propJustau_minor.upper), 
                  size =0.5, linewidth = 0.75)+
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))+
  scale_colour_manual(values = c("#377eb8", "#B51F2E"))+
  theme( 
    legend.position = c(0.95,0.95),
    legend.justification = c(1,1),
  )+
  labs(colour = "Alchemy rank")


propVelvetOverall_Minor <- sum(RST2$FineVelvet)/(sum(RST2$Justaucorps)+sum(RST2$Waistcoat)+sum(RST$FineVelvet))
propVelvetR11_Minor <- RST2 |>  
  filter(AlchemyRank == 11) |> 
  summarise(out = sum(FineVelvet)/sum(Minor)) |> 
  pull(out)
propVelvetR20_Minor <- RST2 |>  
  filter(AlchemyRank == 20) |> 
  summarise(out = sum(FineVelvet)/sum(Minor)) |> 
  pull(out)

RST2 |> 
  ggplot(aes(y = Condition2, x = propVelvet_minor.mean, 
             colour = as.factor(AlchemyRank))) +
  ylab(NULL) +
  xlab("Proportion Fine Velvet among minor successes") +
  geom_vline(xintercept = propVelvetOverall_Minor, linetype = "dashed", 
             colour = "#95a5a6")+
  geom_vline(xintercept = propVelvetR11_Minor, linetype = "dashed", 
             colour = "#377eb8")+
  geom_vline(xintercept = propVelvetR20_Minor, linetype = "dashed", 
             colour = "#B51F2E")+
  geom_pointrange(aes(xmin = propVelvet_minor.lower, xmax = propVelvet_minor.upper), 
                  size =0.5, linewidth = 0.75)+
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))+
  scale_colour_manual(values = c("#377eb8", "#B51F2E"))+
  theme( 
    legend.position = c(0.95,0.95),
    legend.justification = c(1,1),
  )+
  labs(colour = "Alchemy rank")

propWaistOverall_Minor <- 
  sum(RST2$Waistcoat)/(sum(RST2$Justaucorps)+sum(RST2$Waistcoat)+sum(RST$FineVelvet))
propWaistR11_Minor <- RST2 |>  
  filter(AlchemyRank == 11) |> 
  summarise(out = sum(Waistcoat)/sum(Minor)) |> 
  pull(out)
propWaistR20_Minor <- 
  RST2 |>  filter(AlchemyRank == 20) |> 
  summarise(out = sum(Waistcoat)/sum(Minor)) |> 
  pull(out)
RST2 |> 
  ggplot(aes(y = Condition2, x = propWaist_minor.mean, 
             colour = as.factor(AlchemyRank))) +
  ylab(NULL) +
  xlab("Proportion Waistcoat among minor successes") +
  geom_vline(xintercept = propWaistOverall_Minor, linetype = "dashed", 
             colour = "#95a5a6")+
  geom_vline(xintercept = propWaistR11_Minor, linetype = "dashed", 
             colour = "#377eb8")+
  geom_vline(xintercept = propWaistR20_Minor, linetype = "dashed", 
             colour = "#B51F2E")+
  geom_pointrange(aes(xmin = propWaist_minor.lower, xmax = propWaist_minor.upper), 
                  size =0.5, linewidth = 0.75)+
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))+
  scale_colour_manual(values = c("#377eb8", "#B51F2E"))+
  theme( 
    legend.position = c(0.95,0.95),
    legend.justification = c(1,1),
  )+
  labs(colour = "Alchemy rank")
Proportion of the three possible outcomes of minor success under different conditions. Dashed lines shows the meanProportion of the three possible outcomes of minor success under different conditions. Dashed lines shows the meanProportion of the three possible outcomes of minor success under different conditions. Dashed lines shows the mean

FIGURE 24: Proportion of the three possible outcomes of minor success under different conditions. Dashed lines shows the mean

Figure 24 shows the proportion of outcomes under different conditions when the result was a minor success (expressed as share of the sum of minor success, not total trials).

There are 3 different minor success outcomes possible. Among these, Admiral Justaucorps can be considered a higher tier outcome than Fine Velvet or Waistcoats. If we look at the outcome within the minor success group there are no effects of the experimental conditions that did affect huge success: Justaucorps is not any more likely with adept or HRE policy or advisor active. However, there is now a noticeable effect of alchemy rank (which had no effect on Robes or failure). Significantly more Justaucorps was produced at r20 than at r11 alchemy, under all conditions measured where alchemy rank could be contrasted (Figure 24, upper panel). Overall, the proportion of Justaucorps at r11 alchemy was clearly lower than at r20 (15/540 vs 178/1422, \(\chi^{2}\) = 40.77 , df = 1, two-sided p < 0.0001.

In other words, if you for some reason want to make Admiral Justaucorps it doesn’t help to have adept title, or HRE or advisor position. Only higher rank alchemy skill will help. It is a rare outcome at rank 11 – the base probability can be estimated at 2.8%. At r20 it increased to 12.5%, i.e. gain about 1% per rank. At r20 the three minor success possibilities are equally likely. For the other two items there did not seem to be any effects apart from both simply decreasing a little in frequency at r20 in consequence of increased Justaucorps (Figure 24, two lower panels).


#per batch:
RST |> 
  ggplot(aes(y = Failure/Total, x = Robe/Total)) +
  geom_segment(aes(x = 0, y = 0.65, xend = 0.65, yend = 0),
               colour = "#95a5a6", 
               linetype ="dashed", 
               linewidth = 1)+
  annotate("text", x = 0.57, y = 0.21, 
           label ="Fixed 65% \ny = 0.65 − x",
           colour = "#95a5a6") +
  ylim(0,0.7) +
  xlim(0,0.7) +
  ggpmisc::stat_ma_line(se = FALSE, colour = "#B51F2E") +
  ggpmisc::stat_ma_eq(mapping = use_label(c("eq", "R2")), 
                      colour = "#B51F2E", label.x = 0.15, label.y = 0.9) +
  geom_point(size= 3, shape = 21, fill = "black", colour = "white")
#per condition (more precise, but fewer data points):
RST2 |> 
  ggplot(aes(y = Failure/Total, x = Robe/Total)) +
  geom_segment(aes(x = 0, y = 0.65, xend = 0.65, yend = 0),
               colour = "#95a5a6", 
               linetype ="dashed", 
               linewidth = 1)+
  annotate("text", x = 0.57, y = 0.21, 
           label ="Fixed 65% \ny = 0.65 − x",
           colour = "#95a5a6") +
  ylim(0,0.7) +
  xlim(0,0.7) +
  ggpmisc::stat_ma_line(se = FALSE, colour = "#B51F2E") +
  ggpmisc::stat_ma_eq(mapping = use_label(c("eq", "R2")), 
                      colour = "#B51F2E", label.x = 0.15, label.y = 0.9) +
  geom_point(size= 3, shape = 21, fill = "black", colour = "white")

cor.test(RST$Failure/RST$Total, RST$Robe/RST$Total)
## 
##  Pearson's product-moment correlation
## 
## data:  RST$Failure/RST$Total and RST$Robe/RST$Total
## t = -11.85, df = 24, p-value = 0.00000000001622
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.9657742 -0.8360694
## sample estimates:
##        cor 
## -0.9241368
cor.test(RST2$Failure/RST2$Total, RST2$Robe/RST2$Total)
## 
##  Pearson's product-moment correlation
## 
## data:  RST2$Failure/RST2$Total and RST2$Robe/RST2$Total
## t = -14.783, df = 12, p-value = 0.000000004592
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.9918362 -0.9164908
## sample estimates:
##        cor 
## -0.9736247
ppcor::pcor.test(RST$Robe, RST$Failure, RST$Total)
##     estimate           p.value statistic  n gp  Method
## 1 -0.8950273 0.000000001567685 -9.623968 26  1 pearson
ppcor::pcor.test(RST2$Robe, RST2$Failure, RST2$Total)
##     estimate       p.value statistic  n gp  Method
## 1 -0.9100018 0.00001583437 -7.279556 14  1 pearson
Scatterplot of proportion robe vs. proportion failures per experimental batch (upper) and condition (lower). Dotted line shows a situation where the sum of robes and failures have a fixed share of 65% of the total (y = 0.65-x). The red line is the actual regressionScatterplot of proportion robe vs. proportion failures per experimental batch (upper) and condition (lower). Dotted line shows a situation where the sum of robes and failures have a fixed share of 65% of the total (y = 0.65-x). The red line is the actual regression

FIGURE 25: Scatterplot of proportion robe vs. proportion failures per experimental batch (upper) and condition (lower). Dotted line shows a situation where the sum of robes and failures have a fixed share of 65% of the total (y = 0.65-x). The red line is the actual regression

Failure and robes were very strongly negatively related, in a way consistent with a common share fixed at 60-65% of the outcomes (Figure 25). In the figure I used a model II major axis regression instead of ordinary least squares, because we do not want to predict y from x but rather describe the slope of the relationship, and both variables are subject to random variation.1

In fact it is inevitable to be such a strong negative relationship if the sum of outcomes that is minor success (Figure 22) is fixed, leaving robe a constant percentage to share with failure, irrespective of experimental conditions. Changing the conditions will merely slide the outcome proportions up and down along the line in Figure 25.

Moreover, inspection of a correlation matrix between all the five outcomes (Figure 26) show that it is only the proportion of robes that is strongly related to failure. Proportion justaucorps is negatively related to fine velvet and waistcoat but less so, as can be expected because there are three outcomes that covary and not only two like for robe and failure.2

RST2matrix <- RST2 |> 
  select(propRobe, propFailure, propMinor, propJustau, propVelvet, propWaist, propJustau_minor, propVelvet_minor, propWaist_minor) |> 
  cor()

corrplot.mixed(RST2matrix,
 upper = "ellipse",
 lower = "number",
 order = 'AOE', 
 tl.col = 'black',
 tl.pos = "lt",
 diag = "n",
 tl.srt = 30,
 bg = "#EBEBEB",
 #mar = c(0,1,1,1)
 )
Correlogram for proportion of different outcomes. Lower triangular shows the correlation matrix, upper triangular shows the ellipsis matrix, where the eccentricity of the ellipse is scaled to the correlation value

FIGURE 26: Correlogram for proportion of different outcomes. Lower triangular shows the correlation matrix, upper triangular shows the ellipsis matrix, where the eccentricity of the ellipse is scaled to the correlation value

4.2.6 Non-randomness and biases in the underlying RNG algorithm

# sequence plots robe vs non-robe
#whole sequence:
ggplot(Robe2, aes(x=Sequence, fill = fct_rev(Result))) +
  geom_bar(position = "fill", width = 1) +
  scale_fill_brewer(palette = "RdBu", direction = -1) +
  #scale_fill_viridis_d(option = "turbo", direction = 1)+
  #scale_x_continuous(breaks = seq(0, 2000, by = 100)) +
  scale_x_continuous(expand = expansion(mult = .025)) +
  scale_y_continuous(breaks = NULL) +
  theme(legend.background = element_rect(fill = "#EBEBEB"),
        legend.key = element_rect(fill = NA),
        legend.position = "top",
        axis.text.y = element_text(size = rel(1.2)),
        legend.justification = c(0.5, 0)) +
  guides(fill = guide_legend(reverse = TRUE, title = NULL)) +
  xlab("Sequence") +
  ylab(NULL)
Sequence of outcome for all robe experiments

FIGURE 27: Sequence of outcome for all robe experiments

A fundamental assumption in the analyses presented above is that each trial is statistically independent and that the probability of success is the same for all trials for which a confidence interval is calculated. This need not be the case and should be confirmed.

To test for the presence of biases in the pseudo-random number generator algorithm used, we can look at the sequence of outcomes (Figure 27). Non-randomness would result in more clusters of the same result, or more evenly spread out results than expected by chance. One way of measuring this is by counting the number of “runs” – segments of the sequence with the same value. For example if there are certain “lucky times” (where one should continue producing), and certain bad ones (where one should immediately cancel), there would be more runs in the data than expected by chance.

We only consider a binary outcome of Robe vs. non-Robe, shown in Figure 28 below.3 There was a highly significant deviation from randomness in the complete sequence (Wald-Wolfowitz runs-test, p = 0.003).

ggplot(Robe2, aes(x=Sequence, fill = fct_rev(robeSeries))) +
  geom_bar(position = "fill", width = 1) +
  scale_fill_manual(values = c("#95a5a6","#B51F2E")) +
  #scale_x_continuous(breaks = seq(0, 2000, by = 100)) +
  scale_x_continuous(expand = expansion(mult = .025)) +
  scale_y_continuous(breaks = NULL) +
  theme(legend.background = element_rect(fill = "#EBEBEB"),
        legend.key = element_rect(fill = NA, color = NA),
        legend.position = "top",
        axis.text.y = element_text(size = rel(1.2)),
        legend.justification = c(0.5, 0)) +
  guides(fill = guide_legend(reverse = TRUE, title = NULL)) +
  xlab("Sequence") +
  ylab(NULL)
Sequence of outcome for all robe experiments, dichotomized

FIGURE 28: Sequence of outcome for all robe experiments, dichotomized

However, this is entirely as expected because we introduced a potential source of non-randomness, namely changing the conditions. If the conditions indeed affect the outcome of the experiments we thus cannot use the complete sequence to distinguish between the effect of the conditions and the presence of any other non-randomness.

Instead we must look at the batch sequences separately (Figure 29), or the combined sequence of several batches done under the same experimental conditions (Figure 30). These two can differ, for example if closing and starting the production dialogue window somehow matters: Then we should find a deviation when looking at combined sequences from several batches, but not when looking at single batches.4

# per batch:
ggplot(Robe2, aes(x=BatSeq, fill = fct_rev(robeSeries))) +
  geom_bar(position = "fill", width = 1) +
  facet_wrap(vars(fct_rev(Condition3), GroupBatch), strip.position = "left", ncol = 1) +
  #scale_fill_viridis_d(option = "turbo", direction = 1)+
  scale_fill_manual(values = c("#95a5a6","#B51F2E")) +
  scale_x_continuous(breaks = seq(0, 120, by = 20)) +
  scale_y_continuous(breaks = NULL) +
  theme(legend.background = element_rect(fill = "#EBEBEB"),
        legend.key = element_rect(fill = NA, color = NA),
        legend.position = "top",
        axis.text.y = element_text(size = rel(1.2)),
        legend.justification = c(0.5, 0)) +
  guides(fill = guide_legend(reverse = TRUE, title = NULL)) +
  theme(strip.text.y.left = element_blank(), #element_text(angle = 0, hjust = 0),
        strip.text.x = element_blank(),
        panel.spacing = unit(0.05,"cm")) +
  xlab("Sequence") +
  ylab(NULL)
Sequence of outcome for robe experiments, per batch

FIGURE 29: Sequence of outcome for robe experiments, per batch

# per condition:
# first make a sequence within condition.

Robe2 <- Robe2 |> 
  group_by(Condition) |> 
  mutate(StartSeq = min(Sequence),
         CondSeq = Sequence-StartSeq+1)
Robe2 <- ungroup(Robe2)
         
ggplot(Robe2, aes(x=CondSeq, fill = fct_rev(robeSeries))) +
  geom_bar(position = "fill", width = 1) +
  facet_wrap(vars(fct_rev(Condition3)), strip.position = "left", ncol = 1) +
  #scale_fill_viridis_d(option = "turbo", direction = 1)+
  scale_fill_manual(values = c("#95a5a6","#B51F2E")) +
  scale_x_continuous(breaks = seq(0, 220, by = 20)) +
  scale_y_continuous(breaks = NULL) +
  theme(legend.background = element_rect(fill = "#EBEBEB"),
        legend.key = element_rect(fill = NA, color = NA),
        legend.position = "top",
        axis.text.y = element_text(size = rel(1.2)),
        legend.justification = c(0.5, 0)) +
  guides(fill = guide_legend(reverse = TRUE, title = NULL)) +
  theme(strip.text.y.left = element_blank(), #element_text(angle = 0, hjust = 0),
        strip.text.x = element_blank(),
        panel.spacing = unit(0.05,"cm")) +
  xlab("Sequence") +
  ylab(NULL)
Sequence of outcome for robe experiments, per condition

FIGURE 30: Sequence of outcome for robe experiments, per condition

There were very few significant deviations from non-randomness of runs in the separate batches and none in the combined batches under same conditions (runs-tests results in Appendix), and considering the number of tests performed is as expected under randomness. Hence, the pseudo-random number generator algorithm generating the results appears to be unbiased and in accordance with a random process. This result — presence of non-randomness in the complete sequence but not in the batches — therefore give additional statistical confirmation that the experimental conditions affected the outcome.

Another approach to detecting non-randomness is to consider the sequence of outcomes as a time-series and look for auto-correlation. Auto-correlation is that a time series is correlated with a lagged version of itself. This could for example be caused by cycles or trends, or in general if the outcome of one experiment is related to the outcome of the previous experiment (or a certain number of steps away).

Auto-correlation function plots (ACF) and partial auto-correlation function plots (i.e., controlled for auto-correlation at shorter lags) are presented in the Appendix. Again, any (weak) auto correlation could only be seen in the complete sequence of experiments and not in the separate batches, as expected if the only source of non-randomness in the outcome of the experiments was varying the factors studied. In other words, the time-series under constant conditions looked like white noise which is what we can expect from a random process.

5 Discussion

“I guess we’ll be traveling in uncharted waters.”
“That’ll be fun”, Phil said.5

Lemony Snicket (A Series of Unfortunate Events)

In the English version of UWO the descriptions in game of adept title, HRE position and advisor all refer to increased “chance of success” of alchemy experiments. This is somewhat vague, as it does not explicitly state if it refers to “huge success” or simply any success (as opposed to failure) or both. The results presented here show that these effects specifically target the chance of “huge success” and have no effect on the other outcomes.

Compared to the normal rate (with adept title) of approx. 1 robe or biretta from 100 experiments, it is clear that the HRE effect is enormous: with HRE I got approximately 1 robe per 5 experiments, or 20% success. With also advisor title the success increased 10% more, to 30%.

It was surprising that the general HRE effect is larger than the court advisor effect. It would arguably have made more sense if it was the other way around. One of the wider ramifications of this finding is that the other HRE effects perhaps are of the same magnitude, which fits with observations of e.g. the ganador drop rate HRE effect.

Making the FAB album during a HRE boost is obviously saving a lot of materials and suffering, and as shown here is easily done for any character. Even low level ones in a starter barca, provided they have access to rank 2 quarters (and some help).

However, deciding to do further experiments on Robes did not save a lot of materials and suffering.

By comparing experiments where certain conditions were manipulated and all other factors were held constant, it was possible to measure the effect of various factors thought to possibly influence the success of alchemy experiments. There was no discernible effect of having or not having Oxford production skills active (including item improvement technique). If anything, the success was better with Oxfords off. Also, the EX gear for better tier great success did not seem to influence alchemy experiments. Alchemy rank and paymaster aide rank had no effect either. All this is as expected from the “alchemy experiments are different” camp, and supports the view that huge success in alchemy experiments is not analogous to high tier great success of normal crafting.

However, alchemy rank did have an effect on minor success, with higher rank increasing the probability of admiral justaucorps. The magnitude of the effect was estimated at +1% per skill rank above minimum requirement. This suggests that the minor success outcome may be influenced by the same factors as normal crafting. But, why then did only alchemy rank effect the outcome of minor success? I suspect that Oxford and EX actually did have an effect, but only on the stats of the items. This was not tested, and the data is now lost since it was not kept track of where each Adept’s Robe originated from. Just like Darwin (that did not label what particular island his Finches collected in the Galapagos had been obtained from), I can say in hindsight that this should have been done. It must be left to future studies to investigate if alchemy rank, Oxford and EX affects the stats of minor and huge success alchemy experiments.

An unexpected result was that the rate of minor success was constant under all treatments. Increasing the rate of justaucorps by having higher alchemy rank did not change the rate of failures or robes, but only shuffled the proportions of the other minor success options within their fixed common share of all outcomes.

My interpretation is this: that the factors influencing chance of great success in normal crafting play no role in alchemy experiments because unlike normal crafting, rate of failure is not a separate parameter in alchemy experiments. It is simply the inverse of the rate of huge success (Figure 25) because the minor success rate is fixed.

For some alchemy experiments more than one kind also of great success is possible (such as plates from processed lumber and metal works). I do not have any data to say what, if anything, might affect the rates of different great success. I also have no data on the different normal successes for the recipes other than robe. A reasonable conjecture is that these are also affected by alchemy rank, and that their common share of the outcomes is constant like for the robe recipe.

Perhaps surprisingly, there was no measurable effect of having the Adept title, but it was with a high degree of confidence measured to be less than 3%. It may be 1%. However, larger sample sizes than obtained here would be needed to precisely estimate such small effects. This is illustrated by the replication done for some conditions, that showed considerable inter-batch variation (sometimes of the same magnitude as among conditions). Differences between batches subject to the exact same conditions could be mistaken for some kind of bias in the RNG6, but in reality is what is expected from small samples from a random process, shown by tests done here that failed to detect any non-randomness.

The effect of adept is clearly very small compared to the HRE effects, and in practice does not matter under HRE conditions. However, outside of HRE it does matter. In fact, if the adept effect is indeed + 1% I guess the base rate without adept title is 0.1%. In terms of expected materials consumed an increase from 0.1% to even 1% success is substantial, whereas an increase from 30 to 31% is negligible.

Since the adept effect is so small I did not try to test another uncertainty: if one could leech the adept effect by being apprentice to an adept. It is feasible that this is the case (like for some other titles), and could be worth exploiting for obtaining FAB under normal situations when the HRE effect is absent. However, one could not then simultaneously get the +1 alchemy from being apprentice to an alchemy meister. In the FAB experiments all characters were apprentice to a meister for the +1. There is no reason to think this could have affected the outcome because the effect on success is the other way around – for the meister, not the apprentice, and such effects would not have affected the chance of huge success anyway (according to the present report).

The data from FAB experiments on robe success were collected in a slightly different way from the Robe experiments. The gathering of data from one player was terminated when a robe was obtained during the FAB study, whereas in the Robe experiments it was terminated when no more base clothes were left in the inventory. Thus one could argue that the FAB setup would bias the results in favour of success compared to the Robe setup. However, if the observations are independent this would in fact not create a bias because the dataseries could be considered as continuing in the next player, in the same way as for the Robe experiments.7

It would have been interesting to see what the individual effect of court alchemist is compared to the global effect for all supporters of the HRE policy. If the effects found in this study are additive it should be +10%. This was not possible to fully test here since the only individual with the advisor effect also by necessity had the global effect active, so no player had only the HRE advisor position. This must be tested during elections where there is no global alchemy success policy but court alchemist position is still available. But such elections seem to be rare – court alchemist is often not present as an option. (Note that there are skill rank and job requirements for certain advisor roles).

This is not a pressing concern though since the only item worth making is the adept’s robe, as base material for Alchemist robe EX (see Appendix for how to make Robe EX). After this event it is fair to say we have an over-abundance of adept’s robe and the market for Alchemy Robe EX is not endless (although I made and sold 125 of them after last time we had this HRE effect years ago). The robe EX also has a new competitor now in form of Tianzhu Treasure Sword from grande ganador which have the same EX rank and sells for more than robe EX.

Indeed, also the market for FAB is limited and it will be a challenge selling off the ones made during this study. But this was done for Science, not money of course.

The main bottleneck for obtaining mats for FAB turned out, unsurprisingly, to be the knife loosing its gold needed for gold short-sword. To secure maximum number of knives, I let a character without the HRE effect make the knives so that it would be fewer gold swords produced and thus more knives. Or so I thought in my ignorance. As it turns out, HRE should have no effect on a minor success such as this knife (but alchemy rank would), if the results from the robe experiments hold in general. Unfortunately I did not keep records of the outcomes, but I can state that way too many unwanted huge success came out of this.

In the FAB experiments any small differences between the “easy” recipes was likely due to random variation. An outlier was blue ore that was almost as hard as robe to succeed with on average, but there is no reason to believe the blue ore recipe to differ form the other coloured ores. My best estimate is that they all had 25% chance for success, which should give 5% base chance if the HRE effect is the same for all items. An earlier hunch that knife was harder than the others appears to be wrong. It probably only gave the impression to be more difficult because the base material is much harder to obtain than for the others.

There is no reason to believe that the HRE effect (or any other effect) acts differently for different items. Instead, the lower success of Robe/Biretta compared to other items is likely to be due to different base probabilities, to which is added a common fixed increase in the probability of huge success from Adept, HRE and advisor. In other words, I make the assumption that the effects are additive to each other and to the base rate and are the same for all the MA items. An outline of a theory for alchemy experiments, summarising what I have discussed, is presented in Figure 31.

Proposed theory of the outcome of alchemy experiments

FIGURE 31: Proposed theory of the outcome of alchemy experiments

The theory yields some predictions that can be tested. In particular that the independent effect of advisor position is +10%, and that the proportion of failures would be around 55% with only advisor effect on. The court alchemist advisor position is rare, and I am not certain if it actually can be obtained from an emperor that does not have the support alchemy policy. But if it can, it would be interesting to do more Robe experiments with only the Advisor effect on. Easier to test is that processed lumber and the other non-Robe/Biretta should yield 5% MA items in experiments conducted outside HRE/advisor without Adept, and 6% with Adept title on. It would also be interesting to look at the “minor huge successes” from these experiments, such as non-FAB armour plates and how those fit into the picture. The Robe and Biretta recipes do not have multiple huge success categories, so I have no data on this. Another unexplored issue is if the ratio of minor success to failure/huge success is fixed at 2:3 also for other recipes than Robe.

6 Conclusions

Despite some shortcomings, the results presented here show several clear patterns. My conclusion (summarised in Figure 31) is that the chance of minor success is fixed at 35% or 40% and cannot be altered by any known factor, and that increasing the probability of huge success therefore is the same as lowering the probability of failure. The chance of huge success can be modified as follows:

  • Adept: +1%
  • HRE advisor effect: +10%
  • HRE general effect: +20%
  • Oxford: not applicable to alchemy experiments
  • EX: not applicable to alchemy experiments
  • Paymaster aide: not applicable to alchemy experiments
  • Skill rank: no effect on huge success or failure, but moderate effect within minor success

7 Acknowledgments

I want to thank those that lent me their characters for the advancement of science. Big thank you to Marsali that gave me some much needed fine dye and canned silk thread, and to Altavista, Guilder, Marsali and TardyMcGormless for discussions about this and that.

8 Appendices

8.1 How to make items for FAB

Recipes and best practices for making FAB items

FIGURE 32: Recipes and best practices for making FAB items

8.2 How to make Robe EX from Adept’s Robe

How to make Robe EX

FIGURE 33: How to make Robe EX

8.3 FAB raw data

DT::datatable(FAB[,-1], options = list(pageLength = 10), caption = 'FAB experiments raw data. Values are number of attempts needed')

8.4 Robe raw data

DT::datatable(Robe, options = list(pageLength = 10), caption = 'Robe experiments raw data')

8.5 Runs-tests

runsTableBatch
runsTableCond

8.6 Auto-correlation

ACFwhole
PACFwhole
Auto-correlation (upper panel) and partial auto-correlation (lower panel) function plot for the whole Robe sequenceAuto-correlation (upper panel) and partial auto-correlation (lower panel) function plot for the whole Robe sequence

FIGURE 34: Auto-correlation (upper panel) and partial auto-correlation (lower panel) function plot for the whole Robe sequence

BatchACF
Auto-correlation function plots for Robe batches

FIGURE 35: Auto-correlation function plots for Robe batches

ConditionACF
Auto-correlation function plots for Robe conditions

FIGURE 36: Auto-correlation function plots for Robe conditions

BatchPACF
Partial auto-correlation function plots for Robe batches

FIGURE 37: Partial auto-correlation function plots for Robe batches

ConditionPACF
Partial auto-correlation function plots for Robe conditions

FIGURE 38: Partial auto-correlation function plots for Robe conditions

8.7 R code for this report

p.caption {
  color: #777;
}
caption {
  color: #777;
}
knitr::opts_chunk$set(fig.width = 5, fig.align = 'center', 
                      collapse = TRUE, warning = FALSE, message = FALSE)
#packages
library(readODS)
library(ggplot2)
library(dplyr)
library(tidyr)
library(kableExtra)
library(forcats)
library(purrr)
library(tseries)
library(ggfortify)
library(binom)
library(broom)
library(stringr)
library(ggpmisc)
library(corrplot)
library(DT)
library(shades)
#data
FAB <- read_ods("~\\FABtory\\FABtory.ods", sheet =2)
FAB2 <- FAB  |>  pivot_longer(
  cols = Biretta:WhiteOre,
  names_to = "Item",
  values_to = "AttemptsNeeded"
  )
Snitt <- FAB2  |> 
  group_by(Toon, Who) |>
  summarise(MeanAttemptsNeeded = mean(AttemptsNeeded))
Robe <- read_ods("~\\FABtory\\FABtory.ods", sheet =3)
Design <- read_ods("~\\FABtory\\FABtory.ods", sheet =4)
RobeSummaryTable <- read_ods("~\\FABtory\\FABtory.ods", sheet =6)
knitr::include_graphics('FABpic.png')
knitr::include_graphics('HRE.png')
knitr::include_graphics('robeSuccess.png')
kbl(Design[,-2], caption = 'Experimental design for Robe experiments. Group 15 is the Robe part of the FAB setup for comparison.', align ="c") |> 
  kable_paper(c("hover", "condensed"), full_width = F)
knitr::include_graphics('mentored.png')

ggplot(FAB2, aes(AttemptsNeeded)) +
  geom_histogram(fill = "#377eb8", colour = "black", binwidth = 1)

ggplot(FAB2, aes(AttemptsNeeded)) +
  stat_density(aes(colour = AlchemySkill),  linewidth = 2,
               bounds = c(1, Inf), geom="line", position="identity") +
  #scale_colour_manual(values = c("#7a0403","#FABA39","#18bc9c", "#2c3e50"))+
  scale_colour_brewer(palette = "RdBu", direction = -1) +
  #scale_colour_manual(values = c("#7a0403","#FABA39","#18bc9c", "#2c3e50"))+
  theme(legend.position = c(0.95,0.95),
        legend.justification = c(1,1))
FAB3 <- FAB2 |>
  mutate(Item = reorder(Item, AttemptsNeeded, FUN = mean, na.rm = TRUE))
ItemLabels <- FAB3 |>
  group_by(Item) |>
  summarise(snitt = mean(AttemptsNeeded), median =median(AttemptsNeeded))
ItemLabels$snitt2 <- sprintf("bar(x) == %.2f", ItemLabels$snitt)
ItemLabels

ggplot(FAB3, aes(AttemptsNeeded)) +
  geom_histogram(binwidth = 1, colour = "black", 
                 aes(fill=(Item))) + 
  facet_wrap(vars(Item), ncol = 2) +
  #scale_fill_viridis_d(option = "turbo", direction = 1, guide = "none") +
  scale_fill_brewer(palette = "RdBu", guide = "none", direction = -1)+
  geom_label(x = 12.5, y = 17.5, aes(label = snitt2), data = ItemLabels, parse = TRUE, 
             hjust = "center", size = 3, label.size = NA)
ggplot(FAB3, aes(x=AttemptsNeeded, y=Item, fill=Item)) +
  geom_boxplot() +
  stat_summary(fun=mean, geom="point", shape=20, size=3, color="white", fill="white") +
  ylab(NULL) +
  scale_x_continuous(breaks = seq(0,24, by =2))+
  #scale_fill_viridis_d(option = "turbo", direction = 1, guide="none")
  scale_fill_brewer(palette = "RdBu", guide = "none", direction =-1)

FAB4 <- FAB3 |>
  mutate(ItemGroup = ifelse(Item == "Biretta" | Item == "Robe" , "Robe/Biretta", "Others"))
FAB4$ItemGroup <- as_factor(FAB4$ItemGroup)
ItemLabels2 <- FAB4 |>
  group_by(ItemGroup) |>
  summarise(snitt = mean(AttemptsNeeded), 
            median =median(AttemptsNeeded), 
            n = (length(AttemptsNeeded)))
ItemLabels2$snitt2 <- sprintf("bar(x) == %.2f", ItemLabels2$snitt)
ItemLabels2$n2 <- sprintf("n == %.0f", ItemLabels2$n)
ItemLabels2$snittAndN <- sprintf(
  "bar(x) == %.2f * ',' ~~ n == %.0f",
  ItemLabels2$snitt,
  ItemLabels2$n
)
ggplot(FAB4, aes(AttemptsNeeded)) +
  geom_histogram(binwidth = 1, colour = "black", 
                 aes(fill=(ItemGroup))) + 
  facet_wrap(vars(ItemGroup), ncol = 2) +
  scale_fill_manual(values = c("#B51F2E", "#377eb8"), guide = "none") +
  #scale_fill_manual(values = c("#7a0403", "#95a5a6"), guide = "none") +
  #scale_fill_brewer(palette = "Set1", guide = "none")+
  geom_label(x = 12.5, y = 140, aes(label = snittAndN), data = ItemLabels2, parse = TRUE, 
             hjust = "center", vjust = "center", size = 3, label.size = NA)

ggplot(FAB4, aes(AttemptsNeeded)) +
  geom_histogram(binwidth = 1, colour = "black", 
                 aes(fill=(ItemGroup), y = stat(density))) + 
  facet_wrap(vars(ItemGroup), ncol = 2) +
  scale_fill_manual(values = c("#B51F2E", "#377eb8"), guide = "none") +
  #scale_fill_manual(values = c("#7a0403", "#95a5a6"), guide = "none") +
  #scale_fill_brewer(palette = "Set1", guide = "none")+
  geom_label(x = 12.5, y = 0.235, aes(label = snittAndN), data = ItemLabels2, parse = TRUE, 
             hjust = "center", vjust = "center", size = 3, label.size = NA)

ggplot(FAB4, aes(AttemptsNeeded)) +
  stat_density(aes(colour = ItemGroup),  linewidth = 2,
               bounds = c(1, Inf), geom="line", position="identity") +
  scale_colour_manual(values = c("#B51F2E", "#377eb8"), guide = "none") +
  #scale_colour_viridis_d(option = "turbo", direction = -1) +
  #scale_colour_manual(values = c("#7a0403","#95a5a6")) +
  #scale_colour_brewer(palette = "Set1", guide = "none")+
  theme(legend.position = c(0.95,0.95),
        legend.justification = c(1,1)) +
  guides(colour = guide_legend(title = NULL))

ggplot(FAB4, aes(x=AttemptsNeeded, y=ItemGroup, fill=ItemGroup)) +
  geom_boxplot() +
  stat_summary(fun=mean, geom="point", shape=20, size=3, color="white", fill="white") +
  ylab(NULL) +
  scale_x_continuous(breaks = seq(0,24, by =2))+
  scale_fill_manual(values = c("#B51F2E", "#377eb8"), guide = "none") 
  #scale_fill_manual(values = c("#7a0403","#95a5a6"), guide="none")
  #scale_fill_brewer(palette = "Set1", guide = "none")
table(FAB4$ItemGroup)
(WT <- wilcox.test(FAB4$AttemptsNeeded ~ FAB4$ItemGroup, 
            paired = F, 
            alternative = "greater"))
#restructure data
FABprop <- FAB4 |>
  group_by(ItemGroup) |>
  summarise(success = sum(n()), total = sum(AttemptsNeeded))
#binomial confidence intervals
binom::binom.confint(FABprop$success, FABprop$total) 
(binomRes <- binom::binom.wilson(FABprop$success, FABprop$total))
binomRes$Group <- as_factor(c("Robe/Biretta", "Others"))
#test if equal proportions
(propRes <- prop.test(x = binomRes$x, binomRes$n, alternative = "less"))
kbl(FABprop, label = "FABpropTable", 
    caption = 'Number of successes and attempts when making MA entries for FAB', 
    align ="c",
    col.names = c('Group', 'Successes', 'Total Experiments')) |>
  kable_paper(c("hover", "condensed"), full_width = F) #|>
  #column_spec(1, color = "white", bold = F, 
  #            background = c("#ca0020", "#0571b0"))
binomRes |> 
ggplot(aes(x = Group, y = mean, colour= Group)) +
  ylab("Proportion Success") +
  xlab(NULL) +
  geom_pointrange(aes(ymin = lower, ymax = upper), size =0.75, linewidth = 1)+
  scale_colour_manual(values = c("#B51F2E", "#377eb8"), guide = "none") +
  geom_text(aes(label = round(mean, 3), vjust = -2), size = 3) +
  coord_flip() +
  theme(axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank())

ggplot(Snitt, aes(MeanAttemptsNeeded)) +
  geom_histogram(aes(fill= Who), binwidth = 0.5, colour="black") +
  #scale_fill_brewer(palette = "RdBu", direction = 1) +
  scale_fill_manual(values = c("#B51F2E","#F4A582","#A7CFE4", "#377eb8"))+
  #scale_fill_viridis_d(option = "turbo", direction = -1) +
  ylab("Count") +
  xlab("Average attempts needed") +
  annotate("text", x = 2.5, y = 9.5, label = "Alta-lucky-vista", colour = "#B51F2E") +
  geom_curve(
    aes(x = 2.5, y = 8.8, xend = 2.0, yend = 2.5),
    arrow = arrow(
    length = unit(0.03, "npc"), 
    type="closed" 
    ),
    colour = "#B51F2E",
    linewidth = 1.2,
    angle = 0 
  ) +
  scale_y_continuous(breaks = seq(0,20, by =2))+
  theme(
    legend.position = c(0.95,0.95),
    legend.justification = c(1,1),
    legend.background = element_rect(fill = "white"),
    legend.key = element_rect(fill = "white", color = NA)
  )

knitr::include_graphics('Ikea.png')
kbl(RobeSummaryTable, caption = 'Outcome of Robe Experiments for each batch. The percentage Adept Robe is highlighted.', align ="c") |>
  kable_paper(c("hover", "condensed"), full_width = F) |>
  column_spec(9, background = "#92c5de") |> #, color = "white"
  pack_rows("HRE, Advisor, Adept, r20", 1, 3) |>
  pack_rows("HRE, Advisor, r20", 4, 6) |>
  pack_rows("HRE, Advisor, Adept, r20, Ex", 7, 9) |>
  pack_rows("HRE, Advisor, r20, Ex", 10, 12) |>
  pack_rows("HRE, Advisor, Adept, r20, Ex, Oxf", 13, 15) |>
  pack_rows("HRE, Advisor, r11", 16, 17) |>
  pack_rows("HRE, r11", 18, 18) |>
  pack_rows("HRE, r20", 19, 19) |>
  pack_rows("HRE, Adept, r11", 20, 20) |>
  pack_rows("HRE, Adept, r20", 21, 21) |>
  pack_rows("r11", 22, 22) |>
  pack_rows("Adept, r11", 23, 23) |>
  pack_rows("r20", 24, 24) |>
  pack_rows("Adept, r20", 25, 26)

# get condition descriptions from table, and combine with sample size for figure labels
# and reorder factor levels by % robes. and the items from "hardest to easiest"
# keep them reordered in the file so don't have to do within ggplot
Robe2 <- Robe |> left_join(Design |> select(Condition, Condition2, Totals), by="Condition")
Robe2 <- Robe2 |>
  mutate(Condition3 = paste0(Condition2, " (n = ", Totals, ")")) |>
  mutate(Condition3 = fct_reorder(.f = Condition3, 
                                         .x = Result,
                                         .fun = function(.x) mean(.x == "Robe"),
                                         .desc = TRUE)) |>
  mutate(Result = fct_relevel(Result, "Robe", "Justaucorps", "Waistcoat", "FineVelvet", "Failure"))
  
ggplot(Robe2, aes(Condition3, fill = fct_rev(Result))) + 
  geom_bar(position = "fill", width = 0.9,) + #
  #shades::brightness(scale_fill_brewer(palette = "RdBu", direction = -1),   shades::delta(-0.05)) +
  scale_fill_brewer(palette = "RdYlBu", direction = -1) +
  #scale_fill_viridis_d(option = "turbo", direction = 1) +
  scale_y_continuous(labels = scales::percent, breaks = seq(0, 1, by = 0.2), expand = expansion(mult = c(0, 0.02))) +
  scale_x_discrete(expand = expansion(mult = c(0, 0)))+
  ylab("Percentage") +
  xlab(NULL) +
  theme(legend.background = element_rect(fill = "#EBEBEB"),
        legend.key = element_rect(fill = NA),
        legend.position = "top",
        axis.text.y = element_text(size = rel(1)),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank(),
        legend.justification = c(1, 0),
        panel.background = element_rect(fill = "white")) +
  guides(fill = guide_legend(reverse = TRUE, title = NULL))+
  coord_flip()

# include batch, within condition
Robe2$GroupBatch <- ave(Robe2$Batch, Robe2$Condition, 
                        FUN = function(x) as.numeric(factor(x)))
ggplot(Robe2, aes(GroupBatch, fill = fct_rev(Result))) + 
  geom_bar(position = "fill",  width = 1) + #colour = "black",
  scale_fill_brewer(palette = "RdYlBu", direction = -1) +
  #scale_fill_viridis_d(option = "turbo", direction = 1) +
  scale_y_continuous(labels = scales::percent, breaks = seq(0, 1, by = 0.2), expand = expansion(mult = c(0, 0.02))) +
  scale_x_continuous(breaks = NULL, expand = expansion(mult = c(0, 0))) +
  ylab("Percentage") +
  xlab(NULL) +
  theme(legend.background = element_rect(fill = "#EBEBEB"),
        legend.key = element_rect(fill = NA),
        legend.position = "top",
        axis.text.y = element_text(size = rel(1.2)),
        legend.justification = c(1, 0)) +
  guides(fill = guide_legend(reverse = TRUE, title = NULL))+
  coord_flip()+
  facet_wrap(~fct_rev(Condition3), strip.position = "left", ncol = 1, scales = "free_y") +
  theme(strip.text.y.left = element_text(angle = 0, hjust = 1), 
        panel.spacing = unit(0.1,"cm"), 
        strip.background = element_rect(fill = NA))
# time series visualization
ggplot(Robe2, aes(x=BatSeq, fill = fct_rev(Result))) +
  geom_bar(position = "fill", width = 1) + #, colour = "black"
  facet_wrap(vars(fct_rev(Condition3), GroupBatch), strip.position = "left", ncol = 1) +
  scale_fill_brewer(palette = "RdYlBu", direction = -1) +
  #scale_fill_viridis_d(option = "turbo", direction = 1)+
  scale_x_continuous(breaks = seq(0, 120, by = 20), expand = expansion(mult = c(0, 0))) +
  scale_y_continuous(breaks = NULL, expand = expansion(mult = c(0, 0))) +
  theme(legend.background = element_rect(fill = "#EBEBEB"),
        legend.key = element_rect(fill = NA),
        legend.position = "top",
        axis.text.y = element_text(size = rel(1.2)),
        legend.justification = c(0.5, 0)) +
  guides(fill = guide_legend(reverse = TRUE, title = NULL)) +
  theme(strip.text.y.left = element_blank(), 
        strip.text.x = element_blank(),
        panel.spacing = unit(0.1,"cm")) +
  xlab("Sequence") +
  ylab(NULL)

HRE <- Robe2 |> 
  mutate(robeSeries = fct_other(Result, keep = "Robe")) |> 
  group_by(HRE, Condition2, robeSeries) |>
  summarise(number = n(), Condition = mean(Condition)) |> 
  ungroup() |> 
  pivot_wider(
  names_from  = "robeSeries",
  values_from = "number",
  values_fill = 0) |> 
  mutate(Total = Robe+Other,
         PropRobe = Robe/Total)
HRE <- cbind(HRE, 
      binom.confint(HRE$Robe, n = HRE$Total, methods = "wilson"))

HRE <- HRE |> 
  filter(Condition %in% c(7:10) | HRE == "NO") |> #manual subsetting
  mutate(Condition3 = (str_replace(Condition2, c("HRE, "), "")))
  
HRE |> 
ggplot(aes(y = Condition3, x = PropRobe, colour = HRE)) +
  ylab(NULL) +
  xlab("Proportion Robe") +
  geom_pointrange(aes(xmin = lower, xmax = upper), 
                  size =0.75, linewidth = 1,
                  position = position_dodge(width=1)) +
  scale_colour_manual(values = c("#95a5a6", "#2c3e50")) +
  theme(legend.position = "top") +
  facet_wrap(vars(Condition3), strip.position = "left", ncol = 1, scales = "free_y")+
  theme(strip.text.y.left = element_blank(), 
        strip.text.x = element_blank(),
        panel.spacing = unit(0.5,"cm"),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank()) +
  geom_text(aes(label = paste("n =", Total)), vjust = 0, hjust = -0.5, size = 3)

ADV <- Robe2 |> 
  mutate(robeSeries = fct_other(Result, keep = "Robe")) |> 
  group_by(Advisor, Condition2, robeSeries) |>
  summarise(number = n(), Condition = mean(Condition)) |> 
  ungroup() |> 
  pivot_wider(
  names_from  = "robeSeries",
  values_from = "number",
  values_fill = 0) |> 
  mutate(Total = Robe+Other,
         PropRobe = Robe/Total)
ADV <- cbind(ADV, 
      binom.confint(ADV$Robe, n = ADV$Total, methods = "wilson"))

ADV <- ADV |> 
  filter(Condition %in% c(1,2,6,7,8,10)) |> #dont like manual subsetting of these rows...
  mutate(Condition3 = (str_replace(Condition2, c("Advisor, "), "")))
  
ADV |> 
ggplot(aes(y = Condition3, x = PropRobe, colour = Advisor)) +
  ylab(NULL) +
  xlab("Proportion Robe") +
  geom_pointrange(aes(xmin = lower, xmax = upper), 
                  size =0.75, linewidth = 1,
                  position = position_dodge(width=1)) +
  scale_colour_manual(values = c("#95a5a6", "#2c3e50")) +
  theme(legend.position = "top") +
  facet_wrap(vars(Condition3), strip.position = "left", ncol = 1, scales = "free_y")+
  theme(strip.text.y.left = element_blank(), 
        strip.text.x = element_blank(),
        panel.spacing = unit(0.5,"cm"),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank()) +
  geom_text(aes(label = paste("n =", Total)), vjust = 0, hjust = -0.5, size = 3)
ADVwide <- ADV |> 
  group_by(Advisor) |> 
  summarise(SumRobes = sum(Robe), SumTots = sum(Total)) |> 
  mutate(SumProp = SumRobes/SumTots)
(ADVpropTestTotal <- prop.test(x = ADVwide$SumRobes, ADVwide$SumTots, alternative = "less"))
(ADVpropTests <- 
  split(ADV, as.factor(ADV$Condition3)) |>  
  map(function (df) prop.test(x = df$Robe, n = df$Total, alternative = "less")))
ADEPT <- Robe2 |> 
  mutate(robeSeries = fct_other(Result, keep = "Robe")) |> 
  group_by(Adept, Condition2, robeSeries) |>
  summarise(number = n(), Condition = mean(Condition)) |> 
  ungroup() |> 
  pivot_wider(
  names_from  = "robeSeries",
  values_from = "number",
  values_fill = 0) |> 
  mutate(Total = Robe+Other,
         PropRobe = Robe/Total)
ADEPT <- cbind(ADEPT, 
      binom.confint(ADEPT$Robe, n = ADEPT$Total, methods = "wilson"))

ADEPT <- ADEPT |> 
  filter(!(Condition %in% c(5,6))) |> #dont like manual subsetting of these rows...
  mutate(Condition3 = (str_replace(Condition2, c("Adept, "), "")))
  
ADEPT |> 
ggplot(aes(y = Condition3, x = PropRobe, colour = Adept)) +
  ylab(NULL) +
  xlab("Proportion Robe") +
  geom_pointrange(aes(xmin = lower, xmax = upper), 
                  size =0.75, linewidth = 1,
                  position = position_dodge(width=1)) +
  scale_colour_manual(values = c("#95a5a6", "#2c3e50")) +
  theme(legend.position = "top") +
  facet_wrap(vars(Condition3), strip.position = "left", ncol = 1, scales = "free_y")+
  theme(strip.text.y.left = element_blank(), 
        strip.text.x = element_blank(),
        panel.spacing = unit(0.5,"cm"),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank()) +
  ggrepel::geom_text_repel(aes(label = paste("n =", Total)), 
                           size = 3, 
                           direction = "x", 
                           min.segment.length = 5, 
                           seed =666)
ADEPTwide <- ADEPT |> 
  pivot_wider(
  names_from  = c(Adept),
  values_from = "PropRobe") |>
  group_by(Condition3) |> 
  summarise(NO = max(NO, na.rm = T), YES = max(YES, na.rm = T)) |> 
  mutate(difference = YES-NO)
adeptdiff <-  round(mean(ADEPTwide$difference), digits = 4)

(adeptEst <- binom.confint(x= 2, n = 223, methods = "wilson"))
(baseEst <- binom.confint(x= 0, n = 217, methods = "wilson"))

ALCH <- Robe2 |> 
  mutate(robeSeries = fct_other(Result, keep = "Robe")) |> 
  group_by(Alchemy, Condition2, robeSeries) |>
  summarise(number = n(), Condition = mean(Condition)) |> 
  ungroup() |> 
  pivot_wider(
  names_from  = "robeSeries",
  values_from = "number",
  values_fill = 0) |> 
  mutate(Total = Robe+Other,
         PropRobe = Robe/Total)
ALCH <- cbind(ALCH, 
      binom.confint(ALCH$Robe, n = ALCH$Total, methods = "wilson"))

ALCH <- ALCH |> 
  filter(Condition %in% c(2,6:14)) |> #dont like manual subsetting of these rows...
  mutate(Condition3 = (str_replace(Condition2, c(", r11"), ""))) |> 
  mutate(Condition3 = (str_replace(Condition3, c(", r20"), ""))) |> 
  mutate(Condition3 = (str_replace(Condition3, c("r11"), "Nothing"))) |> 
  mutate(Condition3 = (str_replace(Condition3, c("r20"), "Nothing")))
  
ALCH |> 
ggplot(aes(y = Condition3, x = PropRobe, colour = as.factor((Alchemy)))) +
  ylab(NULL) +
  xlab("Proportion Robe") +
  labs(colour = "Alchemy rank")+
  geom_pointrange(aes(xmin = lower, xmax = upper), 
                  size =0.75, linewidth = 1,
                  position = position_dodge(width=1)) +
  scale_colour_manual(values = c("#95a5a6", "#2c3e50")) +
  theme(legend.position = "top") +
  facet_wrap(vars(Condition3), strip.position = "left", ncol = 1, scales = "free_y")+
  theme(strip.text.y.left = element_blank(), 
        strip.text.x = element_blank(),
        panel.spacing = unit(0.5,"cm"),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank()) +
  ggrepel::geom_text_repel(aes(label = paste("n =", Total)), 
                           size = 3, 
                           direction = "x", 
                           min.segment.length = 5, 
                           seed =666)
EX <- Robe2 |> 
  mutate(robeSeries = fct_other(Result, keep = "Robe")) |> 
  group_by(EX, Condition2, robeSeries) |>
  summarise(number = n(), Condition = mean(Condition)) |> 
  ungroup() |> 
  pivot_wider(
  names_from  = "robeSeries",
  values_from = "number",
  values_fill = 0) |> 
  mutate(Total = Robe+Other,
         PropRobe = Robe/Total)
EX <- cbind(EX, 
      binom.confint(EX$Robe, n = EX$Total, methods = "wilson"))

EX <- EX |> 
  filter(Condition %in% c(1:5)) |> #dont like manual subsetting of these rows...
  mutate(Condition3 = (str_replace(Condition2, c(", Ex"), "")))

EXwide <- EX |> 
  pivot_wider(
  names_from  = c(EX),
  values_from = "PropRobe") |>
  group_by(Condition3) |> 
  summarise(NO = mean(NO, na.rm = T), YES = mean(YES, na.rm = T)) |> 
  mutate(difference = YES-NO)
EXdiff <-  round(mean(EXwide$difference), digits = 4)  

EX |> 
ggplot(aes(y = Condition3, x = PropRobe, colour = EX)) +
  ylab(NULL) +
  xlab("Proportion Robe") +
  geom_pointrange(aes(xmin = lower, xmax = upper), 
                  size =0.75, linewidth = 1,
                  position = position_dodge(width=1)) +
  scale_colour_manual(values = c("#95a5a6", "#2c3e50")) +
  theme(legend.position = "top") +
  facet_wrap(vars(Condition3), strip.position = "left", ncol = 1, scales = "free_y")+
  theme(strip.text.y.left = element_blank(), 
        strip.text.x = element_blank(),
        panel.spacing = unit(0.5,"cm"),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank()) +
  ggrepel::geom_text_repel(aes(label = paste("n =", Total)),
  size = 3,
  min.segment.length = 5,
  seed =555)
OXF <- Robe2 |> 
  mutate(robeSeries = fct_other(Result, keep = "Robe")) |> 
  group_by(Oxfords, Condition2, robeSeries) |>
  summarise(number = n(), Condition = mean(Condition)) |> 
  ungroup() |> 
  pivot_wider(
  names_from  = "robeSeries",
  values_from = "number",
  values_fill = 0) |> 
  mutate(Total = Robe+Other,
         PropRobe = Robe/Total)
OXF <- cbind(OXF, 
      binom.confint(OXF$Robe, n = OXF$Total, methods = "wilson"))

OXF <- OXF |> 
  filter(Condition %in% c(3,5)) |> #dont like manual subsetting of these rows...
  mutate(Condition3 = (str_replace(Condition2, c(", Oxf"), "")))

OXFwide <- OXF |> 
  pivot_wider(
  names_from  = c(Oxfords),
  values_from = "PropRobe") |>
  group_by(Condition3) |> 
  summarise(NO = mean(NO, na.rm = T), YES = mean(YES, na.rm = T)) |> 
  mutate(difference = YES-NO)
OXFdiff <-  round(mean(EXwide$difference), digits = 4) 

OXF |> 
ggplot(aes(y = Condition3, x = PropRobe, colour = Oxfords)) +
  ylab(NULL) +
  xlab("Proportion Robe") +
  geom_pointrange(aes(xmin = lower, xmax = upper), 
                  size =0.75, linewidth = 1,
                  position = position_dodge(width=1)) +
  scale_colour_manual(values = c("#95a5a6", "#2c3e50")) +
  theme(legend.position = "top") +
  facet_wrap(vars(Condition3), strip.position = "left", ncol = 1, scales = "free_y")+
  theme(strip.text.y.left = element_blank(), 
        strip.text.x = element_blank(),
        panel.spacing = unit(0.5,"cm"),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank()) +
  ggrepel::geom_text_repel(aes(label = paste("n =", Total)),
  size = 3,
  min.segment.length = 5,
  seed =555)
Paymaster <- read_ods("~\\FABtory\\paymaster.ods", sheet =1)
Paymaster$Condition3 <- (as.factor(Paymaster$Condition3))
Paymaster$Condition3 <-fct_relevel(Paymaster$Condition3, "HRE (r11+r20)", "HRE (r12-r14)")

Paymaster |> 
ggplot(aes(y = Condition3, x = PropRobe, colour = Paymaster)) +
  ylab(NULL) +
  xlab("Proportion Robe") +
  geom_pointrange(aes(xmin = lower, xmax = upper), 
                  size =0.75, linewidth = 1,
                  position = position_dodge(width=1)) +
  scale_colour_manual(values = c("#95a5a6", "#2c3e50")) +
  theme(legend.position = "top") +
  facet_wrap(vars(Condition3), strip.position = "left", ncol = 1, scales = "free_y")+
  theme(strip.text.y.left = element_blank(), #element_text(angle = 0, hjust = 0),
        strip.text.x = element_blank(),
        panel.spacing = unit(0,"cm"),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank()) +
  geom_text(aes(label = paste("n =", Total)), vjust = -1, hjust = -0.5, size = 3)
#proportion minorsuccess
RST <- RobeSummaryTable
#remove row for totals
RST <- RST |> 
  filter(Batch < 27) |> 
  mutate(Condition = as.numeric(Condition))
RST <- RST |> 
  mutate(Minor = (FineVelvet+Justaucorps+Waistcoat)) 
RST <- RST |> 
  mutate(propMinor = (FineVelvet+Justaucorps+Waistcoat)/Total) 
RST <- RST |> left_join(Design |> 
                          select(Condition, Condition2), by="Condition")
RST <- cbind(RST, 
      MinorProp=binom.confint(RST$Minor, n = RST$Total, methods = "wilson"))

#combine batches
RST2 <- RST |> 
  group_by(Condition2) |> 
  summarise(
    Condition = mean(Condition),
    Failure = sum(Failure),
    FineVelvet = sum(FineVelvet),
    Justaucorps = sum(Justaucorps),
    Robe = sum(Robe),
    Waistcoat = sum(Waistcoat),
    Total = sum(Total),
    Minor = sum(Minor)
  ) |> 
  mutate(propMinor = (FineVelvet+Justaucorps+Waistcoat)/Total,
         propRobe = Robe/Total,
         propFailure = Failure/Total,
         propJustau = Justaucorps/Total,
         propVelvet = FineVelvet/Total,
         propWaist = Waistcoat/Total,
         propJustau_minor = Justaucorps/Minor,
         propVelvet_minor = FineVelvet/Minor,
         propWaist_minor = Waistcoat/Minor
         ) 
RST2  <- RST2 |> 
  mutate(Condition2 = as.factor(Condition2)) |> 
  mutate(Condition2 = fct_reorder(Condition2, propRobe, .fun = mean)) |> 
  mutate(Condition2 = fct_rev(Condition2))
RST2 <- cbind(RST2, 
              propMinor = binom.confint(RST2$Minor, n = RST2$Total, methods = "wilson"))
#overall proportion minor success:
propMinorOverall <- sum(RST2$Minor)/sum(RST2$Total)
propMinorOverall2 <- mean(RST2$propMinor)

RST2 |> 
  ggplot(aes(y = Condition2, x = propMinor)) +
  ylab(NULL) +
  xlab("Proportion minor success") +
  geom_vline(xintercept = propMinorOverall, linetype = "dashed", colour = "#95a5a6")+
  geom_pointrange(aes(xmin = propMinor.lower, xmax = propMinor.upper), 
                  size =0.5, linewidth = 0.75,
                  position = position_dodge(width=1)) +
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))

RST2 <- cbind(RST2, 
              propFailure = binom.confint(RST2$Failure, n = RST2$Total, methods = "wilson"))
RST2 <- cbind(RST2, 
              propRobe = binom.confint(RST2$Robe, n = RST2$Total, methods = "wilson"))
RST2 <- cbind(RST2, 
              propJustau = binom.confint(RST2$Justaucorps, n = RST2$Total, methods = "wilson"))
RST2 <- cbind(RST2, 
              propVelvet = binom.confint(RST2$FineVelvet, n = RST2$Total, methods = "wilson"))
RST2 <- cbind(RST2, 
              propWaist = binom.confint(RST2$Waistcoat, n = RST2$Total, methods = "wilson"))
RST2 <- cbind(RST2, 
              propJustau_minor = binom.confint(RST2$Justaucorps, n = RST2$Minor, methods = "wilson"))
RST2 <- cbind(RST2, 
              propVelvet_minor = binom.confint(RST2$FineVelvet, n = RST2$Minor, methods = "wilson"))
RST2 <- cbind(RST2, 
              propWaist_minor = binom.confint(RST2$Waistcoat, n = RST2$Minor, methods = "wilson"))
rank11 <- c(6,7,9,11,12)
RST2 <- RST2 |> 
  mutate(AlchemyRank = ifelse(Condition %in% rank11, 11, 20))

propFailureOverall <- sum(RST2$Failure)/sum(RST2$Total)
RST2 |> 
  ggplot(aes(y = Condition2, x = propFailure.mean)) +
  ylab(NULL) +
  xlab("Proportion Failure") +
  geom_vline(xintercept = propFailureOverall, linetype = "dashed", colour = "#95a5a6")+
  geom_pointrange(aes(xmin = propFailure.lower, xmax = propFailure.upper), 
                  size =0.5, linewidth = 0.75)+
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))

propRobeOverall <- sum(RST2$Robe)/sum(RST2$Total)
RST2 |> 
  ggplot(aes(y = Condition2, x = propRobe.mean)) +
  ylab(NULL) +
  xlab("Proportion Adept's Robe") +
  geom_vline(xintercept = propRobeOverall, linetype = "dashed", colour = "#95a5a6")+
  geom_pointrange(aes(xmin = propRobe.lower, xmax = propRobe.upper), 
                  size =0.5, linewidth = 0.75)+
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))

propJustauOverall <- sum(RST2$Justaucorps)/sum(RST2$Total)

propJustauR11 <- RST2 |>  
  filter(AlchemyRank == 11) |> 
  summarise(out = sum(Justaucorps)/sum(Total)) |> 
  pull(out)
propJustauR20 <- RST2 |>  
  filter(AlchemyRank == 20) |> 
  summarise(out = sum(Justaucorps)/sum(Total)) |> 
  pull(out)

JustauPropAlchemy <- RST2 |>
  group_by(AlchemyRank) |>
  summarise(Justaucorps = sum(Justaucorps), Total = sum(Total))

(JustauAlchRes <- 
    binom.wilson(JustauPropAlchemy$Justaucorps, JustauPropAlchemy$Total))
JustauAlchRes$Group <- as_factor(c("r11", "r20"))
#test if equal proportions
(JustauAlchResTest <- 
    prop.test(x = JustauAlchRes$x, JustauAlchRes$n, alternative = "two.sided"))

RST2 |> 
  ggplot(aes(y = Condition2, x = propJustau.mean, 
             colour = as.factor(AlchemyRank))) +
  ylab(NULL) +
  xlab("Proportion Admiral Justaucorps") +
  geom_vline(xintercept = propJustauOverall, linetype = "dashed", 
             colour = "#95a5a6")+
  geom_vline(xintercept = propJustauR11, linetype = "dashed", 
             colour = "#377eb8")+
  geom_vline(xintercept = propJustauR20, linetype = "dashed", 
             colour = "#B51F2E")+
  geom_pointrange(aes(xmin = propJustau.lower, xmax = propJustau.upper), 
                  size =0.5, linewidth = 0.75)+
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))+
  scale_colour_manual(values = c("#377eb8", "#B51F2E"))+
  theme( 
    legend.position = c(0.95,0.95),
    legend.justification = c(1,1),
  )+
  labs(colour = "Alchemy rank")
 

propVelvetOverall <- sum(RST2$FineVelvet)/sum(RST2$Total)
RST2 |> 
  ggplot(aes(y = Condition2, x = propVelvet.mean)) +
  ylab(NULL) +
  xlab("Proportion Fine Velvet") +
  geom_vline(xintercept = propVelvetOverall, linetype = "dashed", 
             colour = "#95a5a6")+
  geom_pointrange(aes(xmin = propVelvet.lower, xmax = propVelvet.upper), 
                  size =0.5, linewidth = 0.75)+
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))

propWaistOverall <- sum(RST2$Waistcoat)/sum(RST2$Total)
RST2 |> 
  ggplot(aes(y = Condition2, x = propWaist.mean)) +
  ylab(NULL) +
  xlab("Proportion Waistcoat") +
  geom_vline(xintercept = propWaistOverall, linetype = "dashed", 
             colour = "#95a5a6")+
  geom_pointrange(aes(xmin = propWaist.lower, xmax = propWaist.upper), 
                  size =0.5, linewidth = 0.75)+
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))

#proportions innen minor
#prop.test
propJustauOverall_Minor <- sum(RST2$Justaucorps)/(sum(RST2$Justaucorps)+sum(RST2$Waistcoat)+sum(RST$FineVelvet))
propJustauR11_Minor <- RST2 |>  
  filter(AlchemyRank == 11) |> 
  summarise(out = sum(Justaucorps)/sum(Minor)) |> 
  pull(out)
propJustauR20_Minor <- RST2 |>  filter(AlchemyRank == 20) |> 
  summarise(out = sum(Justaucorps)/sum(Minor)) |> 
  pull(out)
RST2 |> 
  ggplot(aes(y = Condition2, x = propJustau_minor.mean, 
             colour = as.factor(AlchemyRank))) +
  ylab(NULL) +
  xlab("Proportion Justaucorps among minor successes") +
  geom_vline(xintercept = propJustauOverall_Minor, linetype = "dashed", 
             colour = "#95a5a6")+
  geom_vline(xintercept = propJustauR11_Minor, linetype = "dashed", 
             colour = "#377eb8")+
  geom_vline(xintercept = propJustauR20_Minor, linetype = "dashed", 
             colour = "#B51F2E")+
  geom_pointrange(aes(xmin = propJustau_minor.lower, xmax = propJustau_minor.upper), 
                  size =0.5, linewidth = 0.75)+
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))+
  scale_colour_manual(values = c("#377eb8", "#B51F2E"))+
  theme( 
    legend.position = c(0.95,0.95),
    legend.justification = c(1,1),
  )+
  labs(colour = "Alchemy rank")


propVelvetOverall_Minor <- sum(RST2$FineVelvet)/(sum(RST2$Justaucorps)+sum(RST2$Waistcoat)+sum(RST$FineVelvet))
propVelvetR11_Minor <- RST2 |>  
  filter(AlchemyRank == 11) |> 
  summarise(out = sum(FineVelvet)/sum(Minor)) |> 
  pull(out)
propVelvetR20_Minor <- RST2 |>  
  filter(AlchemyRank == 20) |> 
  summarise(out = sum(FineVelvet)/sum(Minor)) |> 
  pull(out)

RST2 |> 
  ggplot(aes(y = Condition2, x = propVelvet_minor.mean, 
             colour = as.factor(AlchemyRank))) +
  ylab(NULL) +
  xlab("Proportion Fine Velvet among minor successes") +
  geom_vline(xintercept = propVelvetOverall_Minor, linetype = "dashed", 
             colour = "#95a5a6")+
  geom_vline(xintercept = propVelvetR11_Minor, linetype = "dashed", 
             colour = "#377eb8")+
  geom_vline(xintercept = propVelvetR20_Minor, linetype = "dashed", 
             colour = "#B51F2E")+
  geom_pointrange(aes(xmin = propVelvet_minor.lower, xmax = propVelvet_minor.upper), 
                  size =0.5, linewidth = 0.75)+
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))+
  scale_colour_manual(values = c("#377eb8", "#B51F2E"))+
  theme( 
    legend.position = c(0.95,0.95),
    legend.justification = c(1,1),
  )+
  labs(colour = "Alchemy rank")

propWaistOverall_Minor <- 
  sum(RST2$Waistcoat)/(sum(RST2$Justaucorps)+sum(RST2$Waistcoat)+sum(RST$FineVelvet))
propWaistR11_Minor <- RST2 |>  
  filter(AlchemyRank == 11) |> 
  summarise(out = sum(Waistcoat)/sum(Minor)) |> 
  pull(out)
propWaistR20_Minor <- 
  RST2 |>  filter(AlchemyRank == 20) |> 
  summarise(out = sum(Waistcoat)/sum(Minor)) |> 
  pull(out)
RST2 |> 
  ggplot(aes(y = Condition2, x = propWaist_minor.mean, 
             colour = as.factor(AlchemyRank))) +
  ylab(NULL) +
  xlab("Proportion Waistcoat among minor successes") +
  geom_vline(xintercept = propWaistOverall_Minor, linetype = "dashed", 
             colour = "#95a5a6")+
  geom_vline(xintercept = propWaistR11_Minor, linetype = "dashed", 
             colour = "#377eb8")+
  geom_vline(xintercept = propWaistR20_Minor, linetype = "dashed", 
             colour = "#B51F2E")+
  geom_pointrange(aes(xmin = propWaist_minor.lower, xmax = propWaist_minor.upper), 
                  size =0.5, linewidth = 0.75)+
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))+
  scale_colour_manual(values = c("#377eb8", "#B51F2E"))+
  theme( 
    legend.position = c(0.95,0.95),
    legend.justification = c(1,1),
  )+
  labs(colour = "Alchemy rank")


#per batch:
RST |> 
  ggplot(aes(y = Failure/Total, x = Robe/Total)) +
  geom_segment(aes(x = 0, y = 0.65, xend = 0.65, yend = 0),
               colour = "#95a5a6", 
               linetype ="dashed", 
               linewidth = 1)+
  annotate("text", x = 0.57, y = 0.21, 
           label ="Fixed 65% \ny = 0.65 − x",
           colour = "#95a5a6") +
  ylim(0,0.7) +
  xlim(0,0.7) +
  ggpmisc::stat_ma_line(se = FALSE, colour = "#B51F2E") +
  ggpmisc::stat_ma_eq(mapping = use_label(c("eq", "R2")), 
                      colour = "#B51F2E", label.x = 0.15, label.y = 0.9) +
  geom_point(size= 3, shape = 21, fill = "black", colour = "white")
#per condition (more precise, but fewer data points):
RST2 |> 
  ggplot(aes(y = Failure/Total, x = Robe/Total)) +
  geom_segment(aes(x = 0, y = 0.65, xend = 0.65, yend = 0),
               colour = "#95a5a6", 
               linetype ="dashed", 
               linewidth = 1)+
  annotate("text", x = 0.57, y = 0.21, 
           label ="Fixed 65% \ny = 0.65 − x",
           colour = "#95a5a6") +
  ylim(0,0.7) +
  xlim(0,0.7) +
  ggpmisc::stat_ma_line(se = FALSE, colour = "#B51F2E") +
  ggpmisc::stat_ma_eq(mapping = use_label(c("eq", "R2")), 
                      colour = "#B51F2E", label.x = 0.15, label.y = 0.9) +
  geom_point(size= 3, shape = 21, fill = "black", colour = "white")

cor.test(RST$Failure/RST$Total, RST$Robe/RST$Total)
cor.test(RST2$Failure/RST2$Total, RST2$Robe/RST2$Total)
ppcor::pcor.test(RST$Robe, RST$Failure, RST$Total)
ppcor::pcor.test(RST2$Robe, RST2$Failure, RST2$Total)
RST2matrix <- RST2 |> 
  select(propRobe, propFailure, propMinor, propJustau, propVelvet, propWaist, propJustau_minor, propVelvet_minor, propWaist_minor) |> 
  cor()

corrplot.mixed(RST2matrix,
 upper = "ellipse",
 lower = "number",
 order = 'AOE', 
 tl.col = 'black',
 tl.pos = "lt",
 diag = "n",
 tl.srt = 30,
 bg = "#EBEBEB",
 #mar = c(0,1,1,1)
 )
# runs-test for whole data, should be some non-randomness because batches are from different distributions
Robe2$robeSeries <- fct_other(Robe2$Result, keep = "Robe")
tseries::runs.test(Robe2$robeSeries)
snpar::runs.test(as.numeric(Robe2$robeSeries))
sjekk <- randtests::runs.test(as.numeric(Robe2$robeSeries), threshold = 1.5) 
DescTools::RunsTest(as.numeric(Robe2$robeSeries)) 

# runs-tests for each BATCH, should not be any non-randomness
Robe2$Batch2 <-
  paste("Batch ", Robe2$Batch, ", ", Robe2$Condition2, sep = "")
runsOutputBatch <- split(Robe2, Robe2$Batch2) |>
  map(function (df)
    broom::tidy(randtests::runs.test(as.numeric(df$robeSeries), threshold = 1.5)))
runsOutputBatch <- bind_rows(runsOutputBatch, .id = "ID")
runsTableBatch <- datatable(runsOutputBatch,
          options = list(pageLength = 5),
          caption = 'Runs tests for each batch') |>
  formatRound(columns = c('statistic', 'p.value'),
              digits = 3)

# runs-tests for each CONDITION, should not be any non-randomness
Robe2$Condition4 <-
  paste("Condition ", Robe2$Condition, ", ", Robe2$Condition2, sep = "")
runsOutputCond <- split(Robe2, Robe2$Condition4) |>
  map(function (df)
    broom::tidy(randtests::runs.test(as.numeric(df$robeSeries), threshold = 1.5)))
runsOutputCond <- bind_rows(runsOutputCond, .id = "ID")
runsTableCond <- datatable(runsOutputCond,
          options = list(pageLength = 5),
          caption = 'Runs tests for combination of batches under same conditions') |>
  formatRound(columns = c('statistic', 'p.value'),
              digits = 3)

# autocorrelation, just white noise
acf(as.numeric((Robe2$robeSeries)))
acf(Robe2$robeSeries)
pacf(Robe2$robeSeries)
# nye variabler:
Robe2<- Robe2 |>
  group_by(Batch) |> 
    mutate(robeSeries2 = case_when(
    robeSeries=="Robe" ~ 1, 
    robeSeries=="Other" ~0)) |> 
 summarise(robeInBatch = max(robeSeries2)) |> 
right_join(Robe2)
 
Robe2<- Robe2 |>
  group_by(Condition) |> 
    mutate(robeSeries2 = case_when(
    robeSeries=="Robe" ~ 1, 
    robeSeries=="Other" ~0)) |> 
 summarise(robeInCondition = max(robeSeries2)) |> 
right_join(Robe2) 
Robe2 <- ungroup(Robe2)
# filter on these to avoid empty batches that throws error for acf

ACFwhole <- autoplot(acf(Robe2$robeSeries, plot = FALSE), 
                     main = "Whole time series",
                     conf.int.colour = "#B51F2E")
PACFwhole <- autoplot(pacf(Robe2$robeSeries, plot = FALSE), 
                      main = "Whole time series", 
                      ylab = "Partial ACF",
                      conf.int.colour = "#B51F2E")

RobesInBatchSeries <- Robe2 |> 
  filter(robeInBatch == 1)
autoplot(acf(RobesInBatchSeries$robeSeries, plot = FALSE))
BatchACF <- split(RobesInBatchSeries, RobesInBatchSeries$Batch2) |> 
  map(function (df) acf(df$robeSeries, plot = FALSE))
BatchACF <- autoplot(BatchACF, 
                     ncol = 3, 
                     conf.int.colour = "#B51F2E")
#PACF
BatchPACF <- split(RobesInBatchSeries, RobesInBatchSeries$Batch2) |> 
  map(function (df) pacf(df$robeSeries, plot = FALSE))
BatchPACF <- autoplot(BatchPACF, 
                      ncol = 3, 
                      ylab = "Partial ACF", 
                      conf.int.colour = "#B51F2E")

RobesInConditionSeries <- Robe2 |> 
  filter(robeInCondition == 1)
autoplot(acf(RobesInConditionSeries$robeSeries, plot = FALSE))
ConditionACF <- split(RobesInConditionSeries, 
                      fct_drop(RobesInConditionSeries$Condition2)) |> 
  map(function (df) acf(df$robeSeries, plot = FALSE))
ConditionACF <- autoplot(ConditionACF, 
                         ncol = 3, 
                         conf.int.colour = "#B51F2E")
#PACF
ConditionPACF <- split(RobesInConditionSeries, RobesInConditionSeries$Condition2) |> 
  map(function (df) pacf(df$robeSeries, plot = FALSE))
ConditionPACF <- autoplot(ConditionPACF, 
                          ncol = 3, 
                          ylab = "Partial ACF", 
                          conf.int.colour = "#B51F2E")

#plot text Batch/condition number?
# sequence plots robe vs non-robe
#whole sequence:
ggplot(Robe2, aes(x=Sequence, fill = fct_rev(Result))) +
  geom_bar(position = "fill", width = 1) +
  scale_fill_brewer(palette = "RdBu", direction = -1) +
  #scale_fill_viridis_d(option = "turbo", direction = 1)+
  #scale_x_continuous(breaks = seq(0, 2000, by = 100)) +
  scale_x_continuous(expand = expansion(mult = .025)) +
  scale_y_continuous(breaks = NULL) +
  theme(legend.background = element_rect(fill = "#EBEBEB"),
        legend.key = element_rect(fill = NA),
        legend.position = "top",
        axis.text.y = element_text(size = rel(1.2)),
        legend.justification = c(0.5, 0)) +
  guides(fill = guide_legend(reverse = TRUE, title = NULL)) +
  xlab("Sequence") +
  ylab(NULL)
ggplot(Robe2, aes(x=Sequence, fill = fct_rev(robeSeries))) +
  geom_bar(position = "fill", width = 1) +
  scale_fill_manual(values = c("#95a5a6","#B51F2E")) +
  #scale_x_continuous(breaks = seq(0, 2000, by = 100)) +
  scale_x_continuous(expand = expansion(mult = .025)) +
  scale_y_continuous(breaks = NULL) +
  theme(legend.background = element_rect(fill = "#EBEBEB"),
        legend.key = element_rect(fill = NA, color = NA),
        legend.position = "top",
        axis.text.y = element_text(size = rel(1.2)),
        legend.justification = c(0.5, 0)) +
  guides(fill = guide_legend(reverse = TRUE, title = NULL)) +
  xlab("Sequence") +
  ylab(NULL)
# per batch:
ggplot(Robe2, aes(x=BatSeq, fill = fct_rev(robeSeries))) +
  geom_bar(position = "fill", width = 1) +
  facet_wrap(vars(fct_rev(Condition3), GroupBatch), strip.position = "left", ncol = 1) +
  #scale_fill_viridis_d(option = "turbo", direction = 1)+
  scale_fill_manual(values = c("#95a5a6","#B51F2E")) +
  scale_x_continuous(breaks = seq(0, 120, by = 20)) +
  scale_y_continuous(breaks = NULL) +
  theme(legend.background = element_rect(fill = "#EBEBEB"),
        legend.key = element_rect(fill = NA, color = NA),
        legend.position = "top",
        axis.text.y = element_text(size = rel(1.2)),
        legend.justification = c(0.5, 0)) +
  guides(fill = guide_legend(reverse = TRUE, title = NULL)) +
  theme(strip.text.y.left = element_blank(), #element_text(angle = 0, hjust = 0),
        strip.text.x = element_blank(),
        panel.spacing = unit(0.05,"cm")) +
  xlab("Sequence") +
  ylab(NULL)
# per condition:
# first make a sequence within condition.

Robe2 <- Robe2 |> 
  group_by(Condition) |> 
  mutate(StartSeq = min(Sequence),
         CondSeq = Sequence-StartSeq+1)
Robe2 <- ungroup(Robe2)
         
ggplot(Robe2, aes(x=CondSeq, fill = fct_rev(robeSeries))) +
  geom_bar(position = "fill", width = 1) +
  facet_wrap(vars(fct_rev(Condition3)), strip.position = "left", ncol = 1) +
  #scale_fill_viridis_d(option = "turbo", direction = 1)+
  scale_fill_manual(values = c("#95a5a6","#B51F2E")) +
  scale_x_continuous(breaks = seq(0, 220, by = 20)) +
  scale_y_continuous(breaks = NULL) +
  theme(legend.background = element_rect(fill = "#EBEBEB"),
        legend.key = element_rect(fill = NA, color = NA),
        legend.position = "top",
        axis.text.y = element_text(size = rel(1.2)),
        legend.justification = c(0.5, 0)) +
  guides(fill = guide_legend(reverse = TRUE, title = NULL)) +
  theme(strip.text.y.left = element_blank(), #element_text(angle = 0, hjust = 0),
        strip.text.x = element_blank(),
        panel.spacing = unit(0.05,"cm")) +
  xlab("Sequence") +
  ylab(NULL)
knitr::include_graphics('theorydrawing.png')
knitr::include_graphics('FABingredients.jpg')
knitr::include_graphics('RobeEX.jpg')
DT::datatable(FAB[,-1], options = list(pageLength = 10), caption = 'FAB experiments raw data. Values are number of attempts needed')
DT::datatable(Robe, options = list(pageLength = 10), caption = 'Robe experiments raw data')
runsTableBatch
runsTableCond
ACFwhole
PACFwhole
BatchACF
ConditionACF
BatchPACF
ConditionPACF
sessioninfo::session_info()

8.8 Session information

sessioninfo::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.2.3 (2023-03-15 ucrt)
##  os       Windows 10 x64 (build 19045)
##  system   x86_64, mingw32
##  ui       RTerm
##  language (EN)
##  collate  Norwegian Bokmål_Norway.utf8
##  ctype    Norwegian Bokmål_Norway.utf8
##  tz       Europe/Berlin
##  date     2023-04-13
##  pandoc   2.19.2 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package      * version  date (UTC) lib source
##  backports      1.4.1    2021-12-13 [1] CRAN (R 4.2.0)
##  binom        * 1.1-1.1  2022-05-02 [1] CRAN (R 4.2.0)
##  bookdown       0.33     2023-03-06 [1] CRAN (R 4.2.2)
##  boot           1.3-28.1 2022-11-22 [2] CRAN (R 4.2.3)
##  broom        * 1.0.4    2023-03-11 [1] CRAN (R 4.2.2)
##  bslib          0.4.2    2022-12-16 [1] CRAN (R 4.2.2)
##  cachem         1.0.7    2023-02-24 [1] CRAN (R 4.2.2)
##  cellranger     1.1.0    2016-07-27 [1] CRAN (R 4.2.0)
##  class          7.3-21   2023-01-23 [1] CRAN (R 4.2.2)
##  cli            3.6.1    2023-03-23 [1] CRAN (R 4.2.3)
##  colorspace     2.1-0    2023-01-23 [1] CRAN (R 4.2.2)
##  corrplot     * 0.92     2021-11-18 [1] CRAN (R 4.2.0)
##  crayon         1.5.2    2022-09-29 [1] CRAN (R 4.2.0)
##  crosstalk      1.2.0    2021-11-04 [1] CRAN (R 4.2.0)
##  curl           5.0.0    2023-01-12 [1] CRAN (R 4.2.2)
##  data.table     1.14.8   2023-02-17 [1] CRAN (R 4.2.2)
##  DescTools      0.99.48  2023-02-19 [1] CRAN (R 4.2.2)
##  digest         0.6.31   2022-12-11 [1] CRAN (R 4.2.2)
##  dplyr        * 1.1.1    2023-03-22 [1] CRAN (R 4.2.3)
##  DT           * 0.27     2023-01-17 [1] CRAN (R 4.2.2)
##  e1071          1.7-13   2023-02-01 [1] CRAN (R 4.2.2)
##  ellipsis       0.3.2    2021-04-29 [1] CRAN (R 4.2.0)
##  evaluate       0.20     2023-01-17 [1] CRAN (R 4.2.2)
##  Exact          3.2      2022-09-25 [1] CRAN (R 4.2.0)
##  expm           0.999-7  2023-01-09 [1] CRAN (R 4.2.2)
##  fansi          1.0.4    2023-01-22 [1] CRAN (R 4.2.2)
##  farver         2.1.1    2022-07-06 [1] CRAN (R 4.2.1)
##  fastmap        1.1.1    2023-02-24 [1] CRAN (R 4.2.2)
##  forcats      * 1.0.0    2023-01-29 [1] CRAN (R 4.2.2)
##  generics       0.1.3    2022-07-05 [1] CRAN (R 4.2.1)
##  ggfortify    * 0.4.16   2023-03-20 [1] CRAN (R 4.2.3)
##  ggplot2      * 3.4.2    2023-04-03 [1] CRAN (R 4.2.3)
##  ggpmisc      * 0.5.2    2022-12-17 [1] CRAN (R 4.2.3)
##  ggpp         * 0.5.2    2023-04-01 [1] CRAN (R 4.2.3)
##  ggrepel        0.9.3    2023-02-03 [1] CRAN (R 4.2.2)
##  gld            2.6.6    2022-10-23 [1] CRAN (R 4.2.2)
##  glue           1.6.2    2022-02-24 [1] CRAN (R 4.2.0)
##  gridExtra      2.3      2017-09-09 [1] CRAN (R 4.2.0)
##  gtable         0.3.3    2023-03-21 [1] CRAN (R 4.2.3)
##  highr          0.10     2022-12-22 [1] CRAN (R 4.2.2)
##  hms            1.1.3    2023-03-21 [1] CRAN (R 4.2.3)
##  htmltools      0.5.5    2023-03-23 [1] CRAN (R 4.2.3)
##  htmlwidgets    1.6.2    2023-03-17 [1] CRAN (R 4.2.2)
##  httr           1.4.5    2023-02-24 [1] CRAN (R 4.2.2)
##  jquerylib      0.1.4    2021-04-26 [1] CRAN (R 4.2.0)
##  jsonlite       1.8.4    2022-12-06 [1] CRAN (R 4.2.2)
##  kableExtra   * 1.3.4    2021-02-20 [1] CRAN (R 4.2.0)
##  knitr          1.42     2023-01-25 [1] CRAN (R 4.2.2)
##  labeling       0.4.2    2020-10-20 [1] CRAN (R 4.2.0)
##  lattice        0.21-8   2023-04-05 [1] CRAN (R 4.2.3)
##  lifecycle      1.0.3    2022-10-07 [1] CRAN (R 4.2.2)
##  lmodel2        1.7-3    2018-02-05 [1] CRAN (R 4.2.0)
##  lmom           2.9      2022-05-29 [1] CRAN (R 4.2.0)
##  magrittr       2.0.3    2022-03-30 [1] CRAN (R 4.2.0)
##  MASS           7.3-58.3 2023-03-07 [1] CRAN (R 4.2.2)
##  Matrix         1.5-4    2023-04-04 [1] CRAN (R 4.2.3)
##  MatrixModels   0.5-1    2022-09-11 [1] CRAN (R 4.2.0)
##  munsell        0.5.0    2018-06-12 [1] CRAN (R 4.2.0)
##  mvtnorm        1.1-3    2021-10-08 [1] CRAN (R 4.2.0)
##  pillar         1.9.0    2023-03-22 [1] CRAN (R 4.2.3)
##  pkgconfig      2.0.3    2019-09-22 [1] CRAN (R 4.2.0)
##  polynom        1.4-1    2022-04-11 [1] CRAN (R 4.2.0)
##  ppcor          1.1      2015-12-03 [1] CRAN (R 4.2.3)
##  proxy          0.4-27   2022-06-09 [1] CRAN (R 4.2.0)
##  purrr        * 1.0.1    2023-01-10 [1] CRAN (R 4.2.2)
##  quadprog       1.5-8    2019-11-20 [1] CRAN (R 4.2.0)
##  quantmod       0.4.22   2023-04-07 [1] CRAN (R 4.2.3)
##  quantreg       5.95     2023-04-08 [1] CRAN (R 4.2.3)
##  R6             2.5.1    2021-08-19 [1] CRAN (R 4.2.0)
##  randtests      1.0.1    2022-06-20 [1] CRAN (R 4.2.0)
##  RColorBrewer   1.1-3    2022-04-03 [1] CRAN (R 4.2.0)
##  Rcpp           1.0.10   2023-01-22 [1] CRAN (R 4.2.2)
##  readODS      * 1.8.0    2023-01-23 [1] CRAN (R 4.2.2)
##  readr          2.1.4    2023-02-10 [1] CRAN (R 4.2.2)
##  readxl         1.4.2    2023-02-09 [1] CRAN (R 4.2.2)
##  rlang          1.1.0    2023-03-14 [1] CRAN (R 4.2.2)
##  rmarkdown      2.21     2023-03-26 [1] CRAN (R 4.2.3)
##  rootSolve      1.8.2.3  2021-09-29 [1] CRAN (R 4.2.0)
##  rstudioapi     0.14     2022-08-22 [1] CRAN (R 4.2.1)
##  rvest          1.0.3    2022-08-19 [1] CRAN (R 4.2.0)
##  sass           0.4.5    2023-01-24 [1] CRAN (R 4.2.2)
##  scales         1.2.1    2022-08-20 [1] CRAN (R 4.2.0)
##  sessioninfo    1.2.2    2021-12-06 [1] CRAN (R 4.2.0)
##  shades       * 1.4.0    2019-08-02 [1] CRAN (R 4.2.0)
##  snpar          1.0      2014-08-16 [1] Github (debinqiu/snpar@578b79c)
##  SparseM        1.81     2021-02-18 [1] CRAN (R 4.2.0)
##  stringi        1.7.12   2023-01-11 [1] CRAN (R 4.2.2)
##  stringr      * 1.5.0    2022-12-02 [1] CRAN (R 4.2.2)
##  survival       3.5-5    2023-03-12 [1] CRAN (R 4.2.3)
##  svglite        2.1.1    2023-01-10 [1] CRAN (R 4.2.2)
##  systemfonts    1.0.4    2022-02-11 [1] CRAN (R 4.2.0)
##  tibble         3.2.1    2023-03-20 [1] CRAN (R 4.2.3)
##  tidyr        * 1.3.0    2023-01-24 [1] CRAN (R 4.2.2)
##  tidyselect     1.2.0    2022-10-10 [1] CRAN (R 4.2.2)
##  tseries      * 0.10-53  2023-01-31 [1] CRAN (R 4.2.2)
##  TTR            0.24.3   2021-12-12 [1] CRAN (R 4.2.0)
##  tufte          0.12     2022-01-27 [1] CRAN (R 4.2.3)
##  tzdb           0.3.0    2022-03-28 [1] CRAN (R 4.2.0)
##  utf8           1.2.3    2023-01-31 [1] CRAN (R 4.2.2)
##  vctrs          0.6.1    2023-03-22 [1] CRAN (R 4.2.3)
##  viridisLite    0.4.1    2022-08-22 [1] CRAN (R 4.2.1)
##  webshot        0.5.4    2022-09-26 [1] CRAN (R 4.2.0)
##  withr          2.5.0    2022-03-03 [1] CRAN (R 4.2.0)
##  xfun           0.38     2023-03-24 [1] CRAN (R 4.2.3)
##  xml2           1.3.3    2021-11-30 [1] CRAN (R 4.2.0)
##  xts            0.13.0   2023-02-20 [1] CRAN (R 4.2.2)
##  yaml           2.3.7    2023-01-23 [1] CRAN (R 4.2.2)
##  zoo            1.8-11   2022-09-17 [1] CRAN (R 4.2.1)
## 
##  [1] C:/R/RLibs
##  [2] C:/Program Files/R/R-4.2.3/library
## 
## ──────────────────────────────────────────────────────────────────────────────

9 Footnotes


  1. Of course, the high negative correlation seen in Figure 25 is meaningless in itself because it is partly due to the same total number of trials being used to calculate both proportions, and the variables are not free to vary independently – correlation between parts of a whole is inherently spurious. However, this does not account for the whole relationship since the other correlations between the proportions were much weaker(Figure 26), and the correlation between failures and robes while controlling for totals is still strong: \(r_{partial}\) = -0.91, n = 14, p < 0.0001.↩︎

  2. Figure 26 also includes variables for proportions within the minor success category, denoted _minor. The caveats explained in footnote 1 applies even more.↩︎

  3. Multinomial runs-tests and auto-correlation is computationally complex and beyond the scope of this report, but see Weiss (2017).↩︎

  4. I acknowledge, but ignore, the problem that single batches have inherently less statistical power because of lower sample sizes.↩︎

  5. He could have said FABulous↩︎

  6. Rumours of the flawed (and exploitable) RNG algorithms used in UWO have floated around since the launch of Gama server. For example, one idea is that there are good and bad sessions. So that if you fail, you should close the dialogue screen and start over until you hit a better session. (Or maybe wait a bit until the bad times are over. Who knows.)↩︎

  7. Only for the last individual would a slight bias occur, which is negligible.↩︎

---
title: "Report from a FABtory"
author: "Stinker, Anima (London), Dept. of Experimental Alchemy"
date: "Last compiled: `r Sys.setlocale('LC_TIME', 'English'); format(Sys.time(), '%e %B, %Y')`"
output:
  bookdown::html_document2:
    code_folding: hide
    code_download: true
    toc: yes
    fig_caption: yes
    global_numbering: yes
    toc_float:
      collapsed: yes
    theme: flatly
    highlight: haddock
editor_options: 
  chunk_output_type: console
---

```{css, echo=FALSE}
p.caption {
  color: #777;
}
caption {
  color: #777;
}
```

```{r, setup, results='hide', echo = FALSE}
knitr::opts_chunk$set(fig.width = 5, fig.align = 'center', 
                      collapse = TRUE, warning = FALSE, message = FALSE)
```

```{r, echo = TRUE, results='hide'}
#packages
library(readODS)
library(ggplot2)
library(dplyr)
library(tidyr)
library(kableExtra)
library(forcats)
library(purrr)
library(tseries)
library(ggfortify)
library(binom)
library(broom)
library(stringr)
library(ggpmisc)
library(corrplot)
library(DT)
library(shades)
#data
FAB <- read_ods("~\\FABtory\\FABtory.ods", sheet =2)
FAB2 <- FAB  |>  pivot_longer(
  cols = Biretta:WhiteOre,
  names_to = "Item",
  values_to = "AttemptsNeeded"
  )
Snitt <- FAB2  |> 
  group_by(Toon, Who) |>
  summarise(MeanAttemptsNeeded = mean(AttemptsNeeded))
Robe <- read_ods("~\\FABtory\\FABtory.ods", sheet =3)
Design <- read_ods("~\\FABtory\\FABtory.ods", sheet =4)
RobeSummaryTable <- read_ods("~\\FABtory\\FABtory.ods", sheet =6)
```

# Summary

New data presented here show that the Holy Roman Emperor (HRE) policy effect on alchemy experiments is huge, much larger than the effect of the Adept title. Utilising a plethora of tricks it was possible over a few days to complete the Memorial Album for alchemy experiments (with FAB as reward) for a large number of low level characters at base rank 1 alchemy, at 20%--25% success rate (depending on item) across 3100 alchemy experiments. Also having the court alchemist advisor role on top of the HRE effect gave 10% additional success, as revealed by a further 1962 separate experiments focusing solely on Adept's robe. With only adept title the success was around 1%. The proportion of minor success was unaffected by these factors. There was no measurable effect of alchemy rank, Oxford skills, paymaster aide or EX equipment on robe success, but an effect of alchemy rank on the minor successes.

# Introduction

In Uncharted Waters Online [(UWO)](https://uwo.papayaplay.com/uwo.do){target="_blank"}, the Alchemy Experiment Result [(No.1)](http://uwodbmirror.ivyro.net/eg/main.php?id=88882124){target="_blank"} Memorial Album (MA) give Forbidden Alchemy Book (FAB, Figure \@ref(fig:FABpic)) as a reward for completing. It is notoriously hard to complete because the success rate of obtaining the entries in it is very low, thought to be 0.1% base or less for some items. Due to this difficulty, and because one character can only obtain one FAB, it is a valuable and sought after item that sells for around 10 captain's tickets. It gives rank 15 increased Nanban rates, the highest in the game. (It is also possible to obtain FAB from spending real money, but the [ticket](http://uwodbmirror.ivyro.net/eg/main.php?id=01561007){target="_blank"} it comes from also gives the choice of Vagrant's Wear which is even more valuable so few would choose to get a FAB that way).

```{r FABpic, out.width='50%', fig.align='center', fig.cap="The FAB", echo = FALSE}
knitr::include_graphics('FABpic.png')
```

Having base rank 13 in Alchemy helps, because the [Adept title](http://uwodbmirror.ivyro.net/eg/main.php?id=88880518){target="_blank"} obtained at r13 increases the success rate of alchemy experiments. But even with the title active the success rate is low: Around 1% to get Adept's Biretta and Adept's Robe, 5% to get Gold Short Sword and 10% for the rest (unpublished gut feelings based on getting adept title and FAB for about 25 characters in the past). With adept title, the number of attempts needed to get one item can still easily reach the hundreds (585 have been observed, and more rumoured). So this is hard to do, considering the relative difficulty of obtaining the base materials in quantity and in particular the need for base r13.

Increased availability of gear boosting alchemy in recent years have made it possible to grind alchemy very efficiently from base r1+11 ([making gold](http://uwodbmirror.ivyro.net/eg/main.php?id=01800242){target="_blank"} at John Dee in London, and [using gaseous drugs](http://uwodbmirror.ivyro.net/eg/main.php?id=01500488){target="_blank"} at Paracelsus in Venice), making it easier to obtain the adept title. It is nevertheless still a major effort because of the huge amount of materials needed for the skill grind and the low success of the experiments. Although it takes less than 30 minutes to get from r1 to r13 base alchemy using best practices, this obviously does not include the considerable time needed to prepare the necessary mats.

However, increased alchemy experiment success rate is a [possible policy](http://gvo.gamedb.info/wiki/?Election#naa85c5c){target="_blank"} of some Holy Roman Emperor (HRE) candidates. With enough support done to unlock this effect it may be possible to easily obtain FAB, even without Adept title, without grinding alchemy at all (provided one could reach the required ranks for the experiments: r10--12 needed, + secondary skills). This HRE policy is rare, but we have had it least once early on Maris server when I made 125 Adept Robe and transmuted them into Alchemist Robe and then [Alchemist Robe EX](http://uwodbmirror.ivyro.net/eg/main.php?id=00055301){target="_blank"} that sold for 1 billion ducats each.

During March 2023 such an opportunity came again in the form of Rudolf II as Emperor. Players that had supported this candidate with at least 49 support permits (i.e. 4.9m ducats effective support needed) got the effect activated for 31 days, along with +2 alchemy (Figure \@ref(fig:rudolfPicture)).

```{r rudolfPicture, out.width='50%', fig.align='center', fig.cap="Policy of Support Alchemy and Court Alchemist advisor position for Rudolf II", echo = FALSE}
knitr::include_graphics('HRE.png')
```

Less than a week later PapayaPlay launched an [event](https://uwo.papayaplay.com/uwo.do?tp=news.view&postid=4388){target="_blank"} that gave another +1 alchemy for 2 weeks. In addition, Stinker got the Alchemist advisor court position so could give friends one more +1 alchemy (Figure \@ref(fig:rudolfPicture)), as well as being Alchemist Meister (base rank 15+ alchemy) and thus able to give yet another +1 to apprentices. If all this was not enough, Atlantis made its appearance in the midst of it. Although unrelated to the experiment success for the MA, Atlantis is very relevant for alchemy since it gives +20% to transmutation success (such as when making Alchemist robe from the Adept's robe made for the FAB). There was also 100% skill proficiency event going on so it was unavoidable to rank up several times before finishing the MA.

Thus, there was a lot of alchemy skill boost happening at the same time as the experiment success boost. The time was ripe for creating a **FABtory**.

This report present analyses of some data that was gathered when making MA entries for FAB during these unprecedented circumstances, and some additional experiments performed to better understand exactly how the HRE and other factors influence the outcome of alchemy experiments, and how this relates to crafting success in general.

## Crafting success

Alchemy experiments differ from ordinary alchemy/casting/storage/handicrafts/sewing in that there is a "huge success" category (see Figure \@ref(fig:robePicture)), as well as several tiers of minor success that all give different items, and failures that produce no item. Ordinary crafting do not have the huge success category but some recipes have a failure category, normal success and several tiers of "great success". The latter may simply produce more quantity of the same, but for some recipes it results in a different item than normal success, and the goal is often to craft the highest tier such great success item in as few attempts as possible (to preserve valuable materials). My understanding of crafting success can be summarized as follows.

The outcome of ordinary crafting depends on two hidden parameters that can be manipulated independently. The first is the chance of great success. This can be increased by:

-   Paymaster aide
-   Oxford increased production skills 1-4
-   Meister title with apprentice nearby (only when crafting on land)
-   Workshop/galley ship skill (only when crafting at sea)

These factors have no effect at all if using Master's Secrets, because the great success chance is then instantly made 100%. This is an uncommon consumable item (but obtainable in-game) that cannot be used for alchemy experiments.

The second parameter is the *degree* of great success, such as tier of cannon penetration or attack/defense stats of clothes. This can be increased by:

-   Skill rank
-   Oxford Item improvement Tech skill
-   Gear with the special goods production rate up EX effect
-   Sailor equipment with the special goods production effect (only at sea with workshop)

One theory is that the probability of producing a huge success in alchemy experiments is affected entirely by other factors than crafting a high tier great success in normal crafting. Alternatively, another school of thought argues that huge success is nothing but a high tier great success, so should be influenced at least partly by the same factors.

```{r robePicture, out.width='50%', fig.align='center', fig.cap="Seeing this for the first time in 2011 was something", echo = FALSE}
knitr::include_graphics('robeSuccess.png')
```

# Methods

## Experimental protocol for FAB

In total I had access to 74 characters that made enough support for Rudolf II as emperor, and had not obtained Adept title or any FAB MA entries previously. Nearly all of these had not yet obtained the alchemy skill at all, and most were low level and low court rank. The following steps were done to obtain a FAB for each of these characters:

-   In the election period, get support permits and invest enough in Rudolf II to unlock the alchemy support benefit.
-   Invest 10m ducats in London to get court rank 4 which is needed for quarters rank 2 necessary to place furniture which is needed to perform alchemy experiments.
-   Sell trade goods to get 20k fame, needed to equip +1 Nigredo gloves. In hindsight, this turned out to be a total overkill but I didn't know I would get court alchemist or that papaya would have +1 production skills event. Also, I didn't remember that the HRE gave +2 alchemy. I had assumed +1. So without using +3 paper, aide, job, Boston skill or ship I felt I needed this. Which I didn't need in *any* case due to ranking up. But what the heck.
-   Buy quarters and upgrade to rank 2.
-   Make and place kiln (mostly +3 version) and bench (mostly +1 version) in the quarters.
-   Obtain alchemy skill. This was done in one of 4 ways:
    -   Do the quest chain for alchemist job and get the skill from the job (the only way to learn it in the past). N = `r nrow(filter(FAB, AlchemySkill == "Job"))`
    -   Get adventure level 7 and casting rank 2, and then learn alchemy from John Dee in London. N = `r nrow(filter(FAB, AlchemySkill == "Learned"))`
    -   Learn the skill at the scholar from another character at same account bestowing it. N = `r nrow(filter(FAB, AlchemySkill == "Bestowed"))`
    -   Learn skill at the scholar from another character at same account bestowing it after itself having learned it this way (doubly bestowed). N = `r nrow(filter(FAB, AlchemySkill == "DoubleBestowed"))`
-   Get alchemy boosters. I used:
    -   +1 from being apprentice
    -   +2 [accessory](http://uwodbmirror.ivyro.net/eg/main.php?id=00515101){target="_blank"}
    -   +1 [hat](http://uwodbmirror.ivyro.net/eg/main.php?id=00143404){target="_blank"} (there is also +2 hats but I didn't have any of those)
    -   +2 [clothes](http://uwodbmirror.ivyro.net/eg/main.php?id=00061300){target="_blank"}
    -   +1 [weapon](http://uwodbmirror.ivyro.net/eg/main.php?id=00430500){target="_blank"}
    -   +1 [gloves](http://uwodbmirror.ivyro.net/eg/main.php?id=00300212){target="_blank"}
    -   +2 from HRE
    -   +1 from being on court alchemist friend-list. In total they would therefore start out at r12 (r1+11). For the last 1/3 of the sample there was also:
    -   +1 from Papaya event, so r1+12. During the experiments they would also get enough skills points for a few base ranks.
-   Craft all base mats and ingredients in huge quantities in advance and store on storage alts, aide bazaars, company vaults, quarters. This was done utilising the job-specific Boston skills for alchemy, sewing, handicraft and casting to save on mats when crafting, and using r20 buying skills, Robe EX and merchant mentor Boston skill to buy as much as possible per purchase order. This could have been tweaked even further with [r9 EX](http://uwodbmirror.ivyro.net/eg/main.php?id=00052501){target="_blank"} and 4th fleet member.
-   Each character then made (during the HRE influence period) the 10 items needed for the MA (resulting in 740 items obtained). The experiments terminated when the item was obtained, moved on to the next item, and then to the next character. The items (see [Appendix](#FAB-ingredients) for the recipes) were always made in the following order:
    -   Adept's Biretta (r10)
    -   Adept's Robe (r11)
    -   Gold-plated Armour (r12)
    -   Knife losing it's gold (r12)
    -   Metal Works (r12)
    -   Processed Lumber (r11)
    -   Green Ore (r11)
    -   Blue Ore (r11)
    -   Red Ore (r11)
    -   White Ore (r11)

(Yes, I know I'm inconsistent in naming some after the end product and some after the base material, but give me some slack). The characters were handed 3 base mats and plenty consumables of the right kind (other materials were handed back after each MA entry was obtained to avoid using wrong mats, or at wrong furniture, by mistake). They would have to trade me again if more than 3 base mats were needed and thus close and enter the experiment dialogue anew. For the coloured ores they received 250-300, apart from a few that I couldn't bother to get a new ship for. Some made FAB in a starter barca.

I counted the number of times each experiment was performed until they got the item. That is, I counted in effect the number of consumables used, and not the number of base items. The main base item is not consumed on failures, only on success and "huge success" (i.e. MA item). If different experiments have different tables of possible results (for example if one have many possible success products, and another fewer, or they differ in the rate of failure) this could have affected the number of base items consumed (in addition to possible different probabilities of the MA item), which may in some sense be considered more important than the number of experiments performed. However, I did not collect good data on this.

The raw data for FAB experiments can be found in the [Appendix](#fab-raw-data).

## Experimental protocol for Robes

In a separate set of experiments performed shortly after the FAB study I focused on the Adept's Robe recipe. Unlike in the experiments for FAB I kept track of each outcome. The experimental set-up was designed to yield pairwise comparisons where a treatment group and a control group could be compared while all other factors were held constant. I did 1962 experiments in total, with 1--3 batches for each of 14 combinations of factors tested (Table \@ref(tab:robeDesignTable)), with approx 80 experiments in each batch until I ran out of resources (fine dye) to produce base materials. The conditions tested were:

-   With or without HRE policy (server wide effect for HRE supporters)
-   With or without Advisor position (personal effect for the sole court alchemist title holder)
-   With or without Adept title (obtained at alchemy base rank 13)
-   With or without Oxfords (increased production success 1--4, and item improvement tech)
-   With or without EX ([Golden Ark](http://uwodbmirror.ivyro.net/eg/main.php?id=00536905){target="_blank"} rank 6 Special Goods Production Rate Up)
-   With r20 or r11 alchemy

```{r robeDesignTable}
kbl(Design[,-2], caption = 'Experimental design for Robe experiments. Group 15 is the Robe part of the FAB setup for comparison.', align ="c") |> 
  kable_paper(c("hover", "condensed"), full_width = F)
```

One batch is here defined as an uninterrupted sequence of experiments done without closing the dialogue screen. The length of a batch was limited by inventory space and the number of failures (more failure = longer batch), but was not limited by vigour food. No disconnects from the server occurred during the experiments. Batch number 26 was merged with batch 25 (identical conditions) in some analyses because the sample size was deemed too small to be treated separately.

It was not possible to keep the sample sizes strictly equal between batches because even though the number of base clothes were the same (+/- a few), the number of experiments differed due to the fact that failure do not consume the base material and would lead to more experiments possible before the base mats were used up for the batch.

A full factorial experimental design of 6 factors with 2 levels each would require $2^6$ = 64 combinations. It was not practically possible to test every combination of the factors. I chose to ignore any interaction effects as there is reason to believe all the effects are purely additive, removing the need for a full design. However, certain combinations of factors were physically impossible. For example, the advisor effect could not be obtained without having also the HRE effect. Another difficulty was that without wearing any gear I was already r20 alchemy. However, adjusting alchemy rank down to r11 could be achieved with a combination of mentoring and + gear (Figure \@ref(fig:mentoringPicture)). Thus it was possible to disentangle skill rank, Adept title and HRE effects which would otherwise be confounded.

```{r mentoringPicture, out.width='50%', fig.align='center', fig.cap="Mentoring was used in order to lower skill rank and retain adept title", echo = FALSE}
knitr::include_graphics('mentored.png')
```

Oxfords were compared with all on or all off. Even though it would have been ideal to have experiments with and without Oxford improvement techniques while having Oxford production success on or off, it is still possible to separate the effects. Oxford tech skill is only (possibly) relevant for experiments that did not fail, whereas the production success Oxfords should (possibly) only affect fail rate. Therefore one can compare fail rate with and without Oxfords to test the production success Oxfords, and compare outcome among experiments that did not fail with and without Oxfords to test the item improvement tech Oxford.

All experiments in the Robe setup were done with aide in Paymaster 100s job. Since the FAB trials were not, it was to some extent also possible to test the effect of paymaster aide by comparing certain groups from the Robe setup with the FAB results. Unfortunately there was no data to separate failures from minor success in the FAB experiments, so only the effect of aide on huge success can be considered. 

Results were mostly analysed by comparing confidence intervals of proportions. Since some of the observed proportions were 0 or close to 0, I did not use the standard asymptotic (Wald) approximation to calculate binomial confidence intervals, but instead the Wilson method implemented in R package [binom](https://CRAN.R-project.org/package=binom){target="_blank"}.

The raw data for Robe experiments can be found in the [Appendix](#robe-raw-data).

This report was written using [R Markdown](https://bookdown.org/yihui/rmarkdown/){target="_blank"} in [RStudio](https://posit.co/products/open-source/rstudio/){target="_blank"}. R code used to generate the results and some output can be seen by clicking on the "code" buttons, as well as found in the [Appendix](#r-code) (without output) together with software versions used.

# Results

## FAB experiments

```{r FABdistribution, fig.cap = "Needed attempts to get an MA entry overall (histogram), and by how the alchemy skill was obtained (kernel density estimates)", fig.show = "hold", out.width= "50%"}

ggplot(FAB2, aes(AttemptsNeeded)) +
  geom_histogram(fill = "#377eb8", colour = "black", binwidth = 1)

ggplot(FAB2, aes(AttemptsNeeded)) +
  stat_density(aes(colour = AlchemySkill),  linewidth = 2,
               bounds = c(1, Inf), geom="line", position="identity") +
  #scale_colour_manual(values = c("#7a0403","#FABA39","#18bc9c", "#2c3e50"))+
  scale_colour_brewer(palette = "RdBu", direction = -1) +
  #scale_colour_manual(values = c("#7a0403","#FABA39","#18bc9c", "#2c3e50"))+
  theme(legend.position = c(0.95,0.95),
        legend.justification = c(1,1))
```

Over all `r nrow(FAB2)` MA items obtained, the mean number of experiments needed to obtain one item was `r round(mean(FAB2$AttemptsNeeded), 2)` attempts, with a median of `r round(median(FAB2$AttemptsNeeded),2)`. In total `r sum(FAB2$AttemptsNeeded)` experiments were performed to get all the MA items for all `r nrow(FAB)` participants. The distribution of needed attempts is shown in Figure \@ref(fig:FABdistribution). The highest number observed was 25 attempts for one item. It did not seem to matter if the skill was obtained from the job, learned by fulfilling requirements, bestowed or bestowed again (Figure \@ref(fig:FABdistribution)). If we compare the results from the different items (Figure \@ref(fig:FABhistoItems)) there does not seem to be any obvious differences -- there is a lot of variation within compared to between groups.

```{r FABhistoItems, fig.height= 8, fig.cap = "Needed attempts for the different items"}
FAB3 <- FAB2 |>
  mutate(Item = reorder(Item, AttemptsNeeded, FUN = mean, na.rm = TRUE))
ItemLabels <- FAB3 |>
  group_by(Item) |>
  summarise(snitt = mean(AttemptsNeeded), median =median(AttemptsNeeded))
ItemLabels$snitt2 <- sprintf("bar(x) == %.2f", ItemLabels$snitt)
ItemLabels

ggplot(FAB3, aes(AttemptsNeeded)) +
  geom_histogram(binwidth = 1, colour = "black", 
                 aes(fill=(Item))) + 
  facet_wrap(vars(Item), ncol = 2) +
  #scale_fill_viridis_d(option = "turbo", direction = 1, guide = "none") +
  scale_fill_brewer(palette = "RdBu", guide = "none", direction = -1)+
  geom_label(x = 12.5, y = 17.5, aes(label = snitt2), data = ItemLabels, parse = TRUE, 
             hjust = "center", size = 3, label.size = NA)
```

This is easier to compare using box plots, shown in Figure \@ref(fig:FABboxItems). We can think of a box plot as dividing the data in 4 parts with equal number of observations. The median (shown as a vertical line) cut the data in half with as many smaller as larger observations, and the box show the lower (25%) and upper (75%) quartiles. That is, the central 50% of the observations lie within the boxes, and 50% below and 50% above the median. Mean values (shown as white dots) are higher than medians because of the shape of the distributions. Again there is no large difference discernible in the difficulty of obtaining any of the items, but we can see that robes have the highest mean followed by biretta and blue ore.

```{r FABboxItems, fig.height = 5, fig.cap = "Box plots of number of attempts needed to get each MA entry"}
ggplot(FAB3, aes(x=AttemptsNeeded, y=Item, fill=Item)) +
  geom_boxplot() +
  stat_summary(fun=mean, geom="point", shape=20, size=3, color="white", fill="white") +
  ylab(NULL) +
  scale_x_continuous(breaks = seq(0,24, by =2))+
  #scale_fill_viridis_d(option = "turbo", direction = 1, guide="none")
  scale_fill_brewer(palette = "RdBu", guide = "none", direction =-1)

```

```{r FABgroupFigs, out.width= "50%", fig.height= 3, fig.show = "hold", fig.cap= "Number of attempts needed for two groups of items. Histogram of counts, standardised histogram, kernel density estimates and box plots"}
FAB4 <- FAB3 |>
  mutate(ItemGroup = ifelse(Item == "Biretta" | Item == "Robe" , "Robe/Biretta", "Others"))
FAB4$ItemGroup <- as_factor(FAB4$ItemGroup)
ItemLabels2 <- FAB4 |>
  group_by(ItemGroup) |>
  summarise(snitt = mean(AttemptsNeeded), 
            median =median(AttemptsNeeded), 
            n = (length(AttemptsNeeded)))
ItemLabels2$snitt2 <- sprintf("bar(x) == %.2f", ItemLabels2$snitt)
ItemLabels2$n2 <- sprintf("n == %.0f", ItemLabels2$n)
ItemLabels2$snittAndN <- sprintf(
  "bar(x) == %.2f * ',' ~~ n == %.0f",
  ItemLabels2$snitt,
  ItemLabels2$n
)
ggplot(FAB4, aes(AttemptsNeeded)) +
  geom_histogram(binwidth = 1, colour = "black", 
                 aes(fill=(ItemGroup))) + 
  facet_wrap(vars(ItemGroup), ncol = 2) +
  scale_fill_manual(values = c("#B51F2E", "#377eb8"), guide = "none") +
  #scale_fill_manual(values = c("#7a0403", "#95a5a6"), guide = "none") +
  #scale_fill_brewer(palette = "Set1", guide = "none")+
  geom_label(x = 12.5, y = 140, aes(label = snittAndN), data = ItemLabels2, parse = TRUE, 
             hjust = "center", vjust = "center", size = 3, label.size = NA)

ggplot(FAB4, aes(AttemptsNeeded)) +
  geom_histogram(binwidth = 1, colour = "black", 
                 aes(fill=(ItemGroup), y = stat(density))) + 
  facet_wrap(vars(ItemGroup), ncol = 2) +
  scale_fill_manual(values = c("#B51F2E", "#377eb8"), guide = "none") +
  #scale_fill_manual(values = c("#7a0403", "#95a5a6"), guide = "none") +
  #scale_fill_brewer(palette = "Set1", guide = "none")+
  geom_label(x = 12.5, y = 0.235, aes(label = snittAndN), data = ItemLabels2, parse = TRUE, 
             hjust = "center", vjust = "center", size = 3, label.size = NA)

ggplot(FAB4, aes(AttemptsNeeded)) +
  stat_density(aes(colour = ItemGroup),  linewidth = 2,
               bounds = c(1, Inf), geom="line", position="identity") +
  scale_colour_manual(values = c("#B51F2E", "#377eb8"), guide = "none") +
  #scale_colour_viridis_d(option = "turbo", direction = -1) +
  #scale_colour_manual(values = c("#7a0403","#95a5a6")) +
  #scale_colour_brewer(palette = "Set1", guide = "none")+
  theme(legend.position = c(0.95,0.95),
        legend.justification = c(1,1)) +
  guides(colour = guide_legend(title = NULL))

ggplot(FAB4, aes(x=AttemptsNeeded, y=ItemGroup, fill=ItemGroup)) +
  geom_boxplot() +
  stat_summary(fun=mean, geom="point", shape=20, size=3, color="white", fill="white") +
  ylab(NULL) +
  scale_x_continuous(breaks = seq(0,24, by =2))+
  scale_fill_manual(values = c("#B51F2E", "#377eb8"), guide = "none") 
  #scale_fill_manual(values = c("#7a0403","#95a5a6"), guide="none")
  #scale_fill_brewer(palette = "Set1", guide = "none")
```

```{r FABstats}
table(FAB4$ItemGroup)
(WT <- wilcox.test(FAB4$AttemptsNeeded ~ FAB4$ItemGroup, 
            paired = F, 
            alternative = "greater"))
```

In order to learn if there is more than random variability in the data we can perform some statistical analyses. From a priori information we can expect Robe and Biretta to be harder to make than the other 8 items, so to increase sample size and reduce uncertainty I lumped the data into two groups: Robe+Biretta (n = 148, $\bar{x}$ = `r round(ItemLabels2[1,2], 2)` attempts) vs the other items (n = 592, $\bar{x}$ = `r round(ItemLabels2[2,2], 2)` attempts). The distributions of needed attempts for these two groups is shown in in Figure \@ref(fig:FABgroupFigs). The number of attempts was significantly larger for the Robe/Biretta group (Mann-Whitney U-test, one-sided p = `r round((WT$p.value), 2)`).

```{r FABprop, fig.height= 1.75, fig.cap= "Proportion success for the two groups of items, with 95% confidence intervals"}
#restructure data
FABprop <- FAB4 |>
  group_by(ItemGroup) |>
  summarise(success = sum(n()), total = sum(AttemptsNeeded))
#binomial confidence intervals
binom::binom.confint(FABprop$success, FABprop$total) 
(binomRes <- binom::binom.wilson(FABprop$success, FABprop$total))
binomRes$Group <- as_factor(c("Robe/Biretta", "Others"))
#test if equal proportions
(propRes <- prop.test(x = binomRes$x, binomRes$n, alternative = "less"))
kbl(FABprop, label = "FABpropTable", 
    caption = 'Number of successes and attempts when making MA entries for FAB', 
    align ="c",
    col.names = c('Group', 'Successes', 'Total Experiments')) |>
  kable_paper(c("hover", "condensed"), full_width = F) #|>
  #column_spec(1, color = "white", bold = F, 
  #            background = c("#ca0020", "#0571b0"))
binomRes |> 
ggplot(aes(x = Group, y = mean, colour= Group)) +
  ylab("Proportion Success") +
  xlab(NULL) +
  geom_pointrange(aes(ymin = lower, ymax = upper), size =0.75, linewidth = 1)+
  scale_colour_manual(values = c("#B51F2E", "#377eb8"), guide = "none") +
  geom_text(aes(label = round(mean, 3), vjust = -2), size = 3) +
  coord_flip() +
  theme(axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank())
```

We may also arguably treat each experiment as independent and look at if the proportion of success is lower for the Robe/Biretta group than for the other group (Table \@ref(tab:FABpropTable)). The binomial confidence intervals of the two proportions had some overlap (Figure \@ref(fig:FABprop)), but as expected the proportion of success was significantly less for the Robe/Biretta group ($\chi^{2}$ = `r round(propRes$statistic, 2)` , df = 1, one-sided p = `r round(propRes$p.value, 3)`).

Overall we can say that with the HRE effect active the "difficult" items (Robe and Biretta) on average needed about 5 attempts and the other, "easy", items about 4 attempts, which translates to 20 % and 25 % chance, respectively.

```{r FABmeans, fig.cap = "Mean number of attempts needed to get each of the 10 MA entries for FAB"}

ggplot(Snitt, aes(MeanAttemptsNeeded)) +
  geom_histogram(aes(fill= Who), binwidth = 0.5, colour="black") +
  #scale_fill_brewer(palette = "RdBu", direction = 1) +
  scale_fill_manual(values = c("#B51F2E","#F4A582","#A7CFE4", "#377eb8"))+
  #scale_fill_viridis_d(option = "turbo", direction = -1) +
  ylab("Count") +
  xlab("Average attempts needed") +
  annotate("text", x = 2.5, y = 9.5, label = "Alta-lucky-vista", colour = "#B51F2E") +
  geom_curve(
    aes(x = 2.5, y = 8.8, xend = 2.0, yend = 2.5),
    arrow = arrow(
    length = unit(0.03, "npc"), 
    type="closed" 
    ),
    colour = "#B51F2E",
    linewidth = 1.2,
    angle = 0 
  ) +
  scale_y_continuous(breaks = seq(0,20, by =2))+
  theme(
    legend.position = c(0.95,0.95),
    legend.justification = c(1,1),
    legend.background = element_rect(fill = "white"),
    legend.key = element_rect(fill = "white", color = NA)
  )

```

The mean number of attempts a player had to do to get each of the 10 items for the MA varied from `r min(Snitt$MeanAttemptsNeeded)` to `r max(Snitt$MeanAttemptsNeeded)` with a mean of `r round(mean(Snitt$MeanAttemptsNeeded), digits = 2)`. Some people are luckier than others (Figure \@ref(fig:FABmeans)).

As an anecdote, only one character managed to complete the FAB without getting any other MA entries (like gemstones and plates from minor success by-products, Figure \@ref(fig:FABcomplete)). And he did not have any other MA records either. What a guy.

```{r FABcomplete, out.width='40%', fig.align='center', fig.cap="Another unique individual", echo = FALSE}
knitr::include_graphics('Ikea.png')
```

## Robe experiments

A summary of the results is presented in Table \@ref(tab:RobeSummaryTable) below. This is visualized as stacked bars ordered by proportion Robes in Figure \@ref(fig:robeSummaryFigures-1) (per condition) and Figure \@ref(fig:robeSummaryFigures-2) (further divided into the separate batches). Figure \@ref(fig:robeSummaryFigures-3) show the he actual sequence of outcomes for each batch (batches in same order as in Figure \@ref(fig:robeSummaryFigures-2), but omitting labelling the conditions to leave more space).

It is difficult to compare stacked bar charts that have more than two categories because the location of the middle categories is shifted relative to each other across groups -- they don't have a common baseline. When analysing the effect of the various treatments in the sub-chapters below, the details are therefore presented as proportion (with 95% confidence intervals) of the relevant category.

<button class="btn btn-primary" data-toggle="collapse" data-target="#RobeTable">

Show/Hide Table 3
</button>

::: {#RobeTable .collapse.show}
```{r RobeSummaryTable}
kbl(RobeSummaryTable, caption = 'Outcome of Robe Experiments for each batch. The percentage Adept Robe is highlighted.', align ="c") |>
  kable_paper(c("hover", "condensed"), full_width = F) |>
  column_spec(9, background = "#92c5de") |> #, color = "white"
  pack_rows("HRE, Advisor, Adept, r20", 1, 3) |>
  pack_rows("HRE, Advisor, r20", 4, 6) |>
  pack_rows("HRE, Advisor, Adept, r20, Ex", 7, 9) |>
  pack_rows("HRE, Advisor, r20, Ex", 10, 12) |>
  pack_rows("HRE, Advisor, Adept, r20, Ex, Oxf", 13, 15) |>
  pack_rows("HRE, Advisor, r11", 16, 17) |>
  pack_rows("HRE, r11", 18, 18) |>
  pack_rows("HRE, r20", 19, 19) |>
  pack_rows("HRE, Adept, r11", 20, 20) |>
  pack_rows("HRE, Adept, r20", 21, 21) |>
  pack_rows("r11", 22, 22) |>
  pack_rows("Adept, r11", 23, 23) |>
  pack_rows("r20", 24, 24) |>
  pack_rows("Adept, r20", 25, 26)
```
:::

```{r robeSummaryFigures, fig.width=6, fig.cap = c("Percentage of outcomes for Robe experiments under different conditions", "Percentage of outcomes split in batches", "Sequence of outcomes for Robe experiments under different conditions, for each batch")}

# get condition descriptions from table, and combine with sample size for figure labels
# and reorder factor levels by % robes. and the items from "hardest to easiest"
# keep them reordered in the file so don't have to do within ggplot
Robe2 <- Robe |> left_join(Design |> select(Condition, Condition2, Totals), by="Condition")
Robe2 <- Robe2 |>
  mutate(Condition3 = paste0(Condition2, " (n = ", Totals, ")")) |>
  mutate(Condition3 = fct_reorder(.f = Condition3, 
                                         .x = Result,
                                         .fun = function(.x) mean(.x == "Robe"),
                                         .desc = TRUE)) |>
  mutate(Result = fct_relevel(Result, "Robe", "Justaucorps", "Waistcoat", "FineVelvet", "Failure"))
  
ggplot(Robe2, aes(Condition3, fill = fct_rev(Result))) + 
  geom_bar(position = "fill", width = 0.9,) + #
  #shades::brightness(scale_fill_brewer(palette = "RdBu", direction = -1),   shades::delta(-0.05)) +
  scale_fill_brewer(palette = "RdYlBu", direction = -1) +
  #scale_fill_viridis_d(option = "turbo", direction = 1) +
  scale_y_continuous(labels = scales::percent, breaks = seq(0, 1, by = 0.2), expand = expansion(mult = c(0, 0.02))) +
  scale_x_discrete(expand = expansion(mult = c(0, 0)))+
  ylab("Percentage") +
  xlab(NULL) +
  theme(legend.background = element_rect(fill = "#EBEBEB"),
        legend.key = element_rect(fill = NA),
        legend.position = "top",
        axis.text.y = element_text(size = rel(1)),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank(),
        legend.justification = c(1, 0),
        panel.background = element_rect(fill = "white")) +
  guides(fill = guide_legend(reverse = TRUE, title = NULL))+
  coord_flip()

# include batch, within condition
Robe2$GroupBatch <- ave(Robe2$Batch, Robe2$Condition, 
                        FUN = function(x) as.numeric(factor(x)))
ggplot(Robe2, aes(GroupBatch, fill = fct_rev(Result))) + 
  geom_bar(position = "fill",  width = 1) + #colour = "black",
  scale_fill_brewer(palette = "RdYlBu", direction = -1) +
  #scale_fill_viridis_d(option = "turbo", direction = 1) +
  scale_y_continuous(labels = scales::percent, breaks = seq(0, 1, by = 0.2), expand = expansion(mult = c(0, 0.02))) +
  scale_x_continuous(breaks = NULL, expand = expansion(mult = c(0, 0))) +
  ylab("Percentage") +
  xlab(NULL) +
  theme(legend.background = element_rect(fill = "#EBEBEB"),
        legend.key = element_rect(fill = NA),
        legend.position = "top",
        axis.text.y = element_text(size = rel(1.2)),
        legend.justification = c(1, 0)) +
  guides(fill = guide_legend(reverse = TRUE, title = NULL))+
  coord_flip()+
  facet_wrap(~fct_rev(Condition3), strip.position = "left", ncol = 1, scales = "free_y") +
  theme(strip.text.y.left = element_text(angle = 0, hjust = 1), 
        panel.spacing = unit(0.1,"cm"), 
        strip.background = element_rect(fill = NA))
# time series visualization
ggplot(Robe2, aes(x=BatSeq, fill = fct_rev(Result))) +
  geom_bar(position = "fill", width = 1) + #, colour = "black"
  facet_wrap(vars(fct_rev(Condition3), GroupBatch), strip.position = "left", ncol = 1) +
  scale_fill_brewer(palette = "RdYlBu", direction = -1) +
  #scale_fill_viridis_d(option = "turbo", direction = 1)+
  scale_x_continuous(breaks = seq(0, 120, by = 20), expand = expansion(mult = c(0, 0))) +
  scale_y_continuous(breaks = NULL, expand = expansion(mult = c(0, 0))) +
  theme(legend.background = element_rect(fill = "#EBEBEB"),
        legend.key = element_rect(fill = NA),
        legend.position = "top",
        axis.text.y = element_text(size = rel(1.2)),
        legend.justification = c(0.5, 0)) +
  guides(fill = guide_legend(reverse = TRUE, title = NULL)) +
  theme(strip.text.y.left = element_blank(), 
        strip.text.x = element_blank(),
        panel.spacing = unit(0.1,"cm")) +
  xlab("Sequence") +
  ylab(NULL)

```

### HRE effect

```{r HREeffect,  fig.height=4, fig.cap = "Effect of Holy Roman Emperor support alchemy policy (HRE). Proportion of experiments that yielded Robe under otherwise similar conditions, with 95% confidence intervals"}
HRE <- Robe2 |> 
  mutate(robeSeries = fct_other(Result, keep = "Robe")) |> 
  group_by(HRE, Condition2, robeSeries) |>
  summarise(number = n(), Condition = mean(Condition)) |> 
  ungroup() |> 
  pivot_wider(
  names_from  = "robeSeries",
  values_from = "number",
  values_fill = 0) |> 
  mutate(Total = Robe+Other,
         PropRobe = Robe/Total)
HRE <- cbind(HRE, 
      binom.confint(HRE$Robe, n = HRE$Total, methods = "wilson"))

HRE <- HRE |> 
  filter(Condition %in% c(7:10) | HRE == "NO") |> #manual subsetting
  mutate(Condition3 = (str_replace(Condition2, c("HRE, "), "")))
  
HRE |> 
ggplot(aes(y = Condition3, x = PropRobe, colour = HRE)) +
  ylab(NULL) +
  xlab("Proportion Robe") +
  geom_pointrange(aes(xmin = lower, xmax = upper), 
                  size =0.75, linewidth = 1,
                  position = position_dodge(width=1)) +
  scale_colour_manual(values = c("#95a5a6", "#2c3e50")) +
  theme(legend.position = "top") +
  facet_wrap(vars(Condition3), strip.position = "left", ncol = 1, scales = "free_y")+
  theme(strip.text.y.left = element_blank(), 
        strip.text.x = element_blank(),
        panel.spacing = unit(0.5,"cm"),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank()) +
  geom_text(aes(label = paste("n =", Total)), vjust = 0, hjust = -0.5, size = 3)

```

There was a large and consistent effect of the Holy Roman Empire policy (Figure \@ref(fig:HREeffect)). Having the HRE effect added around 20% to the Robe proportion, both when at r11 and r20 alchemy and with or without adept title.

### Advisor effect

```{r advisorEffect, fig.height=3.5, fig.cap = "Effect of Court Alchemist Advisor position. Proportion of experiments that yielded Robe under otherwise similar conditions, with 95% confidence intervals"}
ADV <- Robe2 |> 
  mutate(robeSeries = fct_other(Result, keep = "Robe")) |> 
  group_by(Advisor, Condition2, robeSeries) |>
  summarise(number = n(), Condition = mean(Condition)) |> 
  ungroup() |> 
  pivot_wider(
  names_from  = "robeSeries",
  values_from = "number",
  values_fill = 0) |> 
  mutate(Total = Robe+Other,
         PropRobe = Robe/Total)
ADV <- cbind(ADV, 
      binom.confint(ADV$Robe, n = ADV$Total, methods = "wilson"))

ADV <- ADV |> 
  filter(Condition %in% c(1,2,6,7,8,10)) |> #dont like manual subsetting of these rows...
  mutate(Condition3 = (str_replace(Condition2, c("Advisor, "), "")))
  
ADV |> 
ggplot(aes(y = Condition3, x = PropRobe, colour = Advisor)) +
  ylab(NULL) +
  xlab("Proportion Robe") +
  geom_pointrange(aes(xmin = lower, xmax = upper), 
                  size =0.75, linewidth = 1,
                  position = position_dodge(width=1)) +
  scale_colour_manual(values = c("#95a5a6", "#2c3e50")) +
  theme(legend.position = "top") +
  facet_wrap(vars(Condition3), strip.position = "left", ncol = 1, scales = "free_y")+
  theme(strip.text.y.left = element_blank(), 
        strip.text.x = element_blank(),
        panel.spacing = unit(0.5,"cm"),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank()) +
  geom_text(aes(label = paste("n =", Total)), vjust = 0, hjust = -0.5, size = 3)
ADVwide <- ADV |> 
  group_by(Advisor) |> 
  summarise(SumRobes = sum(Robe), SumTots = sum(Total)) |> 
  mutate(SumProp = SumRobes/SumTots)
(ADVpropTestTotal <- prop.test(x = ADVwide$SumRobes, ADVwide$SumTots, alternative = "less"))
(ADVpropTests <- 
  split(ADV, as.factor(ADV$Condition3)) |>  
  map(function (df) prop.test(x = df$Robe, n = df$Total, alternative = "less")))
```

The advisor effect was clearly smaller in magnitude than the HRE effect, adding consistently around 10% to the Robe proportion in the three contrasts that could be made (Figure \@ref(fig:advisorEffect)). The confidence intervals overlapped but the effect was significant in the HRE-r11 and HRE-r20 groups, and highly significant when pooling the three groups (46/246 vs 172/574, $\chi^{2}$ = `r round(ADVpropTestTotal$statistic, 2)` , df = 1, one-sided p \< `r round(ADVpropTestTotal$p.value, 3)`).

### Adept effect and base success rate

```{r adeptEffect, fig.height=5.5, fig.cap = "Effect of Adept title. Proportion of experiments that yielded Robe under otherwise similar conditions, with 95% confidence intervals"}
ADEPT <- Robe2 |> 
  mutate(robeSeries = fct_other(Result, keep = "Robe")) |> 
  group_by(Adept, Condition2, robeSeries) |>
  summarise(number = n(), Condition = mean(Condition)) |> 
  ungroup() |> 
  pivot_wider(
  names_from  = "robeSeries",
  values_from = "number",
  values_fill = 0) |> 
  mutate(Total = Robe+Other,
         PropRobe = Robe/Total)
ADEPT <- cbind(ADEPT, 
      binom.confint(ADEPT$Robe, n = ADEPT$Total, methods = "wilson"))

ADEPT <- ADEPT |> 
  filter(!(Condition %in% c(5,6))) |> #dont like manual subsetting of these rows...
  mutate(Condition3 = (str_replace(Condition2, c("Adept, "), "")))
  
ADEPT |> 
ggplot(aes(y = Condition3, x = PropRobe, colour = Adept)) +
  ylab(NULL) +
  xlab("Proportion Robe") +
  geom_pointrange(aes(xmin = lower, xmax = upper), 
                  size =0.75, linewidth = 1,
                  position = position_dodge(width=1)) +
  scale_colour_manual(values = c("#95a5a6", "#2c3e50")) +
  theme(legend.position = "top") +
  facet_wrap(vars(Condition3), strip.position = "left", ncol = 1, scales = "free_y")+
  theme(strip.text.y.left = element_blank(), 
        strip.text.x = element_blank(),
        panel.spacing = unit(0.5,"cm"),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank()) +
  ggrepel::geom_text_repel(aes(label = paste("n =", Total)), 
                           size = 3, 
                           direction = "x", 
                           min.segment.length = 5, 
                           seed =666)
ADEPTwide <- ADEPT |> 
  pivot_wider(
  names_from  = c(Adept),
  values_from = "PropRobe") |>
  group_by(Condition3) |> 
  summarise(NO = max(NO, na.rm = T), YES = max(YES, na.rm = T)) |> 
  mutate(difference = YES-NO)
adeptdiff <-  round(mean(ADEPTwide$difference), digits = 4)

(adeptEst <- binom.confint(x= 2, n = 223, methods = "wilson"))
(baseEst <- binom.confint(x= 0, n = 217, methods = "wilson"))

```

The adept effect was too small to be precisely measured with these data (Figure \@ref(fig:adeptEffect)), but it is clearly smaller than the HRE and advisor effects. On average (for the 6 contrasts) the effect was to increase the proportion by 0.022 (i.e. to add 2.2% to the success rate), but the estimates varied from -0.006 to 0.08. Although we do not know the exact base rate, it is most reliable to simply estimate the adept effect from the two samples of adept without HRE or advisor effect, because the control group estimates otherwise suffers from uncertainty as well. If we pool the two samples (that only differed in alchemy rank) we get 2 robes out of 223 trials (0.009), with a 95% confidence interval of 0.0025--0.032. We are therefore 95% certain that the true effect of the adept title is less than 3.2%, with a point estimate of 0.9% (minus the base rate).

The base rate is very low, and below the detection limit of the study. No robes resulted from the 217 trials in the two r11/r20 groups without adept title or other effects (from the two lower panels in Figure \@ref(fig:adeptEffect)). If we pool these we get a 95% confidence interval of 0--0.017. We are thus 95% certain that the true base rate is below 1.7%, with a point estimate of 0.0%. It should be noted that the base rate is known not to be zero -- it is possible to produce the robe without adept title or any other effects active.

### Alchemy skill rank, EX, Oxford and aide

There was no effect of alchemy rank on the probability of getting a Robe (Figure \@ref(fig:AlchemyR)): in all five comparisons the treatment (r20) and the control group (r11) was very similar to each other, relative to the uncertainty and the readily evident effects of HRE and Advisor.

```{r AlchemyR, fig.height=4.5, fig.cap="Effect of Alchemy rank (r11 or r20). Proportion of experiments that yielded Robe under otherwise similar conditions, with 95% confidence intervals"}
ALCH <- Robe2 |> 
  mutate(robeSeries = fct_other(Result, keep = "Robe")) |> 
  group_by(Alchemy, Condition2, robeSeries) |>
  summarise(number = n(), Condition = mean(Condition)) |> 
  ungroup() |> 
  pivot_wider(
  names_from  = "robeSeries",
  values_from = "number",
  values_fill = 0) |> 
  mutate(Total = Robe+Other,
         PropRobe = Robe/Total)
ALCH <- cbind(ALCH, 
      binom.confint(ALCH$Robe, n = ALCH$Total, methods = "wilson"))

ALCH <- ALCH |> 
  filter(Condition %in% c(2,6:14)) |> #dont like manual subsetting of these rows...
  mutate(Condition3 = (str_replace(Condition2, c(", r11"), ""))) |> 
  mutate(Condition3 = (str_replace(Condition3, c(", r20"), ""))) |> 
  mutate(Condition3 = (str_replace(Condition3, c("r11"), "Nothing"))) |> 
  mutate(Condition3 = (str_replace(Condition3, c("r20"), "Nothing")))
  
ALCH |> 
ggplot(aes(y = Condition3, x = PropRobe, colour = as.factor((Alchemy)))) +
  ylab(NULL) +
  xlab("Proportion Robe") +
  labs(colour = "Alchemy rank")+
  geom_pointrange(aes(xmin = lower, xmax = upper), 
                  size =0.75, linewidth = 1,
                  position = position_dodge(width=1)) +
  scale_colour_manual(values = c("#95a5a6", "#2c3e50")) +
  theme(legend.position = "top") +
  facet_wrap(vars(Condition3), strip.position = "left", ncol = 1, scales = "free_y")+
  theme(strip.text.y.left = element_blank(), 
        strip.text.x = element_blank(),
        panel.spacing = unit(0.5,"cm"),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank()) +
  ggrepel::geom_text_repel(aes(label = paste("n =", Total)), 
                           size = 3, 
                           direction = "x", 
                           min.segment.length = 5, 
                           seed =666)
```

EX equipment had no effect either. Two comparisons were available: In one comparison the treatment group estimate was higher than the control group (Figure \@ref(fig:EX), top) and in another comparison lower (Figure \@ref(fig:EX), bottom). In both cases with considerable overlap of confidence intervals. To illuminate the issue further I included the group with Oxford effects on in addition to EX (Figure \@ref(fig:EX), middle). If anything this group should be higher or at least similar to the group without Oxford. In fact it was lower, and lower than the control group in the top panel. Taken together this shows no measurable effect of EX equipment on Robe success.

```{r EX, fig.height=3.5, fig.cap="Effect of EX equipment. Proportion of experiments that yielded Robe under otherwise similar conditions, with 95% confidence intervals. Also includes a group with Oxfords on, but otherwise similar to the top panel."}
EX <- Robe2 |> 
  mutate(robeSeries = fct_other(Result, keep = "Robe")) |> 
  group_by(EX, Condition2, robeSeries) |>
  summarise(number = n(), Condition = mean(Condition)) |> 
  ungroup() |> 
  pivot_wider(
  names_from  = "robeSeries",
  values_from = "number",
  values_fill = 0) |> 
  mutate(Total = Robe+Other,
         PropRobe = Robe/Total)
EX <- cbind(EX, 
      binom.confint(EX$Robe, n = EX$Total, methods = "wilson"))

EX <- EX |> 
  filter(Condition %in% c(1:5)) |> #dont like manual subsetting of these rows...
  mutate(Condition3 = (str_replace(Condition2, c(", Ex"), "")))

EXwide <- EX |> 
  pivot_wider(
  names_from  = c(EX),
  values_from = "PropRobe") |>
  group_by(Condition3) |> 
  summarise(NO = mean(NO, na.rm = T), YES = mean(YES, na.rm = T)) |> 
  mutate(difference = YES-NO)
EXdiff <-  round(mean(EXwide$difference), digits = 4)  

EX |> 
ggplot(aes(y = Condition3, x = PropRobe, colour = EX)) +
  ylab(NULL) +
  xlab("Proportion Robe") +
  geom_pointrange(aes(xmin = lower, xmax = upper), 
                  size =0.75, linewidth = 1,
                  position = position_dodge(width=1)) +
  scale_colour_manual(values = c("#95a5a6", "#2c3e50")) +
  theme(legend.position = "top") +
  facet_wrap(vars(Condition3), strip.position = "left", ncol = 1, scales = "free_y")+
  theme(strip.text.y.left = element_blank(), 
        strip.text.x = element_blank(),
        panel.spacing = unit(0.5,"cm"),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank()) +
  ggrepel::geom_text_repel(aes(label = paste("n =", Total)),
  size = 3,
  min.segment.length = 5,
  seed =555)
```

The same comparison also reveal no increased Robe success from Oxford skills (for clarity shown separately in Figure \@ref(fig:Oxford)). Unfortunately only this one contrast of treatment vs. control was available, but if anything success was reduced with Oxfords on which indicates random noise.

```{r Oxford, fig.height= 2, fig.cap= "Effect of Oxford skills active. Proportion of experiments that yielded Robe under otherwise similar conditions, with 95% confidence intervals"}
OXF <- Robe2 |> 
  mutate(robeSeries = fct_other(Result, keep = "Robe")) |> 
  group_by(Oxfords, Condition2, robeSeries) |>
  summarise(number = n(), Condition = mean(Condition)) |> 
  ungroup() |> 
  pivot_wider(
  names_from  = "robeSeries",
  values_from = "number",
  values_fill = 0) |> 
  mutate(Total = Robe+Other,
         PropRobe = Robe/Total)
OXF <- cbind(OXF, 
      binom.confint(OXF$Robe, n = OXF$Total, methods = "wilson"))

OXF <- OXF |> 
  filter(Condition %in% c(3,5)) |> #dont like manual subsetting of these rows...
  mutate(Condition3 = (str_replace(Condition2, c(", Oxf"), "")))

OXFwide <- OXF |> 
  pivot_wider(
  names_from  = c(Oxfords),
  values_from = "PropRobe") |>
  group_by(Condition3) |> 
  summarise(NO = mean(NO, na.rm = T), YES = mean(YES, na.rm = T)) |> 
  mutate(difference = YES-NO)
OXFdiff <-  round(mean(EXwide$difference), digits = 4) 

OXF |> 
ggplot(aes(y = Condition3, x = PropRobe, colour = Oxfords)) +
  ylab(NULL) +
  xlab("Proportion Robe") +
  geom_pointrange(aes(xmin = lower, xmax = upper), 
                  size =0.75, linewidth = 1,
                  position = position_dodge(width=1)) +
  scale_colour_manual(values = c("#95a5a6", "#2c3e50")) +
  theme(legend.position = "top") +
  facet_wrap(vars(Condition3), strip.position = "left", ncol = 1, scales = "free_y")+
  theme(strip.text.y.left = element_blank(), 
        strip.text.x = element_blank(),
        panel.spacing = unit(0.5,"cm"),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank()) +
  ggrepel::geom_text_repel(aes(label = paste("n =", Total)),
  size = 3,
  min.segment.length = 5,
  seed =555)
```

All trials in the Robe set-up were done with aide in paymaster 100s. However, a comparison with the FAB experiments allows an estimate of a paymaster aide effect to be made since only very few had an aide at all in the FAB trials. The relevant condition is HRE, with rank 11 or 20 alchemy. These were pooled to increase sample size (rank had no effect). The FAB experiments were done with HRE, at r12--14 (included in Table \@ref(tab:robeDesignTable) for convenience). Figure \@ref(fig:paymaster) shows that having aide in paymaster did not improve the chance of producing a Robe. If anything, the probability was lower with paymaster but the difference is well within the uncertainty.

```{r paymaster, fig.height= 2, fig.cap= "Effect of aide in Paymaster. Proportion of experiments that yielded Robe under otherwise similar conditions, with 95% confidence intervals. Note that the control group is from the FAB trials."}
Paymaster <- read_ods("~\\FABtory\\paymaster.ods", sheet =1)
Paymaster$Condition3 <- (as.factor(Paymaster$Condition3))
Paymaster$Condition3 <-fct_relevel(Paymaster$Condition3, "HRE (r11+r20)", "HRE (r12-r14)")

Paymaster |> 
ggplot(aes(y = Condition3, x = PropRobe, colour = Paymaster)) +
  ylab(NULL) +
  xlab("Proportion Robe") +
  geom_pointrange(aes(xmin = lower, xmax = upper), 
                  size =0.75, linewidth = 1,
                  position = position_dodge(width=1)) +
  scale_colour_manual(values = c("#95a5a6", "#2c3e50")) +
  theme(legend.position = "top") +
  facet_wrap(vars(Condition3), strip.position = "left", ncol = 1, scales = "free_y")+
  theme(strip.text.y.left = element_blank(), #element_text(angle = 0, hjust = 0),
        strip.text.x = element_blank(),
        panel.spacing = unit(0,"cm"),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank()) +
  geom_text(aes(label = paste("n =", Total)), vjust = -1, hjust = -0.5, size = 3)
```

Note that the scale varies in Figure \@ref(fig:AlchemyR)--\@ref(fig:paymaster).

### Effects on failure and minor success

```{r propMinorSucs, fig.height = 3, fig.width = 7, fig.cap="Proportion of minor success (Fine Velvet, Justaucorps or Waistcoat) under different experimental conditions"}
#proportion minorsuccess
RST <- RobeSummaryTable
#remove row for totals
RST <- RST |> 
  filter(Batch < 27) |> 
  mutate(Condition = as.numeric(Condition))
RST <- RST |> 
  mutate(Minor = (FineVelvet+Justaucorps+Waistcoat)) 
RST <- RST |> 
  mutate(propMinor = (FineVelvet+Justaucorps+Waistcoat)/Total) 
RST <- RST |> left_join(Design |> 
                          select(Condition, Condition2), by="Condition")
RST <- cbind(RST, 
      MinorProp=binom.confint(RST$Minor, n = RST$Total, methods = "wilson"))

#combine batches
RST2 <- RST |> 
  group_by(Condition2) |> 
  summarise(
    Condition = mean(Condition),
    Failure = sum(Failure),
    FineVelvet = sum(FineVelvet),
    Justaucorps = sum(Justaucorps),
    Robe = sum(Robe),
    Waistcoat = sum(Waistcoat),
    Total = sum(Total),
    Minor = sum(Minor)
  ) |> 
  mutate(propMinor = (FineVelvet+Justaucorps+Waistcoat)/Total,
         propRobe = Robe/Total,
         propFailure = Failure/Total,
         propJustau = Justaucorps/Total,
         propVelvet = FineVelvet/Total,
         propWaist = Waistcoat/Total,
         propJustau_minor = Justaucorps/Minor,
         propVelvet_minor = FineVelvet/Minor,
         propWaist_minor = Waistcoat/Minor
         ) 
RST2  <- RST2 |> 
  mutate(Condition2 = as.factor(Condition2)) |> 
  mutate(Condition2 = fct_reorder(Condition2, propRobe, .fun = mean)) |> 
  mutate(Condition2 = fct_rev(Condition2))
RST2 <- cbind(RST2, 
              propMinor = binom.confint(RST2$Minor, n = RST2$Total, methods = "wilson"))
#overall proportion minor success:
propMinorOverall <- sum(RST2$Minor)/sum(RST2$Total)
propMinorOverall2 <- mean(RST2$propMinor)

RST2 |> 
  ggplot(aes(y = Condition2, x = propMinor)) +
  ylab(NULL) +
  xlab("Proportion minor success") +
  geom_vline(xintercept = propMinorOverall, linetype = "dashed", colour = "#95a5a6")+
  geom_pointrange(aes(xmin = propMinor.lower, xmax = propMinor.upper), 
                  size =0.5, linewidth = 0.75,
                  position = position_dodge(width=1)) +
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))

```

So far we have only considered Robes. We will now look more closely at the other outcomes: failure and minor successes.

The proportion of minor success (justaucorps, waistcoat and fine velvet together) was remarkably constant under all conditions (overall mean = 0.361), and thus seems fixed at a ratio of around 2:3 (i.e. 40% minor success, 60% failures and huge success, or maybe 35:65), unaffected by the factors that influence huge success (Figure \@ref(fig:propMinorSucs)).

The surprising consequence is therefore that all that can be attained by changing the conditions is to shift the share of outcomes *within* these two categories.

```{r otherProp, fig.height = 3, fig.width = 7, fig.show = "hold", fig.cap= "Proportion of all possible outcomes: Failure; Robe; Justaucorps; Fine Velvet; Waistcoat. Dashed lines show the mean for each item."}
RST2 <- cbind(RST2, 
              propFailure = binom.confint(RST2$Failure, n = RST2$Total, methods = "wilson"))
RST2 <- cbind(RST2, 
              propRobe = binom.confint(RST2$Robe, n = RST2$Total, methods = "wilson"))
RST2 <- cbind(RST2, 
              propJustau = binom.confint(RST2$Justaucorps, n = RST2$Total, methods = "wilson"))
RST2 <- cbind(RST2, 
              propVelvet = binom.confint(RST2$FineVelvet, n = RST2$Total, methods = "wilson"))
RST2 <- cbind(RST2, 
              propWaist = binom.confint(RST2$Waistcoat, n = RST2$Total, methods = "wilson"))
RST2 <- cbind(RST2, 
              propJustau_minor = binom.confint(RST2$Justaucorps, n = RST2$Minor, methods = "wilson"))
RST2 <- cbind(RST2, 
              propVelvet_minor = binom.confint(RST2$FineVelvet, n = RST2$Minor, methods = "wilson"))
RST2 <- cbind(RST2, 
              propWaist_minor = binom.confint(RST2$Waistcoat, n = RST2$Minor, methods = "wilson"))
rank11 <- c(6,7,9,11,12)
RST2 <- RST2 |> 
  mutate(AlchemyRank = ifelse(Condition %in% rank11, 11, 20))

propFailureOverall <- sum(RST2$Failure)/sum(RST2$Total)
RST2 |> 
  ggplot(aes(y = Condition2, x = propFailure.mean)) +
  ylab(NULL) +
  xlab("Proportion Failure") +
  geom_vline(xintercept = propFailureOverall, linetype = "dashed", colour = "#95a5a6")+
  geom_pointrange(aes(xmin = propFailure.lower, xmax = propFailure.upper), 
                  size =0.5, linewidth = 0.75)+
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))

propRobeOverall <- sum(RST2$Robe)/sum(RST2$Total)
RST2 |> 
  ggplot(aes(y = Condition2, x = propRobe.mean)) +
  ylab(NULL) +
  xlab("Proportion Adept's Robe") +
  geom_vline(xintercept = propRobeOverall, linetype = "dashed", colour = "#95a5a6")+
  geom_pointrange(aes(xmin = propRobe.lower, xmax = propRobe.upper), 
                  size =0.5, linewidth = 0.75)+
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))

propJustauOverall <- sum(RST2$Justaucorps)/sum(RST2$Total)

propJustauR11 <- RST2 |>  
  filter(AlchemyRank == 11) |> 
  summarise(out = sum(Justaucorps)/sum(Total)) |> 
  pull(out)
propJustauR20 <- RST2 |>  
  filter(AlchemyRank == 20) |> 
  summarise(out = sum(Justaucorps)/sum(Total)) |> 
  pull(out)

JustauPropAlchemy <- RST2 |>
  group_by(AlchemyRank) |>
  summarise(Justaucorps = sum(Justaucorps), Total = sum(Total))

(JustauAlchRes <- 
    binom.wilson(JustauPropAlchemy$Justaucorps, JustauPropAlchemy$Total))
JustauAlchRes$Group <- as_factor(c("r11", "r20"))
#test if equal proportions
(JustauAlchResTest <- 
    prop.test(x = JustauAlchRes$x, JustauAlchRes$n, alternative = "two.sided"))

RST2 |> 
  ggplot(aes(y = Condition2, x = propJustau.mean, 
             colour = as.factor(AlchemyRank))) +
  ylab(NULL) +
  xlab("Proportion Admiral Justaucorps") +
  geom_vline(xintercept = propJustauOverall, linetype = "dashed", 
             colour = "#95a5a6")+
  geom_vline(xintercept = propJustauR11, linetype = "dashed", 
             colour = "#377eb8")+
  geom_vline(xintercept = propJustauR20, linetype = "dashed", 
             colour = "#B51F2E")+
  geom_pointrange(aes(xmin = propJustau.lower, xmax = propJustau.upper), 
                  size =0.5, linewidth = 0.75)+
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))+
  scale_colour_manual(values = c("#377eb8", "#B51F2E"))+
  theme( 
    legend.position = c(0.95,0.95),
    legend.justification = c(1,1),
  )+
  labs(colour = "Alchemy rank")
 

propVelvetOverall <- sum(RST2$FineVelvet)/sum(RST2$Total)
RST2 |> 
  ggplot(aes(y = Condition2, x = propVelvet.mean)) +
  ylab(NULL) +
  xlab("Proportion Fine Velvet") +
  geom_vline(xintercept = propVelvetOverall, linetype = "dashed", 
             colour = "#95a5a6")+
  geom_pointrange(aes(xmin = propVelvet.lower, xmax = propVelvet.upper), 
                  size =0.5, linewidth = 0.75)+
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))

propWaistOverall <- sum(RST2$Waistcoat)/sum(RST2$Total)
RST2 |> 
  ggplot(aes(y = Condition2, x = propWaist.mean)) +
  ylab(NULL) +
  xlab("Proportion Waistcoat") +
  geom_vline(xintercept = propWaistOverall, linetype = "dashed", 
             colour = "#95a5a6")+
  geom_pointrange(aes(xmin = propWaist.lower, xmax = propWaist.upper), 
                  size =0.5, linewidth = 0.75)+
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))

```

Figure \@ref(fig:otherProp) shows the proportion of all possible outcomes, including Robes again for ease of comparison.

Since proportion of minor success was constant, the proportion of failure (Figure \@ref(fig:otherProp), upper panel) was in consequence largely an inverse mirror of the proportion of Robes: Conditions that increased the proportion of Robes lowered the proportion of failures and vice versa. As with Robes, we can readily see three clusters: Advisor has the lowest amount of failures, HRE a bit more, and considerably lower without HRE, with adept less clear (Figure \@ref(fig:otherProp)).

The proportion of the various minor successes appeared to be unaffected by changing the conditions, except an effect of alchemy rank on Justaucorps, discussed below.

```{r withinMinor, fig.height = 3, fig.width= 7, fig.show = "hold", fig.cap="Proportion of the three possible outcomes of minor success under different conditions. Dashed lines shows the mean"}
#proportions innen minor
#prop.test
propJustauOverall_Minor <- sum(RST2$Justaucorps)/(sum(RST2$Justaucorps)+sum(RST2$Waistcoat)+sum(RST$FineVelvet))
propJustauR11_Minor <- RST2 |>  
  filter(AlchemyRank == 11) |> 
  summarise(out = sum(Justaucorps)/sum(Minor)) |> 
  pull(out)
propJustauR20_Minor <- RST2 |>  filter(AlchemyRank == 20) |> 
  summarise(out = sum(Justaucorps)/sum(Minor)) |> 
  pull(out)
RST2 |> 
  ggplot(aes(y = Condition2, x = propJustau_minor.mean, 
             colour = as.factor(AlchemyRank))) +
  ylab(NULL) +
  xlab("Proportion Justaucorps among minor successes") +
  geom_vline(xintercept = propJustauOverall_Minor, linetype = "dashed", 
             colour = "#95a5a6")+
  geom_vline(xintercept = propJustauR11_Minor, linetype = "dashed", 
             colour = "#377eb8")+
  geom_vline(xintercept = propJustauR20_Minor, linetype = "dashed", 
             colour = "#B51F2E")+
  geom_pointrange(aes(xmin = propJustau_minor.lower, xmax = propJustau_minor.upper), 
                  size =0.5, linewidth = 0.75)+
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))+
  scale_colour_manual(values = c("#377eb8", "#B51F2E"))+
  theme( 
    legend.position = c(0.95,0.95),
    legend.justification = c(1,1),
  )+
  labs(colour = "Alchemy rank")


propVelvetOverall_Minor <- sum(RST2$FineVelvet)/(sum(RST2$Justaucorps)+sum(RST2$Waistcoat)+sum(RST$FineVelvet))
propVelvetR11_Minor <- RST2 |>  
  filter(AlchemyRank == 11) |> 
  summarise(out = sum(FineVelvet)/sum(Minor)) |> 
  pull(out)
propVelvetR20_Minor <- RST2 |>  
  filter(AlchemyRank == 20) |> 
  summarise(out = sum(FineVelvet)/sum(Minor)) |> 
  pull(out)

RST2 |> 
  ggplot(aes(y = Condition2, x = propVelvet_minor.mean, 
             colour = as.factor(AlchemyRank))) +
  ylab(NULL) +
  xlab("Proportion Fine Velvet among minor successes") +
  geom_vline(xintercept = propVelvetOverall_Minor, linetype = "dashed", 
             colour = "#95a5a6")+
  geom_vline(xintercept = propVelvetR11_Minor, linetype = "dashed", 
             colour = "#377eb8")+
  geom_vline(xintercept = propVelvetR20_Minor, linetype = "dashed", 
             colour = "#B51F2E")+
  geom_pointrange(aes(xmin = propVelvet_minor.lower, xmax = propVelvet_minor.upper), 
                  size =0.5, linewidth = 0.75)+
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))+
  scale_colour_manual(values = c("#377eb8", "#B51F2E"))+
  theme( 
    legend.position = c(0.95,0.95),
    legend.justification = c(1,1),
  )+
  labs(colour = "Alchemy rank")

propWaistOverall_Minor <- 
  sum(RST2$Waistcoat)/(sum(RST2$Justaucorps)+sum(RST2$Waistcoat)+sum(RST$FineVelvet))
propWaistR11_Minor <- RST2 |>  
  filter(AlchemyRank == 11) |> 
  summarise(out = sum(Waistcoat)/sum(Minor)) |> 
  pull(out)
propWaistR20_Minor <- 
  RST2 |>  filter(AlchemyRank == 20) |> 
  summarise(out = sum(Waistcoat)/sum(Minor)) |> 
  pull(out)
RST2 |> 
  ggplot(aes(y = Condition2, x = propWaist_minor.mean, 
             colour = as.factor(AlchemyRank))) +
  ylab(NULL) +
  xlab("Proportion Waistcoat among minor successes") +
  geom_vline(xintercept = propWaistOverall_Minor, linetype = "dashed", 
             colour = "#95a5a6")+
  geom_vline(xintercept = propWaistR11_Minor, linetype = "dashed", 
             colour = "#377eb8")+
  geom_vline(xintercept = propWaistR20_Minor, linetype = "dashed", 
             colour = "#B51F2E")+
  geom_pointrange(aes(xmin = propWaist_minor.lower, xmax = propWaist_minor.upper), 
                  size =0.5, linewidth = 0.75)+
  scale_x_continuous(breaks = seq(0, 1, by = 0.2), limits = c(0,1))+
  scale_colour_manual(values = c("#377eb8", "#B51F2E"))+
  theme( 
    legend.position = c(0.95,0.95),
    legend.justification = c(1,1),
  )+
  labs(colour = "Alchemy rank")

```

Figure \@ref(fig:withinMinor) shows the proportion of outcomes under different conditions when the result was a minor success (expressed as share of the sum of minor success, not total trials).

There are 3 different minor success outcomes possible. Among these, Admiral Justaucorps can be considered a higher tier outcome than Fine Velvet or Waistcoats. If we look at the outcome within the minor success group there are no effects of the experimental conditions that did affect huge success: Justaucorps is not any more likely with adept or HRE policy or advisor active. However, there is now a noticeable effect of alchemy rank (which had no effect on Robes or failure). Significantly more Justaucorps was produced at r20 than at r11 alchemy, under all conditions measured where alchemy rank could be contrasted (Figure \@ref(fig:withinMinor), upper panel). Overall, the proportion of Justaucorps at r11 alchemy was clearly lower than at r20 (15/540 vs 178/1422, $\chi^{2}$ = `r round(JustauAlchResTest$statistic, 2)` , df = 1, two-sided p \< 0.0001.

In other words, if you for some reason want to make Admiral Justaucorps it doesn't help to have adept title, or HRE or advisor position. Only higher rank alchemy skill will help. It is a rare outcome at rank 11 -- the base probability can be estimated at 2.8%. At r20 it increased to 12.5%, i.e. gain about 1% per rank. At r20 the three minor success possibilities are equally likely. For the other two items there did not seem to be any effects apart from both simply decreasing a little in frequency at r20 in consequence of increased Justaucorps (Figure \@ref(fig:withinMinor), two lower panels).

```{r fixedscatter,  fig.show = "hold", fig.cap= "Scatterplot of proportion robe vs. proportion failures per experimental batch (upper) and condition (lower). Dotted line shows a situation where the sum of robes and failures have a fixed share of 65% of the total (y = 0.65-x). The red line is the actual regression"}

#per batch:
RST |> 
  ggplot(aes(y = Failure/Total, x = Robe/Total)) +
  geom_segment(aes(x = 0, y = 0.65, xend = 0.65, yend = 0),
               colour = "#95a5a6", 
               linetype ="dashed", 
               linewidth = 1)+
  annotate("text", x = 0.57, y = 0.21, 
           label ="Fixed 65% \ny = 0.65 − x",
           colour = "#95a5a6") +
  ylim(0,0.7) +
  xlim(0,0.7) +
  ggpmisc::stat_ma_line(se = FALSE, colour = "#B51F2E") +
  ggpmisc::stat_ma_eq(mapping = use_label(c("eq", "R2")), 
                      colour = "#B51F2E", label.x = 0.15, label.y = 0.9) +
  geom_point(size= 3, shape = 21, fill = "black", colour = "white")
#per condition (more precise, but fewer data points):
RST2 |> 
  ggplot(aes(y = Failure/Total, x = Robe/Total)) +
  geom_segment(aes(x = 0, y = 0.65, xend = 0.65, yend = 0),
               colour = "#95a5a6", 
               linetype ="dashed", 
               linewidth = 1)+
  annotate("text", x = 0.57, y = 0.21, 
           label ="Fixed 65% \ny = 0.65 − x",
           colour = "#95a5a6") +
  ylim(0,0.7) +
  xlim(0,0.7) +
  ggpmisc::stat_ma_line(se = FALSE, colour = "#B51F2E") +
  ggpmisc::stat_ma_eq(mapping = use_label(c("eq", "R2")), 
                      colour = "#B51F2E", label.x = 0.15, label.y = 0.9) +
  geom_point(size= 3, shape = 21, fill = "black", colour = "white")

cor.test(RST$Failure/RST$Total, RST$Robe/RST$Total)
cor.test(RST2$Failure/RST2$Total, RST2$Robe/RST2$Total)
ppcor::pcor.test(RST$Robe, RST$Failure, RST$Total)
ppcor::pcor.test(RST2$Robe, RST2$Failure, RST2$Total)
```

Failure and robes were very strongly negatively related, in a way consistent with a common share fixed at 60-65% of the outcomes (Figure \@ref(fig:fixedscatter)). In the figure I used a model II major axis regression instead of ordinary least squares, because we do not want to predict y from x but rather describe the slope of the relationship, and both variables are subject to random variation.[^1]

[^1]: Of course, the high negative correlation seen in Figure \@ref(fig:fixedscatter) is meaningless in itself because it is partly due to the same total number of trials being used to calculate both proportions, and the variables are not free to vary independently -- correlation between parts of a whole is inherently spurious. However, this does not account for the whole relationship since the other correlations between the proportions were much weaker(Figure \@ref(fig:corrmatrix)), and the correlation between failures and robes while controlling for totals is still strong: $r_{partial}$ = -0.91, n = 14, p \< 0.0001.

In fact it is inevitable to be such a strong negative relationship if the sum of outcomes that is minor success (Figure \@ref(fig:propMinorSucs)) is fixed, leaving robe a constant percentage to share with failure, irrespective of experimental conditions. Changing the conditions will merely slide the outcome proportions up and down along the line in Figure \@ref(fig:fixedscatter).

Moreover, inspection of a correlation matrix between all the five outcomes (Figure \@ref(fig:corrmatrix)) show that it is only the proportion of robes that is strongly related to failure. Proportion justaucorps is negatively related to fine velvet and waistcoat but less so, as can be expected because there are three outcomes that covary and not only two like for robe and failure.[^2]

[^2]: Figure \@ref(fig:corrmatrix) also includes variables for proportions within the minor success category, denoted \_minor. The caveats explained in footnote 1 applies even more.

```{r corrmatrix, fig.height = 7, fig.width = 8, out.width= "75%", fig.cap= "Correlogram for proportion of different outcomes. Lower triangular shows the correlation matrix, upper triangular shows the ellipsis matrix, where the eccentricity of the ellipse is scaled to the correlation value"}
RST2matrix <- RST2 |> 
  select(propRobe, propFailure, propMinor, propJustau, propVelvet, propWaist, propJustau_minor, propVelvet_minor, propWaist_minor) |> 
  cor()

corrplot.mixed(RST2matrix,
 upper = "ellipse",
 lower = "number",
 order = 'AOE', 
 tl.col = 'black',
 tl.pos = "lt",
 diag = "n",
 tl.srt = 30,
 bg = "#EBEBEB",
 #mar = c(0,1,1,1)
 )
```

### Non-randomness and biases in the underlying RNG algorithm

```{r  include=FALSE}
# runs-test for whole data, should be some non-randomness because batches are from different distributions
Robe2$robeSeries <- fct_other(Robe2$Result, keep = "Robe")
tseries::runs.test(Robe2$robeSeries)
snpar::runs.test(as.numeric(Robe2$robeSeries))
sjekk <- randtests::runs.test(as.numeric(Robe2$robeSeries), threshold = 1.5) 
DescTools::RunsTest(as.numeric(Robe2$robeSeries)) 

# runs-tests for each BATCH, should not be any non-randomness
Robe2$Batch2 <-
  paste("Batch ", Robe2$Batch, ", ", Robe2$Condition2, sep = "")
runsOutputBatch <- split(Robe2, Robe2$Batch2) |>
  map(function (df)
    broom::tidy(randtests::runs.test(as.numeric(df$robeSeries), threshold = 1.5)))
runsOutputBatch <- bind_rows(runsOutputBatch, .id = "ID")
runsTableBatch <- datatable(runsOutputBatch,
          options = list(pageLength = 5),
          caption = 'Runs tests for each batch') |>
  formatRound(columns = c('statistic', 'p.value'),
              digits = 3)

# runs-tests for each CONDITION, should not be any non-randomness
Robe2$Condition4 <-
  paste("Condition ", Robe2$Condition, ", ", Robe2$Condition2, sep = "")
runsOutputCond <- split(Robe2, Robe2$Condition4) |>
  map(function (df)
    broom::tidy(randtests::runs.test(as.numeric(df$robeSeries), threshold = 1.5)))
runsOutputCond <- bind_rows(runsOutputCond, .id = "ID")
runsTableCond <- datatable(runsOutputCond,
          options = list(pageLength = 5),
          caption = 'Runs tests for combination of batches under same conditions') |>
  formatRound(columns = c('statistic', 'p.value'),
              digits = 3)

# autocorrelation, just white noise
acf(as.numeric((Robe2$robeSeries)))
acf(Robe2$robeSeries)
pacf(Robe2$robeSeries)
# nye variabler:
Robe2<- Robe2 |>
  group_by(Batch) |> 
    mutate(robeSeries2 = case_when(
    robeSeries=="Robe" ~ 1, 
    robeSeries=="Other" ~0)) |> 
 summarise(robeInBatch = max(robeSeries2)) |> 
right_join(Robe2)
 
Robe2<- Robe2 |>
  group_by(Condition) |> 
    mutate(robeSeries2 = case_when(
    robeSeries=="Robe" ~ 1, 
    robeSeries=="Other" ~0)) |> 
 summarise(robeInCondition = max(robeSeries2)) |> 
right_join(Robe2) 
Robe2 <- ungroup(Robe2)
# filter on these to avoid empty batches that throws error for acf

ACFwhole <- autoplot(acf(Robe2$robeSeries, plot = FALSE), 
                     main = "Whole time series",
                     conf.int.colour = "#B51F2E")
PACFwhole <- autoplot(pacf(Robe2$robeSeries, plot = FALSE), 
                      main = "Whole time series", 
                      ylab = "Partial ACF",
                      conf.int.colour = "#B51F2E")

RobesInBatchSeries <- Robe2 |> 
  filter(robeInBatch == 1)
autoplot(acf(RobesInBatchSeries$robeSeries, plot = FALSE))
BatchACF <- split(RobesInBatchSeries, RobesInBatchSeries$Batch2) |> 
  map(function (df) acf(df$robeSeries, plot = FALSE))
BatchACF <- autoplot(BatchACF, 
                     ncol = 3, 
                     conf.int.colour = "#B51F2E")
#PACF
BatchPACF <- split(RobesInBatchSeries, RobesInBatchSeries$Batch2) |> 
  map(function (df) pacf(df$robeSeries, plot = FALSE))
BatchPACF <- autoplot(BatchPACF, 
                      ncol = 3, 
                      ylab = "Partial ACF", 
                      conf.int.colour = "#B51F2E")

RobesInConditionSeries <- Robe2 |> 
  filter(robeInCondition == 1)
autoplot(acf(RobesInConditionSeries$robeSeries, plot = FALSE))
ConditionACF <- split(RobesInConditionSeries, 
                      fct_drop(RobesInConditionSeries$Condition2)) |> 
  map(function (df) acf(df$robeSeries, plot = FALSE))
ConditionACF <- autoplot(ConditionACF, 
                         ncol = 3, 
                         conf.int.colour = "#B51F2E")
#PACF
ConditionPACF <- split(RobesInConditionSeries, RobesInConditionSeries$Condition2) |> 
  map(function (df) pacf(df$robeSeries, plot = FALSE))
ConditionPACF <- autoplot(ConditionPACF, 
                          ncol = 3, 
                          ylab = "Partial ACF", 
                          conf.int.colour = "#B51F2E")

#plot text Batch/condition number?
```

```{r wholeSequence, fig.height=1.5, fig.width= 9, fig.cap = "Sequence of outcome for all robe experiments"}
# sequence plots robe vs non-robe
#whole sequence:
ggplot(Robe2, aes(x=Sequence, fill = fct_rev(Result))) +
  geom_bar(position = "fill", width = 1) +
  scale_fill_brewer(palette = "RdBu", direction = -1) +
  #scale_fill_viridis_d(option = "turbo", direction = 1)+
  #scale_x_continuous(breaks = seq(0, 2000, by = 100)) +
  scale_x_continuous(expand = expansion(mult = .025)) +
  scale_y_continuous(breaks = NULL) +
  theme(legend.background = element_rect(fill = "#EBEBEB"),
        legend.key = element_rect(fill = NA),
        legend.position = "top",
        axis.text.y = element_text(size = rel(1.2)),
        legend.justification = c(0.5, 0)) +
  guides(fill = guide_legend(reverse = TRUE, title = NULL)) +
  xlab("Sequence") +
  ylab(NULL)
```

A fundamental assumption in the analyses presented above is that each trial is statistically independent and that the probability of success is the same for all trials for which a confidence interval is calculated. This need not be the case and should be confirmed.

To test for the presence of biases in the pseudo-random number generator algorithm used, we can look at the sequence of outcomes (Figure \@ref(fig:wholeSequence)). Non-randomness would result in more clusters of the same result, or more evenly spread out results than expected by chance. One way of measuring this is by counting the number of "runs" -- segments of the sequence with the same value. For example if there are certain "lucky times" (where one should continue producing), and certain bad ones (where one should immediately cancel), there would be more runs in the data than expected by chance.

We only consider a binary outcome of Robe vs. non-Robe, shown in Figure \@ref(fig:wholeSequenceDico) below.[^3] There was a highly significant deviation from randomness in the complete sequence (Wald-Wolfowitz runs-test, p = 0.003).

[^3]: Multinomial runs-tests and auto-correlation is computationally complex and beyond the scope of this report, but see [Weiss (2017)](https://www.wiley.com/en-nl/An+Introduction+to+Discrete+Valued+Time+Series-p-9781119096993){target="_blank"}.

```{r wholeSequenceDico, fig.height=1.5, fig.width= 9, fig.cap = "Sequence of outcome for all robe experiments, dichotomized"}
ggplot(Robe2, aes(x=Sequence, fill = fct_rev(robeSeries))) +
  geom_bar(position = "fill", width = 1) +
  scale_fill_manual(values = c("#95a5a6","#B51F2E")) +
  #scale_x_continuous(breaks = seq(0, 2000, by = 100)) +
  scale_x_continuous(expand = expansion(mult = .025)) +
  scale_y_continuous(breaks = NULL) +
  theme(legend.background = element_rect(fill = "#EBEBEB"),
        legend.key = element_rect(fill = NA, color = NA),
        legend.position = "top",
        axis.text.y = element_text(size = rel(1.2)),
        legend.justification = c(0.5, 0)) +
  guides(fill = guide_legend(reverse = TRUE, title = NULL)) +
  xlab("Sequence") +
  ylab(NULL)
```

However, this is entirely as expected because we introduced a potential source of non-randomness, namely changing the conditions. If the conditions indeed affect the outcome of the experiments we thus cannot use the complete sequence to distinguish between the effect of the conditions and the presence of any other non-randomness.

Instead we must look at the batch sequences separately (Figure \@ref(fig:sequenceBatch)), or the combined sequence of several batches done under the same experimental conditions (Figure \@ref(fig:sequenceCondition)). These two can differ, for example if closing and starting the production dialogue window somehow matters: Then we should find a deviation when looking at combined sequences from several batches, but not when looking at single batches.[^4]

[^4]: I acknowledge, but ignore, the problem that single batches have inherently less statistical power because of lower sample sizes.

```{r sequenceBatch, fig.height= 4, fig.cap="Sequence of outcome for robe experiments, per batch"}
# per batch:
ggplot(Robe2, aes(x=BatSeq, fill = fct_rev(robeSeries))) +
  geom_bar(position = "fill", width = 1) +
  facet_wrap(vars(fct_rev(Condition3), GroupBatch), strip.position = "left", ncol = 1) +
  #scale_fill_viridis_d(option = "turbo", direction = 1)+
  scale_fill_manual(values = c("#95a5a6","#B51F2E")) +
  scale_x_continuous(breaks = seq(0, 120, by = 20)) +
  scale_y_continuous(breaks = NULL) +
  theme(legend.background = element_rect(fill = "#EBEBEB"),
        legend.key = element_rect(fill = NA, color = NA),
        legend.position = "top",
        axis.text.y = element_text(size = rel(1.2)),
        legend.justification = c(0.5, 0)) +
  guides(fill = guide_legend(reverse = TRUE, title = NULL)) +
  theme(strip.text.y.left = element_blank(), #element_text(angle = 0, hjust = 0),
        strip.text.x = element_blank(),
        panel.spacing = unit(0.05,"cm")) +
  xlab("Sequence") +
  ylab(NULL)
```

```{r sequenceCondition, fig.height= 4, fig.cap="Sequence of outcome for robe experiments, per condition"}
# per condition:
# first make a sequence within condition.

Robe2 <- Robe2 |> 
  group_by(Condition) |> 
  mutate(StartSeq = min(Sequence),
         CondSeq = Sequence-StartSeq+1)
Robe2 <- ungroup(Robe2)
         
ggplot(Robe2, aes(x=CondSeq, fill = fct_rev(robeSeries))) +
  geom_bar(position = "fill", width = 1) +
  facet_wrap(vars(fct_rev(Condition3)), strip.position = "left", ncol = 1) +
  #scale_fill_viridis_d(option = "turbo", direction = 1)+
  scale_fill_manual(values = c("#95a5a6","#B51F2E")) +
  scale_x_continuous(breaks = seq(0, 220, by = 20)) +
  scale_y_continuous(breaks = NULL) +
  theme(legend.background = element_rect(fill = "#EBEBEB"),
        legend.key = element_rect(fill = NA, color = NA),
        legend.position = "top",
        axis.text.y = element_text(size = rel(1.2)),
        legend.justification = c(0.5, 0)) +
  guides(fill = guide_legend(reverse = TRUE, title = NULL)) +
  theme(strip.text.y.left = element_blank(), #element_text(angle = 0, hjust = 0),
        strip.text.x = element_blank(),
        panel.spacing = unit(0.05,"cm")) +
  xlab("Sequence") +
  ylab(NULL)
```

There were very few significant deviations from non-randomness of runs in the separate batches and none in the combined batches under same conditions (runs-tests results in [Appendix](#runs-tests)), and considering the number of tests performed is as expected under randomness. Hence, the pseudo-random number generator algorithm generating the results appears to be unbiased and in accordance with a random process. This result --- presence of non-randomness in the complete sequence but not in the batches --- therefore give additional statistical confirmation that the experimental conditions affected the outcome.

Another approach to detecting non-randomness is to consider the sequence of outcomes as a time-series and look for auto-correlation. Auto-correlation is that a time series is correlated with a lagged version of itself. This could for example be caused by cycles or trends, or in general if the outcome of one experiment is related to the outcome of the previous experiment (or a certain number of steps away).

Auto-correlation function plots (ACF) and partial auto-correlation function plots (i.e., controlled for auto-correlation at shorter lags) are presented in the [Appendix](#auto-correlation). Again, any (weak) auto correlation could only be seen in the complete sequence of experiments and not in the separate batches, as expected if the only source of non-randomness in the outcome of the experiments was varying the factors studied. In other words, the time-series under constant conditions looked like white noise which is what we can expect from a random process.

# Discussion

> "I guess we'll be traveling in uncharted waters."\
> "That'll be fun", Phil said.[^5]
>
> ::: {style="text-align: right;"}
> `r tufte::quote_footer("Lemony Snicket (A Series of Unfortunate Events)")`
> :::

[^5]: He could have said FABulous

In the English version of UWO the descriptions in game of adept title, HRE position and advisor all refer to increased "chance of success" of alchemy experiments. This is somewhat vague, as it does not explicitly state if it refers to "huge success" or simply any success (as opposed to failure) or both. The results presented here show that these effects specifically target the chance of "huge success" and have no effect on the other outcomes.

Compared to the normal rate (with adept title) of approx. 1 robe or biretta from 100 experiments, it is clear that the HRE effect is enormous: with HRE I got approximately 1 robe per 5 experiments, or 20% success. With also advisor title the success increased 10% more, to 30%. 

It was surprising that the general HRE effect is larger than the court advisor effect. It would arguably have made more sense if it was the other way around. One of the wider ramifications of this finding is that the other HRE effects perhaps are of the same magnitude, which fits with observations of e.g. the ganador drop rate HRE effect.

Making the FAB album during a HRE boost is obviously saving a lot of materials and suffering, and as shown here is easily done for any character. Even low level ones in a starter barca, provided they have access to rank 2 quarters (and some help).

However, deciding to do further experiments on Robes did not save a lot of materials and suffering.

By comparing experiments where certain conditions were manipulated and all other factors were held constant, it was possible to measure the effect of various factors thought to possibly influence the success of alchemy experiments. There was no discernible effect of having or not having Oxford production skills active (including item improvement technique). If anything, the success was better with Oxfords off. Also, the EX gear for better tier great success did not seem to influence alchemy experiments. Alchemy rank and paymaster aide rank had no effect either. All this is as expected from the "alchemy experiments are different" camp, and supports the view that huge success in alchemy experiments is not analogous to high tier great success of normal crafting. 

However, alchemy rank *did* have an effect on minor success, with higher rank increasing the probability of admiral justaucorps. The magnitude of the effect was estimated at +1% per skill rank above minimum requirement. This suggests that the minor success outcome may be influenced by the same factors as normal crafting. But, why then did only alchemy rank effect the outcome of minor success? I suspect that Oxford and EX actually did have an effect, but only on the *stats* of the items. This was not tested, and the data is now lost since it was not kept track of where each Adept's Robe originated from. Just like Darwin (that did not label what particular island his Finches collected in the Galapagos had been obtained from), I can say in hindsight that this should have been done. It must be left to future studies to investigate if alchemy rank, Oxford and EX affects the stats of minor and huge success alchemy experiments.

An unexpected result was that the rate of minor success was constant under all treatments. Increasing the rate of justaucorps by having higher alchemy rank did not change the rate of failures or robes, but only shuffled the proportions of the other minor success options within their fixed common share of all outcomes.

My interpretation is this: that the factors influencing chance of great success in normal crafting play no role in alchemy experiments because unlike normal crafting, rate of failure is not a separate parameter in alchemy experiments. It is simply the inverse of the rate of huge success (Figure \@ref(fig:fixedscatter)) because the minor success rate is fixed.

For some alchemy experiments more than one kind also of great success is possible (such as plates from processed lumber and metal works). I do not have any data to say what, if anything, might affect the rates of different great success. I also have no data on the different normal successes for the recipes other than robe. A reasonable conjecture is that these are also affected by alchemy rank, and that their common share of the outcomes is constant like for the robe recipe.

Perhaps surprisingly, there was no measurable effect of having the Adept title, but it was with a high degree of confidence measured to be less than 3%. It may be 1%. However, larger sample sizes than obtained here would be needed to precisely estimate such small effects. This is illustrated by the replication done for some conditions, that showed considerable inter-batch variation (sometimes of the same magnitude as among conditions). Differences between batches subject to the exact same conditions could be mistaken for some kind of bias in the RNG[^6], but in reality is what is expected from small samples from a random process, shown by tests done here that failed to detect any non-randomness.

[^6]: Rumours of the flawed (and exploitable) RNG algorithms used in UWO have floated around since the launch of Gama server. For example, one idea is that there are good and bad sessions. So that if you fail, you should close the dialogue screen and start over until you hit a better session. (Or maybe wait a bit until the bad times are over. Who knows.)

The effect of adept is clearly very small compared to the HRE effects, and in practice does not matter under HRE conditions. However, outside of HRE it does matter. In fact, if the adept effect is indeed + 1% I guess the base rate without adept title is 0.1%. In terms of expected materials consumed an increase from 0.1% to even 1% success is substantial, whereas an increase from 30 to 31% is negligible. 

Since the adept effect is so small I did not try to test another uncertainty: if one could leech the adept effect by being apprentice to an adept. It is feasible that this is the case (like for some other titles), and could be worth exploiting for obtaining FAB under normal situations when the HRE effect is absent. However, one could not then simultaneously get the +1 alchemy from being apprentice to an alchemy meister. In the FAB experiments all characters were apprentice to a meister for the +1. There is no reason to think this could have affected the outcome because the effect on success is the other way around -- for the meister, not the apprentice, and such effects would not have affected the chance of huge success anyway (according to the present report).

The data from FAB experiments on robe success were collected in a slightly different way from the Robe experiments. The gathering of data from one player was terminated when a robe was obtained during the FAB study, whereas in the Robe experiments it was terminated when no more base clothes were left in the inventory. Thus one could argue that the FAB setup would bias the results in favour of success compared to the Robe setup. However, if the observations are independent this would in fact not create a bias because the dataseries could be considered as continuing in the next player, in the same way as for the Robe experiments.[^7]

[^7]: Only for the last individual would a slight bias occur, which is negligible.

It would have been interesting to see what the individual effect of court alchemist is compared to the global effect for all supporters of the HRE policy. If the effects found in this study are additive it should be +10%. This was not possible to fully test here since the only individual with the advisor effect also by necessity had the global effect active, so no player had only the HRE advisor position. This must be tested during elections where there is no global alchemy success policy but court alchemist position is still available. But such elections seem to be rare -- court alchemist is often not present as an option. (Note that there are skill rank and job requirements for certain advisor roles).

This is not a pressing concern though since the only item worth making is the adept's robe, as base material for Alchemist robe EX (see [Appendix](#robe-EX) for how to make Robe EX). After this event it is fair to say we have an over-abundance of adept's robe and the market for Alchemy Robe EX is not endless (although I made and sold 125 of them after last time we had this HRE effect years ago). The robe EX also has a new competitor now in form of [Tianzhu Treasure Sword](http://uwodbmirror.ivyro.net/eg/main.php?id=88882167) from grande ganador which have the same EX rank and sells for more than robe EX.

Indeed, also the market for FAB is limited and it will be a challenge selling off the ones made during this study. But this was done for Science, not money of course.

The main bottleneck for obtaining mats for FAB turned out, unsurprisingly, to be the *knife loosing its gold* needed for gold short-sword. To secure maximum number of knives, I let a character *without* the HRE effect make the knives so that it would be fewer gold swords produced and thus more knives. Or so I thought in my ignorance. As it turns out, HRE should have no effect on a minor success such as this knife (but alchemy rank would), if the results from the robe experiments hold in general. Unfortunately I did not keep records of the outcomes, but I can state that way too many unwanted huge success came out of this.

In the FAB experiments any small differences between the "easy" recipes was likely due to random variation. An outlier was blue ore that was almost as hard as robe to succeed with on average, but there is no reason to believe the blue ore recipe to differ form the other coloured ores. My best estimate is that they all had 25% chance for success, which should give 5% base chance if the HRE effect is the same for all items. An earlier hunch that knife was harder than the others appears to be wrong. It probably only gave the impression to be more difficult because the base material is much harder to obtain than for the others.

There is no reason to believe that the HRE effect (or any other effect) acts differently for different items. Instead, the lower success of Robe/Biretta compared to other items is likely to be due to different base probabilities, to which is added a common fixed increase in the probability of huge success from Adept, HRE and advisor. In other words, I make the assumption that the effects are additive to each other and to the base rate and are the same for all the MA items. An outline of a theory for alchemy experiments, summarising what I have discussed, is presented in Figure \@ref(fig:summarypicture).

```{r summarypicture, out.width='50%', fig.align='center', fig.cap="Proposed theory of the outcome of alchemy experiments", echo = FALSE}
knitr::include_graphics('theorydrawing.png')
```

The theory yields some predictions that can be tested. In particular that the independent effect of advisor position is +10%, and that the proportion of failures would be around 55% with only advisor effect on. The court alchemist advisor position is rare, and I am not certain if it actually can be obtained from an emperor that does not have the support alchemy policy. But if it can, it would be interesting to do more Robe experiments with only the Advisor effect on. Easier to test is that processed lumber and the other non-Robe/Biretta should yield 5% MA items in experiments conducted outside HRE/advisor without Adept, and 6% with Adept title on. It would also be interesting to look at the "minor huge successes" from these experiments, such as non-FAB armour plates and how those fit into the picture. The Robe and Biretta recipes do not have multiple huge success categories, so I have no data on this. Another unexplored issue is if the ratio of minor success to failure/huge success is fixed at 2:3 also for other recipes than Robe.

# Conclusions

Despite some shortcomings, the results presented here show several clear patterns. My conclusion (summarised in Figure \@ref(fig:summarypicture)) is that the chance of minor success is fixed at 35% or 40% and cannot be altered by any known factor, and that increasing the probability of *huge success* therefore is the same as lowering the probability of failure. The chance of huge success can be modified as follows:

-   Adept: +1%
-   HRE advisor effect: +10%
-   HRE general effect: +20%
-   Oxford: not applicable to alchemy experiments
-   EX: not applicable to alchemy experiments
-   Paymaster aide: not applicable to alchemy experiments
-   Skill rank: no effect on huge success or failure, but moderate effect *within* minor success

# Acknowledgments

I want to thank those that lent me their characters for the advancement of science. Big thank you to Marsali that gave me some much needed fine dye and canned silk thread, and to Altavista, Guilder, Marsali and TardyMcGormless for discussions about this and that.

# Appendices {#appendices}

## How to make items for FAB {#FAB-ingredients}
<button class="btn btn-primary" data-toggle="collapse" data-target="#BlockName">

Show/Hide
</button>

::: {#BlockName .collapse}
```{r, out.width='75%', fig.align='center', fig.cap="Recipes and best practices for making FAB items", echo = FALSE}
knitr::include_graphics('FABingredients.jpg')
```
:::


## How to make Robe EX from Adept's Robe {#robe-EX}
<button class="btn btn-primary" data-toggle="collapse" data-target="#BlockName2">

Show/Hide
</button>

::: {#BlockName2 .collapse}
```{r, out.width='75%', fig.align='center', fig.cap="How to make Robe EX", echo = FALSE}
knitr::include_graphics('RobeEX.jpg')
```
:::

## FAB raw data {#fab-raw-data}

<button class="btn btn-primary" data-toggle="collapse" data-target="#BlockName3">

Show/Hide
</button>

::: {#BlockName3 .collapse}
```{r}
DT::datatable(FAB[,-1], options = list(pageLength = 10), caption = 'FAB experiments raw data. Values are number of attempts needed')
```
:::

## Robe raw data {#robe-raw-data}

<button class="btn btn-primary" data-toggle="collapse" data-target="#BlockName4">

Show/Hide
</button>

::: {#BlockName4 .collapse}
```{r FABappendix}
DT::datatable(Robe, options = list(pageLength = 10), caption = 'Robe experiments raw data')
```
:::

## Runs-tests {#runs-tests}

<button class="btn btn-primary" data-toggle="collapse" data-target="#BlockName5">

Show/Hide
</button>

::: {#BlockName5 .collapse}
```{r}
runsTableBatch
```

```{r}
runsTableCond
```
:::

## Auto-correlation {#auto-correlation}

<button class="btn btn-primary" data-toggle="collapse" data-target="#BlockName6">

Show/Hide
</button>

::: {#BlockName6 .collapse}
```{r ACFwhole, fig.height= 4, fig.width= 5, fig.show='hold', fig.cap= "Auto-correlation (upper panel) and partial auto-correlation (lower panel) function plot for the whole Robe sequence"}
ACFwhole
PACFwhole
```

```{r ACFbatch, fig.height= 16, fig.width= 8, fig.cap= "Auto-correlation function plots for Robe batches"}
BatchACF
```

```{r ACFcondition, fig.height= 8, fig.width= 8, fig.cap= "Auto-correlation function plots for Robe conditions"}
ConditionACF
```

```{r PACFbatch, fig.height= 16, fig.width= 8, fig.cap= "Partial auto-correlation function plots for Robe batches"}
BatchPACF
```

```{r PACFcondition, fig.height= 8, fig.width= 8, fig.cap= "Partial auto-correlation function plots for Robe conditions"}
ConditionPACF
```
:::

## R code for this report {#r-code}

```{r ref.label=knitr::all_labels(), echo=TRUE, eval=FALSE}
```

## Session information {#session-info}

```{r}
sessioninfo::session_info()
```

# Footnotes
