Forging success in UWO

Experiments with Fur Boots

Authors
Affiliation

Anima, London

Anima, London

PUBLISHED

December 23, 2024

LAST MODIFIED

January 2, 2025

Summary

We forged a lot of Fur Boots to experimentally test what the effect is of the Item Improvement Techniques Oxford skill, the Special Production EX gear, the Special Production Sailor Equipment, the skill rank of handicrafts, and having refined the handicrafts skill, on rate of success and degree of success in forging. Other factors known or suspected to influence crafting were held constant. In total we made 3,630 forging attempts and 21 boots that reached 100 defence.

None of the five studied factors influenced the rate of failure or rate of great success. On average boots failed on 12.6% of the attempts, gave normal success 38.6% and had great success 48.8%. Contrary to popular belief, we found no indication that failures were clustered.

As expected, the degree of success (how much the defence improved) was increased by having the Oxford skill active, and by getting great success instead of normal success. Surprisingly, none of the other skills affected the degree of success. Without Oxford, only 1.5% of base boots reached 100 defence while this increased to 5.5% with Oxford.

This study shows that Special Production EX/Sailor Equipment is not affecting forging, and we know from a previous study that the EX is not involved in ordinary crafting either. Thus, the hunt continues to discover what “Special Production” is.

A comparison was made with some newer forging recipes. These yield less degree of success per forging attempt (rate of success was not evaluated) than Fur Boots do, on average 4.4 improvement compared to 5.2 for great success Fur Boots with Oxford on. This is because these recipes improve the secondary trait (e.g. defence on a weapon), showing similar low degree of improvement as the old Horus’ Staff recipe.

This is part 3 in a series of investigations about crafting in UWO. Part 1 concerned alchemy experiments, and part 2 making weapons with casting:

Stinker (2023) Report from a FABtory https://rpubs.com/kryssogtvers/FAB

Stinker (2024) Crafting success in UWO — experiments with Gorz https://rpubs.com/kryssogtvers/Gorz

Introduction

In Uncharted Waters Online (UWO), character equipment can be improved from the base values by increasing its attack and/or defence in various ways.

FIGURE 1: Possible outcomes when forging boots

One way is to obtain a superior grade item, either from finding such an item with the Search skill in the field, or from gaining a great success production. The attack/defence value obtained from a great success can vary widely and was covered in a previous report (Stinker 2024).

Another way is to use forging tools, but these are limited to a) only the main attribute of the equipment (e.g. attack on a weapon, not defence) and b) only up to +30 (+25 from normal tools, and a further +5 from special tools).

Instead of using forging tools, one can also use transmutation alchemy to get +25 (but not on top of forging tools). This has the added benefit of being able to also increase the secondary attribute to +25 (e.g. defence on a weapon), provided the equipment has at least some to begin with.

Yet another way is available for certain equipment: recipes for forging.1 A limited selection of equipment have such recipes either fixed at certain NPCs, or in crafting books. One such book is the Craftsman’s Boot Enhancement, which has the recipes for forging Leather Boots, Riding Boots, and Fur Boots. Base Fur Boots have a defence value of 7, and this can be increased up to 100 by repeating the forging recipe enough times. The base boots can be bought cheap from the item shop in Oslo.

1 This was usually the only way in the old days when forging tools were rare/non-existing and transmutation alchemy was not invented yet.

When forging boots, 3 distinct outcomes are possible: normal success, great success and failure. The boots are lost when a failure occurs (Fig. 1, Fig. 2).

FIGURE 2: The three different outcomes are also explicitly stated in the text messages, with slightly different wording than in Fig. 1. Note the additional statement about refined skill when the outcome is a great success.

Thus, to make 100 defence boots it is necessary to get a long enough unbroken chain of normal and/or great success. The chances for this to happen can be increased by lowered chance of failure, and/or a shorter chain needed, i.e. more improvement per success.

It is therefore of interest to understand what influences both the chance of success and the degree of success. The degree of success in this context is a function of both how much improvement a normal and a great success gives, as well as the probability of a normal vs a great success.

Several items and skills are known, claimed, or suspected to influence the chance and degree of success in crafting. Some of these may apply to forging as well, others not, and some only to forging.

FIGURE 3: The Oxford Item Improvement Techniques skill

The Item Improvement Techniques skill from Oxford is widely believed to specifically help with forging. It doesn’t affect normal crafting of gear (Stinker 2024). The description in-game says:

Increases max improvement value when performing production to raise attack and defense powers.

It can also be expected that refining the relevant production skill will help with forging, because refining has a huge effect on the degree of great success in normal crafting of gear (Stinker 2024), and the in-game description of refining says:

… Also, a specific effect will be added when production is a great success.

FIGURE 4: Refined Handicrafts skill

Furthermore, after obtaining a great success when forging (or ordinary crafting) with refined skill a message comes in the chat that says:

You were able to produce higher-grade [Fur Boots] due to the effect of the Skill Refined!

It is also believed that the skill rank of the relevant production skill influences the degree of success (and some say also the rate of success), but there is no in-game message or description that says so.

Less clear is the purpose of EX gear with the Special Goods Production Rate Up equipment effect. What the description says suggests that the rate of getting “special goods” will increase:

Increases the chance of Producing special goods.

Effect increases depending on Rank.

FIGURE 5: The Golden Ark r6 special production EX accessory

However, it is unclear what is meant by “special goods”. It could be that it refers to producing a superior grade item, i.e. a great success. So that it would simply mean increased chance of great success. But EX gear does not increase the chance of great success in normal production (Stinker 2024), so this explanation has no support. It could be that it specifically refers to forging, either any success (as opposed to failure) or great success. Or it could be restricted to some other category of production, such as Florence recipes.

To make things clearer, or alternatively to confuse things even more, there is also Sailor Equipment with a Special Goods Production effect. Despite the very similar wording of the effect itself, the description is different and suggest that this is about the degree of success (and thus could be a different effect that will stack with the EX):

Special Goods made through Sewing, Casting or Handicrafts on the sea will be more likely to be high-performing.

After Special Goods Production, the Durability of the Sailor Equipment will decrease.

FIGURE 6: The Secrets of Crafts r5 special production Sailor Equipment (requires workshop skill)

Here we report on forging of Fur Boots done under specific conditions to experimentally test how these aforementioned skills and equipment may influence:

  • the rate of failure
  • the rate of normal success
  • the degree of improvement from normal success
  • the rate of great success
  • the degree of improvement from great success

We did this by conducting controlled experiments manipulating the presence or absence of each factor in a systematic way. Other factors known or believed to influence the rate and/or degree of success were held constant, such as paymaster aide job and trait value, Oxford production success skills, Meister/apprentice relationship, and using ship with workshop ship skill when producing at sea.

Methods

We forged boots using the r11 handicrafts recipe in the Craftsman’s Boot Enhancement book (Fig. 7). To save on materials consumed we did this in Artisan job with Boston skill active.

FIGURE 7: The recipe for forging Fur Boots

We continued forging the same pair of boots (without closing the production window) until they either were lost due to failure, or it reached 100 defence. Then we closed the window after double checking the recorded values (the results from previous attempts are still visible on the right hand side), and started over again with new boots. For each forging attempt we recorded the outcome (normal success, great success or failure) as well as the defence and durability values. The improvement gained in each attempt was calculated from these values at a later stage.

Forging was done under conditions specified in Tab. 1 in December 2024. We did 250 forging attempts for the treatment group, and then continued until the current pair of boots failed or reached 100 defence. A few boots did not get completed in this way and the data for these were excluded from some analyses and additional data collected separately to reach at least 250 forging attempts for each treatment group.

We tried to keep constant other factors not under study. Thus all forging was done with Meister title on and apprentice nearby (except when done at sea, see below), aide in paymaster job with 120SS traits, and 4 Oxford production success skills on.

The forging were done either at skill rank 21 (maximum) or at r11 (required) so as to maximise any rank effects. We used mentoring of a low level character in order to be only r11 despite having refined skill (+2) and in favoured job (base r15 needed to use Meister title). Mentoring gives penalties to the rank (just as +gear give boost) depending on the level difference between mentor and mentee, and we could thus adjust the effective rank to r11 (Fig. 8).

FIGURE 8: Mentoring was used to lower the handicraft rank to r11 (required for forging Fur Boots), and yet retain the handicraft Meister title.

Refinement is less easy to nullify, so we used another character with unrefined handicrafts that were otherwise identical (Meister title, paymaster 120SS aide, 4 Oxford production success skills active), but lower level.

For special production sailor equipment we used the Secrets of Crafting r5 in a ship with workshop skill, and for special production EX we used the golden arc r6, and the Oxford skill was either turned on or off.

TABLE 1: Treatment groups
Treatment Where
Factors
N Attempts N Boots1
Sailor Equip Refined Handicrafts Oxford EX
A land yes r21 yes r6 255 43
B land yes r21 yes no 258 35
C land yes r21 no r6 268 37
D land yes r21 no no 259 34
E land yes r11 yes r6 267 39
F land yes r11 yes no 266 37
G land yes r11 no r6 251 29
H land yes r11 no no 254 34
I sea r5 yes r21 yes r6 261 28
J sea r5 yes r21 yes no 253 32
K sea no yes r21 yes r6 256 35
L sea no yes r21 yes no 264 31
M land no r21 yes r6 265 28
N land no r21 yes no 253 39
Total 3630 481
1 Including incomplete boots (one each in treatment A to D)

We did not explicitly compare rates of failure or great success between producing on land and at sea, but note that this could be done to compare the effect of workshop (only at sea) vs. the effect of apprentice (only on land) which we expect to be very small. Instead we were interested in measuring the effect of the sailor equipment, thus all production at sea were done with workshop, either with or without sailor equipment.

Also durability can be increased by forging, but is of less importance than defence (especially since its only a temporal improvement). It is nevertheless a by-product of the forging process and we therefore also briefly present data on durability improvement.

For some analyses we excluded improvement gain values that could have been capped, that is if the boots were already higher than 100 defence minus the maximum improvement gain, i.e. higher than 92 for great success or higher than 94 for a normal success. And similar for durability (higher than 45 excluded). This was done so as not to bias the distribution of values.

In addition to graphical analyses of the data, we performed some statistical analyses. This was done with general(ized) linear modelling, either as Analysis of variance (for degree of improvement) or logistic regression (for rate of success, dichotomized either as failure/non-failure or great-success/non-great-success). We used type III sum of squares, and effects coding for contrasts (not dummy coding). That is, coefficients are not comparing to a reference group but rather to the mean of the groups.

This report was written using Quarto in RStudio. R code used to generate the results can be found in the Appendix together with software versions used. The raw data can be downloaded from the Appendix. Dark mode can be turned on/off in the top right, and figures can be enlarged by clicking on them.

Results

Rate of normal success, great success and failure

FIGURE 9: Proportion failure, normal success and great success in each treatment group, with confidence intervals (binomial, for great success and for non-failure separately). Dotted lines show the overall proportions irrespective of treatment.
FIGURE 10: Proportion failure, normal success and great success for each treatment. Treatments in a different order than in Fig. 9.

Overall, we did 3630 forging attempts on 481 boots, and obtained 1400 normal success (38.6 %), 1772 great success (48.8 %) and 458 failures (12.6 %).

Fig. 9 gives an overview of the proportion of the three outcomes under the different experimental conditions. This is also show in Fig. 10, but with an attempt to make the different factors more explicit.

In the figures, we show the rate of failure and the rate of great success, both expressed as binomial outcomes (i.e. treating normal success either as “non-failure” or as “non-great success”) with 95% binomial confidence intervals.

Statistical modelling of outcomes: logistic regression

Failure Rate

A crude analysis simply comparing the treatment groups without considering what factors were manipulated is presented in Tab. 2. In this table each treatment level is compared to the average failure rate. Treatment I and M had somewhat lower failure rate and treatment A higher than the others, but not significant (see also figure Fig. 9). Since multiple comparisons were made and the omnibus (overall) p-value of the model was not significant, there is no evidence that there was any systematic difference in failure rate between the groups.

More important, variation in failure rate cannot be attributed to any of the factors studied (Tab. 3). The failure rate was entirely unaffected by the experimentally altered conditions of special production EX, sailor equipment, Oxford item skill, refined skill or skill rank.

TABLE 2: Logistic regression comparing failure among all groups.
Simple Logistic regression
Variable Failure Rate OR1 95% CI1 p-value
treatment2


 0.6  
    A 40 / 255 (16%) 1.30 0.93–1.79  0.12 
    B 32 / 258 (12%) 0.99 0.68–1.39 >0.9  
    C 35 / 268 (13%) 1.05 0.73–1.46  0.8  
    D 33 / 259 (13%) 1.02 0.71–1.43 >0.9  
    E 38 / 267 (14%) 1.16 0.82–1.60  0.4  
    F 34 / 266 (13%) 1.02 0.71–1.43  0.9  
    G 28 / 251 (11%) 0.88 0.59–1.26  0.5  
    H 34 / 254 (13%) 1.08 0.75–1.51  0.7  
    I 24 / 261 (9.2%) 0.71 0.46–1.04  0.091
    J 34 / 253 (13%) 1.09 0.75–1.52  0.6  
    K 34 / 256 (13%) 1.07 0.74–1.50  0.7  
    L 30 / 264 (11%) 0.90 0.61–1.27  0.6  
    M 25 / 265 (9.4%) 0.73 0.48–1.06  0.11 
    N 37 / 253 (15%)
1 OR = Odds Ratio, CI = Confidence Interval
2 Contrasts: effects coding (i.e. comparing to overall mean)
TABLE 3: Logistic regression models of the effect of various factors on the Rate of Failure.
Factorial model without Refine
Variable Failure Rate OR1 95% CI1 p-value
oxford_item


 0.6
    no 130 / 1,032 (13%) 0.97 0.86–1.09
    yes 206 / 1,564 (13%)
crafting_rank


>0.9
    11 134 / 1,038 (13%) 1.00 0.88–1.12
    21 202 / 1,558 (13%)
Ex_rank


 0.8
    0 170 / 1,290 (13%) 1.02 0.90–1.15
    6 166 / 1,306 (13%)
oxford_item * crafting_rank 336 / 2,596 (13%)

 0.7
    no * 11 196 / 1,536 (13%) 0.97 0.86–1.10
oxford_item * Ex_rank 336 / 2,596 (13%)

 0.7
    no * 0 170 / 1,300 (13%) 1.03 0.91–1.16
crafting_rank * Ex_rank 336 / 2,596 (13%)

>0.9
    11 * 0 168 / 1,308 (13%) 1.00 0.89–1.13
oxford_item * crafting_rank * Ex_rank 336 / 2,596 (13%)

 0.4
    no * 11 * 0 176 / 1,300 (14%) 1.06 0.94–1.19
1 OR = Odds Ratio, CI = Confidence Interval
Additive model with Refine
Variable Failure Rate OR1 95% CI1 p-value
oxford_item


0.4
    no 130 / 1,032 (13%) 0.95 0.84–1.08
    yes 206 / 1,564 (13%)
crafting_rank


0.7
    11 134 / 1,038 (13%) 0.98 0.86–1.11
    21 202 / 1,558 (13%)
Ex_rank


0.7
    0 170 / 1,290 (13%) 1.02 0.91–1.14
    6 166 / 1,306 (13%)
refined


0.3
    no 62 / 518 (12%) 0.91 0.77–1.08
    yes 274 / 2,078 (13%)
1 OR = Odds Ratio, CI = Confidence Interval
Factorial model with Sailor Equipment
Variable Failure Rate OR1 95% CI1 p-value
sailor_equip


0.6 
    no 64 / 520 (12%) 1.06 0.87–1.28
    yes 58 / 514 (11%)
Ex_rank


0.5 
    0 64 / 517 (12%) 1.06 0.88–1.29
    6 58 / 517 (11%)
sailor_equip * Ex_rank 122 / 1,034 (12%)

0.12
    no * 0 54 / 525 (10%) 0.86 0.71–1.04
1 OR = Odds Ratio, CI = Confidence Interval
Additive model with Sailor Equipment
Variable Failure Rate OR1 95% CI1 p-value
sailor_equip


0.6
    no 64 / 520 (12%) 1.05 0.87–1.27
    yes 58 / 514 (11%)
Ex_rank


0.6
    0 64 / 517 (12%) 1.06 0.87–1.28
    6 58 / 517 (11%)
1 OR = Odds Ratio, CI = Confidence Interval

Rate of Great Success

Looking at the rate of great success a few treatment groups differ somewhat from the others (Tab. 4), but no effects are apparent for forging done on land (Tab. 5). At sea, EX suddenly seems to perhaps play a role. But if we look more closely we can see that great success with EX is, if anything, lower than without EX. Overall, there is no evidence for any effects of any of these factors on the rate of great success.

TABLE 4: Logistic regression comparing great success among all groups.
Simple Logistic regression
Variable Great Success Rate OR1 95% CI1 p-value
treatment2


 0.11 
    A 117 / 255 (46%) 0.89 0.70–1.13  0.3  
    B 131 / 258 (51%) 1.08 0.85–1.37  0.5  
    C 125 / 268 (47%) 0.92 0.73–1.15  0.5  
    D 130 / 259 (50%) 1.06 0.83–1.34  0.6  
    E 113 / 267 (42%) 0.77 0.61–0.97  0.028
    F 121 / 266 (45%) 0.87 0.69–1.10  0.3  
    G 131 / 251 (52%) 1.14 0.90–1.45  0.3  
    H 116 / 254 (46%) 0.88 0.69–1.12  0.3  
    I 126 / 261 (48%) 0.98 0.77–1.24  0.9  
    J 133 / 253 (53%) 1.16 0.92–1.47  0.2  
    K 130 / 256 (51%) 1.08 0.85–1.37  0.5  
    L 151 / 264 (57%) 1.40 1.11–1.77  0.005
    M 130 / 265 (49%) 1.01 0.80–1.27 >0.9  
    N 118 / 253 (47%)
1 OR = Odds Ratio, CI = Confidence Interval
2 Contrasts: effects coding (i.e. comparing to overall mean)
TABLE 5: Logistic regression models of the effect of various factors on the Rate of Great Success.
Factorial model w/o Refine
Variable Great Success Rate OR1 95% CI1 p-value
oxford_item


0.2 
    no 502 / 1,032 (49%) 1.06 0.97–1.14
    yes 730 / 1,564 (47%)
crafting_rank


0.4 
    11 481 / 1,038 (46%) 0.96 0.89–1.04
    21 751 / 1,558 (48%)
Ex_rank


0.9 
    0 616 / 1,290 (48%) 1.01 0.93–1.09
    6 616 / 1,306 (47%)
oxford_item * crafting_rank 1,232 / 2,596 (47%)

0.2 
    no * 11 743 / 1,536 (48%) 1.05 0.97–1.14
oxford_item * Ex_rank 1,232 / 2,596 (47%)

0.4 
    no * 0 606 / 1,300 (47%) 0.96 0.89–1.04
crafting_rank * Ex_rank 1,232 / 2,596 (47%)

0.3 
    11 * 0 609 / 1,308 (47%) 0.96 0.89–1.04
oxford_item * crafting_rank * Ex_rank 1,232 / 2,596 (47%)

0.14
    no * 11 * 0 603 / 1,300 (46%) 0.94 0.87–1.02
1 OR = Odds Ratio, CI = Confidence Interval
Additive model with Refine
Variable Great Success Rate OR1 95% CI1 p-value
oxford_item


0.2
    no 502 / 1,032 (49%) 1.05 0.97–1.15
    yes 730 / 1,564 (47%)
crafting_rank


0.4
    11 481 / 1,038 (46%) 0.96 0.88–1.05
    21 751 / 1,558 (48%)
Ex_rank


0.8
    0 616 / 1,290 (48%) 1.01 0.94–1.09
    6 616 / 1,306 (47%)
refined


0.8
    no 248 / 518 (48%) 1.02 0.91–1.14
    yes 984 / 2,078 (47%)
1 OR = Odds Ratio, CI = Confidence Interval
Factorial model with Sailor Equipment
Variable Great Success Rate OR1 95% CI1 p-value
sailor_equip


0.2  
    no 281 / 520 (54%) 1.07 0.95–1.21
    yes 259 / 514 (50%)
Ex_rank


0.084
    0 284 / 517 (55%) 1.11 0.99–1.26
    6 256 / 517 (50%)
sailor_equip * Ex_rank 540 / 1,034 (52%)

0.7  
    no * 0 277 / 525 (53%) 1.02 0.90–1.15
1 OR = Odds Ratio, CI = Confidence Interval
Additive model with Sailor Equipment
Variable Great Success Rate OR1 95% CI1 p-value
sailor_equip


0.3  
    no 281 / 520 (54%) 1.07 0.95–1.21
    yes 259 / 514 (50%)
Ex_rank


0.084
    0 284 / 517 (55%) 1.11 0.99–1.26
    6 256 / 517 (50%)
1 OR = Odds Ratio, CI = Confidence Interval

Degree of defence improvement

FIGURE 11: Box-plots of defence value in each treatment group, overlaid by the individual data points, scaled by number of observations. Treatments without oxford item tech skill active are highlighted. See Tab. 1 for the conditions in each treatment.
FIGURE 12: The distribution of defence value in each treatment group (histograms). Treatments without oxford item tech skill active are highlighted. See Tab. 1 for the conditions in each treatment.

In total we obtained 3,143 measures of defence improvement2, with a mean of 4.38. The degree of improvement per forging attempt was very similar in all treatments, except lower in those without Oxford item skill active. Fig. 11 shows this as box-plots, while Fig. 12 display the same data as histograms. Both figures include normal as well as great success.

2 Excluded are failures and values that could have been capped because close to the maximum defence

With Oxford item active, the improvement varied between 3 and 8 whereas without Oxford the range was 2–6. The mean improvement was 4.74 (n = 2243) and 3.5 (n = 900), respectively.

If we compare the improvement values from normal and great success (Fig. 13) we see that both have a minimum value of 3 with Oxford, but only 2 without. The maximum value of a normal success increase from 4 without Oxford to 6 with Oxford. Likewise, maximum great success increases from 6 to 8.

Apart from the large effect of Oxford skill (and normal/great success), the treatment groups were remarkably similar in degree of improvement.

If we pool the data across treatments, the average normal success gives 2.99 defence without Oxford and 4.16 with, and for great success 3.91 and 5.19, respectively (Tab. 6).

In fact, a great success without Oxford is on average worse than a normal success with Oxford.

FIGURE 13: Box-plots of defence value for normal and great success (gs) in each treatment group, overlaid by the individual data points (scaled by number of observations). See Tab. 1 for the conditions in each treatment.


TABLE 6: Degree of improvement of defence from normal and great success (gs), without and with Item Improvement Techniques (Oxford) active.
Defence increase
2 3 4 5 6 7 8 Mean SD n Valid n
Oxford: no
failure 130
normal 138 128 134 0 0 0 0 2.99 0.83 400 400
gs 67 117 163 99 54 0 0 3.91 1.18 502 500
Total 3.50
Sum 205 245 297 99 54 0 0 1032 900
Oxford: yes
failure 328
normal 0 405 206 195 186 0 0 4.16 1.15 1000 992
gs 0 158 271 341 226 163 92 5.19 1.43 1270 1251
Total 4.74
Sum 0 563 477 536 412 163 92 2598 2243
Grand Total 4.38
Grand Sum 205 808 774 635 466 163 92 3630 3143

Statistical modelling of degree of improvement

A two-way Analysis of Variance (Tab. 7) confirms what graphical inspection of the data indicated: there is a large effect of both outcome (normal/great success) and Oxford on the degree of improvement. Further, there is no interaction between outcome and Oxford, meaning that the effect of Oxford is the same for each outcome, and vice versa.

TABLE 7: Anova Table (Type III tests) of the effect of Oxford item and outcome (normal/great success) on the degree of defence improvement, ignoring other factors.
Effect df F ges1 p.value
oxford_item 1, 3139 618.15 .149 <.001
outcome 1, 3139 390.86 .111 <.001
oxford_item:outcome 1, 3139 1.18 <.001 .277
1 generalized eta-squared (effect size)

However, this model ignores the other factors studied. A more complete analysis is presented below, where we look simultaneously at all the factors studied. First we focus on forging done on land.

Forging on land

Since we had only two treatments without refined skill (and thus unable to estimate many coefficients of a full model), we first did a full factorial model ignoring refinement (Tab. 8 a). This revealed (again) strong effects of Oxford_item and normal/great success and no effect of EX or crafting rank on the degree of improvement. None of the first-order or higher order interactions were significant.

Then we did a reduced model of all the factors including refinement, but without any interaction. This showed that also refinement had no effect (Tab. 8 b).

However, previous analyses of ordinary crafting (Stinker 2024) found strong effects of refinement on the degree of great success. To test if there is an effect of refinement but only when great success, we made a separate full model with outcome and refinement (Tab. 8 c)3, but found no such interaction (and again no main effect apart from normal/great success).

3 In this model we could also include the effect of EX, but was restricted to treatments with r21 handicrafts and with Oxford skill. Thus, this analysis consists of treatments A, B, M, and N. (see Tab. 1 for treatment groups.)

TABLE 8: Anova Tables (Type III tests) of the effect of various factors on the degree of defence improvement.
Full factorial model w/o Refine
Effect df F ges1 p.value
oxford_item 1, 2223 534.67 .173 <.001
outcome 1, 2223 320.28 .126 <.001
Ex_rank 1, 2223 0.55 <.001 .459
crafting_rank 1, 2223 0.08 <.001 .777
oxford_item:outcome 1, 2223 0.38 <.001 .535
oxford_item:Ex_rank 1, 2223 2.78 .001 .096
outcome:Ex_rank 1, 2223 2.75 .001 .097
oxford_item:crafting_rank 1, 2223 0.22 <.001 .641
outcome:crafting_rank 1, 2223 0.01 <.001 .920
Ex_rank:crafting_rank 1, 2223 1.51 <.001 .220
oxford_item:outcome:Ex_rank 1, 2223 0.00 <.001 .965
oxford_item:outcome:crafting_rank 1, 2223 0.48 <.001 .489
oxford_item:Ex_rank:crafting_rank 1, 2223 0.41 <.001 .522
outcome:Ex_rank:crafting_rank 1, 2223 1.94 <.001 .164
oxford_item:outcome:Ex_rank:crafting_rank 1, 2223 0.83 <.001 .361
1 generalized eta-squared (effect size)
Additive model with Refine
Effect df F ges1 p.value
oxford_item 1, 2233 480.75 .156 <.001
outcome 1, 2233 360.11 .139 <.001
Ex_rank 1, 2233 0.75 <.001 .387
crafting_rank 1, 2233 0.17 <.001 .676
refined 1, 2233 0.01 <.001 .909
1 generalized eta-squared (effect size)
Full factorial model with Refine
Effect df F ges1 p.value
refined 1, 875 0.08 <.001 .782
outcome 1, 875 136.43 .135 <.001
Ex_rank 1, 875 0.11 <.001 .740
refined:outcome 1, 875 0.00 <.001 .948
refined:Ex_rank 1, 875 0.41 <.001 .524
outcome:Ex_rank 1, 875 0.45 <.001 .504
refined:outcome:Ex_rank 1, 875 0.07 <.001 .787
1 generalized eta-squared (effect size)
Full factorial model with Sailor Equipment
Effect df F ges1 p.value
sailor_equip 1, 896 0.08 <.001 .780
outcome 1, 896 139.14 .134 <.001
Ex_rank 1, 896 0.07 <.001 .798
sailor_equip:outcome 1, 896 1.06 .001 .305
sailor_equip:Ex_rank 1, 896 0.15 <.001 .697
outcome:Ex_rank 1, 896 0.09 <.001 .761
sailor_equip:outcome:Ex_rank 1, 896 0.17 <.001 .684
1 generalized eta-squared (effect size)

Forging at sea

A separate analysis was done for the forging done at sea (Tab. 8 d). We found no effect of sailor equipment or EX (other factors were held constant at refined r21 with Oxford active; treatment I, J, K and L).

Durability improvement and relationship to defence improvement

Unlike defence, the durability improvement was not affected by Oxford (Fig. 14). Similar to defence — and in contrast to ordinary crafting — there was no effect of refined skill.

(a) Histograms
(b) Boxplots
FIGURE 14: The distribution of durability improvement values. These are all from great success (normal success never increased durability). See Tab. 1 for the conditions in each treatment.

The only thing that had an effect on durability was the success outcome: normal success never improved durability whereas great success invariably did so (up to the cap at 49 dura).

Durability improvement per forging attempt ranged from 1 to 4 (when great success), on average 2.2.

Some treatment groups did not see any durability increase of 4, but this value was rare and unrelated to the factors under study.

There was no correlation between durability improvement and defence improvement4 (r = -0.009, df = 1306, p = 0.75, Fig. 15), meaning that the degree of improvement was set independently for defence and durability.

4 Observations where either value could have been capped were excluded.

FIGURE 15: Scatterplot of the lack of relationship between durability improvment and defence improvement.

Putting it all together: 100 defence boots

Ultimately, what matters when forging boots is to get to 100 defence (Fig. 16).

FIGURE 16: Base and 100 defence Fur Boots.

5 Only boots that were forged until breaking or reaching 100 defence were included.

Fig. 17 shows the distribution of what defence value the boots failed at (or became 100 defence). This measure of final defence is arguably the most relevant, as it is a function of both the rate of failure and great success, as well as the degree of improvement from normal and great success.5

FIGURE 17: Histogram of the distribution of defence value when boots failed (or reached 100 defence). A value of 7 represent cases where the boots failed at first attempt.

In Fig. 18 we show how this final defence measure varies among the treatment groups.

There was some, but not much, evidence that final defence varied between the treatment groups (one-way Anova, F13,463 = 1.70, η^2 = 0.045, p = 0.06).

Since we found above that Oxford item tech was the only factor that had any effect on degree of success (and no factor had any effect on rate of failure or rate of great success), we lumped the treatments into Oxford vs. no Oxford (Fig. 19).

There was a highly significant difference between the two groups (one-way Anova, F1,475 = 7.73, η^2 = 0.016, p = 0.006), but the effect is small — most of the variation is random.

FIGURE 18: The fate of each pair of boots forged in each treatment: final defence before breaking (or reaching 100 defence). Included are also the ones that failed at first attempt and never got above the base of 7 defence, as well as those that reached 100 defence (highlighted in green).
FIGURE 19: The fate of each pair of boots forged in treatments with and without Oxford item improvement tech active: final defence before breaking (or reaching 100 defence). Included are also the ones that failed at first attempt and never got above the base of 7 defence, as well as those that reached 100 defence (highlighted in green).

Nevertheless, 5.5 % of boots reached 100 defence with Oxford skill, but only 1.5 % became 100 defence without the Oxford skill active (Tab. 9)

TABLE 9: Cross tabulation of the end result of forging boots, by using Oxford skill or not
End result
Total p-value1
100 defence destroyed
oxford_item


0.057
    no 2 (1.5%) 130 (98.5%) 132 (100.0%)
    yes 19 (5.5%) 326 (94.5%) 345 (100.0%)
Total 21 (4.4%) 456 (95.6%) 477 (100.0%)
1 Pearson’s Chi-squared test

Comparison with other forging recipes

Boots are not the only equipment that can be forged from recipes. Other old recipes include those from plundered books for various weapons, and recipes for weapons fixed at NPCs in East Asia.

Certain other equipment from newer recipes are even more sought after to be max-forged than Fur Boots, and it is of interest to see if the conclusions reached from forging boots apply to these as well. These recipes are for forging the “other” parameter, i.e. defence on weapons and attack on clothes and helmets.

We present the results from forging some of these in Tab. 10. The items were:

FIGURE 20: The Sagaris
FIGURE 21: Apotolos Amina EX
FIGURE 22: Black Landsknecht
FIGURE 23: Sea Dragon’s Circlet

The obtained values were all from great success (Master’s secret invariably used), with Oxford item and r6 EX used in all cases, at r20 refined skill (except alchemy, which is not refineable).

Since the materials required are much harder to obtain than for old recipes such as forging boots, we do not have extensive data. However, there are some striking differences from forging Fur Boots.

FIGURE 24: Horus’ Staff. Base and forged to max defence. Base staffs can have improved attack and durability from great success when initially crafted (failures result in loss also for the base recipe), but have never any defence.
FIGURE 25: The Horus’ Staff forging recipe.
TABLE 10: Improvement values of some other forged gear. These were all items where the secondary attribute is forgeable from a recipe, i.e. defence on a weapon, or attack on clothes or a helmet. No failures occured because Master’s Secrets were used in each forging attempt, and thus each improvement value is from great success (these recipes use very valuable materials). Note that the minimum improvement value was 2 and the maximum was 7, unlike for the boots.
Sagaris weapon
Sagaris
defence increase
0   0 
1   5 
2   9 
3  13 
4  18 
5  23 
6  28 
7  32 
8  35 
9  37 
10  41 
11  43 
12  46 
13  49 
14  54 
15  58 
16  64 
17  71 
18  76 
19  80 
20  85 
21  91 
22  97 
23 100 
Mean1 4.41
Min 2
Max 7
1 Excluding the last, potentially capped, attempt
Apotolos Amina EX weapon
Apotolos
defence increase
0  30 
1  36 
2  41 
3  46 
4  52 
5  57 
6  59 
7  62 
8  67 
9  73 
10  79 
11  85 
12  91 
13  97 
14 100 
Mean1 5.15
Min 2
Max 6
1 Excluding the last, potentially capped, attempt
Another Apotolos Amina EX
Apotolos
defence increase
0  30 
1  34 
2  36 
3  41 
4  46 
5  51 
6  55 
7  57 
8  61 
9  67 
10  71 
11  74 
12  77 
13  82 
14  85 
15  92 
16  98 
17 100 
Mean1 4.25
Min 2
Max 7
1 Excluding the last, potentially capped, attempt
Black Landsknecht Clothes
Landsknecht
attack increase
0  15 
1  20 
2  24 
3  26 
4  31 
5  34 
6  37 
7  43 
8  50 
9  57 
10  62 
11  67 
12  71 
13  74 
14  76 
15  78 
16  84 
17  87 
18  92 
19  96 
20 100 
Mean1 4.26
Min 2
Max 7
1 Excluding the last, potentially capped, attempt
Sea Dragon's Circlet (headgear)
Circlet
attack increase
0   7 
1  12 
2  16 
3  20 
4  24 
5  26 
6  28 
7  32 
8  36 
9  40 
10  46 
11  52 
12  57 
13  62 
14  67 
15  69 
16  74 
17  79 
18  84 
19  88 
20  94 
21  96 
22 100 
Mean1 4.24
Min 2
Max 6
1 Excluding the last, potentially capped, attempt
Another Sea Dragon's Circlet
Circlet
attack increase
0  7 
1 10 
2 12 
3 16 
4 19 
5 23 
6 27 
7 31 
8 36 
9 38 
10 40 
11 42 
12 46 
13 53 
14 59 
15 64 
16 69 
17 73 
18 79 
19 83 
20
21
22
Mean 4.00
Min 2
Max 7

In contrast to the boots, the minimum improvement value was only 2 and the maximum was only 7 for the equipment in Tab. 10. Even normal success forging of Fur Boots starts at 3, and reach 8 with great success. It is only when no Oxford item skill is used that the boots improvement can be as low as a value of 2.

TABLE 11: Summary of the outcome of forging defence on Horus’ Staffs.
Oxford
Total p-value1
no yes
outcome


0.093
    failure 18 (17.1%) 32 (12.3%) 50 (13.7%)
    normal 47 (44.8%) 97 (37.3%) 144 (39.5%)
    gs 40 (38.1%) 131 (50.4%) 171 (46.8%)
Total 105 (100.0%) 260 (100.0%) 365 (100.0%)
1 Pearson’s Chi-squared test

On average, a great success with these recipes yielded 4.4 improvement, whereas the average great success of Fur Boots (when done with with Oxford item skill) gave 5.2 improvement.

Despite being performed under similar conditions, Tab. 10 shows that both the minimum and the maximum improvement of these forging recipes are less than for the Fur Boots, which is reflected in a lower average improvement per attempt.

The reason for this discrepancy was unclear. It could be that these newer recipes simply are implemented differently. Or it could be that it is related to these recipes being for forging the secondary attribute of the equipment. To separate between these possibilities we forged some Horus’s Staffs (Fig. 25, from the ultimate staff-training recipe book). This is an ideal item for this purpose, because its an old recipe for forging defence on a weapon.6

6 The staff was never very popular because it is restricted to trade jobs, and has low attack. But can be useful in certain situations.


We did 260 forging attempts on 33 Horus’ Staffs with Oxford on, and 105 attempts without Oxford (18 staffs). All production was done on land, with refined r20 casting, Meister title on with apprentice nearby, 4 production success Oxfords on, aide in paymaster 120SS, and EX r6 equipped. The success rates were similar to Fur Boots, with 14% failure and 47% great success (Tab. 11).

TABLE 12: Degree of improvement of defence on Horus’ Staff from normal and great success (gs), without and with Item Improvement Techniques (Oxford) active.
Defence increase
Mean SD n Valid n
Oxford: no
failure 18 0
normal 2.04 0.81 47 47
gs 3.00 1.18 40 40
Total 2.48
Sum 105 87
Oxford: yes
failure 32 0
normal 3.06 1.13 97 96
gs 4.38 1.44 131 131
Total 3.82
Sum 260 227
Grand Total 3.45
Grand Sum 365 314
FIGURE 26: Box-plots of improvement value for Horus’ Staff (blue) compared to Fur Boots (green; treatment A and C from Fig. 13) and the other recipes (red; pooled from Tab. 10), split in normal and great success (gs) with and without Oxford Improvement Tech, overlaid by the individual data points (scaled by number of observations). Note that staffs without Oxford had low sample size, and for “others” only gs with Oxford was available.

The results for improvement values were clear: forging Horus’s Staff behaved like the new recipes for secondary stats, and not like Fur Boots (Fig. 26).

With Oxford on, the defence increased on average 4.4 for a great success staff (range 2–7), and 3.1 (range 2–5) for a normal success (Tab. 12, compare to the boots in Tab. 6). Without Oxford, both normal and great success defence increase on a staff could be as low as 1.

Hence, we conclude that recipes for forging a secondary attribute yield less improvement per attempt than recipes for forging the primary attribute.

RNG issues

Often when forging we get the feeling that failures are clustered. Several people have commented independently on this, and that it might be due to some biases in the random number generator algorithms used in UWO causing lucky and bad streaks. If this really is so it should be possible to detect certain patterns in the data.

Within a treatment group, the boots were made sequentially and to a large degree consecutively, and we kept track of the sequence of events.7 One way to test for non-randomness is thus by runs-tests, i.e. use statistical methods to see if segments of the sequence with the same value occur more often than expected by chance.

7 However, since failures meant that we closed the window in our set-up, it could be that the rng was reset and that the supposedly biased outcomes must be sought in a longer, unclosed, sequence. And that we therefore did not get the relevant data, which should be from an unclosed sequence it can be argued. Maybe.

8 Although a first attempt failure can have been preceded by a 100 def.

In our case however, we can test a more straightforward and explicit prediction about failures: clustered failures would mean that failures should occur disproportionally often at the first attempt. This is because failures can only be followed by another failure that is a first attempt.8 Whenever two failures occur together, at least one of them must have been a first attempt. A casual look at Fig. 17 from earlier might suggest that there were a suspicious large number of failures at the first forging attempt.

FIGURE 27: Proportion failures along the sequence of forging attempts for each pair of Fur Boots. Also shown is (b) proportion great success and (c) proportion normal success, as well as (d) the raw sequence of failures over all pairs of boots.

Although the proportion failure was a little higher for the 1st attempt of a pair of boots (13.93%) than the overall mean from all attempts (12.62%), it was not more so than can be expected from random variation.9 We can put this in perspective by looking at the whole sequence of forging attempts per pair of boots (Fig. 27). Then it becomes clear that first attempts are not special, and ergo failures cannot be overly clustered.10

9 67 failures out of 481 first attempts = 0.1393 with 95% confidence interval [0.1111–0.1731].

10 Once again we have been fooled by the human psyche who is a sponge that sucks in random noise and spits out fictitious patterns.

Discussion

This study was motivated partly by confusion about what the “special production” EX and sailor equipment actually do. The EX does not play a part in normal production, at least not for the stats of crafted gear (Stinker 2024). The present study shows that it does not affect forging either. At least not to any degree we could detect with our reasonably large sample sizes. If it has any effect it must be much smaller than the effect of the Oxford skill, and practically insignificant. The same can be said about the sailor equipment. A further clue that it had no effect is the fact that durability did not decrease on the sailor equipment (beyond what happens over time with any installed sailor equipment), which should happen after special goods production (Fig. 6).

Clearly, we do not understand what these items do. What is “special goods production”? It is not forging, not great success, not alchemy experiments. Perhaps it applies to Florence recipes, but these cannot be done at sea so that is contradicted by the existence of the sailor equipment. Maybe it is restricted to recipes where great success results in a qualitatively different item with another name, such as master’s cannons or armour plates. We suggest that to narrow down and simplify the search, one could test various production at sea and get a hint from the durability of the sailor equipment.

Given that refining has a huge effect on the degree of improvement from great success in normal crafting (Stinker 2024), we had not expected that it has no effect whatsoever on the degree of improvement in forging. The game is lying about refining — the statement it gives about “You were able to produce higher-grade Fur Boots due to the effect of the skill refined” (Fig. 2) is false. It could have been that this only applies in certain situations, but we found neither a main effect nor any interaction with the other factors we studied. Specifically we had expected an interaction effect between refinement and outcome (success/great success). This is in stark contrast to the situation when crafting gear from raw materials — refining is the one factor that affects the stats of normal crafting the most and only when great success.

We had also expected that skill rank would show some effect on the degree of improvement, but we could not detect any difference between producing at r11 and r21. Earlier we have for example shown that skill rank does affect the outcome of alchemy experiments (not the rate of huge success, but the outcome of normal success, (Stinker 2023).

Less surprising was it that we could confirm that the Oxford item skill does indeed increase the improvement values when forging. The maximum improvement increases from 6 to 8, resulting in a little over 1 extra defence per attempt on average.

Entirely unsurprising was the fact that great success yields on average more improvement than normal success, but perhaps not as much as one might think — often a normal success would give more improvement than some great success do (Fig. 13), and a normal success with Oxford is on average slightly better than a great success without (Tab. 6).

However, these findings did not transfer directly to some newer forging recipes, currently in more demand, that we analysed for comparison (Tab. 10). These recipes gave less improvement per great success than the boots, when done under similar conditions. The reason for this discrepancy is that these recipes are for forging the secondary attribute, and not because they are newer recipes. This was confirmed by forging defence with the old Horus’ Staff recipe (Fig. 26), which behaved like the newer recipes and not like Fur Boots.

Acknowledgements

Thanks to Nijntx for comments.

Appendix

R code used in this report

library(ggplot2)
library(dplyr)
library(tidyr)
library(ggh4x)
library(legendry)
library(ggtext)
library(gt)
library(afex)
library(gtsummary)

guide_axis_label_trans <- function(label_trans = identity, ...) {
  axis_guide <- guide_axis(...)
  axis_guide$label_trans <- rlang::as_function(label_trans)
  class(axis_guide) <- c("guide_axis_trans", class(axis_guide))
  axis_guide
}

guide_train.guide_axis_trans <- function(x, ...) {
  trained <- NextMethod()
  trained$key$.label <- x$label_trans(trained$key$.label)
  trained
}

#function to check for and place legend in empty facet if available
shift_legend4 <- function(p) {
  pnls <- cowplot::plot_to_gtable(p) |> gtable::gtable_filter("panel") |>
    with(setNames(grobs, layout$name)) |> purrr::keep(~identical(.x, zeroGrob()))
  
  if(length(pnls) == 0) stop("No empty facets in the plot")
  
  lemon::reposition_legend(p, "center", panel=names(pnls)) |>
    ggplotify::as.ggplot()
}

named_group_split <- function(.tbl, ...) {
  grouped <- group_by(.tbl, ...)
  names <- rlang::inject(paste(!!!group_keys(grouped), sep = " / "))

  grouped %>% 
    group_split() %>% 
    rlang::set_names(names)
}

update_geom_defaults("bar", list(fill = I("#377eb8"), colour =("grey30")))
theme_update(plot.background = element_rect(fill = "grey92", 
                                            colour = NA), 
             #linewidth = 0.5),
             legend.background = element_rect(fill = "grey92"),
             legend.title = element_text(face = "bold"),
             legend.key = element_rect(color = "white", fill = "grey97"),
             legend.margin = NULL
)

stinker <- readODS::read_ods("Boots.ods", sheet =1)
mars <- readODS::read_ods("BootsMarsaliResults.ods", sheet =1)
boots <- rbind(stinker, mars)
boots$outcome <- factor(boots$outcome, levels = c("failure", "normal", "gs"))
boots$sailor_equip <- factor(boots$sailor_equip)
boots$refined <- factor(boots$refined)
boots$crafting_rank <- factor(boots$crafting_rank)
boots$oxford_item <- factor(boots$oxford_item )
boots$Ex_rank <- factor(boots$Ex_rank)
summary(boots)

boots |>
write.csv2("bootsRawData.csv")

overalProp <- boots |>
  summarise(
    normal = sum(outcome == "normal"),
    gs = sum(outcome == "gs"),
    failure = sum(outcome == "failure"),
    total = n()
  ) |>
  mutate(
    propGs = gs / total,
    propFailure = failure / total,
    propNormal = normal / total,
    propNonfailure = (normal + gs) / total
  )

bootsProp <- boots |>
  group_by(treatment) |>
  summarise(normal = sum(outcome == "normal"),
            gs = sum(outcome == "gs"),
            failure = sum(outcome == "failure"),
            total = n(),
            sailor_equip = unique(sailor_equip),
            refined = unique(refined),
            crafting_rank = unique(crafting_rank),
            oxford_item = unique(oxford_item),
            Ex_rank = unique(Ex_rank)) |>
  mutate(nonfailure = normal+gs)

#binomial confidence intervals
binomCIgs <- binom::binom.wilson(bootsProp$gs, bootsProp$total)
binomCIfailure <- binom::binom.wilson(bootsProp$failure, bootsProp$total)
binomCInonfailure <- binom::binom.wilson(bootsProp$nonfailure, bootsProp$total)

CIgs <- cbind(bootsProp, binomCIgs)
CIfailure <- cbind(bootsProp, binomCIfailure)
CInonfailure <- cbind(bootsProp, binomCInonfailure)
CI <- bind_cols(bootsProp, binomCIgs, binomCInonfailure, .name_repair = c("unique")) # .1 is for failure

meanGsOx <- boots |>
  filter(outcome == "gs" & oxford_item == "yes") |>
  summarise(avrg = round(mean(valid_def_increase, na.rm = TRUE), 1))
treatmentGroups <- boots |>
  group_by(treatment) |>  
  summarise(
    treatment = factor(unique(treatment)),
    Where = factor(unique(land_sea)),
    SE = recode(factor(unique(sailor_equip)), "NA" = NA_character_,
                "yes" = "r5"),
    Refined = unique(refined),
    Handicrafts = recode(unique(crafting_rank),
                         "11" = "r11",
                         "21" = "r21"),
    Oxford = unique(oxford_item),
    EX = recode(unique(Ex_rank),
                "0" = "no",
                "6" = "r6"),
    Attempts = n(),
    n_boots = length(unique(boots_id)),
    n_valid_boots = length(unique(boots_id[!is.na(treatment_id)])),
    incomplete_boots = length(unique(boots_id[is.na(treatment_id)])),
    n_normal = sum(outcome == "normal"),
    n_gs = sum(outcome == "gs"),
    n_failure = sum(outcome == "failure")
    ) 

treatmentGroups |>
  select(treatment:n_boots) |>
gt(rowname_col = "treatment") |>
  tab_spanner(label = md("**Factors**"), 
              columns = c(3:7)
              ) |>
  tab_stubhead(label = "Treatment") |>
  sub_missing() |>
  cols_label(SE = "Sailor Equip",
             Attempts = "N Attempts",
             n_boots = "N Boots") |>
  tab_style(style = cell_text(color = "#777"), locations = cells_title ()) |>
  opt_stylize(style = 3, color = "blue") |>
  grand_summary_rows(
    columns = c(8:9),
    fns = list(Total ~ sum(.))
    ) |>
  tab_footnote(
    footnote = "Including incomplete boots (one each in treatment A to D)",
    locations = cells_column_labels(columns = n_boots)
    ) |>
  tab_options(
    data_row.padding = px(2),
    summary_row.padding = px(3),
    grand_summary_row.padding = px(3),
    row_group.padding = px(4),
    quarto.use_bootstrap = TRUE
  ) |>
  tab_style(
    style = cell_text(size = px(16)),
    locations = cells_body()
    ) |>
  tab_style(
    style = cell_text(weight = "bold"),
    locations = cells_stub_grand_summary()
  )

boots |>
ggplot(aes(x = treatment, fill = outcome)) +
  geom_bar(position = "fill") + 
  ylab("proportion")+
  scale_fill_brewer(palette = "Set1") +
  geom_hline(yintercept = overalProp$propGs, linetype =2) +
  geom_hline(yintercept = 1-overalProp$propFailure, linetype =2)
ggplot() +
  geom_bar(data = boots, aes(x = treatment, fill = outcome), position = "fill") + 
  scale_fill_brewer(palette = "Set1") +
  geom_hline(aes(yintercept = overalProp$propGs, linetype = "great success"), linetype = 2) +
  geom_hline(aes(yintercept = 1-overalProp$propFailure, linetype = "non-failure"), linetype = 2) +
  geom_pointrange(data = CIgs, aes(y = mean, x = treatment, ymin = lower, ymax = upper)) +
  geom_pointrange(data = CInonfailure, aes(y = mean, x = treatment, ymin = lower, ymax = upper)) +
  scale_y_continuous(breaks = seq(0, 1, 0.1), limits = c(0, 1)) +
  xlab("experimental treatment") +
  ylab("proportion +/- 95% C.I.") +
  scale_y_continuous(expand = expansion(mult = c(0, 0.05)))
CIgs |> 
  ggplot() +
  geom_pointrange(data = CIgs, aes(y = mean, x = treatment, ymin = lower, ymax = upper))+ 
 #                 size =1, linewidth = 1) +
  scale_y_continuous(breaks = seq(0, 1, 0.1), limits = c(0, 1)) +
  geom_hline(yintercept = overalProp$propGs, linetype =2)+
    xlab("experimental treatment") +
  ylab("proportion great success +/- 95% binomial C.I.")

CIfailure |> 
  ggplot(aes(y = mean, x = treatment)) +
  geom_pointrange(aes(ymin = lower, ymax = upper))+
  scale_y_continuous(breaks = seq(0, 1, 0.1), limits = c(0, 1)) +
    geom_hline(yintercept = overalProp$propFailure, linetype =2)+
  xlab("experimental treatment") +
  ylab("proportion failure +/- 95% binomial C.I.")

CInonfailure |> 
  ggplot(aes(y = mean, x = treatment)) +
  geom_pointrange(aes(ymin = lower, ymax = upper))+
  scale_y_continuous(breaks = seq(0, 1, 0.1), limits = c(0, 1)) +
    geom_hline(yintercept = overalProp$propNonfailure, linetype =2)+
  xlab("experimental treatment") +
  ylab("proportion non-failure +/- 95% binomial C.I.")


CI$sailor_equip <- recode_factor(CI$sailor_equip, "NA" = "—")
CI |>
  ggplot(aes(
    x = weave_factors(sailor_equip, refined, crafting_rank, oxford_item, Ex_rank)
  )) +
  geom_hline(aes(yintercept = 1-overalProp$propFailure, colour = "non-failure"), linetype = 3, linewidth =1) +
  geom_hline(aes(yintercept = overalProp$propGs, colour = "great success"), linetype = 3, linewidth =1) +
  geom_pointrange(aes(y = mean...15, ymin = lower...16, ymax = upper...17)) +
  geom_pointrange(aes(y = mean...21, ymin = lower...22, ymax = upper...23)) +
  #scale_y_continuous(breaks = seq(0, 1, 0.1), limits = c(0, 1)) +
  #xlab("experimental treatment") +
  ylab("proportion +/- 95% C.I.") +
  scale_x_discrete(guide = legendry::guide_axis_nested(title = NULL, type = "box")) +
  scale_colour_brewer(palette = "Set1", direction = -1) +
  guides(colour = guide_legend(title = "Outcome"),
         #for now, paste manual treatment order
    x.sec = guide_axis_base(title = "treatment group",
                            key = key_manual(aesthetic =c(1:14),
                                             label = c("N", "M", "H", "G", "F", "E", "D", "C", "B", "A", "L", "K", "J", "I")))
    )+
  #      labels = c("N", "M", "H", "G", "F", "E", "D", "C", "B", "A", "L", "K", "J", "I"),
  theme(axis.ticks.x = element_blank()) +
  annotate(
    geom = "richtext",
    size = 3.1,
    label = paste(
      "**Factors**",
      "<br>sailor equip",
      "<br>refined",
      "<br>handi rank",
      "<br>Oxford item",
      "<br>EX rank"
    ),
    fill = NA,
    label.color = NA,
    x = 14.3,
    y = -0.10,
    hjust = 0
  ) +
  coord_cartesian(expand = FALSE,
                  clip = 'off',
                  ylim = c(0, 1))
  

boots <- boots |> mutate(
  failure = recode_factor(
    outcome,
    "normal" = "no",
    "gs" = "no",
    "failure" = "yes"
  ),
  great_success = recode_factor(
    outcome,
    "normal" = "no",
    "failure" = "no",
    "gs" = "yes"
  )) 

boots_land <- boots |>
  filter(land_sea == "land")
#boots_land$oxford_item  <- relevel(boots_land$oxford_item, ref = "yes")

boots_sea <- boots |>
  filter(land_sea == "sea") |>
  droplevels()

#make a function to give tables
#do final adjustments outside this function, using gt functions instead, such as 
#label event rate to failure or gs rate. 

#glm ignoring land/sea will indicate some effect of crafting rank, but thats only because all the crafting at sea was at r21. and its because treatment L happens to be lots of great success. but it makes no sense since it sholdnt be better than treatment K which is WITH ex and otherwise same.

gt_glmTable <- function(df, event, title, keep = FALSE) {
  # arguments: model, failure/great success, title, keep (p values for each level)
  tbl_regression(df,
                 exponentiate = TRUE,
                 add_estimate_to_reference_rows = FALSE) |>
    add_n(location = c('level')) |>
    add_nevent(location = c('level')) |>
    add_global_p(keep = keep, type = "III") |>
    # this will update the em-dash in the CI row to Ref.
    # modify_table_styling(
    #   columns = conf.low,
    #   rows = reference_row %in% TRUE,
    #   missing_symbol = "Reference"
    # ) |>
    # adding event rate
    modify_table_body( ~ .x |>
                         dplyr::mutate(
                           stat_nevent_rate =
                             ifelse(!is.na(stat_nevent), paste0(
                               style_sigfig(stat_nevent / stat_n, scale = 100), "%"
                             ), NA),
                           .after = stat_nevent
                         )) |>
    # merge the colums into a single column
    modify_column_merge(pattern = "{stat_nevent} / {stat_n} ({stat_nevent_rate})", 
                        rows = !is.na(stat_nevent)) |>
    modify_table_styling(
    column = conf.low,
    rows = !is.na(estimate),
    cols_merge_pattern = "{conf.low}–{conf.high}"
  ) |>
    modify_header(stat_nevent = glue::glue("**{event} Rate**")) |>
    bold_labels() |>
    modify_header(label = "**Variable**") |>
    modify_column_alignment(columns = stat_nevent, align = "right") |>
    as_gt() |>
    tab_header(title = glue::glue("{title}")) |>
    opt_stylize(style = 3, color = "blue") |>
    tab_options(
      data_row.padding = px(2),
      summary_row.padding = px(3),
      grand_summary_row.padding = px(3),
      row_group.padding = px(4),
      footnotes.padding = px(2)
    ) |>
    tab_style(locations = cells_title (), style = cell_borders(style = 'hidden')) |>
    cols_align_decimal(columns = p.value,
                       dec_mark = ".",
                       locale = NULL) |>
    tab_style(style = cell_text(align = "center"),
              locations = cells_column_labels(columns = stat_nevent)) |>
  tab_style(
    style = cell_text(size = px(15)),
    locations = cells_body()
    )
}

  # # merge OR and CI into single column
  # modify_table_styling(
  #   column = estimate,
  #   rows = !is.na(estimate),
  #   cols_merge_pattern = "{estimate} ({conf.low} to {conf.high})"
  # ) %>%
  # modify_header(estimate ~ "**OR (95% CI)**") %>%
  # modify_column_hide(c(ci, p.value))

glm1 <- glm(failure ~ Ex_rank*crafting_rank*oxford_item, 
            family = binomial(link='logit'), data = boots_land)
#set global contrast default
options(contrasts=c("contr.sum","contr.poly"))

# contrasts = list(oxford_item = contr.sum,
#                  crafting_rank = contr.sum,
#                  Ex_rank = contr.sum,
#                  refined = contr.sum))

# seems contrast coding would matter in this case, 
# because im using gtsummary and not afex, and gtsummary use 
# default car::Anova settings and just ask for type=3  whereas afex forces contr.sum
# Should write to the gtsummary guy and explain the issue with his tbl_regression function when using global.p

summary(glm1)
car::Anova(glm1, type = 3)
anova(glm1)
glm1 |> broom.helpers::  tidy_plus_plus(
    exponentiate = TRUE,
    add_reference_rows = FALSE,
    categorical_terms_pattern = "{level} / {reference_level}",
    add_n = TRUE
  ) |> gt()

gt_glmTable(glm1, "Failure", "Full factorial model w/o Refine")

glmFailureGroups <- glm(failure ~ treatment, 
             family = binomial(link='logit'), 
             data = boots,
             contrasts = list(treatment = contr.sum))
gt_glmTable(glmFailureGroups, "Failure", "Simple Logistic regression", TRUE) |>
    tab_footnote(
    footnote = "Contrasts: effects coding (i.e. comparing to overall mean)",
    locations = cells_body(columns = label,
                           rows = label == "treatment")
    )
#at land, full w/o refine
glm1 <- glm(failure ~ oxford_item*crafting_rank*Ex_rank, 
             family = binomial(link='logit'), data = boots_land)
summary(glm1)
car::Anova(glm1, type = 3)

#at land, additive with refine
glm2 <- glm(failure ~ oxford_item + crafting_rank + Ex_rank + refined,
             family = binomial(link='logit'), data = boots_land)
summary(glm2)
car::Anova(glm2, type = 3)

#at sea, full factorial with sailor equipment (but no data on oxford or refined)
glm3 <- glm(failure ~ sailor_equip*Ex_rank, 
             family = binomial(link='logit'), data = boots_sea)
summary(glm3)
car::Anova(glm3, type = 3)

#at sea, additive with sailor equipment (but no data on oxford or refined)
glm4 <- glm(failure ~ sailor_equip + Ex_rank, 
             family = binomial(link='logit'), data = boots_sea)
summary(glm4)
car::Anova(glm4, type = 3)
gt_glmTable(glm1, "Failure", "Factorial model without Refine")
gt_glmTable(glm2, "Failure", "Additive model with Refine")
gt_glmTable(glm3, "Failure", "Factorial model with Sailor Equipment")
gt_glmTable(glm4, "Failure", "Additive model with Sailor Equipment")

glmGsGroups <- glm(great_success ~ treatment, 
             family = binomial(link='logit'), 
             data = boots,
             contrasts = list(treatment = contr.sum)
             )
gt_glmTable(glmGsGroups, "Great Success", "Simple Logistic regression", TRUE) |>
    tab_footnote(
    footnote = "Contrasts: effects coding (i.e. comparing to overall mean)",
    locations = cells_body(columns = label,
                           rows = label == "treatment")
    )
#at land, full w/o refine
glm5 <- glm(great_success ~ oxford_item*crafting_rank*Ex_rank, 
             family = binomial(link='logit'), data = boots_land)
summary(glm5)
car::Anova(glm5, type =3)

#at land, additive with refine
glm6 <- glm(great_success ~ oxford_item + crafting_rank + Ex_rank + refined,
             family = binomial(link='logit'), data = boots_land)
summary(glm6)
car::Anova(glm6, type =3)

#at sea, full factorial with sailor equipment (but no data on oxford or refined)
glm7 <- glm(great_success ~ sailor_equip*Ex_rank, 
             family = binomial(link='logit'), data = boots_sea)
summary(glm7)
car::Anova(glm7, type =3)

#at sea, additive with sailor equipment (but no data on oxford or refined)
glm8 <- glm(great_success ~ sailor_equip + Ex_rank, 
             family = binomial(link='logit'), data = boots_sea)
summary(glm8)
car::Anova(glm8, type =3)
gt_glmTable(glm5, "Great Success", "Factorial model w/o Refine")
gt_glmTable(glm6, "Great Success", "Additive model with Refine")
gt_glmTable(glm7, "Great Success", "Factorial model with Sailor Equipment")
gt_glmTable(glm8, "Great Success", "Additive model with Sailor Equipment")
boots |>
ggplot(aes(y=valid_def_increase, x= treatment)) +
  geom_boxplot(aes(fill = oxford_item), outliers = F, staplewidth = 0.7) +
  geom_count(alpha=0.6) +
  ylab("Defence increase") +
  scale_fill_brewer(palette = "Set1") +
  theme(axis.ticks.x = element_blank())

histoDef1 <- boots |>
 ggplot(aes(valid_def_increase, fill = oxford_item)) +
  geom_histogram(binwidth = 1) +
  scale_fill_brewer(palette = "Set1") +
  xlab("Defence increase") +
  facet_wrap(vars(treatment), ncol=5)
histoDef2 <-  shift_legend4(histoDef1)

meanDef <- mean(boots$valid_def_increase, na.rm = T)
summarisedOx <- boots |>
  group_by(oxford_item) |>
  summarise(n = n(),
            meanDef = mean(valid_def_increase, na.rm = T),
            n_validDef = sum(valid_def_increase >= 1, na.rm = T)
            )

summarisedOxOutcome <- boots |>
  group_by(oxford_item, outcome) |>
  summarise(meanDef = mean(valid_def_increase, na.rm = T),
            SDDef = sd(valid_def_increase, na.rm = T),
            n = n(),
            n_validDef = sum(valid_def_increase >= 1, na.rm = T)
            )
histoDef2 

boots |>
  filter(outcome != "failure") |>
ggplot(aes(y=valid_def_increase, x= outcome)) +
  geom_boxplot(aes(fill = oxford_item), outliers = F, staplewidth = 0.7) +
  geom_count(alpha=0.6) +
  scale_fill_brewer(palette = "Set1") +
  scale_x_discrete(drop = TRUE) +
  scale_y_continuous(expand = c(0.1, 0.05)) +
  facet_wrap(vars(treatment), nrow = 2) +
  theme(axis.ticks.x = element_blank())
summarisedDef <- summarisedOxOutcome |> 
    mutate(
    oxford_item = paste0("Oxford: ", oxford_item)) |>
   gt(groupname_col = "oxford_item", rowname_col = "outcome") |>
    cols_label(
    outcome = "Outcome",
    meanDef = "Mean",
    SDDef = "SD",
    n_validDef = "Valid n") |>
    tab_spanner(
    label = md('**Defence increase**'),
    columns = 3:6) |>
  sub_missing() |>
   fmt_number(
    columns = c("meanDef", "SDDef"),
    decimals = 2) |>
   summary_rows(
    fns = list("Total" = ~weighted.mean(., n_validDef)
    ), 
    fmt = ~ fmt_number(., decimals = 2),
    columns = meanDef) |>
    summary_rows(
    fns = list(
      "Sum" = ~sum(.)
    ),
    fmt = ~ fmt_number(., decimals = 0, use_seps = FALSE),
    columns = c(n, n_validDef)
   ) |>
  grand_summary_rows(
    fns = list("Grand Total" = ~weighted.mean(., n_validDef)
    ),
    fmt = ~ fmt_number(., decimals = 2),
    columns = meanDef
  ) |>
   grand_summary_rows(
    fns = list(
      "Grand Sum" = ~sum(.)
    ),
    fmt = ~ fmt_number(., decimals = 0, use_seps = FALSE),
    columns = c(n, n_validDef)
   ) |>
    tab_stubhead(label = html(local_image(
    file = "images/00201500.png", 
    height = 40))) |>
  opt_stylize(style = 3, color = "blue") |>
  tab_style(
    locations = list(cells_stub_grand_summary(),cells_row_groups()),
    style = cell_text(weight = "bold")) |>
  tab_options(
    data_row.padding = px(2),
    summary_row.padding = px(3),
    grand_summary_row.padding = px(3),
    row_group.padding = px(4))

#summarisedDef

summarisedOxOutcome2 <- boots |>
  group_by(oxford_item, outcome) |>
  summarise(
            "2" = sum(valid_def_increase == 2, na.rm = T),
            "3" = sum(valid_def_increase == 3, na.rm = T),
            "4" = sum(valid_def_increase == 4, na.rm = T),
            "5" = sum(valid_def_increase == 5, na.rm = T),
            "6" = sum(valid_def_increase == 6, na.rm = T),
            "7" = sum(valid_def_increase == 7, na.rm = T),
            "8" = sum(valid_def_increase == 8, na.rm = T),
    meanDef = mean(valid_def_increase, na.rm = T),
    SDDef = sd(valid_def_increase, na.rm = T),
    n = n(),
    n_validDef = sum(valid_def_increase >= 1, na.rm = T)
            )

summarisedDef2 <- summarisedOxOutcome2 |> 
    mutate(
    oxford_item = paste0("Oxford: ", oxford_item)) |>
   gt(groupname_col = "oxford_item", rowname_col = "outcome") |>
    cols_label(
    outcome = "Outcome",
    meanDef = "Mean",
    SDDef = "SD",
    n_validDef = "Valid n") |>
    tab_spanner(
    label = md('**Defence increase**'),
    columns = 3:13) |>
   fmt_number(
    columns = c("meanDef", "SDDef"),
    decimals = 2) |>
   summary_rows(
    fns = list("Total" = ~weighted.mean(., n_validDef)
    ), 
    fmt = ~ fmt_number(., decimals = 2),
    columns = meanDef) |>
    summary_rows(
    fns = list(
      "Sum" = ~sum(.)
    ),
    fmt = ~ fmt_number(., decimals = 0, use_seps = FALSE),
    columns = c(3:9, 12:13) #  2,3,4,5,6,7,8,n, n_validDef)
   ) |>
  grand_summary_rows(
    fns = list("Grand Total" = ~weighted.mean(., n_validDef)
    ),
    fmt = ~ fmt_number(., decimals = 2),
    columns = meanDef
  ) |>
   grand_summary_rows(
    fns = list(
      "Grand Sum" = ~sum(.)
    ),
    fmt = ~ fmt_number(., decimals = 0, use_seps = FALSE),
    columns = c(3:9, 12:13)
   ) |>
    tab_stubhead(label = html(local_image(
    file = "images/00201500.png", 
    height = 40))) |>
    sub_missing() |>
  sub_zero(rows = c(1, 4), zero_text = "—") |>
  opt_stylize(style = 3, color = "blue") |>
  tab_style(
    locations = list(cells_stub_grand_summary(),cells_row_groups()),
    style = cell_text(weight = "bold")) |>
  tab_options(
    data_row.padding = px(2),
    summary_row.padding = px(3),
    grand_summary_row.padding = px(3),
    row_group.padding = px(4))

summarisedDef2

# calculate average improvments in 4 groups, and 2-way anova for 2x2 groups med interaction. + for all treatments?

#first a simple model, based on what we can see from the raw data it should be this simple
anovaTable <- car::Anova(lm(valid_def_increase ~ oxford_item*outcome, 
              data=boots,  
              contrasts = list(oxford_item = contr.sum, outcome = contr.sum)), 
           type = 3)

anovaTable2 <- aov_car(valid_def_increase ~ oxford_item * outcome + Error(id), data = boots, observed = "outcome")
anovaTable3 <- nice(anovaTable2, MSE = FALSE, sig_symbols = rep("", 4))
gt_anovaTable <- function(df, title){
gt(df, rowname_col = "Effect") |>
  tab_stubhead(label = "Effect") |>
  tab_footnote(footnote = "generalized eta-squared (effect size)", 
               locations = cells_column_labels(columns = ges)) |>
  tab_header(title = glue::glue("{title}")) |>
  tab_style(style = cell_text(color = "#777"), locations = cells_title ()) |>
    cols_align(align = 'right', columns = 2:5) |> 
  opt_stylize(style = 3, color = "blue") |>
  tab_options(
    data_row.padding = px(2),
    summary_row.padding = px(3),
    grand_summary_row.padding = px(3),
    row_group.padding = px(4)
  ) |>
  tab_style(locations = cells_title (), style = cell_borders(style = 'hidden')) |>
  tab_style(
    style = cell_text(size = px(15)),
    locations = cells_body()
    )
}
gt_anovaTable(anovaTable3, "")

#gt(anovaTable, rownames_to_stub = TRUE)

#full-factorial  
anovaTable4 <- aov_car(valid_def_increase ~ oxford_item * outcome * Ex_rank * crafting_rank + Error(id), 
                       data = boots_land, observed = "outcome")
anovaTable5 <- nice(anovaTable4, MSE = FALSE, sig_symbols = rep("", 4))

# no interactions, but with refined
anovaTable6 <- aov_car(valid_def_increase ~ oxford_item + outcome + Ex_rank + crafting_rank + refined + Error(id), data = boots_land, observed = "outcome")
anovaTable7 <- nice(anovaTable6, MSE = FALSE, sig_symbols = rep("", 4))

#outcome and refined + ex (A B M N)
boots_ABMN <- boots_land |>
  filter(crafting_rank == "21" & oxford_item == "yes") 
anovaTable8 <- aov_car(valid_def_increase ~ refined*outcome*Ex_rank + Error(id), 
                       data = boots_ABMN, observed = "outcome")
anovaTable9 <- nice(anovaTable8, MSE = FALSE, sig_symbols = rep("", 4))

#forging at sea

anovaTable10 <- aov_car(valid_def_increase ~ sailor_equip*outcome*Ex_rank + Error(id), 
                       data = boots_sea, observed = "outcome")
anovaTable11 <- nice(anovaTable10, MSE = FALSE, sig_symbols = rep("", 4))

gt_anovaTable(anovaTable5, "Full factorial model w/o Refine")
gt_anovaTable(anovaTable7, "Additive model with Refine")
gt_anovaTable(anovaTable9, "Full factorial model with Refine")
gt_anovaTable(anovaTable11, "Full factorial model with Sailor Equipment")
duraHist1 <- boots |>
ggplot(aes(valid_dura_increase, fill = oxford_item)) +
  geom_histogram(binwidth = 1) +
    scale_fill_brewer(palette = "Set1") +
  facet_wrap(vars(treatment), ncol=5) +
  xlab("Durability increase")
duraHist2 <- shift_legend4(duraHist1)

duraBox <- boots |>
ggplot(aes(y=valid_dura_increase, x = treatment)) +
  geom_boxplot(aes(fill = oxford_item), outliers = F, staplewidth = 0.7) +
  geom_count(alpha=0.6) +
  scale_fill_brewer(palette = "Set1") +
  theme(axis.ticks.x = element_blank()) +
  ylab("Durability increase")

duraScatter <- boots |>
ggplot(aes(y=valid_dura_increase, valid_def_increase))+
  geom_count(alpha = 1) +
  scale_size_area() +
  ylab("Durability increase") +
  xlab("Defence increase")

corRes <- cor.test(data = boots, ~ valid_dura_increase + valid_def_increase, 
         na.omit = "T", method = "pearson",
         alternative = "two.sided")
duraHist2
duraBox
duraScatter
endDef <- boots |>
  filter(treatment_id >= 1) |>
  group_by(boots_id) |>
  summarise(
    result = max(defence, na.rm = T),
    attempts = n(),
    treatment = unique(treatment),
    oxford_item = unique(oxford_item),
    refined = unique(refined),
    Ex_rank = unique(Ex_rank),
    crafting_rank = unique(crafting_rank),
    sailor_equip = unique(sailor_equip),
    land_sea = unique(land_sea)
  ) |>
mutate(final = if_else(result < 0, 7, result))

endDef |>
  ggplot(aes(x = final)) +
  geom_histogram(binwidth = 1) +
  xlab("final defence") +
  scale_x_continuous(breaks = seq(0, 100, 10))+
  scale_y_continuous(expand = expansion(mult = c(0.01, 0.05)))

endDef |>
  ggplot(aes(x = treatment, y = final)) +
  geom_boxplot(aes(fill = ifelse(treatment %in% c("C", "D", "G", "H"), 
                                    "white", "#377eb8")),
                   outliers = F, width = 0.5, staplewidth = 0.7,  alpha =1) +
  geom_count(aes(colour = ifelse(result == 100, "#4daf4a", "#e41a1c")), alpha = 0.6) +
  scale_colour_identity("end result", labels = c("hooray", "poop"),
                      guide = "legend") +
  scale_fill_identity("Oxford item", labels = c("yes", "no"),
                      guide = "legend") + 
  guides(size = guide_legend(title = "n boots", order = 2)) +
  scale_size_area(n.breaks = 6) +
  scale_y_continuous(breaks = seq(0, 100, 10), limits = c(0, 100), 
                     expand = expansion(mult = c(0, 0.05))) +
  ylab("final defence")
  #geom_hline(yintercept = 7, linetype = 2)
# C, D, G, H are those without item oxford. maybe have in another colour
endDef |>
  ggplot(aes(x = oxford_item, y = final)) +
  geom_boxplot(aes(fill = ifelse(treatment %in% c("C", "D", "G", "H"), 
                                    "white", "#377eb8")),
                   outliers = F, width = 0.5, staplewidth = 0.7,  alpha =1) +
  geom_count(aes(colour = ifelse(result == 100, "#4daf4a", "#e41a1c")), alpha = 0.6) +
  scale_colour_identity("end result", labels = c("hooray", "poop"),
                      guide = "legend") +
  scale_fill_identity("Oxford item", labels = c("yes", "no"),
                      guide = "legend") + 
  scale_size_area(n.breaks = 6) +  
  guides(size = guide_legend(title = "n boots", order = 3)) +
  scale_y_continuous(breaks = seq(0, 100, 10), limits = c(0, 100), 
                     expand = expansion(mult = c(0, 0.05))) +
  ylab("final defence")+
  xlab("Oxford")

#one-way just to see if groups differ
anovaTable12 <- aov_car(final ~ treatment + Error(boots_id), data = endDef)
anovaTable13 <- nice(anovaTable12, MSE = FALSE, sig_symbols = rep("", 4))

anovaTable14 <- car::Anova(lm(final ~ oxford_item, 
              data=endDef,  
              contrasts = list(oxford_item = contr.sum)), 
           type = 3)

anovaTable15 <- aov_car(final ~ oxford_item + Error(boots_id), data = endDef)
anovaTable16 <- nice(anovaTable15, MSE = FALSE, sig_symbols = rep("", 4))

endDefLand <- endDef |>
  filter(land_sea == "land") 

#full-factorial  
anovaTable17 <- aov_car(final ~ oxford_item * Ex_rank * crafting_rank + Error(boots_id), 
                       data = endDefLand)
anovaTable18 <- nice(anovaTable17, MSE = FALSE, sig_symbols = rep("", 4))

# no interactions, but with refined
anovaTable19 <- aov_car(final ~ oxford_item + Ex_rank + crafting_rank + refined + Error(boots_id), data = endDefLand)
anovaTable20 <- nice(anovaTable19, MSE = FALSE, sig_symbols = rep("", 4))

#outcome and refined + ex (A B M N)
endDefABMN <- endDefLand |>
  filter(crafting_rank == "21" & oxford_item == "yes") 
anovaTable21 <- aov_car(final ~ refined*Ex_rank + Error(boots_id), 
                       data = endDefABMN)
anovaTable22 <- nice(anovaTable21, MSE = FALSE, sig_symbols = rep("", 4))

#forging at sea
endDefSea <- endDef |>
  filter(land_sea == "sea") 

anovaTable23 <- aov_car(final ~ sailor_equip*Ex_rank + Error(boots_id), 
                       data = endDefSea)
anovaTable24 <- nice(anovaTable23, MSE = FALSE, sig_symbols = rep("", 4))

anovaTable25 <- aov_car(final ~ sailor_equip+Ex_rank + Error(boots_id), 
                       data = endDefSea)
anovaTable26 <- nice(anovaTable25, MSE = FALSE, sig_symbols = rep("", 4))
endDef |>
  mutate(end = ifelse(final == 100, "100 defence", "destroyed")) |>
  tbl_cross(col = end, 
            row = oxford_item, 
            percent = "row", 
            digits = c(0, 1),
            label = list(end = "End result")) |>
  add_p() |>
  bold_labels() |>
as_gt() |>
    cols_label(label = html(local_image(
    file = "images/00201500.png", 
    height = 40))) |>
    opt_stylize(style = 3, color = "blue") |>
    tab_options(
      data_row.padding = px(2),
      summary_row.padding = px(3),
      grand_summary_row.padding = px(3),
      row_group.padding = px(4),
      footnotes.padding = px(2)
    ) |>
    tab_style(locations = cells_title (), 
              style = cell_borders(style = 'hidden')) |>
  tab_style(
    style = cell_text(size = px(16)),
    locations = cells_body()
    )

sagaris <- data.frame(
  attempt = c(0L,1L,2L,3L,4L,5L,6L,7L,8L,9L,10L,
              11L,12L,13L,14L,15L,16L,17L,18L,19L,20L,21L,22L,
              23L),
  defence = c(0L,5L,9L,13L,18L,23L,28L,32L,35L,
              37L,41L,43L,46L,49L,54L,58L,64L,71L,76L,80L,85L,91L,
              97L,100L),
  increase = c(NA,5L,4L,4L,5L,5L,5L,4L,3L,2L,4L,
               2L,3L,3L,5L,4L,6L,7L,5L,4L,5L,6L,6L,3L)
)

apotolosAminaEX1 <- data.frame(
  attempt = c(0L,
              1L,2L,3L,4L,5L,6L,7L,8L,9L,10L,
              11L,12L,13L,14L),
  defence = c(30L,
              36L,41L,46L,52L,57L,59L,62L,67L,
              73L,79L,85L,91L,97L,100L),
  increase = c(NA,
               6L,5L,5L,6L,5L,2L,3L,5L,6L,6L,6L,
               6L,6L,3L)
)

apotolosAminaEX2 <- data.frame(
  attempt = c(0L,
              1L,2L,3L,4L,5L,6L,7L,8L,9L,10L,
              11L,12L,13L,14L,15L,16L,17L),
  defence = c(30L,
              34L,36L,41L,46L,51L,55L,57L,61L,
              67L,71L,74L,77L,82L,85L,92L,98L,100L),
  increase = c(NA,
               4L,2L,5L,5L,5L,4L,2L,4L,6L,4L,3L,
               3L,5L,3L,7L,6L,2L)
)

blackLandsknecht <- data.frame(
  attempt = c(0L,
              1L,2L,3L,4L,5L,6L,7L,8L,9L,10L,
              11L,12L,13L,14L,15L,16L,17L,18L,19L,
              20L),
  attack = c(15L,
             20L,24L,26L,31L,34L,37L,43L,50L,
             57L,62L,67L,71L,74L,76L,78L,84L,87L,
             92L,96L,100L),
  increase = c(NA,
               5L,4L,2L,5L,3L,3L,6L,7L,7L,5L,5L,
               4L,3L,2L,2L,6L,3L,5L,4L,4L)
)

circlet1 <- data.frame(
  attempt = c(0L,1L,2L,3L,4L,
              5L,6L,7L,8L,9L,10L,11L,12L,13L,14L,15L,
              16L,17L,18L,19L,20L,21L,22L),
  attack = c(7L,12L,16L,20L,
             24L,26L,28L,32L,36L,40L,46L,52L,57L,62L,
             67L,69L,74L,79L,84L,88L,94L,96L,100L),
  increase = c(NA,5L,4L,4L,4L,
               2L,2L,4L,4L,4L,6L,6L,5L,5L,5L,2L,5L,5L,
               5L,4L,6L,2L,4L)
)

circlet2 <- data.frame(
  attempt = c(0L,1L,2L,3L,4L,
              5L,6L,7L,8L,9L,10L,11L,12L,13L,14L,15L,
              16L,17L,18L,19L,20L,21L,22L),
  attack = c(7L,10L,12L,16L,
             19L,23L,27L,31L,36L,38L,40L,42L,46L,53L,
             59L,64L,69L,73L,79L,83L,NA,NA,NA),
  increase = c(NA,3L,2L,4L,3L,
               4L,4L,4L,5L,2L,2L,2L,4L,7L,6L,5L,5L,4L,
               6L,4L,NA,NA,NA)
)

forging_table <- function(df, gear, title, picture){
  gt(df, rowname_col = "attempt") |>
  sub_missing() |>
  grand_summary_rows(
    columns = "increase",
    fns = list(Mean ~ mean(head(., -1), na.rm = T)),
    fmt = ~ fmt_number(., decimals = 2)) |>
  grand_summary_rows(
    columns = "increase",
    fns = list(Min ~ min(., na.rm = T), 
               Max ~ max(., na.rm = T)),
    fmt = ~ fmt_number(., decimals = 0)) |> 
  tab_options(
    data_row.padding = px(2),
    summary_row.padding = px(3),
    grand_summary_row.padding = px(3),
    row_group.padding = px(4)
  ) |>
  tab_stubhead(label = html(local_image(
    file = glue::glue("{picture}"), 
    height = 40))) |>
  tab_footnote(footnote = "Excluding the last, potentially capped, attempt", 
               locations = cells_stub_grand_summary(rows = "Mean")) |>
  tab_spanner(label = glue::glue("{gear}"), columns = c(2:3)) |>
  tab_header(title = glue::glue("{title}")) |>
  tab_options(heading.title.font.size = px(16)) |>
  tab_style(style = cell_text(color = "#777"), locations = cells_title ()) |>
  opt_stylize(style = 3, color = 'blue') |>
  tab_style(locations = cells_title (), style = cell_borders(style = 'hidden'))  |>
  tab_style(
    style = cell_text(size = px(15)),
    locations = cells_body()
    ) |>
      cols_align_decimal()
}

forging_table(sagaris, "Sagaris", "Sagaris weapon", "images/sagaris.png")
forging_table(apotolosAminaEX1, "Apotolos", "Apotolos Amina EX weapon","images/apotolos.png")
forging_table(apotolosAminaEX2, "Apotolos", "Another Apotolos Amina EX", "images/apotolos.png")

forging_table(blackLandsknecht, "Landsknecht", "Black Landsknecht Clothes", "images/landsknecht.png")
forging_table(circlet1, "Circlet", "Sea Dragon's Circlet (headgear)", "images/circlet.png")
forging_table(circlet2, "Circlet", "Another Sea Dragon's Circlet", "images/circlet.png") |> 
  rm_footnotes()
staffs <- readODS::read_ods("staff.ods", sheet =1)
staffs$outcome <- factor(staffs$outcome, levels = c("failure", "normal", "gs"))
staffs$oxford_item <- factor(staffs$oxford_item )
summary(staffs)

#make a data fram with the other recipes, and oxford group A and C (that are similar to the staff groups).
combined_df <- staffs |>
  select(valid_def_increase, outcome, oxford_item, ) |>
  filter(valid_def_increase >=1) |>
  rename(increase = valid_def_increase,
         Oxford = oxford_item) |>
  mutate(equipment = "staff")

combined_df <- head(sagaris, -1) |>
  filter(increase != is.na(increase)) |>
  select(increase) |>
  mutate(outcome = "gs",
         Oxford = "yes",
         equipment = "other") |>
  rbind(combined_df)

combined_df <- head(apotolosAminaEX1, -1) |>
  filter(increase != is.na(increase)) |>
  select(increase) |>
  mutate(outcome = "gs",
         Oxford = "yes",
         equipment = "other") |>
  rbind(combined_df)

combined_df <- head(apotolosAminaEX2, -1) |>
  filter(increase != is.na(increase)) |>
  select(increase) |>
  mutate(outcome = "gs",
         Oxford = "yes",
         equipment = "other") |>
  rbind(combined_df)

combined_df <- head(blackLandsknecht, -1) |>
  filter(increase != is.na(increase)) |>
  select(increase) |>
  mutate(outcome = "gs",
         Oxford = "yes",
         equipment = "other") |>
  rbind(combined_df)

combined_df <- head(circlet1, -1) |>
  filter(increase != is.na(increase)) |>
  select(increase) |>
  mutate(outcome = "gs",
         Oxford = "yes",
         equipment = "other") |>
  rbind(combined_df)

combined_df <- circlet2 |>
  filter(increase != is.na(increase)) |>
  select(increase) |>
  mutate(outcome = "gs",
         Oxford = "yes",
         equipment = "other") |>
  rbind(combined_df)

combined_df <- boots |>  
  filter(valid_def_increase >=1 & (treatment == "A" | treatment == "C")) |>
  select(valid_def_increase, outcome, oxford_item, ) |>
  rename(increase = valid_def_increase,
         Oxford = oxford_item) |>
  mutate(equipment = "boots") |>
  rbind(combined_df)

combined_df$equipment <- factor(combined_df$equipment, levels =c("other", "staff", "boots"))
tableStaffOutcome <- staffs |>
tbl_cross(row = outcome, 
          col = oxford_item,
          percent = "column", 
          digits = c(0, 1),
          label = list(oxford_item = "Oxford")) |>
  add_p() |>
  bold_labels() |>
#  show_header_names()
as_gt() |>
  cols_label(label = html(local_image(
    file = "images/HorusStaff.png", 
    height = 40))) |>
  tab_style(locations = cells_title (), 
              style = cell_borders(style = 'hidden')) |>
  tab_style(
    style = cell_text(size = px(16)),
    locations = cells_body()
    ) |>
    opt_stylize(style = 3, color = "blue") |>
    tab_options(
      data_row.padding = px(2),
      summary_row.padding = px(3),
      grand_summary_row.padding = px(3),
      row_group.padding = px(4),
      footnotes.padding = px(2)
    )


tableStaffOutcome
summarisedOxOutcomeStaff <- staffs |>
  group_by(oxford_item, outcome) |>
  summarise(meanDef = mean(valid_def_increase, na.rm = T),
            SDDef = sd(valid_def_increase, na.rm = T),
            n = n(),
            n_validDef = sum(valid_def_increase >= 1, na.rm = T)
            )

summarisedDefStaff <- summarisedOxOutcomeStaff |> 
    mutate(
    oxford_item = paste0("Oxford: ", oxford_item)) |>
   gt(groupname_col = "oxford_item", rowname_col = "outcome") |>
    cols_label(
    outcome = "Outcome",
    meanDef = "Mean",
    SDDef = "SD",
    n_validDef = "Valid n") |>
    tab_spanner(
    label = md('**Defence increase**'),
    columns = 3:6) |>
  sub_missing() |>
   fmt_number(
    columns = c("meanDef", "SDDef"),
    decimals = 2) |>
   summary_rows(
    fns = list("Total" = ~weighted.mean(., n_validDef)
    ), 
    fmt = ~ fmt_number(., decimals = 2),
    columns = meanDef) |>
    summary_rows(
    fns = list(
      "Sum" = ~sum(.)
    ),
    fmt = ~ fmt_number(., decimals = 0, use_seps = FALSE),
    columns = c(n, n_validDef)
   ) |>
  grand_summary_rows(
    fns = list("Grand Total" = ~weighted.mean(., n_validDef)
    ),
    fmt = ~ fmt_number(., decimals = 2),
    columns = meanDef
  ) |>
   grand_summary_rows(
    fns = list(
      "Grand Sum" = ~sum(.)
    ),
    fmt = ~ fmt_number(., decimals = 0, use_seps = FALSE),
    columns = c(n, n_validDef)
   ) |>
  opt_stylize(style = 3, color = "blue") |>
  # tab_stubhead(label = "Horus' Staffs") |>
  # tab_style(
  #   style = cell_fill(color = "lightblue"),
  #   locations = cells_stubhead()
  # ) |>
  tab_stubhead(label = html(local_image(
    file = "images/HorusStaff.png", 
    height = 40))) |>
  tab_style(
    locations = list(cells_stub_grand_summary(),cells_row_groups()),
    style = cell_text(weight = "bold")) |>
  tab_options(
    data_row.padding = px(2),
    summary_row.padding = px(3),
    grand_summary_row.padding = px(3),
    row_group.padding = px(4))

summarisedDefStaff
combinedBox <- combined_df |>
ggplot(aes(y=increase, x = outcome)) +
  geom_boxplot(aes(fill = equipment), 
               outliers = F, 
               staplewidth = 0.7, 
               position=position_dodge(0.8)) +
  geom_count(aes(fill = equipment), 
             alpha=0.6, 
             position = position_jitterdodge(0,0,0.81)) +
  scale_fill_brewer(palette = "Set1") +
  scale_x_discrete(drop = TRUE) +
  facet_wrap(vars(Oxford), nrow = 1, labeller =labeller(Oxford = label_both)) +
  theme(strip.text.x = element_text(
    size = 10, face = "bold"))
 # theme(axis.ticks.x = element_blank())

combinedBox
bootsSeq <- boots |>
  group_by(boots_id) |> #NA will be excluded since have no boots_id
  mutate(Sequence = 1:n()) |>
  ungroup() |>
    group_by(Sequence) |>
    summarise(
    normal = sum(outcome == "normal"),
    gs = sum(outcome == "gs"),
    failure = sum(outcome == "failure"),
    total = n()
  ) |>
  mutate(
    propGs = gs / total,
    propFailure = failure / total,
    propNormal = normal / total,
    propNonfailure = (normal + gs) / total
  )

sequence_fail <- ggplot(bootsSeq, aes(x = Sequence, y = propFailure))+
    geom_hline(aes(yintercept = overalProp$propFailure, 
                   colour = "mean proportion"),
              linetype = 1,
              linewidth =1) +
  scale_colour_manual(name = "Expected",
                      values = c("#377eb8")) +
  geom_line(colour = "grey30", linetype = 2)+
  geom_point(aes(fill = ifelse(Sequence == 1, "first", "later"), size = total), shape = 21) +
  scale_size_area() +
  scale_fill_manual(name = "When failed?",
                      values = c("#377eb8", "black")) +
  ylab("proportion failure") +
  xlab("position in the forging sequence") +
  coord_cartesian(xlim = c(1, 28)) +
  scale_x_continuous(breaks = c(1, 10, 20, 30),
                     minor_breaks = c(5, 15, 25)) +
  guides(size = guide_legend(title = "n boots", order = 0)) +
  theme(legend.position = c(0.29, 0.79),
        legend.box = "horizontal") +
  ylim(0,1)

sequence_gs <- ggplot(bootsSeq, aes(x = Sequence, y = propGs))+
    geom_hline(aes(yintercept = overalProp$propGs, colour = "mean proportion"),
              linetype = 1,
              linewidth =1) +
  scale_colour_manual(name = "Expected",
                      values = c("#377eb8")) +
  geom_line(colour = "grey30", linetype = 2) +
  geom_point(aes(size = total), alpha = 1) +
  scale_size_area() +
  ylab("proportion great success") +
  xlab("position in the forging sequence") +
  coord_cartesian(xlim = c(1, 28)) +
  scale_x_continuous(breaks = c(1, 10, 20, 30),
                     minor_breaks = c(5, 15, 25)) +
  guides(size = guide_legend(title = "n boots", order = 0)) +
  theme(legend.position = c(0.2, 0.79),
        legend.box = "horizontal") +
  ylim(0,1)

sequence_norm <- ggplot(bootsSeq, aes(x = Sequence, y = propNormal))+
  geom_hline(aes(yintercept = overalProp$propNormal, colour = "mean proportion"),
              linetype = 1,
              linewidth =1) +
  scale_colour_manual(name = "Expected",
                      values = c("#377eb8")) +
  geom_line(colour = "grey30", linetype = 2) +
  geom_point(aes(size = total), alpha = 1) +
  scale_size_area() +
  ylab("proportion normal success") +
  xlab("position in the forging sequence") +
  coord_cartesian(xlim = c(1, 28)) +
  scale_x_continuous(breaks = c(1, 10, 20, 30),
                     minor_breaks = c(5, 15, 25)) +
  guides(size = guide_legend(title = "n boots", order = 0)) +
  theme(legend.position = c(0.2, 0.79),
        legend.box = "horizontal") +
  ylim(0,1)

round(bootsSeq$propFailure[[1]]*100, 2) 
round(overalProp$propFailure*100, 2)
binom::binom.wilson(67, 481)

# runs-tests
boots2 <- boots |>
  filter(treatment_id >= 1) |> 
  arrange(treatment, treatment_id)

lagAdd <- boots2 |>
  group_by(treatment) |>
  summarise(n_treat = max(treatment_id)) |>
  summarise(previous = lag(cumsum(n_treat)),
            n_treat = n_treat,
            treatment = unique(treatment))

boots3 <- left_join(boots2, lagAdd, by=join_by(treatment)) |>
  mutate(previous = case_match(previous, NA ~ 0, .default = previous)) |>
  mutate(overall_seq = treatment_id + previous)
boots3 |>
  ggplot(aes(x= overall_seq, fill=failure)) +
  geom_bar(linewidth = 0, size = 0,colour = NA, width = 1)


broom::tidy(DescTools::RunsTest(boots$failure, exact = F))

split(boots2$failure, boots2$treatment) |>
  purrr::map(function (df)
  broom::tidy(DescTools::RunsTest(df, exact = T))
  )
#   broom::tidy(randtests::runs.test(as.numeric(df$failure), threshold = 1.5)))

sequence_raw <- boots2 |>
  ggplot(aes(x= treatment_id, fill=failure, colour = failure), colour = NULL) +
  geom_bar(linewidth = 0, size = 0,colour = NA, width = 1)+
  facet_wrap(vars(treatment), ncol = 1, strip.position = "left", ) +
  theme(strip.text.y.left = element_text(angle = 0)) +
  scale_y_continuous(breaks = NULL) +
  scale_fill_manual(values = c("white", "#377eb8")) +
  ylab("treatment group") +
  xlab("sequence") +
  scale_x_continuous(breaks = seq(0, 300, 50),
                     expand = expansion(mult = .01))

sequence_fail
sequence_gs
sequence_norm
sequence_raw
pkg_sesh <- sessioninfo::session_info()
quarto_version <- system("quarto --version", intern = TRUE)
pkg_sesh$platform$quarto <- paste(
  system("quarto --version", intern = TRUE)
  )
pkg_sesh
library(downloadthis)

boots |>
  download_this(
    output_name = "bootsData",
    output_extension = ".csv",
    button_label = "Download boots data as csv",
    button_type = "primary",
    has_icon = TRUE,
    icon = "fa fa-save",
    class = "hvr-sweep-to-left"
  )
staffs |>
  download_this(
    output_name = "staffData",
    output_extension = ".csv",
    button_label = "Download staff data as csv",
    button_type = "primary",
    has_icon = TRUE,
    icon = "fa fa-save",
    class = "hvr-sweep-to-left"
  )

Session information

─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.4.1 (2024-06-14 ucrt)
 os       Windows 10 x64 (build 19045)
 system   x86_64, mingw32
 ui       RTerm
 language (EN)
 collate  nb_NO.utf8
 ctype    nb_NO.utf8
 tz       Europe/Oslo
 date     2025-01-02
 pandoc   3.2 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
 quarto   1.6.39

─ Packages ───────────────────────────────────────────────────────────────────
 package       * version    date (UTC) lib source
 abind           1.4-8      2024-09-12 [1] CRAN (R 4.4.1)
 afex          * 1.4-1      2024-09-01 [1] CRAN (R 4.4.1)
 backports       1.5.0      2024-05-23 [1] CRAN (R 4.3.3)
 base64enc       0.1-3      2015-07-28 [1] CRAN (R 4.3.0)
 bayestestR      0.15.0     2024-10-17 [1] CRAN (R 4.4.1)
 binom           1.1-1.1    2022-05-02 [1] CRAN (R 4.3.0)
 boot            1.3-31     2024-08-28 [1] CRAN (R 4.4.2)
 broom           1.0.7      2024-09-26 [1] CRAN (R 4.4.1)
 broom.helpers   1.17.0     2024-08-28 [1] CRAN (R 4.4.1)
 car             3.1-3      2024-09-27 [1] CRAN (R 4.4.1)
 carData         3.0-5      2022-01-06 [1] CRAN (R 4.3.0)
 cards           0.4.0      2024-11-27 [1] CRAN (R 4.4.1)
 cardx           0.2.2      2024-11-27 [1] CRAN (R 4.4.2)
 cellranger      1.1.0      2016-07-27 [1] CRAN (R 4.3.0)
 class           7.3-22     2023-05-03 [1] CRAN (R 4.4.2)
 cli             3.6.3      2024-06-21 [1] CRAN (R 4.4.1)
 coda            0.19-4.1   2024-01-31 [1] CRAN (R 4.3.2)
 codetools       0.2-20     2024-03-31 [2] CRAN (R 4.4.1)
 colorspace      2.1-1      2024-07-26 [1] CRAN (R 4.4.1)
 commonmark      1.9.2      2024-10-04 [1] CRAN (R 4.4.1)
 cowplot         1.1.3      2024-01-22 [1] CRAN (R 4.3.2)
 data.table      1.16.4     2024-12-06 [1] CRAN (R 4.4.2)
 datawizard      0.13.0     2024-10-05 [1] CRAN (R 4.4.1)
 DescTools       0.99.58    2024-11-08 [1] CRAN (R 4.4.2)
 digest          0.6.37     2024-08-19 [1] CRAN (R 4.4.1)
 dplyr         * 1.1.4      2023-11-17 [1] CRAN (R 4.3.2)
 e1071           1.7-16     2024-09-16 [1] CRAN (R 4.4.1)
 effectsize      1.0.0      2024-12-10 [1] CRAN (R 4.4.1)
 emmeans         1.10.6     2024-12-12 [1] CRAN (R 4.4.1)
 estimability    1.5.1      2024-05-12 [1] CRAN (R 4.3.3)
 evaluate        1.0.1      2024-10-10 [1] CRAN (R 4.4.1)
 Exact           3.3        2024-07-21 [1] CRAN (R 4.3.3)
 expm            1.0-0      2024-08-19 [1] CRAN (R 4.4.1)
 farver          2.1.2      2024-05-13 [1] CRAN (R 4.3.3)
 fastmap         1.2.0      2024-05-15 [1] CRAN (R 4.3.3)
 forcats         1.0.0      2023-01-29 [1] CRAN (R 4.3.0)
 Formula         1.2-5      2023-02-24 [1] CRAN (R 4.3.0)
 fs              1.6.5      2024-10-30 [1] CRAN (R 4.4.1)
 generics        0.1.3      2022-07-05 [1] CRAN (R 4.3.0)
 ggh4x         * 0.3.0      2024-12-15 [1] CRAN (R 4.4.1)
 ggplot2       * 3.5.1      2024-04-23 [1] CRAN (R 4.3.3)
 ggplotify       0.1.2      2023-08-09 [1] CRAN (R 4.4.2)
 ggtext        * 0.1.2      2022-09-16 [1] CRAN (R 4.3.0)
 gld             2.6.6      2022-10-23 [1] CRAN (R 4.3.0)
 glue            1.8.0      2024-09-30 [1] CRAN (R 4.4.1)
 gridExtra       2.3        2017-09-09 [1] CRAN (R 4.3.0)
 gridGraphics    0.5-1      2020-12-13 [1] CRAN (R 4.3.0)
 gridtext        0.1.5      2022-09-16 [1] CRAN (R 4.3.0)
 gt            * 0.11.1     2024-10-04 [1] CRAN (R 4.4.1)
 gtable          0.3.6      2024-10-25 [1] CRAN (R 4.4.1)
 gtsummary     * 2.0.4      2024-11-30 [1] CRAN (R 4.4.2)
 haven           2.5.4      2023-11-30 [1] CRAN (R 4.3.2)
 hms             1.1.3      2023-03-21 [1] CRAN (R 4.3.0)
 htmltools       0.5.8.1    2024-04-04 [1] CRAN (R 4.3.3)
 htmlwidgets     1.6.4      2023-12-06 [1] CRAN (R 4.3.2)
 httr            1.4.7      2023-08-15 [1] CRAN (R 4.3.1)
 insight         1.0.0      2024-11-26 [1] CRAN (R 4.4.1)
 jsonlite        1.8.9      2024-09-20 [1] CRAN (R 4.4.1)
 knitr           1.49       2024-11-08 [1] CRAN (R 4.4.1)
 labeling        0.4.3      2023-08-29 [1] CRAN (R 4.3.1)
 labelled        2.13.0     2024-04-23 [1] CRAN (R 4.3.3)
 lattice         0.22-6     2024-03-20 [1] CRAN (R 4.3.3)
 legendry      * 0.2.0      2024-12-14 [1] CRAN (R 4.4.2)
 lemon           0.5.0      2024-11-10 [1] CRAN (R 4.4.1)
 lifecycle       1.0.4      2023-11-07 [1] CRAN (R 4.3.1)
 lme4          * 1.1-35.5   2024-07-03 [1] CRAN (R 4.4.1)
 lmerTest        3.1-3      2020-10-23 [1] CRAN (R 4.3.0)
 lmom            3.2        2024-09-30 [1] CRAN (R 4.4.1)
 magrittr        2.0.3      2022-03-30 [1] CRAN (R 4.3.0)
 markdown        1.13       2024-06-04 [1] CRAN (R 4.3.3)
 MASS            7.3-61     2024-06-13 [1] CRAN (R 4.4.1)
 Matrix        * 1.7-1      2024-10-18 [1] CRAN (R 4.4.1)
 minqa           1.2.8      2024-08-17 [1] CRAN (R 4.4.1)
 minty           0.0.4      2024-11-08 [1] CRAN (R 4.4.2)
 multcomp        1.4-26     2024-07-18 [1] CRAN (R 4.3.3)
 munsell         0.5.1      2024-04-01 [1] CRAN (R 4.3.3)
 mvtnorm         1.3-2      2024-11-04 [1] CRAN (R 4.4.2)
 nlme            3.1-166    2024-08-14 [1] CRAN (R 4.4.1)
 nloptr          2.1.1      2024-06-25 [1] CRAN (R 4.3.3)
 numDeriv        2016.8-1.1 2019-06-06 [1] CRAN (R 4.3.0)
 parameters      0.24.0     2024-11-27 [1] CRAN (R 4.4.1)
 pillar          1.10.0     2024-12-17 [1] CRAN (R 4.4.1)
 pkgconfig       2.0.3      2019-09-22 [1] CRAN (R 4.3.0)
 plyr            1.8.9      2023-10-02 [1] CRAN (R 4.3.1)
 proxy           0.4-27     2022-06-09 [1] CRAN (R 4.3.0)
 purrr           1.0.2      2023-08-10 [1] CRAN (R 4.3.1)
 R6              2.5.1      2021-08-19 [1] CRAN (R 4.3.0)
 RColorBrewer    1.1-3      2022-04-03 [1] CRAN (R 4.3.0)
 Rcpp            1.0.13-1   2024-11-02 [1] CRAN (R 4.4.1)
 readODS         2.3.1      2024-11-05 [1] CRAN (R 4.4.2)
 readxl          1.4.3      2023-07-06 [1] CRAN (R 4.3.1)
 reshape2        1.4.4      2020-04-09 [1] CRAN (R 4.3.0)
 rlang           1.1.4      2024-06-04 [1] CRAN (R 4.3.3)
 rmarkdown       2.29       2024-11-04 [1] CRAN (R 4.4.1)
 rootSolve       1.8.2.4    2023-09-21 [1] CRAN (R 4.3.1)
 rstudioapi      0.17.1     2024-10-22 [1] CRAN (R 4.4.1)
 sandwich        3.1-1      2024-09-15 [1] CRAN (R 4.4.1)
 sass            0.4.9      2024-03-15 [1] CRAN (R 4.3.3)
 scales          1.3.0      2023-11-28 [1] CRAN (R 4.3.2)
 sessioninfo     1.2.2      2021-12-06 [1] CRAN (R 4.3.0)
 stringi         1.8.4      2024-05-06 [1] CRAN (R 4.3.3)
 stringr         1.5.1      2023-11-14 [1] CRAN (R 4.3.1)
 survival        3.8-3      2024-12-17 [1] CRAN (R 4.4.2)
 TH.data         1.1-2      2023-04-17 [1] CRAN (R 4.3.0)
 tibble          3.2.1      2023-03-20 [1] CRAN (R 4.3.0)
 tidyr         * 1.3.1      2024-01-24 [1] CRAN (R 4.3.2)
 tidyselect      1.2.1      2024-03-11 [1] CRAN (R 4.3.3)
 tzdb            0.4.0      2023-05-12 [1] CRAN (R 4.3.0)
 utf8            1.2.4      2023-10-22 [1] CRAN (R 4.3.2)
 vctrs           0.6.5      2023-12-01 [1] CRAN (R 4.3.2)
 withr           3.0.2      2024-10-28 [1] CRAN (R 4.4.1)
 xfun            0.49       2024-10-31 [1] CRAN (R 4.4.1)
 xml2            1.3.6      2023-12-04 [1] CRAN (R 4.3.2)
 xtable          1.8-4      2019-04-21 [1] CRAN (R 4.3.0)
 yaml            2.3.10     2024-07-26 [1] CRAN (R 4.4.1)
 yulab.utils     0.1.8      2024-11-07 [1] CRAN (R 4.4.1)
 zip             2.3.1      2024-01-27 [1] CRAN (R 4.3.2)
 zoo             1.8-12     2023-04-13 [1] CRAN (R 4.3.0)

 [1] C:/R/RLibs
 [2] C:/Program Files/R/R-4.4.1/library

──────────────────────────────────────────────────────────────────────────────

Raw data