The score distribution of real images is summarized below:
Score | Proportion |
---|---|
1 | 0.030 |
2 | 0.273 |
3 | 0.515 |
4 | 0.182 |
Based on this distribution, we calculate the total number of images (1:1) needed to for a significant two-sided test when synthetic images are worse than real ones by certain proportions (i.e., tolerance levels). The calculated sample sizes a function of the tolerance level powered at 80%, 70%, and 60% are plotted below.
The exact numbers for certain tolerance levels are tabulated below.
60% power | 70% power | 80% power | |
---|---|---|---|
0.08 | 291 | 367 | 466 |
0.1 | 158 | 199 | 253 |
0.12 | 90 | 113 | 143 |
0.14 | 50 | 63 | 80 |