1. Purpose of this analysis

The original PMA data file contains one record for each satisfying individual selection. Each PMA record has the structure:

pma(CTID, Type, NInd, ConjID, Prefix, Conds, Alts, In, Result)

The goal of this analysis is to summarize the large PMA output into interpretable patterns. I am especially interested in which axiom components produce many satisfying cases, which individuals and variable patterns are common, which condition and alternative predicates dominate, and whether any support or assumption patterns look unusual.

2. Basic PMA record volume

total_satisfying_records rows_in_pma_summary
3926056 563

3. Question 1: Which rule types produce the most satisfying records?

Type total_count percent
cona 3842801 97.8794240
unca 36147 0.9206950
horn 31124 0.7927549
cscj 15984 0.4071261

Interpretation to write: This table shows whether the satisfying records are spread across rule types or concentrated in one type. If one type dominates, then the later analysis should pay special attention to that rule type.

4. Question 2: Which CTIDs dominate the satisfying records?

CTID Type NInd total_count percent
62 cona 38 1052096 26.797784
227 cona 98 211944 5.398395
230 cona 98 211944 5.398395
200 cona 97 117744 2.999040
204 cona 97 117744 2.999040
199 cona 97 106912 2.723140
203 cona 97 106912 2.723140
224 cona 98 105972 2.699197
233 cona 98 105972 2.699197
201 cona 97 86488 2.202923
202 cona 97 86488 2.202923
205 cona 97 86488 2.202923
206 cona 97 86488 2.202923
223 cona 98 60102 1.530849
232 cona 98 60102 1.530849
225 cona 98 54648 1.391931
228 cona 98 54648 1.391931
447 cona 172 46656 1.188368
450 cona 172 46656 1.188368
453 cona 172 46656 1.188368

Interpretation to write: CTIDs with very high counts are important because they account for a large part of the satisfying model-checking output. These are good candidates for closer inspection.

5. Question 3: Are satisfying records concentrated in a few CTIDs?

CTID Type NInd total_count percent rank cumulative_count cumulative_percent
62 cona 38 1052096 26.7977838 1 1052096 26.79778
227 cona 98 211944 5.3983947 2 1264040 32.19618
230 cona 98 211944 5.3983947 3 1475984 37.59457
200 cona 97 117744 2.9990403 4 1593728 40.59361
204 cona 97 117744 2.9990403 5 1711472 43.59265
199 cona 97 106912 2.7231400 6 1818384 46.31579
203 cona 97 106912 2.7231400 7 1925296 49.03893
224 cona 98 105972 2.6991974 8 2031268 51.73813
233 cona 98 105972 2.6991974 9 2137240 54.43733
201 cona 97 86488 2.2029232 10 2223728 56.64025
202 cona 97 86488 2.2029232 11 2310216 58.84317
205 cona 97 86488 2.2029232 12 2396704 61.04610
206 cona 97 86488 2.2029232 13 2483192 63.24902
223 cona 98 60102 1.5308493 14 2543294 64.77987
232 cona 98 60102 1.5308493 15 2603396 66.31072
225 cona 98 54648 1.3919312 16 2658044 67.70265
228 cona 98 54648 1.3919312 17 2712692 69.09458
447 cona 172 46656 1.1883682 18 2759348 70.28295
450 cona 172 46656 1.1883682 19 2806004 71.47132
453 cona 172 46656 1.1883682 20 2852660 72.65969
226 cona 98 40068 1.0205662 21 2892728 73.68025
229 cona 98 40068 1.0205662 22 2932796 74.70082
183 cona 95 37926 0.9660076 23 2970722 75.66683
65 cona 38 37380 0.9521005 24 3008102 76.61893
372 cona 146 33924 0.8640733 25 3042026 77.48300
373 cona 146 33924 0.8640733 26 3075950 78.34707
374 cona 146 33924 0.8640733 27 3109874 79.21115
474 cona 172 30492 0.7766573 28 3140366 79.98780
211 cona 98 27528 0.7011617 29 3167894 80.68897
213 cona 98 27528 0.7011617 30 3195422 81.39013

Interpretation to write: If the curve rises very quickly, then a small number of axiom components explain most of the satisfying records. This helps prioritize manual review.

6. Question 4: Which original axiom IDs, NInd, dominate?

NInd total_count number_of_ctids percent
98 1209712 23 30.8123980
38 1116356 4 28.4345409
97 968280 23 24.6629187
172 257425 30 6.5568346
146 198784 30 5.0631983
95 43464 3 1.1070652
171 23728 30 0.6043724
188 12638 9 0.3219007
140 7296 1 0.1858354
47 6610 12 0.1683623
94 6325 7 0.1611032
236 4792 4 0.1220563
55 4672 1 0.1189998
71 4590 5 0.1169112
91 3711 4 0.0945223
165 3312 1 0.0843595
45 2757 1 0.0702231
113 2561 19 0.0652309
13 2416 2 0.0615376
8 1912 2 0.0487003

Interpretation to write: NInd summarizes the original internal axiom ID. Large NInd counts suggest that a small number of original axioms may be responsible for many satisfying examples.

7. Question 5: Which individuals appear most often in variable positions?

individual total_count percent
1dt 5018562 15.1611436
A 2335072 7.0542840
1dt-2 1651124 4.9880679
0dt3 1539687 4.6514152
0dt2 1538597 4.6481223
0dt 1244410 3.7593794
B 1220676 3.6876787
01dt 1162148 3.5108648
space 1159249 3.5021069
1dte 1110342 3.3543582
1dts 1110342 3.3543582
st1 1061452 3.2066608
2dsr 919714 2.7784684
st2dsr 855818 2.5854377
1dsr 787796 2.3799424
0dsr 787780 2.3798940
st0dsr 784264 2.3692721
st1dsr 784264 2.3692721
st3-2 751838 2.2713128
st3-3 669549 2.0227166
space2 637187 1.9249505
3dsr-2 623838 1.8846230
3dsr-3 472180 1.4264622
me 468540 1.4154657
bound 375241 1.1336081
1db 305374 0.9225390
0db 304997 0.9214001
ag 290912 0.8788491
C 280447 0.8472342
3dsr 255703 0.7724822
variable_position individual total_count
1 st0dsr 401958
1 st1dsr 401958
1 st2dsr 341757
1 st3-2 341718
1 st1 295506
1 st3-3 279091
1 space2 220549
1 me 211166
1 space 189743
1 ag 137818
2 st1 646518
2 st2dsr 423942
2 st3-2 304235
2 st3-3 303125
2 st0dsr 290367
2 st1dsr 290367
2 space2 279793
2 space 234022
2 me 153758
2 2dsr 132400
3 0dt3 603580
3 0dt2 602534
3 1dt-2 567825
3 0dt 462814
3 1dt 433740
3 01dt 416188
3 1dts 386276
3 1dte 386268
3 A 12952
3 sdc2 5129
4 0dt3 349036
4 0dt2 348932
4 1dt-2 317494
4 0dt 304413
4 01dt 291724
4 1dte 284374
4 1dts 284189
4 1dt 270779
4 0dsr 152913
4 1dsr 152896
5 1dt 1714014
5 A 252470
5 1dt-2 134044
5 0dsr 81128
5 1dsr 81128
5 2dsr 78997
5 space 76099
5 3dsr-2 74831
5 3dsr 74524
5 space2 69955
6 1dt 1754638
6 A 224553
6 1dt-2 84428
6 hist2 78612
6 hist 72402
6 hist3 72258
6 st3-2 71076
6 st0dsr 68640
6 st1dsr 68640
6 st2dsr 67728
7 A 927632
7 1dt 381094
7 1dt-2 167026
7 B 109296
7 0dt2 78486
7 0dt3 78462
7 0dt 75572
7 01dt 74964
7 1dte 74546
7 1dts 74546
8 0dsr 290347
8 1dsr 290347
8 B 286880
8 3dsr-2 222482
8 2dsr 220086
8 space 175637
8 3dsr-3 151745
8 1dt 103176
8 1dt-2 58712
8 0dt2 38360
9 space 315580
9 1dt 189552
9 2dsr 185592
9 B 140238
9 1dt-2 91960
9 0dsr 64960
9 1dsr 64960
9 3dsr-2 64240
9 3dsr-3 63016
9 0dt2 56696
10 0dt2 205290
10 0dt3 205290
10 B 172976
10 1dt-2 135276
10 0dt 132200
10 01dt 130696
10 1dte 129740
10 1dts 129740
10 C 80136
10 1dt 64512
11 A 268880
11 C 172976
11 0dt2 81144
11 0dt3 81144
11 1dt-2 80784
11 0dt 78888
11 01dt 78648
11 1dte 78444
11 1dts 78444
11 1dt 78336
12 A 529860
12 space 72328
12 3dsr-2 61888
12 3dsr-3 61888
12 0dsr 60928
12 1dsr 60928
12 2dsr 60928
12 space2 7488
12 3dsr 424
13 B 423888

Interpretation to write: This tells us whether a few domain individuals are responsible for many variable assignments. If a small number of individuals dominate, then the satisfying cases may be concentrated around specific model objects.

8. Question 6: Are variable assignments mostly repeated or distinct?

n_vars n_unique_individuals equality_pattern total_count percent
6 6 X1,X2,X3,X4,X5,X6 260964 6.6469777
6 5 X1,X1,X2,X3,X4,X5 200176 5.0986550
6 5 X1,X2,X3,X2,X4,X5 196082 4.9943773
6 5 X1,X2,X3,X1,X4,X5 188622 4.8043647
6 4 X1,X1,X2,X1,X3,X4 125644 3.2002608
7 7 X1,X2,X3,X4,X5,X6,X7 123276 3.1399458
8 7 X1,X2,X3,X4,X5,X5,X6,X7 117508 2.9930299
5 5 X1,X2,X3,X4,X5 112235 2.8587221
8 8 X1,X2,X3,X4,X5,X6,X7,X8 71280 1.8155630
4 4 X1,X2,X3,X4 41073 1.0461647
7 6 X1,X2,X3,X4,X5,X1,X6 34164 0.8701865
8 6 X1,X1,X2,X3,X4,X4,X5,X6 28668 0.7301986
7 6 X1,X2,X3,X4,X2,X5,X6 27216 0.6932149
8 7 X1,X2,X3,X3,X4,X5,X6,X7 27208 0.6930112
8 7 X1,X2,X3,X4,X4,X5,X6,X7 26564 0.6766079
10 9 X1,X2,X3,X4,X5,X5,X6,X7,X8,X9 26160 0.6663177
11 9 X1,X2,X3,X4,X5,X5,X6,X5,X7,X8,X9 25200 0.6418657
11 9 X1,X2,X3,X4,X5,X5,X6,X7,X7,X8,X9 25200 0.6418657
12 10 X1,X2,X3,X4,X5,X5,X6,X7,X5,X8,X9,X10 25200 0.6418657
12 10 X1,X2,X3,X4,X5,X5,X6,X7,X8,X8,X9,X10 25200 0.6418657
3 3 X1,X2,X3 24099 0.6138223
8 6 X1,X2,X3,X4,X4,X4,X5,X6 23996 0.6111988
9 8 X1,X2,X3,X4,X5,X5,X6,X7,X8 23956 0.6101799
6 5 X1,X2,X3,X4,X1,X5 22986 0.5854732
6 5 X1,X2,X3,X4,X2,X5 22100 0.5629060
8 6 X1,X2,X3,X3,X4,X4,X5,X6 20356 0.5184848
8 6 X1,X2,X3,X4,X3,X3,X5,X6 20356 0.5184848
12 8 X1,X1,X2,X3,X4,X4,X5,X6,X4,X7,X8,X6 20160 0.5134926
12 8 X1,X1,X2,X3,X4,X4,X5,X6,X7,X7,X8,X6 20160 0.5134926
13 9 X1,X1,X2,X3,X4,X4,X4,X5,X5,X6,X7,X8,X9 20160 0.5134926
CTID Type NInd total_count weighted_avg_unique_individuals weighted_avg_vars weighted_avg_uniqueness_ratio
569 cona 230 12 1.000000 3 0.3333333
572 cona 231 12 1.000000 3 0.3333333
241 cona 105 8 1.000000 3 0.3333333
249 cona 109 24 1.500000 4 0.3750000
250 cona 109 24 1.500000 4 0.3750000
290 unca 113 355 2.000000 4 0.5000000
293 unca 113 355 2.000000 4 0.5000000
152 horn 79 41 1.000000 2 0.5000000
537 horn 200 41 1.000000 2 0.5000000
571 cona 230 24 1.500000 3 0.5000000
574 cona 231 24 1.500000 3 0.5000000
246 cona 108 12 1.000000 2 0.5000000
252 cona 110 12 1.000000 2 0.5000000
255 cona 111 12 1.000000 2 0.5000000
539 horn 202 8 1.500000 3 0.5000000
540 horn 203 8 1.500000 3 0.5000000
561 cona 223 8 1.000000 2 0.5000000
642 horn 309 4 1.000000 2 0.5000000
643 horn 310 4 1.000000 2 0.5000000
738 horn 205 4 1.000000 2 0.5000000
739 horn 221 4 1.000000 2 0.5000000
155 unca 80 1 1.000000 2 0.5000000
596 cona 236 1048 2.687023 5 0.5374046
175 cona 93 20 2.200000 4 0.5500000
734 horn 90 10 2.200000 4 0.5500000
600 cona 237 96 3.500000 6 0.5833333
251 cona 109 18 2.333333 4 0.5833333
243 cona 106 12 2.333333 4 0.5833333
199 cona 97 106912 7.020316 12 0.5850263
203 cona 97 106912 7.020316 12 0.5850263

Interpretation to write: A low uniqueness ratio means several variable slots are filled by the same individual. That can reveal self-relations or variable collapse. These cases may be important because some axioms may be satisfied mainly when variables are not distinct.

9. Question 7: Which condition and alternative predicates are most common?

field predicate sign arity total_count percent_within_field
Alts temporally-projects-onto 0 2 3384319 17.4799157
Alts spatially-projects-onto 0 3 2797416 14.4485777
Alts occurrent-part-of 0 2 2046609 10.5706799
Alts temporal-part-of 0 2 1972741 10.1891537
Alts spatially-projects-onto -1 3 1617272 8.3531660
Alts instance-of 0 3 1216375 6.2825438
Alts continuant-part-of 0 3 1191818 6.1557076
Alts instance-of -1 3 1165519 6.0198740
Alts exists-at 0 2 1003450 5.1827920
Alts occurrent-part-of -1 2 789436 4.0774155
Alts temporal-part-of -1 2 478455 2.4712071
Alts continuant-part-of -1 3 321693 1.6615356
Alts exists-at -1 2 257144 1.3281418
Alts temporally-projects-onto -1 2 235754 1.2176630
Alts occupies-spatiotemporal-region 0 2 227113 1.1730325
Alts occupies-spatial-region 0 3 198999 1.0278244
Alts identical 0 2 150426 0.7769462
Alts occupies-spatial-region -1 3 89042 0.4598995
Alts occupies-spatiotemporal-region -1 2 77143 0.3984415
Alts identical -1 2 58792 0.3036591
Conds instance-of -1 3 9697337 80.0834038
Conds continuant-part-of -1 3 2259297 18.6579258
Conds occurrent-part-of -1 2 97955 0.8089406
Conds temporal-part-of -1 2 17507 0.1445779
Conds exists-at -1 2 5642 0.0465933
Conds inheres-in -1 2 5251 0.0433643
Conds precedes -1 2 5192 0.0428770
Conds concretizes -1 3 4716 0.0389461
Conds located-in -1 3 2188 0.0180691
Conds has-first-instant -1 2 2044 0.0168799
Conds has-last-instant -1 2 2044 0.0168799
Conds occupies-spatial-region -1 3 1432 0.0118259
Conds occupies-temporal-region -1 2 1057 0.0087290
Conds proper-continuant-part-of -1 3 1050 0.0086712
Conds universal -1 1 951 0.0078536
Conds proper-temporal-part-of -1 2 878 0.0072508
Conds proper-occurrent-part-of -1 2 804 0.0066397
Conds occupies-spatiotemporal-region -1 2 782 0.0064580
Conds specifically-depends-on -1 2 722 0.0059625
Conds has-material-basis -1 3 584 0.0048228
CTID Type NInd field predicate total_count
62 cona 38 Conds instance-of 4208384
62 cona 38 Alts instance-of 2104192
62 cona 38 Conds continuant-part-of 2104192
227 cona 98 Alts spatially-projects-onto 635832
230 cona 98 Alts spatially-projects-onto 635832
199 cona 97 Alts spatially-projects-onto 427648
203 cona 97 Alts spatially-projects-onto 427648
227 cona 98 Alts temporal-part-of 423888
227 cona 98 Alts temporally-projects-onto 423888
227 cona 98 Conds instance-of 423888
230 cona 98 Alts temporal-part-of 423888
230 cona 98 Alts temporally-projects-onto 423888
230 cona 98 Conds instance-of 423888
224 cona 98 Alts temporal-part-of 317916
201 cona 97 Alts spatially-projects-onto 259464
202 cona 97 Alts occurrent-part-of 259464
205 cona 97 Alts spatially-projects-onto 259464
206 cona 97 Alts occurrent-part-of 259464
200 cona 97 Alts spatially-projects-onto 235488
200 cona 97 Conds instance-of 235488
204 cona 97 Alts spatially-projects-onto 235488
204 cona 97 Conds instance-of 235488
199 cona 97 Alts occurrent-part-of 213824
199 cona 97 Alts temporally-projects-onto 213824
199 cona 97 Conds instance-of 213824
203 cona 97 Alts occurrent-part-of 213824
203 cona 97 Alts temporally-projects-onto 213824
203 cona 97 Conds instance-of 213824
224 cona 98 Alts spatially-projects-onto 211944
224 cona 98 Alts temporally-projects-onto 211944
224 cona 98 Conds instance-of 211944
227 cona 98 Alts continuant-part-of 211944
227 cona 98 Alts exists-at 211944
230 cona 98 Alts continuant-part-of 211944
230 cona 98 Alts exists-at 211944
233 cona 98 Alts spatially-projects-onto 211944
233 cona 98 Alts temporal-part-of 211944
233 cona 98 Alts temporally-projects-onto 211944
233 cona 98 Conds instance-of 211944
447 cona 172 Alts occurrent-part-of 186624

Interpretation to write: The Conds rows describe the if-side of the rule, while the Alts rows describe the then-side that was satisfied. This helps identify what predicates are driving the satisfying PMA records.

10. Question 8: Which support types are most common?

support_kind total_count percent
inst 15248853 78.7599117
mca 3903115 20.1594830
ident 150426 0.7769462
distinct 58792 0.3036591
CTID Type NInd support_kind total_count
62 cona 38 inst 2104192
227 cona 98 inst 1695552
230 cona 98 inst 1695552
224 cona 98 inst 847776
233 cona 98 inst 847776
199 cona 97 inst 748384
203 cona 97 inst 748384
201 cona 97 inst 432440
202 cona 97 inst 432440
205 cona 97 inst 432440
206 cona 97 inst 432440
223 cona 98 inst 240408
232 cona 98 inst 240408
200 cona 97 inst 235488
200 cona 97 mca 235488
204 cona 97 inst 235488
204 cona 97 mca 235488
211 cona 98 inst 220224
227 cona 98 mca 211944
230 cona 98 mca 211944
213 cona 98 inst 192696
216 cona 98 inst 192696
219 cona 98 inst 192696
201 cona 97 mca 172976
202 cona 97 mca 172976
205 cona 97 mca 172976
206 cona 97 mca 172976
226 cona 98 inst 160272
229 cona 98 inst 160272
65 cona 38 inst 149520
447 cona 172 inst 139968
450 cona 172 inst 139968
453 cona 172 inst 139968
372 cona 146 inst 135696
373 cona 146 inst 135696
374 cona 146 inst 135696
223 cona 98 mca 120204
183 cona 95 inst 113778
225 cona 98 mca 109296
228 cona 98 mca 109296

Interpretation to write: The support field tells us what kind of fact or inference supported the satisfied alternative. If a few support kinds dominate, this may explain why certain alternatives are repeatedly satisfied.

11. Question 9: Is Result usually empty, and when is it not empty?

result_empty total_count percent
yes 3926055 100
CTID Type NInd ConjID result_empty result_raw count

Interpretation to write: If Result is almost always empty, then assumptions are probably not a major driver of the satisfying records. If there are non-empty Result cases, those CTIDs should be reviewed.

12. Candidate CTIDs for closer inspection

This section combines several signals:

  1. high total satisfying count,
  2. low variable uniqueness ratio,
  3. repeated equality patterns,
  4. frequent condition or alternative predicates.
CTID Type NInd total_count percent weighted_avg_uniqueness_ratio
62 cona 38 1052096 26.7977838 0.8246472
227 cona 98 211944 5.3983947 0.6929327
230 cona 98 211944 5.3983947 0.6929327
200 cona 97 117744 2.9990403 0.8639506
204 cona 97 117744 2.9990403 0.8639506
199 cona 97 106912 2.7231400 0.5850263
203 cona 97 106912 2.7231400 0.5850263
233 cona 98 105972 2.6991974 0.6371023
224 cona 98 105972 2.6991974 0.6673437
202 cona 97 86488 2.2029232 0.7533615
206 cona 97 86488 2.2029232 0.7533615
201 cona 97 86488 2.2029232 0.7739147
205 cona 97 86488 2.2029232 0.7739147
232 cona 98 60102 1.5308493 0.7682565
223 cona 98 60102 1.5308493 0.7940057
225 cona 98 54648 1.3919312 0.9277244
228 cona 98 54648 1.3919312 0.9277244
453 cona 172 46656 1.1883682 0.7489712
447 cona 172 46656 1.1883682 0.8918283
450 cona 172 46656 1.1883682 0.8918283
226 cona 98 40068 1.0205662 0.8146052
229 cona 98 40068 1.0205662 0.8146052
183 cona 95 37926 0.9660076 0.6683014
65 cona 38 37380 0.9521005 0.6871589
374 cona 146 33924 0.8640733 0.6905288
372 cona 146 33924 0.8640733 0.8155288
373 cona 146 33924 0.8640733 0.8155288
474 cona 172 30492 0.7766573 0.9551882
211 cona 98 27528 0.7011617 0.6426257
219 cona 98 27528 0.7011617 0.6426257

Interpretation to write: These CTIDs are not automatically wrong. They are priority cases for inspection because they are high-volume and may have repeated-variable patterns.

13. Initial conclusions

Write conclusions in this form:

  1. The satisfying PMA records are [concentrated / not concentrated] by rule type.
  2. The largest CTIDs are [list top CTIDs].
  3. The most common individuals are [list top individuals].
  4. The most common variable equality patterns are [list top patterns].
  5. The most common condition predicates are [list top Conds predicates].
  6. The most common alternative predicates are [list top Alts predicates].
  7. Result is [mostly empty / sometimes non-empty], which suggests [interpretation].
  8. The first CTIDs recommended for manual review are [list].

14. Next step

The next step is to extract a small number of original PMA records for the highest-priority CTIDs. Summary tables show where the patterns are, but the original PMA records are needed to interpret whether the satisfying combinations are intended or unintended.