Background

The purpose of this script is to analyze the text of the pilot data of the American’s Dream project.

For context, this is what we did:

Participants listed at least five guiding values for each of the following three perspectives: US on paper (constitution); US in practice (government); and personal ideal (if participants could design a country from scratch).

Then, they defined each of the values.

And then, they rank-ordered them in terms of importance for each of the perspectives.

Ok, before we start, let’s look at some demographics and political ideology:

Demographics & political ideology

Race

race N Perc
asian 10 10
black 12 12
hispanic 7 7
multiracial 3 3
white 67 67
NA 1 1

Gender

gender N Perc
man 61 61
woman 39 39

Education

edu N Perc
GED 21 21
2yearColl 16 16
4yearColl 41 41
MA 19 19
PHD 3 3

Income

Political Ideology

We asked for political ideology in the following way: Participants selected all that apply from a list of six ideologies or wrote in their own ideology. Those that were selected were then rated on a scale of 0 (not subscribe to it all) to 100 (subscribe to it to a great extent).

Count per ideology

ideo N Score
Conservatism 40 70.88
Liberalism 33 69.48
Democratic Socialism 17 74.00
Progressivism 15 69.00
Libertarianism 8 73.25
Right-Wing Nationalism 2 92.50
Centrist 1 100.00
Christ follower 1 92.00
I’m very moderate 1 100.00
Independent 1 88.00
Independent. 1 90.00
None. They all have their faults or sell an unrealistic dream of some sort. 1 100.00
none 1 3.00
pragmatic libertarianism 1 61.00

Party ID

party_id N Perc
Democrat 41 41
Independent 26 26
Republican 33 33

Values

Before we get into word embedding and clustering, let’s just take a brief look at the values people wrote in for each of the three perspectives. Does it pass the smell test?

US On Paper

The prompt:

First, we want you to think of the United States. Since its independence and onwards, the formation of the US as a sovereign country was based on a number of values, all of which were inscribed in the constitution. This document, importantly, has evolved since its inception.

ON PAPER, what are the values that the US stands for?

List at least FIVE values.


Top 30 words:

value N
freedom 56
equality 45
liberty 37
justice 29
democracy 22
independence 16
diversity 14
individualism 14
freedom of speech 13
unity 12
free speech 8
freedom of religion 8
opportunity 8
equality 7
democracy 5
life 5
patriotism 5
fairness 4
hard work 4
liberty 4
progress 4
pursuit of happiness 4
diversity 3
education 3
equal opportunity 3
integrity 3
religion 3
representative government 3
self-government 3
strength 3

alright, this makes sense. All pretty consistent with what we’d imagine, I think.

US In Practice

Now, let’s see if people write different values for the US in practice. The prompt:

Now, we want you to think of the values that the United States stands for in reality. Regardless of what is written in the constitution, the US (across party lines) stands for certain values and does not stand for others.

IN PRACTICE, what are the values that the US stands for?

List at least FIVE values.


Top 30 words:

value N
freedom 23
equality 18
democracy 15
individualism 15
diversity 12
liberty 12
greed 9
freedom of speech 8
justice 8
power 8
unity 7
capitalism 6
nationalism 6
opportunity 6
free speech 5
independence 5
money 5
success 5
achievement 4
competition 4
individuality 4
patriotism 4
progress 4
right to bear arms 4
democracy 3
division 3
dominance 3
education 3
freedom of religion 3
hard work 3

Ok, some similarity, but a little more variance in this one. And we get some new words up here (greed, power). It’ll be interesting to see who wrote similar words here and in the constitution question.

Personal ideal

What did people write for values in their own ideal dream country? Prompt:

And now, we want you to imagine your ideal state. Importantly, imagine this ideal state as if you are randomly born into its population. You can end up in any level of its citizenry.

So, if you could design a state completely from scratch, what would be its guiding values?

List at least FIVE values.


Top 30 words:

value N
freedom 38
equality 35
justice 19
freedom of speech 12
liberty 12
unity 12
democracy 11
individualism 11
diversity 10
opportunity 7
education 6
equality 6
compassion 5
hard work 5
honesty 5
respect 5
democracy 4
empathy 4
freedom of religion 4
kindness 4
nationalism 4
patriotism 4
equal 3
free speech 3
individualism 3
integrity 3
love 3
morality 3
peace 3
privacy 3

freedom still rules. americans… but hey, equality and justice are much stronger here, though. And there are some nice ones in the middle third of this list.

Definitions

Let’s get a sense of the definitions people wrote for these values. Instead of taking all the values, though, we’ll just look at the top ten most mentioned values across perspectives.

This was their prompt:

Thank you for listing the values guiding the US on paper, the US in practice, and your ideal state.

Now, we ask you to define these values for us.

For each value, please write 1-2 sentences about what you meant when you listed that value. If you listed the same value in two or three different perspectives, there is no need to define it more than once. Simply write “See above” for the second or third time it appears.

value mentions def_word def_mentions
freedom 117 freedom 35
freedom 117 free 33
freedom 117 ability 21
freedom 117 live 17
freedom 117 act 16
freedom 117 life 14
freedom 117 speak 13
freedom 117 government 11
equality 98 equal 59
equality 98 opportunities 22
equality 98 people 22
equality 98 rights 19
equality 98 treated 19
equality 98 equally 11
liberty 61 freedom 27
liberty 61 free 13
justice 56 justice 14
justice 56 people 14
justice 56 law 12
democracy 48 government 23
democracy 48 people 18
democracy 48 citizens 11
diversity 36 people 13
freedom of speech 33 speech 12
freedom of speech 33 ability 11

Cool. All of this is coming together pretty nicely. Alright, are we ready to start with the real stuff?

Word Embedding

Introduce GloVe

We’ll start by using GloVe word embedding (https://cran.r-project.org/web/packages/text2vec/vignettes/glove.html). I’ll keep the code visible for this part.

library(text2vec)
text8_file = "~/text8"
if (!file.exists(text8_file)) {
  download.file("http://mattmahoney.net/dc/text8.zip", "~/text8.zip")
  unzip ("~/text8.zip", files = "text8", exdir = "~/")
}
wiki = readLines(text8_file, n = 1, warn = FALSE)

# Create iterator over tokens
tokens <- space_tokenizer(wiki)

# Create vocabulary. Terms will be ngrams (1 to 4 tokens).
it = itoken(tokens, progressbar = FALSE)
vocab <- create_vocabulary(it,ngram = c(ngram_min = 1,ngram_max = 4))

vocab <- prune_vocabulary(vocab, term_count_min = 3L)

# Use our filtered vocabulary
vectorizer <- vocab_vectorizer(vocab)
# use window of 5 for context words
tcm <- create_tcm(it, vectorizer, skip_grams_window = 5L)

glove = GlobalVectors$new(rank = 100, x_max = 10)
wv_main = glove$fit_transform(tcm, n_iter = 10, convergence_tol = 0.01, n_threads = 8)
## INFO  [12:53:48.368] epoch 1, loss 0.4197
## INFO  [12:55:46.425] epoch 2, loss 0.2677
## INFO  [12:57:19.234] epoch 3, loss 0.1594
## INFO  [12:58:44.773] epoch 4, loss 0.1102
## INFO  [12:59:53.151] epoch 5, loss 0.0835
## INFO  [13:01:18.132] epoch 6, loss 0.0657
## INFO  [13:02:20.827] epoch 7, loss 0.0536
## INFO  [13:03:17.902] epoch 8, loss 0.0451
## INFO  [13:04:19.144] epoch 9, loss 0.0388
## INFO  [13:05:16.104] epoch 10, loss 0.0341
wv_context = glove$components
word_vectors = wv_main + t(wv_context)

Our words

Now, with our words. These are the vectors for our top ten values:

word X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31 X32 X33 X34 X35 X36 X37 X38 X39 X40 X41 X42 X43 X44 X45 X46 X47 X48 X49 X50 X51 X52 X53 X54 X55 X56 X57 X58 X59 X60 X61 X62 X63 X64 X65 X66 X67 X68 X69 X70 X71 X72 X73 X74 X75 X76 X77 X78 X79 X80 X81 X82 X83 X84 X85 X86 X87 X88 X89 X90 X91 X92 X93 X94 X95 X96 X97 X98 X99 X100
individualism -0.5706531 -0.1126608 0.4573229 -0.1748016 -0.3033205 0.4709786 -0.1742736 -0.0765885 0.8068037 0.0654578 0.6177974 0.6801190 -0.2108597 -0.3455212 -0.3215271 -0.0975455 -0.1981827 -0.5241472 0.3229958 -1.1306757 0.6358075 -0.3059242 0.0492382 0.6905779 0.6204865 -0.6705427 -0.1807378 0.0214706 -0.3845205 0.1555910 0.2793875 -0.7739798 0.2864166 -0.1966459 -0.4110313 0.3090826 0.1772605 0.4547164 0.2122230 0.0212808 0.2628981 -0.7249787 0.0818724 0.0433294 0.1339398 -0.4809950 -0.4479637 -0.5902564 0.2526257 0.5358725 -0.1896026 -0.0222402 -0.0566561 0.1649339 -0.6993240 -0.2175410 0.0429257 -0.2348120 -0.0806193 -0.2840777 -0.2957826 0.0590548 0.3887997 -0.3307897 0.0764245 0.2954556 -1.0379694 -0.3518625 0.3666290 0.6449180 0.1350715 0.2532982 -0.0734219 -0.5956234 -0.4529497 0.0242262 0.2632688 0.2654312 0.1684302 -0.2778670 -0.0713707 -0.2567578 -0.1512891 0.4958537 0.0183076 0.2986921 0.0035480 0.0096460 -0.6274090 0.0661499 0.0582183 0.3157753 -0.7879794 -0.0059821 -0.7711160 0.2735928 -0.4636370 -0.2566196 0.2017884 0.0350333
equality 0.0017765 -0.2757384 0.1380378 0.1115941 -0.3846866 -0.1867720 -0.3296255 0.0164803 -0.0105116 0.3699454 0.2803337 -0.2414358 -0.0757693 0.6476464 -0.0250542 -0.5222559 -0.1953305 -0.5506221 0.2688169 -0.0147637 0.2946952 -0.4280716 0.3224988 0.7129194 0.5991113 -0.0151125 -0.0589920 0.2227196 -0.8394155 0.2518007 -0.0072719 0.0724599 -0.2895042 -0.5537905 -0.4293071 -0.3396828 0.3617906 0.9901985 -0.4436743 0.5156335 0.0420188 -0.3018648 0.0897391 -0.1571725 0.0423092 -0.4390616 0.4547808 -0.1080651 -0.6224648 -0.4486903 0.5340221 -0.0290245 0.3772279 0.6542494 -0.3789293 0.4393395 -0.0694559 -0.5179666 -0.5633874 0.0518472 -0.9817229 0.0698676 0.6482765 0.4054193 0.5840553 0.3225159 0.1916285 -0.0710628 -0.3673116 0.1853093 0.1857984 0.2164594 -0.1051258 -0.6007488 0.3100362 0.4470880 0.2910041 -0.1982457 0.1036974 -0.1021365 -0.0555308 -0.2373071 -0.3498558 -0.1867557 -0.2871992 -0.0571508 0.3586378 -0.1056368 0.1503214 -0.2101359 0.2535754 0.8551450 0.0139293 -0.1587807 -0.2438933 0.1077239 -0.3602018 -0.6220697 0.2590905 -0.1899552
diversity -0.0543874 0.0907506 -0.2158128 -0.0894759 0.3463087 -0.2947234 -0.4554147 -0.3980008 -0.1632898 0.1637886 -0.8599857 0.0258172 -0.4292283 0.6537609 -0.0373631 -0.8016178 0.2868961 0.0973240 0.2242796 0.3415994 -0.2060550 -0.1768636 -0.6758251 0.1643870 -0.1685744 -0.0459732 0.4936555 -0.1099546 0.5283486 -0.2210025 -1.2558102 -0.0302971 0.7401356 -0.6730625 -0.1484026 -0.2459032 0.0347580 0.6086943 -0.2112537 0.2689932 -0.1995739 0.5846537 0.1961937 0.0547890 0.0610349 -0.2533994 0.2208386 0.0369523 0.0746591 0.5525951 -0.9102708 -0.0792612 -0.6983373 -0.4610428 -0.0456132 -0.2445136 0.2684186 0.2038335 0.0035407 -0.1248608 0.0986256 0.3431945 0.4859838 -0.4890740 -0.0278704 -0.6386770 -0.0500980 0.0719799 0.4942881 -0.4049571 0.2833160 0.7122544 -0.2760131 0.3350321 -0.8227359 0.0168181 0.4498476 0.2278223 0.4521201 0.3622040 0.6045631 0.5357237 0.3514683 0.0383356 0.1127135 -0.2619211 0.3481093 0.6108972 -0.3352669 0.4594275 0.0975350 0.2207475 0.3213755 0.3712006 0.4007671 0.7085598 -0.4457898 -0.2428139 -0.1033081 -0.5335293
unity -0.0834816 0.4491344 -0.8389599 -0.8980731 0.2041093 -0.3043530 0.0556579 -0.5018289 -0.6602815 -0.3047201 0.4859250 -0.4721624 0.0422945 0.4558312 -0.0560778 -0.2663017 -0.6386721 0.3278354 0.4835667 0.5883605 -0.4797593 -0.0668836 0.5838603 -0.0456437 -0.4627429 -0.9044603 0.6313694 -0.9404632 0.6859468 0.0544157 0.2886156 0.4134014 0.2526044 -0.0334707 -0.6004569 -0.3950155 0.2671329 0.5151973 -0.4986752 0.7023148 0.1238084 0.5210217 0.3507144 -1.1982318 0.5916083 -0.0495996 -0.1134710 -0.0297949 -0.1936190 0.1567637 0.0451554 0.1622532 0.4393766 0.3385275 -0.2438365 0.2643870 0.2968137 -0.4262645 0.2901089 -0.0333452 -0.2823046 -0.2145625 -0.1664019 0.1075338 -0.1084785 -0.0598009 0.2181054 0.2158593 -0.5503917 -0.0866416 0.2876534 0.1570109 0.1902234 0.4909739 -0.1261626 0.8535546 0.3338455 -0.1590653 0.6324685 -0.1156309 1.1517201 0.2981082 0.7840473 0.9985799 -0.1398852 -0.1487450 -0.2559384 -0.3719143 -0.2603666 -0.2424770 0.4323443 -0.3279959 0.6170279 -0.4661679 0.7269990 0.1858877 -0.2301561 0.7706122 -0.0994340 0.3192974
liberty -0.3594864 -0.5026147 -0.2081281 -0.3685180 0.5255886 -0.4280984 -0.2733975 0.8367721 0.1696940 -0.0902922 -0.3381813 0.5422031 -0.7208381 0.0672403 -0.9753591 -0.1349160 -0.2284926 0.1977168 0.5585061 0.4658009 0.1284381 0.5046493 -0.2164724 0.6892621 -0.2698011 -0.2174263 0.0185442 -0.4715038 0.1055669 -0.0078694 0.0333601 0.3191605 0.4104726 -0.4290830 -0.0344310 0.2331574 0.2489767 0.6369151 0.0890752 0.5674957 0.5221212 0.4046228 -0.4912863 -0.0194108 0.0493918 -0.0608819 0.1264356 -0.3659190 -0.3325358 -0.1000579 -0.1496046 0.6260188 0.2415619 -0.0841586 0.1649768 0.8104288 -0.2432779 0.3300447 -0.2008761 0.5125108 -0.4867673 -1.0143752 0.0288914 -0.0623315 0.6270692 0.4268869 -0.3429166 0.1172073 -0.7184248 0.5238178 -0.5019073 -0.1210488 0.2108400 0.0683919 0.3839224 -0.6006378 0.2774419 -0.1497170 0.4100104 -0.3908681 -0.2047120 0.4311659 0.9726577 -0.1953387 -0.5186177 0.0126568 0.0664331 -1.0553703 -0.1063223 0.0540204 1.3651621 0.3348835 -0.5150288 -0.0673262 0.3323468 0.6750739 -0.2216213 0.5112707 0.8171864 -0.4797633
democracy -0.1139310 -0.3491421 -0.0302168 0.0581126 -0.5130326 -0.3618891 0.2366838 0.0612760 -0.0664646 -0.6326874 0.4105462 0.5310218 0.2480130 0.7982512 0.5223246 -0.2394029 -0.4604727 0.1140813 -0.1784561 0.2173720 0.7119586 0.4361755 -0.0084432 -1.1038426 -0.5087239 0.0981672 0.6296581 0.1140319 0.1820886 0.1653642 -0.0158409 -0.5698087 -0.6532357 -0.1247795 0.3182479 -0.1431068 0.2692345 0.4700592 -0.1118507 -0.0588129 -0.2848716 -0.4341126 -0.1630735 -0.3011917 0.2812844 0.0344683 -0.0609784 0.8127626 0.1111385 -0.1403709 0.1411283 -0.2212206 -0.0926750 0.0001322 1.1118620 -0.0890517 0.2174105 0.5390585 -0.2850934 0.0146192 0.3395877 -0.6936893 0.1120022 -0.8893632 0.1875660 0.1727779 0.9256687 0.1470129 0.1020919 0.1850827 0.3310835 0.6416769 -0.2618445 -0.9010756 -0.3920506 -0.6481385 0.9278553 0.2837365 -0.0238192 0.1856862 0.4384185 -0.1426451 -0.1403193 -0.0621569 -0.0959625 -0.4237133 0.3650434 0.0758955 -0.5443707 0.7385178 -0.0981297 0.4504162 0.1562996 -0.5648318 0.3658088 -0.1896669 -0.2539289 0.0587047 0.0925498 0.8850234
justice 1.3205603 0.4782971 -0.2945293 -0.1582906 -0.3027340 0.3250320 0.4134305 0.0563676 -0.1706448 0.0847796 -0.6025830 -0.1120429 0.8825506 -0.3314126 0.3060865 0.2588186 0.1090525 0.2746653 0.1148561 -0.3173074 0.2398353 0.2472418 0.1980879 -0.1754475 -0.3045942 -0.2779845 -0.1550678 -0.0462797 0.2977303 0.0119498 -0.0404475 0.2181525 -0.5735036 0.2930813 -0.2164784 -0.0002422 -0.7442827 -1.1164809 0.4076308 -0.6360653 0.2850729 1.4206514 -0.6165990 -0.2972571 1.0985000 -0.2259105 -0.2153638 0.6105820 -0.4218787 0.5137426 -0.5127449 -0.3253621 0.2545246 0.6647953 -0.8559284 0.2234756 0.2255488 0.2058525 -0.2573327 0.7802072 -0.3983821 0.1206354 0.4078982 -0.0435164 1.0833050 0.9459490 -0.7741399 0.0506848 -0.3715400 1.5446490 0.2420095 -0.6337935 -0.1363183 0.0474556 0.2131962 0.0072489 0.2557982 -0.4845407 -1.1207439 0.1718878 0.9994394 0.3859515 -0.3079891 -0.6769921 -0.7631164 0.4015137 0.2904308 -0.3051606 -0.5609063 0.3212380 0.5495811 -0.3923484 -0.9684743 -0.2597839 0.4456711 -0.0017392 0.0577903 0.3166950 -0.0890987 0.3606351
freedom -0.9758872 -0.0483266 0.5723640 -0.2906176 0.0309650 0.0210083 -0.0531846 -0.1829408 0.1280572 -0.4474409 -0.2491306 0.0101692 0.2632806 -0.4737616 0.4049551 0.6044812 0.0447891 -0.4044132 0.3174958 0.9714692 0.1844219 0.1304695 0.2516911 -0.2192965 -0.7567427 -0.1621473 0.7687256 0.1118473 0.4346112 0.3893186 -0.7614122 -0.3057235 -0.6198806 0.2974233 -0.7395588 -1.3266038 0.2295608 0.3590204 -0.0293640 0.3850306 0.1476187 -0.0248723 -0.4552377 -0.1549678 0.4255162 -0.0361901 -0.6686361 0.3994660 0.1989528 -0.2814545 -0.0974256 -0.2685177 -0.4628228 -1.4965588 0.5921433 0.2735399 0.8204858 -0.7702816 0.0700059 0.1117551 0.1903634 0.8904612 -0.6708447 -0.1990541 0.1025740 -0.4836443 0.3604871 0.6905748 -0.2610917 0.7692295 0.7514672 -0.7845639 -0.7135853 0.2549577 0.3096264 0.1981537 -0.2940688 -0.3303469 0.3190798 0.6323002 0.4649938 -0.3265512 0.7423137 0.1736426 0.1679425 0.5389774 0.2993193 0.7665718 -0.6256936 0.4890951 -0.6993064 0.4530404 -0.4507098 0.0981348 0.0674675 -0.2983423 0.2501080 -0.2745000 0.3139165 -0.2723636
independence 0.2256531 1.0150124 -0.3070521 -0.1539596 0.4921132 0.0099673 0.6999894 0.5137481 0.7720840 -0.3073462 -0.4588363 -0.0492247 -0.0881307 1.0326173 0.1713335 -0.4216821 0.5931731 -0.1176083 0.4697768 0.1935990 0.3649809 -0.0529658 0.3100830 0.0429115 -0.3644657 0.8660627 -0.7539011 0.0777666 -0.1445115 0.6596922 -0.2167226 -0.4489493 0.1793150 0.6353878 0.4680334 -0.2557805 0.2829012 -0.9582570 0.1420597 -0.4017733 0.5045108 0.3136754 -0.2834175 -0.2978825 0.1924060 -0.4647861 -0.5146586 -0.8607032 -0.2949883 -0.1089756 -0.5974052 0.3850957 -0.2536941 0.6269122 0.0388140 -0.2391959 0.5349864 0.9856966 0.3284889 -0.1479923 0.2271516 -1.0836656 -0.6377266 -0.2889705 0.6887094 -0.1041094 0.6100079 -0.3813571 -0.2832932 -0.6331540 0.8097350 0.1776102 1.3003847 -0.0204378 0.0611318 -0.8648971 -0.4942458 -0.1591714 -0.8316659 -0.1391127 -0.2808507 0.2986309 -0.4739119 0.3802865 -0.9212319 0.4912192 -0.0578424 0.7464743 -0.2355124 0.3479696 0.2826111 0.1943578 -0.1462496 0.5105316 -0.1203944 -0.4463011 -0.2325612 1.0852533 0.4308586 -0.5890007
freedom of speech NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

Create value list

df_valuelist <- df_amd %>%
  select(value) %>%
  rename(word = value) %>% 
  distinct() %>% 
  group_by(word) %>%
  slice(1) %>%
  ungroup() %>%
  select(word) %>%
  mutate(last_char = stri_sub(word,-1,-1)) %>% 
  mutate(word = ifelse(last_char == " ",stri_replace_last_fixed(word," ",""),word)) %>% 
  select(word)

Add vectors

Ok, we have a list of values. It’s not perfect, but with enough data, I think the noise will wash away. Now, let’s add vectors to the words on the list.

df_valuevectors <- data.frame(word_vectors) %>% 
  mutate(word = rownames(word_vectors),
         word = str_replace_all(word,"_"," ")) %>% 
  group_by(word) %>% 
  slice(1) %>% 
  ungroup() %>% 
  select(word,everything()) %>% 
  right_join(df_valuelist,by = "word") %>% 
  mutate(word = ifelse(str_length(word) < 3,NA,word)) %>% 
  distinct() %>% 
  filter(!is.na(word))

cool. now, let’s add the vectors to our data

df_amd <- df_amd %>% 
  left_join(df_valuevectors %>% 
              rename(value = word),by = "value")

Cosine similarity between perspectives per participant

df_distances = tibble(PID = -999,dist_paper_practice = 0,dist_paper_ideal = 0,dist_prac_ideal = 0)
PIDs = unique(df_amd$PID)

for(i in PIDs){
  
mt_cosine <- df_amd %>% 
  select(PID,type,value,X1:X100) %>% 
  group_by(PID,type) %>% 
  summarise_at(vars(X1:X100),function(x){mean(x,na.rm = T)}) %>% 
  ungroup() %>% 
  filter(PID == i) %>% 
  select(type,X1:X100) %>% 
  pivot_longer(X1:X100,
               names_to = "names",
               values_to = "values") %>% 
  pivot_wider(names_from = "type",
              values_from = "values") %>% 
  select(-names) %>% 
  as.matrix() %>% 
  cosine() 

cosine_ideal_paper = mt_cosine["ideal","paper"]
cosine_ideal_prac = mt_cosine["ideal","prac"]
cosine_paper_prac = mt_cosine["paper","prac"]

current_scores = tibble(PID = i,
                        dist_paper_practice = cosine_paper_prac,
                        dist_paper_ideal = cosine_ideal_paper,
                        dist_prac_ideal = cosine_ideal_prac)

df_distances <- df_distances %>% 
  bind_rows(current_scores)
}

combine back in

df_amd_inddiff <- df_amd_inddiff %>% 
  left_join(df_distances %>% 
  filter(PID != -999),by = "PID")

Let’s take a couple of extreme PID’s and see if it passes the smell test. I’ll look at paper vs. practice: top 5 cosine similarity vs. bottom 5 cosine similarity. lets see

PID dist_paper_practice cosine_similarity type value
17 -0.1968922 low paper freedom
17 -0.1968922 low paper independence
17 -0.1968922 low paper democracy
17 -0.1968922 low paper liberty
17 -0.1968922 low paper life
17 -0.1968922 low paper persuit of happiness
17 -0.1968922 low prac obedience to authority
17 -0.1968922 low prac accumulation of money
17 -0.1968922 low prac gathering of power to oneself
17 -0.1968922 low prac screw the other guy
17 -0.1968922 low prac elections and votes

Low Cosine Similarity

Example 1

PID dist_paper_practice cosine_similarity type value
17 -0.1968922 low paper freedom
17 -0.1968922 low paper independence
17 -0.1968922 low paper democracy
17 -0.1968922 low paper liberty
17 -0.1968922 low paper life
17 -0.1968922 low paper persuit of happiness
17 -0.1968922 low prac obedience to authority
17 -0.1968922 low prac accumulation of money
17 -0.1968922 low prac gathering of power to oneself
17 -0.1968922 low prac screw the other guy
17 -0.1968922 low prac elections and votes

Example 2

PID dist_paper_practice cosine_similarity type value
86 -0.1885002 low paper all men are equal
86 -0.1885002 low paper the right to bear arms
86 -0.1885002 low paper no taxaation without representation
86 -0.1885002 low paper the right to vote
86 -0.1885002 low paper trial by a jury of his peers
86 -0.1885002 low prac right to bear arms
86 -0.1885002 low prac trial by jury of one’s peers
86 -0.1885002 low prac rule by law
86 -0.1885002 low prac free trade
86 -0.1885002 low prac foreign aid
86 -0.1885002 low prac free speech

Example 3

PID dist_paper_practice cosine_similarity type value
13 -0.1083258 low paper freedom
13 -0.1083258 low paper liberty
13 -0.1083258 low paper democracy
13 -0.1083258 low paper vote
13 -0.1083258 low paper bear arms
13 -0.1083258 low paper speech
13 -0.1083258 low prac money
13 -0.1083258 low prac power
13 -0.1083258 low prac greed
13 -0.1083258 low prac entitlement
13 -0.1083258 low prac division

High Cosine Similarity

Example 1

PID dist_paper_practice cosine_similarity type value
53 1 high paper freedom of speech
53 1 high paper freedom to practice religion or not
53 1 high paper no search and seizure without warrant
53 1 high paper right to bear arms in a well-regulated militia
53 1 high paper right to be represented in legislative body
53 1 high prac freedom of speech
53 1 high prac taxation without representation in some places
53 1 high prac right to be taken advantage of by corporations
53 1 high prac wealth is 9/10ths of the law
53 1 high prac right to lobby legislators with re-election funds

Oh, looks like this is similar just because they’re all NA’s (none of them have vectors). We’re gonna have to fix that.

Example 2

PID dist_paper_practice cosine_similarity type value
68 0.9535225 high paper greed
68 0.9535225 high paper capitalism
68 0.9535225 high paper war
68 0.9535225 high paper drama
68 0.9535225 high paper entertainment
68 0.9535225 high paper freedom
68 0.9535225 high prac greed
68 0.9535225 high prac war
68 0.9535225 high prac capitalism
68 0.9535225 high prac drama
68 0.9535225 high prac freedom

Example 3

PID dist_paper_practice cosine_similarity type value
44 0.9266478 high paper freedom
44 0.9266478 high paper democracy
44 0.9266478 high paper justice
44 0.9266478 high paper individualism
44 0.9266478 high paper equality
44 0.9266478 high paper self-government
44 0.9266478 high prac individualism
44 0.9266478 high prac self-government
44 0.9266478 high prac democracy
44 0.9266478 high prac freedom
44 0.9266478 high prac justice

Ok, these look a little better.

How about we try to plot these distributions and slice by ideology

df_amd_inddiff %>% 
  select(PID,ideo,dist_paper_practice:dist_prac_ideal) %>% 
  pivot_longer(-c(PID,ideo),
               names_to = "names",
               values_to = "values") %>%  
  filter(!is.na(values)) %>% 
  ggplot(aes(x = values,fill = names)) +
  geom_histogram(bins = 30) + 
  theme(panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        panel.background = element_blank(),
        axis.ticks = element_blank(),
        legend.position = "none") +
  facet_wrap(~names,nrow = 3) 

df_amd_inddiff %>% 
  select(PID,dist_paper_practice:dist_prac_ideal) %>% 
  left_join(df_amd_ideo %>% 
              filter(ideo == "Democratic Socialism" |
                       ideo == "Conservatism" |
                       ideo == "Liberalism" |
                       ideo == "Progressivism" |
                       ideo == "Libertarianism" |
                       ideo == "Right-Wing Nationalism"),by = "PID") %>%
  select(-ideo_score) %>% 
  pivot_longer(-c(PID,ideo),
               names_to = "names",
               values_to = "distance") %>% 
  filter(!is.na(distance)) %>% 
  filter(!is.na(ideo)) %>% 
  ggplot(aes(x = distance,fill = names)) +
  geom_histogram(bins = 30) + 
  theme(panel.grid.major = element_line(color = "grey66"),
        panel.grid.minor = element_blank(),
        panel.background = element_blank(),
        axis.ticks = element_blank(),
        axis.line = element_line(color = "grey66"),
        legend.position = "none") +
  facet_grid(ideo~names)
## Warning in left_join(., df_amd_ideo %>% filter(ideo == "Democratic Socialism" | : Each row in `x` is expected to match at most 1 row in `y`.
## ℹ Row 5 of `x` matches multiple rows.
## ℹ If multiple matches are expected, set `multiple = "all"` to silence this
##   warning.

Archive

alright, now we’ll take an average of each dimension per participant per perspective and then get difference scores between perspectives for each participant.

df_amd_diffscores <- df_amd %>% 
  group_by(PID,type) %>% 
  summarise_at(vars(X1:X100),function(x){mean(x,na.rm = T)}) %>% 
  ungroup() %>% 
  pivot_longer(-c(PID,type),
               names_to = "dim",
               values_to = "values") %>% 
  filter(!is.na(values)) %>% 
  #mutate(dim_type = paste0(type,"_",names)) %>% 
  pivot_wider(names_from = type,
              values_from = values) %>% 
  mutate(prac_minus_paper = prac - paper,
         ideal_minus_paper = ideal - paper,
         ideal_minus_prac = ideal - prac) %>% 
  group_by(PID) %>% 
  summarise(prac_minus_paper = mean(prac_minus_paper,na.rm = T),
            ideal_minus_paper = mean(ideal_minus_paper,na.rm = T),
            ideal_minus_prac = mean(ideal_minus_prac,na.rm = T)) %>% 
  ungroup()

hmm, let’s take a look

df_amd_diffscores %>% 
  pivot_longer(-PID,
               names_to = "names",
               values_to = "values") %>%  
  filter(!is.na(values)) %>% 
  ggplot(aes(x = values,fill = names)) +
  geom_histogram(bins = 30) + 
  theme(panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        panel.background = element_blank(),
        axis.ticks = element_blank(),
        legend.position = "none") +
  facet_wrap(~names,nrow = 3) 

umm, not sure what this tells us tbh. but basically, we can take the difference scores to predict different things. Let’s see if this like ideology predict difference scores:

df_amd_diffscores %>% 
  left_join(df_amd_ideo %>% 
              filter(ideo == "Democratic Socialism" |
                       ideo == "Conservatism" |
                       ideo == "Liberalism" |
                       ideo == "Progressivism" |
                       ideo == "Libertarianism" |
                       ideo == "Right-Wing Nationalism"),by = "PID") %>%
  select(-ideo_score) %>% 
  pivot_longer(-c(PID,ideo),
               names_to = "names",
               values_to = "values") %>% 
  filter(!is.na(values)) %>% 
  filter(!is.na(ideo)) %>% 
  ggplot(aes(x = values,fill = names)) +
  geom_histogram(bins = 30) + 
  theme(panel.grid.major = element_line(color = "grey66"),
        panel.grid.minor = element_blank(),
        panel.background = element_blank(),
        axis.ticks = element_blank(),
        axis.line = element_line(color = "grey66"),
        legend.position = "none") +
  facet_grid(ideo~names) 
## Warning in left_join(., df_amd_ideo %>% filter(ideo == "Democratic Socialism" | : Each row in `x` is expected to match at most 1 row in `y`.
## ℹ Row 5 of `x` matches multiple rows.
## ℹ If multiple matches are expected, set `multiple = "all"` to silence this
##   warning.