Summary of LDA Analysis

Summary

I did the following:

Loaded the file job descriptors.csv.zip, which contains 4,832,204 rows, including the text field job_title.
Got rid of duplicate mem_id descriptions, so that each document was a single freelancer job_title was always a single value for a unique mem_id – meaning no freelancer apparently changed their job_title for different jobs. This resulted in 215,980 unique job titles, although some of these could not be analyzed because they contained only “.” for instance.
Processed the text in the following way:
1. Removed punctuation, symbols, and English stopwords.
2. Stemmed the words.
3. Detected bigram collocations, such as “graphic design”, and treated these as a single token.
Formed a document-feature matrix from the tokens, and trimmed this to remove any terms that occurred fewer than 10 times, or in fewer than .001 of all documents.
Ran the structural topic model, for K = 6, 7, 8, 9, 10, 12, 15, and 20 topics. Chose K = 12 as the best tradeoff of fit.
Fitted the K = 12 topic model more precisely.
Output the thetas (estimated topic proportions per document) to results/LDA_k12_job_title.csv.
Used this fitted topic model to estimate topic proportions when the texts are job_title but aggregated by project_id.
1. I concatenated all of the job_title text after grouping by project_id. This resulted in 644,132 unique documents.
2. I followed the steps in 3 and 4 above to process the tokens and form a document-feature matrix.
3. I used the fitted k = 12 model to predict topic proportions for these new documents. I did this because the texts are the same, just grouped in a different way. (But I did a comparison of having fitted a new model this way, and the results were largely the same.)
4. I outputted the topic propotions for each proj_id as `results/LDA_k12_proj_id.csv.

Replication files

See Fit_LDA.Rmd" which has all of the code required to load the filejob descriptors.csv.zip`, process the text, fit the topic models, and output the results.

Because these files are so large, I outputted them to a temporary folder and then zipped them before copying to the Dropbox shared folder in results/.

Details on the topic calibration

library("stm")

stm v1.3.5 successfully loaded. See ?stm for help. 
 Papers, resources, and other materials at structuraltopicmodel.com

load("fitted/k_search.rda")
plot(k_search)

Details on the K = 12 topics:

load("fitted/tmod12.rda")
plot(tmod12)

summary(tmod12)

A topic model with 12 topics, 209294 documents and a 466 word dictionary.
Topic 1 Top Words:
     Highest Prob: expert, seo, sale, social media, admin, excel, work 
     FREX: seo, social media, work, retouch, digit market, compani, internet market 
     Lift: ad, advertis, adword, analyt, email market, googl adword, link build 
     Score: expert, seo, sale, social media, quantiti, admin, ppc 
Topic 2 Top Words:
     Highest Prob: writer, editor, copywrit, freelanc writer, content, journalist, experienc 
     FREX: writer, editor, copywrit, freelanc writer, content, journalist, experienc 
     Lift: cameraman, en, film, filmmak, law, writer, editor 
     Score: writer, editor, interpret, copywrit, freelanc writer, journalist, content 
Topic 3 Top Words:
     Highest Prob: consult, manag, engin, analyst, softwar, experi, databas 
     FREX: consult, manag, engin, analyst, softwar, databas, system 
     Lift: consult, internet, scienc, secur, system, analysi, analyst 
     Score: consult, manag, engin, analyst, civil, softwar, system 
Topic 4 Top Words:
     Highest Prob: translat, research, proofread, english, content writer, blogger, teacher 
     FREX: translat, research, proofread, english, content writer, teacher, articl writer 
     Lift: research, french, onlin, academ, arab, articl, articl writer 
     Score: translat, english, proofread, research, blogger, french, content writer 
Topic 5 Top Words:
     Highest Prob: design, artist, graphic, anim, 3d, video editor, print 
     FREX: design, artist, graphic, video editor, print, voic, game 
     Lift: actor, voic, 2d, 2d anim, 3d artist, 3d model, artist 
     Score: design, artist, graphic, 3d, anim, concept artist, video editor 
Topic 6 Top Words:
     Highest Prob: graphic design, develop, web design, websit, websit design, mobil, seo specialist 
     FREX: graphic design, web design, websit, seo specialist, founder, websit design, head 
     Lift: cutter, founder, graphic design, head, seo specialist, web design, websit 
     Score: develop, graphic design, web design, cutter, websit, websit design, mobil 
Topic 7 Top Words:
     Highest Prob: illustr, programm, assist, architect, execut, student, brand 
     FREX: illustr, programm, assist, architect, execut, student, brand 
     Lift: creativ director, illustr, master, technologist, architect, art, brand 
     Score: illustr, programm, architect, assist, architectur, visual, student 
Topic 8 Top Words:
     Highest Prob: php, wordpress, creativ, logo design, softwar develop, html, joomla 
     FREX: php, wordpress, softwar develop, html, joomla, magento, android 
     Lift: php, 5, access, android, app develop, applic develop, brochur 
     Score: php, wordpress, html, joomla, front end, css, mysql 
Topic 9 Top Words:
     Highest Prob: web develop, freelanc, senior, fashion, communic, ui, owner 
     FREX: freelanc, senior, owner, ux, sound, photographi, photo editor 
     Lift: art director, freelanc, industri, music compos, music produc, owner, photographi 
     Score: web develop, freelanc, textil, senior, fashion, ui, ux 
Topic 10 Top Words:
     Highest Prob: profession, account, busi, servic, director, support, project manag 
     FREX: profession, account, busi, servic, director, project manag, bookkeep 
     Lift: advisor, busi analyst, coach, coordin, corpor, event, event manag 
     Score: account, profession, busi, director, support, sap, servic 
Topic 11 Top Words:
     Highest Prob: web, market, photograph, specialist, video, audio, maker 
     FREX: web, market, photograph, specialist, video, video edit, audio 
     Lift: photograph, web, creation, generalist, make, market, mechan engin 
     Score: web, market, photograph, specialist, generalist, video, audio 
Topic 12 Top Words:
     Highest Prob: data entri, virtual assist, administr, write, data, offic, custom servic 
     FREX: virtual assist, administr, write, data, offic, custom servic, pa 
     Lift: offic, admin assist, admin support, administr assist, appoint, articl write, call 
     Score: virtual assist, data entri, administr, write, custom servic, clerk, offic

How to use the output files

Fitted by `mem_id` (by freelancer)

Load LDA_k12_job_title.csv. This file contains one set of topic proportions for each unique mem_id (a total of )

unzip("results/LDA_k12_job_title.zip", exdir = "~/tmp/")
LDA_k12_job_title <- data.table::fread("~/tmp/LDA_k12_job_title.csv", data.table = FALSE)
tibble::glimpse(LDA_k12_job_title, width = 90)

Rows: 209,294
Columns: 14
$ mem_id    <int> 128342, 503693, 177138, 411986, 219778, 14128, 86906, 539230, 58975, …
$ theta_1   <dbl> 0.07308893, 0.10721996, 0.08948607, 0.20415807, 0.02732215, 0.1124007…
$ theta_2   <dbl> 0.075702142, 0.019463663, 0.080716024, 0.023394417, 0.003830533, 0.08…
$ theta_3   <dbl> 0.09336628, 0.03983987, 0.10311823, 0.04846664, 0.01433578, 0.0757593…
$ theta_4   <dbl> 0.061339687, 0.017868666, 0.072610576, 0.026829924, 0.003139403, 0.10…
$ theta_5   <dbl> 0.09302686, 0.04843737, 0.06836757, 0.03183601, 0.02363463, 0.0434636…
$ theta_6   <dbl> 0.09764756, 0.22147844, 0.07880496, 0.12536559, 0.13920830, 0.0597120…
$ theta_7   <dbl> 0.05943682, 0.03162716, 0.05138677, 0.02492144, 0.01204788, 0.0397462…
$ theta_8   <dbl> 0.06818606, 0.34183952, 0.05665602, 0.31244171, 0.71002458, 0.0478790…
$ theta_9   <dbl> 0.09897834, 0.05867481, 0.06594379, 0.04536958, 0.03635920, 0.0480426…
$ theta_10  <dbl> 0.15528114, 0.04166124, 0.18338070, 0.05595774, 0.01058258, 0.1266794…
$ theta_11  <dbl> 0.06931444, 0.04835895, 0.07024998, 0.05327263, 0.01529206, 0.0633435…
$ theta_12  <dbl> 0.054631748, 0.023530343, 0.079279321, 0.047986233, 0.004222914, 0.19…
$ max_topic <int> 10, 8, 10, 8, 8, 12, 8, 8, 6, 3, 12, 8, 8, 5, 8, 10, 8, 1, 8, 8, 8, 6…

To merge this with the original data, just do a “left join” operation:

unzip("job descriptors.zip", exdir = "~/tmp/")

error 1 in extracting from zip fileWarning messages:
1: In readChar(file, size, TRUE) : truncating string with embedded nuls
2: In readChar(file, size, TRUE) : truncating string with embedded nuls

all_data <- data.table::fread("~/tmp/job descriptors.csv", data.table = FALSE)
all_data_LDA_job_title <- dplyr::left_join(all_data, LDA_k12_job_title, by = "mem_id")
dplyr::glimpse(all_data_LDA_job_title, width = 90)

Rows: 4,832,204
Columns: 17
$ bid_id    <int> 404205, 416055, 699962, 823066, 824483, 1146456, 200210, 193618, 1812…
$ proj_id   <int> 43215, 44698, 73522, 85889, 85967, 120260, 20982, 20168, 18487, 19720…
$ mem_id    <int> 114272, 114272, 114272, 140195, 140195, 140195, 60468, 60468, 60468, …
$ job_title <chr> "Web/Graphic Design Adobe Photoshop cs3 ; cs4 ; cs5 ; Illustrator ; C…
$ theta_1   <dbl> 0.06715131, 0.06715131, 0.06715131, 0.04421541, 0.04421541, 0.0442154…
$ theta_2   <dbl> 0.04058958, 0.04058958, 0.04058958, 0.05566576, 0.05566576, 0.0556657…
$ theta_3   <dbl> 0.06431540, 0.06431540, 0.06431540, 0.05289841, 0.05289841, 0.0528984…
$ theta_4   <dbl> 0.03015566, 0.03015566, 0.03015566, 0.03772991, 0.03772991, 0.0377299…
$ theta_5   <dbl> 0.10882433, 0.10882433, 0.10882433, 0.22895669, 0.22895669, 0.2289566…
$ theta_6   <dbl> 0.16089062, 0.16089062, 0.16089062, 0.11142330, 0.11142330, 0.1114233…
$ theta_7   <dbl> 0.08201423, 0.08201423, 0.08201423, 0.13881545, 0.13881545, 0.1388154…
$ theta_8   <dbl> 0.14438100, 0.14438100, 0.14438100, 0.06390998, 0.06390998, 0.0639099…
$ theta_9   <dbl> 0.09925113, 0.09925113, 0.09925113, 0.09508420, 0.09508420, 0.0950842…
$ theta_10  <dbl> 0.09503692, 0.09503692, 0.09503692, 0.07527572, 0.07527572, 0.0752757…
$ theta_11  <dbl> 0.07636458, 0.07636458, 0.07636458, 0.06785218, 0.06785218, 0.0678521…
$ theta_12  <dbl> 0.03102524, 0.03102524, 0.03102524, 0.02817298, 0.02817298, 0.0281729…
$ max_topic <int> 6, 6, 6, 5, 5, 5, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12,…

Fitted by `proj_id` (by project)

Load LDA_k12_proj_id.csv. This file contains one set of topic proportions for each unique mem_id (a total of )

unzip("results/LDA_k12_proj_id.zip", exdir = "~/tmp/")
LDA_k12_proj_id <- data.table::fread("~/tmp/LDA_k12_proj_id.csv", data.table = FALSE)
tibble::glimpse(LDA_k12_proj_id, width = 90)

Rows: 639,678
Columns: 14
$ proj_id   <int> 1004, 1402, 1419, 1888, 2973, 3103, 3250, 3275, 5837, 6141, 6254, 649…
$ theta_1   <dbl> 0.02484302, 0.07454527, 0.10842039, 0.13170684, 0.15179779, 0.1022415…
$ theta_2   <dbl> 0.008886185, 0.066314688, 0.045267903, 0.042876092, 0.057985343, 0.05…
$ theta_3   <dbl> 0.03158616, 0.11964086, 0.19163514, 0.08205396, 0.12051238, 0.1104882…
$ theta_4   <dbl> 0.004995802, 0.058052048, 0.046520735, 0.044954254, 0.041141920, 0.10…
$ theta_5   <dbl> 0.11799575, 0.04937282, 0.02479040, 0.02938004, 0.03525608, 0.0224015…
$ theta_6   <dbl> 0.28167562, 0.05024897, 0.03793470, 0.03823611, 0.05463732, 0.0390909…
$ theta_7   <dbl> 0.04418074, 0.04041016, 0.02578611, 0.02733854, 0.02713814, 0.0223500…
$ theta_8   <dbl> 0.34311672, 0.03252080, 0.03162171, 0.02895976, 0.05523331, 0.0260378…
$ theta_9   <dbl> 0.07605664, 0.04853417, 0.03221491, 0.03755607, 0.04535646, 0.0414873…
$ theta_10  <dbl> 0.02739549, 0.31848332, 0.27610439, 0.32073269, 0.22182093, 0.1652494…
$ theta_11  <dbl> 0.03424946, 0.05759115, 0.04787108, 0.06026831, 0.10964360, 0.0563533…
$ theta_12  <dbl> 0.005018427, 0.084285755, 0.131832544, 0.155937332, 0.079476741, 0.25…
$ max_topic <int> 8, 10, 10, 10, 10, 12, 10, 12, 12, 10, 10, 10, 6, 8, 10, 10, 12, 8, 2…

To merge this with the original data, just do a “left join” operation:

all_data_LDA_proj_id <- dplyr::left_join(all_data, LDA_k12_proj_id, by = "proj_id")
dplyr::glimpse(all_data_LDA_proj_id, width = 90)

Rows: 4,832,204
Columns: 17
$ bid_id    <int> 404205, 416055, 699962, 823066, 824483, 1146456, 200210, 193618, 1812…
$ proj_id   <int> 43215, 44698, 73522, 85889, 85967, 120260, 20982, 20168, 18487, 19720…
$ mem_id    <int> 114272, 114272, 114272, 140195, 140195, 140195, 60468, 60468, 60468, …
$ job_title <chr> "Web/Graphic Design Adobe Photoshop cs3 ; cs4 ; cs5 ; Illustrator ; C…
$ theta_1   <dbl> 0.031542269, 0.048840017, 0.091749654, 0.008246565, 0.029839193, 0.05…
$ theta_2   <dbl> 0.032127402, 0.007837322, 0.013990019, 0.046447252, 0.019700680, 0.01…
$ theta_3   <dbl> 0.023764586, 0.034962227, 0.032299515, 0.009772934, 0.028723194, 0.01…
$ theta_4   <dbl> 0.016029310, 0.005322921, 0.007722979, 0.017385866, 0.010632236, 0.00…
$ theta_5   <dbl> 0.245843668, 0.071529731, 0.067098941, 0.375921441, 0.363578252, 0.28…
$ theta_6   <dbl> 0.228765226, 0.245321525, 0.154724394, 0.079028017, 0.167235956, 0.11…
$ theta_7   <dbl> 0.078976426, 0.051136990, 0.029294372, 0.310453583, 0.140258542, 0.26…
$ theta_8   <dbl> 0.159050136, 0.374920646, 0.419285425, 0.018590715, 0.052087925, 0.04…
$ theta_9   <dbl> 0.056934308, 0.055500799, 0.089933186, 0.077227010, 0.069596126, 0.09…
$ theta_10  <dbl> 0.04420721, 0.05470092, 0.04604291, 0.02630976, 0.06065594, 0.0401919…
$ theta_11  <dbl> 0.07606562, 0.04019165, 0.03523336, 0.02612268, 0.04691210, 0.0437800…
$ theta_12  <dbl> 0.006693847, 0.009735248, 0.012625244, 0.004494180, 0.010779856, 0.01…
$ max_topic <int> 5, 8, 8, 5, 5, 5, 12, 12, 12, 12, 12, 2, 12, 2, 12, 12, 12, 12, 12, 1…

LS0tCnRpdGxlOiAiU3VtbWFyeSBvZiBMREEgQW5hbHlzaXMiCmF1dGhvcjogIktlbm5ldGggQmVub2l0IgpvdXRwdXQ6CiAgaHRtbF9ub3RlYm9vazogZGVmYXVsdAogIHBkZl9kb2N1bWVudDogZGVmYXVsdAotLS0KCiMjIFN1bW1hcnkKCkkgZGlkIHRoZSBmb2xsb3dpbmc6CgoxLiAgTG9hZGVkIHRoZSBmaWxlIGBqb2IgZGVzY3JpcHRvcnMuY3N2LnppcGAsIHdoaWNoIGNvbnRhaW5zIDQsODMyLDIwNCByb3dzLCBpbmNsdWRpbmcgdGhlIHRleHQgZmllbGQgYGpvYl90aXRsZWAuCjIuICBHb3QgcmlkIG9mIGR1cGxpY2F0ZSBgbWVtX2lkYCBkZXNjcmlwdGlvbnMsIHNvIHRoYXQgZWFjaCBkb2N1bWVudCB3YXMgYSBzaW5nbGUgZnJlZWxhbmNlciAgYGpvYl90aXRsZWAgd2FzIGFsd2F5cyBhIHNpbmdsZSB2YWx1ZSBmb3IgYSB1bmlxdWUgYG1lbV9pZGAgLS0gbWVhbmluZyBubyBmcmVlbGFuY2VyIGFwcGFyZW50bHkgY2hhbmdlZCB0aGVpciBgam9iX3RpdGxlYCBmb3IgZGlmZmVyZW50IGpvYnMuICBUaGlzIHJlc3VsdGVkIGluIDIxNSw5ODAgdW5pcXVlIGpvYiB0aXRsZXMsIGFsdGhvdWdoIHNvbWUgb2YgdGhlc2UgY291bGQgbm90IGJlIGFuYWx5emVkIGJlY2F1c2UgdGhleSBjb250YWluZWQgb25seSAiLiIgZm9yIGluc3RhbmNlLgozLiAgUHJvY2Vzc2VkIHRoZSB0ZXh0IGluIHRoZSBmb2xsb3dpbmcgd2F5OgogICAgYS4gIFJlbW92ZWQgcHVuY3R1YXRpb24sIHN5bWJvbHMsIGFuZCBFbmdsaXNoIHN0b3B3b3Jkcy4KICAgIGIuICBTdGVtbWVkIHRoZSB3b3Jkcy4KICAgIGMuICBEZXRlY3RlZCBiaWdyYW0gY29sbG9jYXRpb25zLCBzdWNoIGFzICJncmFwaGljIGRlc2lnbiIsIGFuZCB0cmVhdGVkIHRoZXNlIGFzIGEgc2luZ2xlIHRva2VuLgo0LiAgRm9ybWVkIGEgZG9jdW1lbnQtZmVhdHVyZSBtYXRyaXggZnJvbSB0aGUgdG9rZW5zLCBhbmQgdHJpbW1lZCB0aGlzIHRvIHJlbW92ZSBhbnkgdGVybXMgdGhhdCBvY2N1cnJlZCBmZXdlciB0aGFuIDEwIHRpbWVzLCBvciBpbiBmZXdlciB0aGFuIC4wMDEgb2YgYWxsIGRvY3VtZW50cy4KNS4gIFJhbiB0aGUgc3RydWN0dXJhbCB0b3BpYyBtb2RlbCwgZm9yIEsgPSA2LCA3LCA4LCA5LCAxMCwgMTIsIDE1LCBhbmQgMjAgdG9waWNzLiAgQ2hvc2UgSyA9IDEyIGFzIHRoZSBiZXN0IHRyYWRlb2ZmIG9mIGZpdC4KNi4gIEZpdHRlZCB0aGUgSyA9IDEyIHRvcGljIG1vZGVsIG1vcmUgcHJlY2lzZWx5Lgo3LiAgT3V0cHV0IHRoZSB0aGV0YXMgKGVzdGltYXRlZCB0b3BpYyBwcm9wb3J0aW9ucyBwZXIgZG9jdW1lbnQpIHRvIGByZXN1bHRzL0xEQV9rMTJfam9iX3RpdGxlLmNzdmAuCjguICBVc2VkIHRoaXMgZml0dGVkIHRvcGljIG1vZGVsIHRvIGVzdGltYXRlIHRvcGljIHByb3BvcnRpb25zIHdoZW4gdGhlIHRleHRzIGFyZSBgam9iX3RpdGxlYCBidXQgYWdncmVnYXRlZCBieSBgcHJvamVjdF9pZGAuCiAgICBhLiAgSSBjb25jYXRlbmF0ZWQgYWxsIG9mIHRoZSBgam9iX3RpdGxlYCB0ZXh0IGFmdGVyIGdyb3VwaW5nIGJ5IGBwcm9qZWN0X2lkYC4gIFRoaXMgcmVzdWx0ZWQgaW4gNjQ0LDEzMiB1bmlxdWUgZG9jdW1lbnRzLgogICAgYi4gIEkgZm9sbG93ZWQgdGhlIHN0ZXBzIGluIDMgYW5kIDQgYWJvdmUgdG8gcHJvY2VzcyB0aGUgdG9rZW5zIGFuZCBmb3JtIGEgZG9jdW1lbnQtZmVhdHVyZSBtYXRyaXguCiAgICBjLiAgSSB1c2VkIHRoZSBmaXR0ZWQgayA9IDEyIG1vZGVsIHRvIHByZWRpY3QgdG9waWMgcHJvcG9ydGlvbnMgZm9yIHRoZXNlIG5ldyBkb2N1bWVudHMuICBJIGRpZCB0aGlzIGJlY2F1c2UgdGhlIHRleHRzIGFyZSB0aGUgc2FtZSwganVzdCBncm91cGVkIGluIGEgZGlmZmVyZW50IHdheS4gIChCdXQgSSBkaWQgYSBjb21wYXJpc29uIG9mIGhhdmluZyBmaXR0ZWQgYSBuZXcgbW9kZWwgdGhpcyB3YXksIGFuZCB0aGUgcmVzdWx0cyB3ZXJlIGxhcmdlbHkgdGhlIHNhbWUuKQogICAgZC4gIEkgb3V0cHV0dGVkIHRoZSB0b3BpYyBwcm9wb3Rpb25zIGZvciBlYWNoIGBwcm9qX2lkYCBhcyBgYHJlc3VsdHMvTERBX2sxMl9wcm9qX2lkLmNzdmAuCgoKIyMgUmVwbGljYXRpb24gZmlsZXMKClNlZSBgRml0X0xEQS5SbWQiIHdoaWNoIGhhcyBhbGwgb2YgdGhlIGNvZGUgcmVxdWlyZWQgdG8gbG9hZCB0aGUgZmlsZSBgam9iIGRlc2NyaXB0b3JzLmNzdi56aXBgLCBwcm9jZXNzIHRoZSB0ZXh0LCBmaXQgdGhlIHRvcGljIG1vZGVscywgYW5kIG91dHB1dCB0aGUgcmVzdWx0cy4KCkJlY2F1c2UgdGhlc2UgZmlsZXMgYXJlIHNvIGxhcmdlLCBJIG91dHB1dHRlZCB0aGVtIHRvIGEgdGVtcG9yYXJ5IGZvbGRlciBhbmQgdGhlbiB6aXBwZWQgdGhlbSBiZWZvcmUgY29weWluZyB0byB0aGUgRHJvcGJveCBzaGFyZWQgZm9sZGVyIGluIGByZXN1bHRzL2AuCgoKIyMgRGV0YWlscyBvbiB0aGUgdG9waWMgY2FsaWJyYXRpb24KCmBgYHtyfQpsaWJyYXJ5KCJzdG0iKQpsb2FkKCJmaXR0ZWQva19zZWFyY2gucmRhIikKcGxvdChrX3NlYXJjaCkKYGBgCgpEZXRhaWxzIG9uIHRoZSBLID0gMTIgdG9waWNzOgpgYGB7cn0KbG9hZCgiZml0dGVkL3Rtb2QxMi5yZGEiKQpwbG90KHRtb2QxMikKc3VtbWFyeSh0bW9kMTIpCmBgYAojIyBIb3cgdG8gdXNlIHRoZSBvdXRwdXQgZmlsZXMKCiMjIyBGaXR0ZWQgYnkgYG1lbV9pZGAgKGJ5IGZyZWVsYW5jZXIpCgpMb2FkIGBMREFfazEyX2pvYl90aXRsZS5jc3ZgLiAgVGhpcyBmaWxlIGNvbnRhaW5zIG9uZSBzZXQgb2YgdG9waWMgcHJvcG9ydGlvbnMgZm9yIGVhY2ggdW5pcXVlIGBtZW1faWRgIChhIHRvdGFsIG9mICkKCmBgYHtyfQp1bnppcCgicmVzdWx0cy9MREFfazEyX2pvYl90aXRsZS56aXAiLCBleGRpciA9ICJ+L3RtcC8iKQpMREFfazEyX2pvYl90aXRsZSA8LSBkYXRhLnRhYmxlOjpmcmVhZCgifi90bXAvTERBX2sxMl9qb2JfdGl0bGUuY3N2IiwgZGF0YS50YWJsZSA9IEZBTFNFKQp0aWJibGU6OmdsaW1wc2UoTERBX2sxMl9qb2JfdGl0bGUsIHdpZHRoID0gOTApCmBgYAoKVG8gbWVyZ2UgdGhpcyB3aXRoIHRoZSBvcmlnaW5hbCBkYXRhLCBqdXN0IGRvIGEgImxlZnQgam9pbiIgb3BlcmF0aW9uOgoKYGBge3J9CnVuemlwKCJqb2IgZGVzY3JpcHRvcnMuemlwIiwgZXhkaXIgPSAifi90bXAvIikKYWxsX2RhdGEgPC0gZGF0YS50YWJsZTo6ZnJlYWQoIn4vdG1wL2pvYiBkZXNjcmlwdG9ycy5jc3YiLCBkYXRhLnRhYmxlID0gRkFMU0UpCmFsbF9kYXRhX0xEQV9qb2JfdGl0bGUgPC0gZHBseXI6OmxlZnRfam9pbihhbGxfZGF0YSwgTERBX2sxMl9qb2JfdGl0bGUsIGJ5ID0gIm1lbV9pZCIpCmRwbHlyOjpnbGltcHNlKGFsbF9kYXRhX0xEQV9qb2JfdGl0bGUsIHdpZHRoID0gOTApCmBgYAoKIyMjIEZpdHRlZCBieSBgcHJval9pZGAgKGJ5IHByb2plY3QpCgpMb2FkIGBMREFfazEyX3Byb2pfaWQuY3N2YC4gIFRoaXMgZmlsZSBjb250YWlucyBvbmUgc2V0IG9mIHRvcGljIHByb3BvcnRpb25zIGZvciBlYWNoIHVuaXF1ZSBgbWVtX2lkYCAoYSB0b3RhbCBvZiApCgpgYGB7cn0KdW56aXAoInJlc3VsdHMvTERBX2sxMl9wcm9qX2lkLnppcCIsIGV4ZGlyID0gIn4vdG1wLyIpCkxEQV9rMTJfcHJval9pZCA8LSBkYXRhLnRhYmxlOjpmcmVhZCgifi90bXAvTERBX2sxMl9wcm9qX2lkLmNzdiIsIGRhdGEudGFibGUgPSBGQUxTRSkKdGliYmxlOjpnbGltcHNlKExEQV9rMTJfcHJval9pZCwgd2lkdGggPSA5MCkKYGBgCgpUbyBtZXJnZSB0aGlzIHdpdGggdGhlIG9yaWdpbmFsIGRhdGEsIGp1c3QgZG8gYSAibGVmdCBqb2luIiBvcGVyYXRpb246CgpgYGB7cn0KYWxsX2RhdGFfTERBX3Byb2pfaWQgPC0gZHBseXI6OmxlZnRfam9pbihhbGxfZGF0YSwgTERBX2sxMl9wcm9qX2lkLCBieSA9ICJwcm9qX2lkIikKZHBseXI6OmdsaW1wc2UoYWxsX2RhdGFfTERBX3Byb2pfaWQsIHdpZHRoID0gOTApCmBgYAo=

Summary of LDA Analysis

Kenneth Benoit

Summary

Replication files

Details on the topic calibration

How to use the output files

Fitted by mem_id (by freelancer)

Fitted by proj_id (by project)

Fitted by `mem_id` (by freelancer)

Fitted by `proj_id` (by project)