LDA model of Supreme Court Opinions on The Free Exercise Clause of The First Amendment

I retrieved the majority opinons for the 32 Supreme Court opinions that ruled on the Free Exercise Clause of the first amendment according to Oyez.org a website that contains summaries and puts the opinions into categories based on topics. However, I went to Westlaw to retrieve the entire majority text of the opinion. I clean the corpus by removing stopwords, moving to lowercase, and removing numbers. I then ran the LDA model based on 4 topics.

Justin Burnworth
2022-04-06
[1] "~/Documents/Text Project/Judicial"

“Tokens”

<itoken>
  Inherits from: <CallbackIterator>
  Public:
    callback: function (x) 
    clone: function (deep = FALSE) 
    initialize: function (x, callback = identity) 
    is_complete: active binding
    length: active binding
    move_cursor: function () 
    nextElem: function () 
    x: GenericIterator, iterator, R6
Number of docs: 32 
0 stopwords:  ... 
ngram_min = 1; ngram_max = 1 
Vocabulary: 
            term term_count doc_count
   1:   abdicate          1         1
   2: abdication          1         1
   3:      abdul          1         1
   4:   abdullah          1         1
   5: abhorrence          1         1
  ---                                
8635:      state        626        32
8636:     church        627        27
8637:      court        813        32
8638:         ct       1047        32
8639:  religious       1272        32
[1] 8639    3
[1] 448   3

“Iterate over tokens, built vocabulary, pruned vocabulary, created DTM.”

<WarpLDA>
  Inherits from: <LDA>
  Public:
    clone: function (deep = FALSE) 
    components: active binding
    fit_transform: function (x, n_iter = 1000, convergence_tol = 0.001, n_check_convergence = 10, 
    get_top_words: function (n = 10, topic_number = 1L:private$n_topics, lambda = 1) 
    initialize: function (n_topics = 10L, doc_topic_prior = 50/n_topics, topic_word_prior = 1/n_topics, 
    plot: function (lambda.step = 0.1, reorder.topics = FALSE, doc_len = private$doc_len, 
    topic_word_distribution: active binding
    transform: function (x, n_iter = 1000, convergence_tol = 0.001, n_check_convergence = 10, 
  Private:
    calc_pseudo_loglikelihood: function (ptr = private$ptr) 
    check_convert_input: function (x) 
    components_: NULL
    doc_len: NULL
    doc_topic_distribution: function () 
    doc_topic_distribution_with_prior: function () 
    doc_topic_matrix: NULL
    doc_topic_prior: 0.1
    fit_transform_internal: function (model_ptr, n_iter, convergence_tol, n_check_convergence, 
    get_c_all: function () 
    get_c_all_local: function () 
    get_doc_topic_matrix: function (prt, nr) 
    get_topic_word_count: function () 
    init_model_dtm: function (x, ptr = private$ptr) 
    internal_matrix_formats: list
    is_initialized: FALSE
    n_iter_inference: 10
    n_topics: 4
    ptr: NULL
    reset_c_local: function () 
    run_iter_doc: function (update_topics = TRUE, ptr = private$ptr) 
    run_iter_word: function (update_topics = TRUE, ptr = private$ptr) 
    seeds: 1956722818.67766 1422772583.17494
    set_c_all: function (x) 
    set_internal_matrix_formats: function (sparse = NULL, dense = NULL) 
    topic_word_distribution_with_prior: function () 
    topic_word_prior: 0.01
    transform_internal: function (x, n_iter = 1000, convergence_tol = 0.001, n_check_convergence = 10, 
    vocabulary: NULL

“LDA model”

INFO  [17:09:11.183] early stopping at 100 iteration 
INFO  [17:09:11.248] early stopping at 50 iteration 

“Fitted model and used a bar graph to display topic distribution for the first document then the entire dataset.”

      [,1]           [,2]             [,3]        [,4]         
 [1,] "hhs"          "holy"           "ordinance" "perich"     
 [2,] "rfra"         "diocese"        "prison"    "hastings"   
 [3,] "corporations" "ecclesiastical" "animal"    "cls"        
 [4,] "phillips"     "dionisije"      "sacrifice" "minister"   
 [5,] "roy"          "foster"         "santeria"  "student"    
 [6,] "trinity"      "css"            "beard"     "hosanna"    
 [7,] "coverage"     "sales"          "inmates"   "tabor"      
 [8,] "profit"       "illinois"       "rluipa"    "substances" 
 [9,] "insurance"    "bishop"         "animals"   "forum"      
[10,] "montana"      "disputes"       "prisoners" "ministerial"

“I got the top 10 words for the 4 topics.”