Model Performance Evaluation

Published

July 28, 2024

Load Libraries and Data: The first chunk loads the necessary libraries and the data from CSV files.
Preprocessing and Similarity Functions: The second chunk defines the preprocessing and similarity calculation functions.
Process and Summarize predicted_vs_manual: The third chunk preprocesses the text, calculates similarity, and summarizes the numerical columns for predicted_vs_manual.
Process and Summarize predicted_no_score: The fourth chunk preprocesses the text, calculates similarity, and summarizes the numerical columns for predicted_no_score.
Visualizations for predicted_vs_manual: The fifth chunk generates histograms and box-plots for the accuracy and alignment ratings in predicted_vs_manual.
Visualizations for predicted_no_score: The sixth chunk generates histograms and box-plots for the similarity scores in predicted_no_score.

Loading required package: ggplot2

Warning: package 'ggplot2' was built under R version 4.4.1

Loading required package: dplyr

Warning: package 'dplyr' was built under R version 4.4.1


Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

Loading required package: readr

Loading required package: tm

Warning: package 'tm' was built under R version 4.4.1

Loading required package: NLP


Attaching package: 'NLP'

The following object is masked from 'package:ggplot2':

    annotate

Loading required package: SnowballC

Loading required package: tidytext

Warning: package 'tidytext' was built under R version 4.4.1

Loading required package: lsa

Warning: package 'lsa' was built under R version 4.4.1

Loading required package: stringr

New names:
Rows: 111 Columns: 9
── Column specification
──────────────────────────────────────────────────────── Delimiter: "," chr
(5): manual_narrative_name, predicted_narrative, back_office_url, origin... dbl
(3): ...1, How accurately do the predicted narratives represent the cont... lgl
(1): Topics (Exploring 10, Matching 0)
ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
Specify the column types or set `show_col_types = FALSE` to quiet this message.
New names:
Rows: 727 Columns: 5
── Column specification
──────────────────────────────────────────────────────── Delimiter: "," chr
(4): manual_narrative_name, predicted_narrative, back_office_url, origin... dbl
(1): ...1
ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
Specify the column types or set `show_col_types = FALSE` to quiet this message.
• `` -> `...1`

[1] "...1"                                                                                          
[2] "manual_narrative_name"                                                                         
[3] "predicted_narrative"                                                                           
[4] "back_office_url"                                                                               
[5] "original_url"                                                                                  
[6] "How accurately do the predicted narratives represent the content of the articles? (1 to 10)"   
[7] "How closely do the predicted narratives align with the manually assigned narratives? (1 to 10)"
[8] "Other feedback"                                                                                
[9] "Topics (Exploring 10, Matching 0)"

[1] "...1"                  "manual_narrative_name" "predicted_narrative"  
[4] "back_office_url"       "original_url"

# A tibble: 1 × 6
  mean_accuracy median_accuracy mean_alignment median_alignment mean_similarity
          <dbl>           <dbl>          <dbl>            <dbl>           <dbl>
1          3.19               1           3.13                1               0
# ℹ 1 more variable: median_similarity <dbl>

# A tibble: 1 × 2
  mean_similarity median_similarity
            <dbl>             <dbl>
1           0.102                 0

Data Visualizations

The predicted_vs_manual dataset’s summary shows mean and median values for accuracy (3.19 and 1 respectively) and alignment ratings (3.13 and 1 respectively), indicating generally low confidence in automated predictions. Meanwhile, predicted_no_score displays a mean similarity score of 0.102, with a median of 0, further emphasizing the low similarity between predicted and manual narratives.

Alignment Ratings (Sheet 1)

Similarity Scores (Sheet 2)