IN this file, EA will cross check the latest scaling of the final acts’ summaries.
setwd("C:/Users/nasta/Dropbox/____Nordface_POst_doc/data/EurLex/Scaled summaries/Scaling _Proposal_and_Final_acts_092021/")
data=read.csv('lp_fa_scaling.csv')
overestimated_examples= c('2013/0015(COD)', '2011/0298(COD)', '2016/0208(COD)',
'2017/0220(COD)', '2016/0275(COD)', '2010/0253(COD)',
'2010/0101(COD)', '2010/0242(COD)' , '2011/0298(COD)',
'2011/0353(COD)', '2011/0356(COD)', '2013/0080(COD)',
'2013/0139(COD)', '2014/0257(COD)', '2014/0280(COD)')
examples_df=read_excel("examples_overest_and_background_in_content.xlsx")
Checking if anything changed for the cases where both EA and AK agreed overestimation was evident in the prior scaling attempts
The results suggests that the scaling of the final acts is currently more conservative.
Plotting the histogram which captures the difference in scaling for the final acts carried out in May and September.
##
## FALSE TRUE
## 275 589
Let’s take a look into the cases which changed the probability fro more than 0.35 points along the probaility scale. Overall, there is quite a substantial decrease in the probability scaled for the summaries of the final acts.
No particular pattern emerges if we try to see whether the change is driven by the policy area
In this step, we scrutinize the cases in which Coder 1 validation results diverges from the model predictions (relying in a dichotomous indicator)
## coder_model_FA$agree n percent valid_percent
## FALSE 9 0.01040462 0.072
## TRUE 116 0.13410405 0.928
## <NA> 740 0.85549133 NA
Also checking how many disagreement there are for the predictions for the proposals: essentially comparing the duichotomous classification from the manual validation and the prediction of the model.
## coder_model_LP$agree n percent valid_percent
## FALSE 14 0.01618497 0.1521739
## TRUE 78 0.09017341 0.8478261
## <NA> 773 0.89364162 NA
Firstly we draw the list of mismatches for the Final acts then the list of mismatched cases for the proposals