Section 1 of Exam 1: Regression Model

Red font is directions.

Green font is questions.

Blue font is answers.

  1. Using the formatting learned in the first class, type the following header in R Markdown:

See above header

Read the csv file titled Training Data into R. All rows of this dataset should be used as your training set in the development of a regression model. This dataset contains information about some of Pfizer’s patent applications. Each patent application has a unique ID given to it, and this ID is listed in the Patent_Number column. The column titled Cites_Patent_Count lists the number of patent citations made within the corresponding patent application, and the column titled Cited_by_Patent_Count lists the number of patents which cite the document identified by the Patent_Number. The Cited_by_Patent_Count column can be seen as a measure of the influence and strength of the innovation, and as such, it can be useful to try to predict this column to help firms understand the potential influence of their innovations.

training <- read.csv("C:/Users/justt/Desktop/School/621/Exams/Exam 1/Training Data.csv")
str(training)
## 'data.frame':    626 obs. of  3 variables:
##  $ Patent_Number        : chr  "PL 3341367 T3" "HR P20210871 T1" "CR 20210284 A" "US 2021/0205309 A1" ...
##  $ Cites_Patent_Count   : int  0 0 0 0 3 0 0 1 0 0 ...
##  $ Cited_by_Patent_Count: int  0 0 0 0 0 0 0 0 0 0 ...
  1. Use regression to try to predict the Cited_by_Patent_Count, with Cites_Patent_Count used as a covariate. Based on your results, write an equation using the following format: Cited_by_Patent_Count ≈ B1(Cites_Patent_Count) + B2, where B1 and B2 are real numbers
training <- training[ ,-1]
colnames(training)
## [1] "Cites_Patent_Count"    "Cited_by_Patent_Count"
colnames(training) <- c("B1", "B2")
model1 <- lm(B2 ~ B1, data = training)
model1
## 
## Call:
## lm(formula = B2 ~ B1, data = training)
## 
## Coefficients:
## (Intercept)           B1  
##     0.05226      0.00267
summary(model1)
## 
## Call:
## lm(formula = B2 ~ B1, data = training)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.9469 -0.0523 -0.0523 -0.0523  3.7661 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 0.0522603  0.0130896   3.993 7.32e-05 ***
## B1          0.0026705  0.0005322   5.018 6.81e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3223 on 624 degrees of freedom
## Multiple R-squared:  0.03879,    Adjusted R-squared:  0.03725 
## F-statistic: 25.18 on 1 and 624 DF,  p-value: 6.811e-07

Fill-in the blanks for the following statements:

  1. Your regression equation provides an “estimate” of the actual values for the numbers B1 and B2. There is a 90% probability that the actual value of B1 lies between _________ and ________.
confint(model1, level = 0.90)
##                     5 %        95 %
## (Intercept) 0.030697775 0.073822761
## B1          0.001793859 0.003547102

There is a 90% probability that the actual value of B1 lies between 0.001793859 and 0.003547102.

  1. There is a 90% probability that the actual value of B2 lies between ________ and ________.
confint(model1, level = 0.90)
##                     5 %        95 %
## (Intercept) 0.030697775 0.073822761
## B1          0.001793859 0.003547102

There is a 90% probability that the actual value of B2 lies between 0.030697775 and 0.073822761.

  1. Based on Cook’s Distance, are there outliers that may possibly need to be removed and/or treated differently in the dataset? Create a plot using R to support your answer.
cooks.distance(model1)
##            1            2            3            4            5            6 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.810625e-05 2.174933e-05 
##            7            8            9           10           11           12 
## 2.174933e-05 2.372145e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##           13           14           15           16           17           18 
## 5.071376e-05 2.174933e-05 2.174933e-05 8.574799e-05 2.174933e-05 2.174933e-05 
##           19           20           21           22           23           24 
## 2.372145e-05 2.174933e-05 2.174933e-05 2.174933e-05 3.021095e-02 2.372145e-05 
##           25           26           27           28           29           30 
## 2.174933e-05 2.174933e-05 2.583357e-05 2.372145e-05 7.152876e-03 2.174933e-05 
##           31           32           33           34           35           36 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##           37           38           39           40           41           42 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##           43           44           45           46           47           48 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##           49           50           51           52           53           54 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.583357e-05 2.174933e-05 2.174933e-05 
##           55           56           57           58           59           60 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##           61           62           63           64           65           66 
## 2.174933e-05 2.174933e-05 2.174933e-05 3.612723e-05 2.174933e-05 2.174933e-05 
##           67           68           69           70           71           72 
## 2.174933e-05 6.914970e-03 2.174933e-05 2.174933e-05 3.612723e-05 2.174933e-05 
##           73           74           75           76           77           78 
## 6.580877e-05 6.111181e-02 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##           79           80           81           82           83           84 
## 2.810625e-05 5.071376e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##           85           86           87           88           89           90 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 3.322685e-05 
##           91           92           93           94           95           96 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 7.152876e-03 6.580877e-05 
##           97           98           99          100          101          102 
## 2.174933e-05 2.174933e-05 2.372145e-05 2.174933e-05 4.284078e-04 2.174933e-05 
##          103          104          105          106          107          108 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          109          110          111          112          113          114 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          115          116          117          118          119          120 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          121          122          123          124          125          126 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.810625e-05 2.174933e-05 4.275628e-05 
##          127          128          129          130          131          132 
## 9.239691e-02 5.071376e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.372145e-05 
##          133          134          135          136          137          138 
## 2.174933e-05 1.273912e-02 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          139          140          141          142          143          144 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          145          146          147          148          149          150 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          151          152          153          154          155          156 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          157          158          159          160          161          162 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          163          164          165          166          167          168 
## 2.810625e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          169          170          171          172          173          174 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          175          176          177          178          179          180 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 7.381892e-02 2.174933e-05 
##          181          182          183          184          185          186 
## 2.838903e-02 2.174933e-05 2.174933e-05 2.174933e-05 1.570643e-03 2.174933e-05 
##          187          188          189          190          191          192 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          193          194          195          196          197          198 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          199          200          201          202          203          204 
## 2.174933e-05 1.335152e-04 7.152876e-03 2.174933e-05 3.612723e-05 2.174933e-05 
##          205          206          207          208          209          210 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          211          212          213          214          215          216 
## 2.372145e-05 2.174933e-05 2.810625e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          217          218          219          220          221          222 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.372145e-05 2.174933e-05 
##          223          224          225          226          227          228 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.372145e-05 
##          229          230          231          232          233          234 
## 2.174933e-05 2.174933e-05 8.849406e-01 2.174933e-05 2.174933e-05 2.174933e-05 
##          235          236          237          238          239          240 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          241          242          243          244          245          246 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          247          248          249          250          251          252 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          253          254          255          256          257          258 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          259          260          261          262          263          264 
## 3.021095e-02 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 3.612723e-05 
##          265          266          267          268          269          270 
## 2.174933e-05 2.372145e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          271          272          273          274          275          276 
## 2.174933e-05 2.174933e-05 2.372145e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          277          278          279          280          281          282 
## 1.222411e-04 2.174933e-05 2.174933e-05 2.174933e-05 2.810625e-05 2.174933e-05 
##          283          284          285          286          287          288 
## 2.174933e-05 2.583357e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          289          290          291          292          293          294 
## 2.174933e-05 2.174933e-05 2.174933e-05 3.929307e-05 2.174933e-05 2.174933e-05 
##          295          296          297          298          299          300 
## 2.174933e-05 9.727993e-02 9.369997e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          301          302          303          304          305          306 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          307          308          309          310          311          312 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          313          314          315          316          317          318 
## 2.174933e-05 2.174933e-05 2.174933e-05 3.963990e-04 2.174933e-05 2.174933e-05 
##          319          320          321          322          323          324 
## 7.152876e-03 2.174933e-05 1.796726e-02 2.174933e-05 2.174933e-05 2.810625e-05 
##          325          326          327          328          329          330 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          331          332          333          334          335          336 
## 2.174933e-05 2.372145e-05 2.174933e-05 1.913967e-02 2.174933e-05 2.174933e-05 
##          337          338          339          340          341          342 
## 2.174933e-05 7.185379e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          343          344          345          346          347          348 
## 2.174933e-05 2.485777e-03 2.174933e-05 3.322685e-05 2.174933e-05 1.118882e-04 
##          349          350          351          352          353          354 
## 2.174933e-05 2.174933e-05 7.702738e-04 2.372145e-05 2.174933e-05 2.174933e-05 
##          355          356          357          358          359          360 
## 6.996241e-03 2.174933e-05 2.174933e-05 2.372145e-05 2.174933e-05 2.174933e-05 
##          361          362          363          364          365          366 
## 2.174933e-05 2.174933e-05 3.322685e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          367          368          369          370          371          372 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 7.152876e-03 6.030046e-05 
##          373          374          375          376          377          378 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          379          380          381          382          383          384 
## 2.174933e-05 3.056231e-05 7.152876e-03 2.174933e-05 2.174933e-05 2.174933e-05 
##          385          386          387          388          389          390 
## 2.174933e-05 2.174933e-05 1.335152e-04 2.174933e-05 2.174933e-05 2.174933e-05 
##          391          392          393          394          395          396 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.372145e-05 2.174933e-05 
##          397          398          399          400          401          402 
## 2.174933e-05 2.174933e-05 5.528323e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          403          404          405          406          407          408 
## 2.174933e-05 2.174933e-05 3.021095e-02 2.174933e-05 2.372145e-05 2.174933e-05 
##          409          410          411          412          413          414 
## 3.056231e-05 7.152876e-03 2.174933e-05 2.174933e-05 3.929307e-05 2.174933e-05 
##          415          416          417          418          419          420 
## 2.174933e-05 2.174933e-05 7.111225e-02 2.174933e-05 5.528323e-05 7.152876e-03 
##          421          422          423          424          425          426 
## 2.174933e-05 4.655102e-05 8.574799e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          427          428          429          430          431          432 
## 2.174933e-05 2.174933e-05 2.174933e-05 3.056231e-05 2.174933e-05 2.583357e-05 
##          433          434          435          436          437          438 
## 2.174933e-05 2.583357e-05 2.174933e-05 2.810625e-05 2.174933e-05 2.174933e-05 
##          439          440          441          442          443          444 
## 2.174933e-05 2.174933e-05 1.222411e-04 2.174933e-05 5.071376e-05 2.174933e-05 
##          445          446          447          448          449          450 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          451          452          453          454          455          456 
## 3.056231e-05 2.174933e-05 2.174933e-05 2.174933e-05 6.030046e-05 2.174933e-05 
##          457          458          459          460          461          462 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 9.369997e-05 
##          463          464          465          466          467          468 
## 2.174933e-05 2.372145e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          469          470          471          472          473          474 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          475          476          477          478          479          480 
## 2.174933e-05 2.810625e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          481          482          483          484          485          486 
## 3.322685e-05 2.174933e-05 2.174933e-05 2.372145e-05 2.174933e-05 2.174933e-05 
##          487          488          489          490          491          492 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.372145e-05 2.174933e-05 
##          493          494          495          496          497          498 
## 2.174933e-05 2.174933e-05 2.174933e-05 6.773844e-03 2.583357e-05 2.583357e-05 
##          499          500          501          502          503          504 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          505          506          507          508          509          510 
## 6.775879e-01 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          511          512          513          514          515          516 
## 2.174933e-05 2.174933e-05 2.583357e-05 2.583357e-05 2.810625e-05 2.372145e-05 
##          517          518          519          520          521          522 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          523          524          525          526          527          528 
## 2.174933e-05 2.174933e-05 3.612723e-05 2.174933e-05 2.810625e-05 2.583357e-05 
##          529          530          531          532          533          534 
## 2.174933e-05 2.174933e-05 7.848345e-05 2.174933e-05 1.150560e-03 2.174933e-05 
##          535          536          537          538          539          540 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          541          542          543          544          545          546 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          547          548          549          550          551          552 
## 2.174933e-05 2.583357e-05 2.174933e-05 2.174933e-05 2.810625e-05 2.174933e-05 
##          553          554          555          556          557          558 
## 2.174933e-05 7.152876e-03 2.174933e-05 2.174933e-05 3.021095e-02 2.174933e-05 
##          559          560          561          562          563          564 
## 2.174933e-05 7.185379e-05 2.174933e-05 2.174933e-05 3.612723e-05 2.174933e-05 
##          565          566          567          568          569          570 
## 2.174933e-05 2.174933e-05 2.635251e+00 1.523294e-02 2.174933e-05 2.174933e-05 
##          571          572          573          574          575          576 
## 2.174933e-05 8.756098e-03 2.174933e-05 2.174933e-05 6.914970e-03 2.174933e-05 
##          577          578          579          580          581          582 
## 2.174933e-05 3.021095e-02 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          583          584          585          586          587          588 
## 5.282779e-03 2.174933e-05 2.174933e-05 2.174933e-05 3.664304e-04 2.174933e-05 
##          589          590          591          592          593          594 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          595          596          597          598          599          600 
## 6.914970e-03 2.810625e-05 2.174933e-05 2.174933e-05 2.174933e-05 3.056231e-05 
##          601          602          603          604          605          606 
## 2.174933e-05 2.463827e+00 3.056231e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          607          608          609          610          611          612 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          613          614          615          616          617          618 
## 2.174933e-05 2.174933e-05 4.275628e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          619          620          621          622          623          624 
## 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 2.174933e-05 
##          625          626 
## 2.174933e-05 2.583357e-05
max(cooks.distance(model1))
## [1] 2.635251

Yes, there are outliers that may possibly need to be removed and/or treated differently in the dataset. The Max is 2.635251, which is greater than 1.0 value for Cook’s Distance.

plot(model1)

This shows that there are 2 points outside the 1.0 range for Cook’s Distance. Points 602 and 567 should be considered outliers and should be removed. This could be the result of bad data capture, or data anomolies.

  1. Check the normality of residuals by creating a QQ-plot. Based on the plot, are the residuals normally distributed?
qqnorm(model1$residuals, main = "model1")
qqline(model1$residuals)

No, the residuals are not normally distributed. Normal Unit Scaling is recommended.

Use the regression model that you created in #2 to predict the Cited_by_Patent_Count in each of the following scenarios:

  1. Cites_Patent_Count = 340
model1 <- lm(B2 ~ B1, data = training)
hatvalues(model1)
##           1           2           3           4           5           6 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001602470 0.001649210 
##           7           8           9          10          11          12 
## 0.001649210 0.001628178 0.001649210 0.001649210 0.001649210 0.001649210 
##          13          14          15          16          17          18 
## 0.001684219 0.001649210 0.001649210 0.001966906 0.001649210 0.001649210 
##          19          20          21          22          23          24 
## 0.001628178 0.001649210 0.001649210 0.001649210 0.001649210 0.001628178 
##          25          26          27          28          29          30 
## 0.001649210 0.001649210 0.001612598 0.001628178 0.001649210 0.001649210 
##          31          32          33          34          35          36 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##          37          38          39          40          41          42 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##          43          44          45          46          47          48 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##          49          50          51          52          53          54 
## 0.001649210 0.001649210 0.001649210 0.001612598 0.001649210 0.001649210 
##          55          56          57          58          59          60 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##          61          62          63          64          65          66 
## 0.001649210 0.001649210 0.001649210 0.001604795 0.001649210 0.001649210 
##          67          68          69          70          71          72 
## 0.001649210 0.001612598 0.001649210 0.001649210 0.001604795 0.001649210 
##          73          74          75          76          77          78 
## 0.001801030 0.057059770 0.001649210 0.001649210 0.001649210 0.001649210 
##          79          80          81          82          83          84 
## 0.001602470 0.001684219 0.001649210 0.001649210 0.001649210 0.001649210 
##          85          86          87          88          89          90 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001598568 
##          91          92          93          94          95          96 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001801030 
##          97          98          99         100         101         102 
## 0.001649210 0.001649210 0.001628178 0.001649210 0.004156863 0.001649210 
##         103         104         105         106         107         108 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         109         110         111         112         113         114 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         115         116         117         118         119         120 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         121         122         123         124         125         126 
## 0.001649210 0.001649210 0.001649210 0.001602470 0.001649210 0.001633604 
##         127         128         129         130         131         132 
## 0.053239702 0.001684219 0.001649210 0.001649210 0.001649210 0.001628178 
##         133         134         135         136         137         138 
## 0.001649210 0.025500044 0.001649210 0.001649210 0.001649210 0.001649210 
##         139         140         141         142         143         144 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         145         146         147         148         149         150 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         151         152         153         154         155         156 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         157         158         159         160         161         162 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         163         164         165         166         167         168 
## 0.001602470 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         169         170         171         172         173         174 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         175         176         177         178         179         180 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001801030 0.001649210 
##         181         182         183         184         185         186 
## 0.038683734 0.001649210 0.001649210 0.001649210 0.008314883 0.001649210 
##         187         188         189         190         191         192 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         193         194         195         196         197         198 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         199         200         201         202         203         204 
## 0.001649210 0.002352401 0.001649210 0.001649210 0.001604795 0.001649210 
##         205         206         207         208         209         210 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         211         212         213         214         215         216 
## 0.001628178 0.001649210 0.001602470 0.001649210 0.001649210 0.001649210 
##         217         218         219         220         221         222 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001628178 0.001649210 
##         223         224         225         226         227         228 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001628178 
##         229         230         231         232         233         234 
## 0.001649210 0.001649210 0.012638032 0.001649210 0.001649210 0.001649210 
##         235         236         237         238         239         240 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         241         242         243         244         245         246 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         247         248         249         250         251         252 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         253         254         255         256         257         258 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         259         260         261         262         263         264 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001604795 
##         265         266         267         268         269         270 
## 0.001649210 0.001628178 0.001649210 0.001649210 0.001649210 0.001649210 
##         271         272         273         274         275         276 
## 0.001649210 0.001649210 0.001628178 0.001649210 0.001649210 0.001649210 
##         277         278         279         280         281         282 
## 0.002264399 0.001649210 0.001649210 0.001649210 0.001602470 0.001649210 
##         283         284         285         286         287         288 
## 0.001649210 0.001612598 0.001649210 0.001649210 0.001649210 0.001649210 
##         289         290         291         292         293         294 
## 0.001649210 0.001649210 0.001649210 0.001616473 0.001649210 0.001649210 
##         295         296         297         298         299         300 
## 0.001649210 0.060213954 0.002033102 0.001649210 0.001649210 0.001649210 
##         301         302         303         304         305         306 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         307         308         309         310         311         312 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         313         314         315         316         317         318 
## 0.001649210 0.001649210 0.001649210 0.003992536 0.001649210 0.001649210 
##         319         320         321         322         323         324 
## 0.001649210 0.001649210 0.005257305 0.001649210 0.001649210 0.001602470 
##         325         326         327         328         329         330 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         331         332         333         334         335         336 
## 0.001649210 0.001628178 0.001649210 0.005667733 0.001649210 0.001649210 
##         337         338         339         340         341         342 
## 0.001649210 0.001850871 0.001649210 0.001649210 0.001649210 0.001649210 
##         343         344         345         346         347         348 
## 0.001649210 0.010654413 0.001649210 0.001598568 0.001649210 0.002181848 
##         349         350         351         352         353         354 
## 0.001649210 0.001649210 0.005667733 0.001628178 0.001649210 0.001649210 
##         355         356         357         358         359         360 
## 0.001717704 0.001649210 0.001649210 0.001628178 0.001649210 0.001649210 
##         361         362         363         364         365         366 
## 0.001649210 0.001649210 0.001598568 0.001649210 0.001649210 0.001649210 
##         367         368         369         370         371         372 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001756641 
##         373         374         375         376         377         378 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         379         380         381         382         383         384 
## 0.001649210 0.001597793 0.001649210 0.001649210 0.001649210 0.001649210 
##         385         386         387         388         389         390 
## 0.001649210 0.001649210 0.002352401 0.001649210 0.001649210 0.001649210 
##         391         392         393         394         395         396 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001628178 0.001649210 
##         397         398         399         400         401         402 
## 0.001649210 0.001649210 0.001717704 0.001649210 0.001649210 0.001649210 
##         403         404         405         406         407         408 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001628178 0.001649210 
##         409         410         411         412         413         414 
## 0.001597793 0.001649210 0.001649210 0.001649210 0.001616473 0.001649210 
##         415         416         417         418         419         420 
## 0.001649210 0.001649210 0.032597334 0.001649210 0.001717704 0.001649210 
##         421         422         423         424         425         426 
## 0.001649210 0.001656186 0.001966906 0.001649210 0.001649210 0.001649210 
##         427         428         429         430         431         432 
## 0.001649210 0.001649210 0.001649210 0.001597793 0.001649210 0.001612598 
##         433         434         435         436         437         438 
## 0.001649210 0.001612598 0.001649210 0.001602470 0.001649210 0.001649210 
##         439         440         441         442         443         444 
## 0.001649210 0.001649210 0.002264399 0.001649210 0.001684219 0.001649210 
##         445         446         447         448         449         450 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         451         452         453         454         455         456 
## 0.001597793 0.001649210 0.001649210 0.001649210 0.001756641 0.001649210 
##         457         458         459         460         461         462 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.002033102 
##         463         464         465         466         467         468 
## 0.001649210 0.001628178 0.001649210 0.001649210 0.001649210 0.001649210 
##         469         470         471         472         473         474 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         475         476         477         478         479         480 
## 0.001649210 0.001602470 0.001649210 0.001649210 0.001649210 0.001649210 
##         481         482         483         484         485         486 
## 0.001598568 0.001649210 0.001649210 0.001628178 0.001649210 0.001649210 
##         487         488         489         490         491         492 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001628178 0.001649210 
##         493         494         495         496         497         498 
## 0.001649210 0.001649210 0.001649210 0.001597793 0.001612598 0.001612598 
##         499         500         501         502         503         504 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         505         506         507         508         509         510 
## 0.051013082 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         511         512         513         514         515         516 
## 0.001649210 0.001649210 0.001612598 0.001612598 0.001602470 0.001628178 
##         517         518         519         520         521         522 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         523         524         525         526         527         528 
## 0.001649210 0.001649210 0.001604795 0.001649210 0.001602470 0.001612598 
##         529         530         531         532         533         534 
## 0.001649210 0.001649210 0.001906163 0.001649210 0.007029858 0.001649210 
##         535         536         537         538         539         540 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         541         542         543         544         545         546 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         547         548         549         550         551         552 
## 0.001649210 0.001612598 0.001649210 0.001649210 0.001602470 0.001649210 
##         553         554         555         556         557         558 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         559         560         561         562         563         564 
## 0.001649210 0.001850871 0.001649210 0.001649210 0.001604795 0.001649210 
##         565         566         567         568         569         570 
## 0.001649210 0.001649210 0.299599101 0.004326640 0.001649210 0.001649210 
##         571         572         573         574         575         576 
## 0.001649210 0.002264399 0.001649210 0.001649210 0.001612598 0.001649210 
##         577         578         579         580         581         582 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         583         584         585         586         587         588 
## 0.015981449 0.001649210 0.001649210 0.001649210 0.003833662 0.001649210 
##         589         590         591         592         593         594 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         595         596         597         598         599         600 
## 0.001612598 0.001602470 0.001649210 0.001649210 0.001649210 0.001597793 
##         601         602         603         604         605         606 
## 0.001649210 0.292432463 0.001597793 0.001649210 0.001649210 0.001649210 
##         607         608         609         610         611         612 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         613         614         615         616         617         618 
## 0.001649210 0.001649210 0.001633604 0.001649210 0.001649210 0.001649210 
##         619         620         621         622         623         624 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         625         626 
## 0.001649210 0.001612598
max(hatvalues(model1))
## [1] 0.2995991
x_new = c(1, 340)
X= model.matrix(model1)
t(x_new)%*%solve(t(X)%*%X)%*%x_new
##           [,1]
## [1,] 0.3086801
  1. Cites_Patent_Count = 300
model1 <- lm(B2 ~ B1, data = training)
hatvalues(model1)
##           1           2           3           4           5           6 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001602470 0.001649210 
##           7           8           9          10          11          12 
## 0.001649210 0.001628178 0.001649210 0.001649210 0.001649210 0.001649210 
##          13          14          15          16          17          18 
## 0.001684219 0.001649210 0.001649210 0.001966906 0.001649210 0.001649210 
##          19          20          21          22          23          24 
## 0.001628178 0.001649210 0.001649210 0.001649210 0.001649210 0.001628178 
##          25          26          27          28          29          30 
## 0.001649210 0.001649210 0.001612598 0.001628178 0.001649210 0.001649210 
##          31          32          33          34          35          36 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##          37          38          39          40          41          42 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##          43          44          45          46          47          48 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##          49          50          51          52          53          54 
## 0.001649210 0.001649210 0.001649210 0.001612598 0.001649210 0.001649210 
##          55          56          57          58          59          60 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##          61          62          63          64          65          66 
## 0.001649210 0.001649210 0.001649210 0.001604795 0.001649210 0.001649210 
##          67          68          69          70          71          72 
## 0.001649210 0.001612598 0.001649210 0.001649210 0.001604795 0.001649210 
##          73          74          75          76          77          78 
## 0.001801030 0.057059770 0.001649210 0.001649210 0.001649210 0.001649210 
##          79          80          81          82          83          84 
## 0.001602470 0.001684219 0.001649210 0.001649210 0.001649210 0.001649210 
##          85          86          87          88          89          90 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001598568 
##          91          92          93          94          95          96 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001801030 
##          97          98          99         100         101         102 
## 0.001649210 0.001649210 0.001628178 0.001649210 0.004156863 0.001649210 
##         103         104         105         106         107         108 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         109         110         111         112         113         114 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         115         116         117         118         119         120 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         121         122         123         124         125         126 
## 0.001649210 0.001649210 0.001649210 0.001602470 0.001649210 0.001633604 
##         127         128         129         130         131         132 
## 0.053239702 0.001684219 0.001649210 0.001649210 0.001649210 0.001628178 
##         133         134         135         136         137         138 
## 0.001649210 0.025500044 0.001649210 0.001649210 0.001649210 0.001649210 
##         139         140         141         142         143         144 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         145         146         147         148         149         150 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         151         152         153         154         155         156 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         157         158         159         160         161         162 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         163         164         165         166         167         168 
## 0.001602470 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         169         170         171         172         173         174 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         175         176         177         178         179         180 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001801030 0.001649210 
##         181         182         183         184         185         186 
## 0.038683734 0.001649210 0.001649210 0.001649210 0.008314883 0.001649210 
##         187         188         189         190         191         192 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         193         194         195         196         197         198 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         199         200         201         202         203         204 
## 0.001649210 0.002352401 0.001649210 0.001649210 0.001604795 0.001649210 
##         205         206         207         208         209         210 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         211         212         213         214         215         216 
## 0.001628178 0.001649210 0.001602470 0.001649210 0.001649210 0.001649210 
##         217         218         219         220         221         222 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001628178 0.001649210 
##         223         224         225         226         227         228 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001628178 
##         229         230         231         232         233         234 
## 0.001649210 0.001649210 0.012638032 0.001649210 0.001649210 0.001649210 
##         235         236         237         238         239         240 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         241         242         243         244         245         246 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         247         248         249         250         251         252 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         253         254         255         256         257         258 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         259         260         261         262         263         264 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001604795 
##         265         266         267         268         269         270 
## 0.001649210 0.001628178 0.001649210 0.001649210 0.001649210 0.001649210 
##         271         272         273         274         275         276 
## 0.001649210 0.001649210 0.001628178 0.001649210 0.001649210 0.001649210 
##         277         278         279         280         281         282 
## 0.002264399 0.001649210 0.001649210 0.001649210 0.001602470 0.001649210 
##         283         284         285         286         287         288 
## 0.001649210 0.001612598 0.001649210 0.001649210 0.001649210 0.001649210 
##         289         290         291         292         293         294 
## 0.001649210 0.001649210 0.001649210 0.001616473 0.001649210 0.001649210 
##         295         296         297         298         299         300 
## 0.001649210 0.060213954 0.002033102 0.001649210 0.001649210 0.001649210 
##         301         302         303         304         305         306 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         307         308         309         310         311         312 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         313         314         315         316         317         318 
## 0.001649210 0.001649210 0.001649210 0.003992536 0.001649210 0.001649210 
##         319         320         321         322         323         324 
## 0.001649210 0.001649210 0.005257305 0.001649210 0.001649210 0.001602470 
##         325         326         327         328         329         330 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         331         332         333         334         335         336 
## 0.001649210 0.001628178 0.001649210 0.005667733 0.001649210 0.001649210 
##         337         338         339         340         341         342 
## 0.001649210 0.001850871 0.001649210 0.001649210 0.001649210 0.001649210 
##         343         344         345         346         347         348 
## 0.001649210 0.010654413 0.001649210 0.001598568 0.001649210 0.002181848 
##         349         350         351         352         353         354 
## 0.001649210 0.001649210 0.005667733 0.001628178 0.001649210 0.001649210 
##         355         356         357         358         359         360 
## 0.001717704 0.001649210 0.001649210 0.001628178 0.001649210 0.001649210 
##         361         362         363         364         365         366 
## 0.001649210 0.001649210 0.001598568 0.001649210 0.001649210 0.001649210 
##         367         368         369         370         371         372 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001756641 
##         373         374         375         376         377         378 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         379         380         381         382         383         384 
## 0.001649210 0.001597793 0.001649210 0.001649210 0.001649210 0.001649210 
##         385         386         387         388         389         390 
## 0.001649210 0.001649210 0.002352401 0.001649210 0.001649210 0.001649210 
##         391         392         393         394         395         396 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001628178 0.001649210 
##         397         398         399         400         401         402 
## 0.001649210 0.001649210 0.001717704 0.001649210 0.001649210 0.001649210 
##         403         404         405         406         407         408 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001628178 0.001649210 
##         409         410         411         412         413         414 
## 0.001597793 0.001649210 0.001649210 0.001649210 0.001616473 0.001649210 
##         415         416         417         418         419         420 
## 0.001649210 0.001649210 0.032597334 0.001649210 0.001717704 0.001649210 
##         421         422         423         424         425         426 
## 0.001649210 0.001656186 0.001966906 0.001649210 0.001649210 0.001649210 
##         427         428         429         430         431         432 
## 0.001649210 0.001649210 0.001649210 0.001597793 0.001649210 0.001612598 
##         433         434         435         436         437         438 
## 0.001649210 0.001612598 0.001649210 0.001602470 0.001649210 0.001649210 
##         439         440         441         442         443         444 
## 0.001649210 0.001649210 0.002264399 0.001649210 0.001684219 0.001649210 
##         445         446         447         448         449         450 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         451         452         453         454         455         456 
## 0.001597793 0.001649210 0.001649210 0.001649210 0.001756641 0.001649210 
##         457         458         459         460         461         462 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.002033102 
##         463         464         465         466         467         468 
## 0.001649210 0.001628178 0.001649210 0.001649210 0.001649210 0.001649210 
##         469         470         471         472         473         474 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         475         476         477         478         479         480 
## 0.001649210 0.001602470 0.001649210 0.001649210 0.001649210 0.001649210 
##         481         482         483         484         485         486 
## 0.001598568 0.001649210 0.001649210 0.001628178 0.001649210 0.001649210 
##         487         488         489         490         491         492 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001628178 0.001649210 
##         493         494         495         496         497         498 
## 0.001649210 0.001649210 0.001649210 0.001597793 0.001612598 0.001612598 
##         499         500         501         502         503         504 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         505         506         507         508         509         510 
## 0.051013082 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         511         512         513         514         515         516 
## 0.001649210 0.001649210 0.001612598 0.001612598 0.001602470 0.001628178 
##         517         518         519         520         521         522 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         523         524         525         526         527         528 
## 0.001649210 0.001649210 0.001604795 0.001649210 0.001602470 0.001612598 
##         529         530         531         532         533         534 
## 0.001649210 0.001649210 0.001906163 0.001649210 0.007029858 0.001649210 
##         535         536         537         538         539         540 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         541         542         543         544         545         546 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         547         548         549         550         551         552 
## 0.001649210 0.001612598 0.001649210 0.001649210 0.001602470 0.001649210 
##         553         554         555         556         557         558 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         559         560         561         562         563         564 
## 0.001649210 0.001850871 0.001649210 0.001649210 0.001604795 0.001649210 
##         565         566         567         568         569         570 
## 0.001649210 0.001649210 0.299599101 0.004326640 0.001649210 0.001649210 
##         571         572         573         574         575         576 
## 0.001649210 0.002264399 0.001649210 0.001649210 0.001612598 0.001649210 
##         577         578         579         580         581         582 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         583         584         585         586         587         588 
## 0.015981449 0.001649210 0.001649210 0.001649210 0.003833662 0.001649210 
##         589         590         591         592         593         594 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         595         596         597         598         599         600 
## 0.001612598 0.001602470 0.001649210 0.001649210 0.001649210 0.001597793 
##         601         602         603         604         605         606 
## 0.001649210 0.292432463 0.001597793 0.001649210 0.001649210 0.001649210 
##         607         608         609         610         611         612 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         613         614         615         616         617         618 
## 0.001649210 0.001649210 0.001633604 0.001649210 0.001649210 0.001649210 
##         619         620         621         622         623         624 
## 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 0.001649210 
##         625         626 
## 0.001649210 0.001612598
max(hatvalues(model1))
## [1] 0.2995991
x_new = c(1, 300)
X= model.matrix(model1)
t(x_new)%*%solve(t(X)%*%X)%*%x_new
##           [,1]
## [1,] 0.2398486
  1. Was the prediction made in part a of #3 considered to be extrapolation?

Yes, this is extrapolation because it’s value of 0.3086801 is greater than the max leverage of 0.2995991.

  1. Was the prediction made in part b of #3 considered to be extrapolation?

No, this is not extrapolation because it’s value of 0.2398486 is less than the max leverage of 0.2995991.

Fill in the blanks based on the regression model that you created in #2, and the corresponding prediction that you made in part b of #3:

  1. When Cites_Patent_Count is 300 (as in part b of #3), there is a 95% chance that Cited_by_Patent_Count will be between _________ and __________.
B2_pred = data.frame(B1 = c(300))
B2_pred
##    B1
## 1 300
predict(model1, B2_pred, type = "response")
##         1 
## 0.8534046
predict(model1, B2_pred, interval = "prediction", level = .95, type = "response")
##         fit       lwr      upr
## 1 0.8534046 0.1486074 1.558202

When Cites_Patent_Count is 300 (as in part b of #3), there is a 95% chance that Cited_by_Patent_Count will be between 0.1486074 and 1.558202.

  1. When Cites_Patent_Count is 300, there is a 95% chance that the mean response will fall between _________ and _________.
predict(model1, B2_pred, interval = "confidence", level = 0.95, type = "response")
##         fit       lwr      upr
## 1 0.8534046 0.5434141 1.163395

When Cites_Patent_Count is 300, there is a 95% chance that the mean response will fall between 0.5434141 and 1.163395.

Section 2 of Exam 1: Joining Data

  1. Type the following header in R Markdown, using formatting learned in the first class:

See above header

  1. Type the following in R Markdown, being sure to make the phrase “mutating joins” appear in bold in your HTML document:

The four types of mutating joins learned in class are:

  1. Open the Training Data and Sequence Counts csv files, and notice that each file has a column listing the patent application number. For full credit, perform the following joins in R without changing any column titles in either dataset:
train <- read.csv("C:/Users/justt/Desktop/School/621/Exams/Exam 1/Training Data.csv")
sc <- read.csv("C:/Users/justt/Desktop/School/621/Exams/Exam 1/Sequence Counts.csv")
str(train)
## 'data.frame':    626 obs. of  3 variables:
##  $ Patent_Number        : chr  "PL 3341367 T3" "HR P20210871 T1" "CR 20210284 A" "US 2021/0205309 A1" ...
##  $ Cites_Patent_Count   : int  0 0 0 0 3 0 0 1 0 0 ...
##  $ Cited_by_Patent_Count: int  0 0 0 0 0 0 0 0 0 0 ...
str(sc)
## 'data.frame':    222 obs. of  2 variables:
##  $ Patent_No.    : chr  "CA 189065 S" "NI 202000072 A" "KR 20210032013 A" "PH 12020550461 A1" ...
##  $ Sequence_Count: int  0 0 80 0 0 0 0 0 0 0 ...
  1. Perform an inner join
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.2.1
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6     ✔ purrr   0.3.4
## ✔ tibble  3.1.7     ✔ dplyr   1.0.9
## ✔ tidyr   1.2.0     ✔ stringr 1.4.0
## ✔ readr   2.1.2     ✔ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(dplyr)
train %>% inner_join(sc, by = c("Patent_Number" = "Patent_No.")) -> joined_data
head(joined_data)
##       Patent_Number Cites_Patent_Count Cited_by_Patent_Count Sequence_Count
## 1       CA 189065 S                  0                     0              0
## 2    NI 202000072 A                  0                     0              0
## 3  KR 20210032013 A                  0                     0             80
## 4 PH 12020550461 A1                  0                     0              0
## 5      TW I722568 B                  0                     0              0
## 6    CN 112533674 A                  0                     2              0
  1. Perform a full join
train %>% full_join(sc, by = c("Patent_Number" = "Patent_No.")) -> joined_data1
head(joined_data1)
##        Patent_Number Cites_Patent_Count Cited_by_Patent_Count Sequence_Count
## 1      PL 3341367 T3                  0                     0             NA
## 2    HR P20210871 T1                  0                     0             NA
## 3      CR 20210284 A                  0                     0             NA
## 4 US 2021/0205309 A1                  0                     0             NA
## 5    JP 2021100972 A                  3                     0             NA
## 6  AU 2021/203768 A1                  0                     0             NA
  1. Join the two datasets so that only the patents in the Sequence Counts dataset are included
train %>% right_join(sc, by = c("Patent_Number" = "Patent_No.")) -> joined_data2
head(joined_data2)
##       Patent_Number Cites_Patent_Count Cited_by_Patent_Count Sequence_Count
## 1       CA 189065 S                  0                     0              0
## 2    NI 202000072 A                  0                     0              0
## 3  KR 20210032013 A                  0                     0             80
## 4 PH 12020550461 A1                  0                     0              0
## 5      TW I722568 B                  0                     0              0
## 6    CN 112533674 A                  0                     2              0
  1. Read the Training and Testing Data csv file into R. This file contains a binary column called Partition which is 0 when the patent was part of the training set used to develop the regression model in #2, and is equal to 1 otherwise. Use R to perform the following on this dataset:
tandt <- read.csv("C:/Users/justt/Desktop/School/621/Exams/Exam 1/Training and Testing Data.csv")
str(tandt)
## 'data.frame':    725 obs. of  4 variables:
##  $ Patent_Number        : chr  "PL 3341367 T3" "HR P20210871 T1" "CR 20210284 A" "US 2021/0205309 A1" ...
##  $ Cites_Patent_Count   : int  0 0 0 0 3 0 0 1 0 0 ...
##  $ Cited_by_Patent_Count: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Partition            : int  0 0 0 0 0 0 0 0 0 0 ...
  1. Group the dataset by Partition. Summarize each grouping by calculating the standard deviation of Cites_Patent_Count for each group.:::
by_part <- group_by(tandt, Partition)
Stan_dev <- summarise(by_part, cpc = sd(Cites_Patent_Count, na.rm = TRUE))
Stan_dev
## # A tibble: 2 × 2
##   Partition   cpc
##       <int> <dbl>
## 1         0  24.2
## 2         1  16.7
arrange(Stan_dev, desc(cpc))
## # A tibble: 2 × 2
##   Partition   cpc
##       <int> <dbl>
## 1         0  24.2
## 2         1  16.7
  1. Identify the top 3 patents with the highest number of citations in the Cited_by_Patent_Count column.
  1. Filter the dataset to display only those rows for which the Cites_Patent_Count is greater than zero.