Install and load the package PDFtools
library("pdftools")
## Using poppler version 22.04.0
Download pdf File inserting the link
#pdf.file <- "https://dohs.gov.np/wp-content/uploads/2022/07/DoHS-Annual-Report-FY-2077-78-date-5-July-2022-2022_FINAL.pdf"
#download.file(pdf.file, destfile = "sample.pdf", mode = "wb")
#pdf.text <- pdftools::pdf_text("sample.pdf")
#or
pdf.file <- "C:/Users/Dell/Desktop/DoHS-Annual-Report-FY-2077-78-date-5-July-2022-2022_FINAL.pdf"
Convert it to the text Sentence
pdf.text <- pdftools::pdf_text(pdf.file)
Clean the text and remove spaces
pdf.text <- gsub(" ", "", pdf.text, fixed = TRUE)
pdf.text <- paste(pdf.text, collapse = " ")
pdf.text <- strsplit(pdf.text, "\n")[[1]]
Insert the word, which you want to extract (For Eg.: Province 1)
PNC <- grep("PNC", pdf.text, value = TRUE, ignore.case = T)
Checking the results
PNC
## [1] "MDT Multi-drug therapy PNC Postnatal care"
## [2] " had three PNC16 19 25 22152220 29 40 4650 90"
## [3] "The proportion of mothers attending three PNC visits as per the protocol increased remarkably from 18.8"
## [4] " •promotion of antenatal care (ANC), institutional delivery and postnatal care (PNC) (iron, tetanus"
## [5] "v.Formulated the Guideline on PNC Home Visit Micro-planning at the local level 2021."
## [6] "introduced the monitoring of three PNC visits according to a protocol since 2071/72."
## [7] " Figure 8: % PNC as per protcol"
## [8] "The proportion of mothers attending three PNC visits as per the protocol increased from 19 percent in"
## [9] "PNC as per protocol in 2077/78. It is important to note that proportion of women attending three PNC has"
## [10] " PNC home visit (micro-planning for PNC)"
## [11] " deaths occur during post-natal period. As reported above in PNC section, women who received PNC"
## [12] " gradually. Till FY 2077/78, It has been expanded in to 396 Municipals from 50 districts to strengthen PNC"
## [13] " services by mobilizing MNH service providers from health facilities to provide PNC at women’s home. Out"
## [14] " (where majority of the local levels were able to implement the program) the PNC as per protocol is 35% of"
## [15] "Table 4: Institutional Delivery and PNC visit within 24 hours"
## [16] " Institutional Delivery PNC visit within 24 hours"
## [17] "Table 5: PNC visits as per protocol"
## [18] " PNC visits as per protocol"
## [19] "4 ANCs as per protocol and only around a quarter of women receiving 3 PNC as per protocol in 2077/78."
## [20] " Percentage of women who had 3 PNC check-ups as per protocol"
## [21] "low PNC coverage"
## [22] " • Continuation of PNC home visit throughout"
## [23] "PNC 43,572 37,707 39,330 22,51023,928"
## [24] "TUTH, Maharajgunj, Kathmandu SBA,NICU, ICU, OTTM, PNC, Medico-"
## [25] "Kanti Children Hospital, Kathmandu Pediatric Nursing care(PNC)"
## [26] " AMDA Hospital, Butwal OTTM, PNC, SBA"
## [27] "88 Capacity Enhance Program for PNC service ExpansionTimes 2"
## [28] "2 Pediatric Nursing Training (PNC) Person 20"
## [29] " % of PNC"
## [30] "who receivedattended by a % of normal(Vaccum who had 3 PNC"
Saving the extracted results
write.csv(PNC,file = "pnc.csv") #to csv
write.table(PNC,file = "pnc.txt") # to text
Note that the echo = FALSE parameter was added to the
code chunk to prevent printing of the R code that generated the
plot.