# Teoría El reconocimiento óptico de caracteres (OCR) es una tecnología utilizada para convertir diferentes tipos de documentos, como imágenes, documentos impresos escaneados, fotografías de tecto, archivos pdf, o imágenes capturadas con una cámara, en datos editables y buscables.

Instalar paquetes y llamar librerías

#install.packages("tesseract") # OCR
library(tesseract)
#install.packages("magick") # PNG
library(magick)

## Warning: package 'magick' was built under R version 4.3.3

## Linking to ImageMagick 6.9.12.93
## Enabled features: cairo, fontconfig, freetype, heic, lcms, pango, raw, rsvg, webp
## Disabled features: fftw, ghostscript, x11

#install.packages("officer") # Word
library(officer)

## Warning: package 'officer' was built under R version 4.3.2

#install.packages("pdftools") # pdf
library(pdftools)

## Using poppler version 23.04.0

Obtener texto de una imagen PNG

# file.choose()
imagen1 <- image_read("/Users/yessicaacosta/Desktop/Captura de Pantalla 2024-08-12 a la(s) 15.33.13.png")
texto1 <- ocr(imagen1)
texto1

## [1] "¢ In supervised learning the agent observes input-output pairs and learns a function that Supervised learning\nmaps from input to output. For example, the inputs could be camera images, each\none accompanied by an output saying “bus” or “pedestrian,” etc. An output like this\nis called a label. The agent learns a function that, when given a new image, predicts Label\nthe appropriate label. In the case of braking actions (component 1 above), an input is\nthe current state (speed and direction of the car, road condition), and an output is the\ndistance it took to stop. In this case a set of output values can be obtained by the agent\nfrom its own percepts (after the fact); the environment is the teacher, and the agent\nlearns a function that maps states to stopping distance.\n¢ In unsupervised learning the agent learns patterns in the input without any explicit leone sed\nfeedback. The most common unsupervised learning task is clustering: detecting poten-\ntially useful clusters of input examples. For example, when shown millions of images\ntaken from the Internet, a computer vision system can identify a large cluster of similar\nimages which an English speaker would call “cats.”\n¢ In reinforcement learning the agent learns from a series of reinforcements: rewards Reinforcement\nand punishments. For example, at the end of a chess game the agent is told that it has\nwon (a reward) or lost (a punishment). It is up to the agent to decide which of the\nactions prior to the reinforcement were most responsible for it, and to alter its actions\nto aim towards more rewards in the future.\n"

LS0tCnRpdGxlOiAiT0NSIgphdXRob3I6IFllc3NpY2EgQWNvc3RhCmRhdGU6ICIyMDI0LTA4LTE0IgpvdXRwdXQ6IAogIGh0bWxfZG9jdW1lbnQ6CiAgICB0b2M6IFRSVUUKICAgIHRvY19mbG9hdDogVFJVRQogICAgY29kZV9kb3dubG9hZDogVFJVRQogICAgdGhlbWU6IGRhcmsKLS0tCgoKIVtdKC9Vc2Vycy95ZXNzaWNhYWNvc3RhL0Rvd25sb2Fkcy9saWJyb3MuZ2lmKQojIFRlb3LDrWEgCkVsIHJlY29ub2NpbWllbnRvIMOzcHRpY28gZGUgY2FyYWN0ZXJlcyAoT0NSKSBlcyB1bmEgdGVjbm9sb2fDrWEgdXRpbGl6YWRhIHBhcmEgY29udmVydGlyIGRpZmVyZW50ZXMgdGlwb3MgZGUgZG9jdW1lbnRvcywgY29tbyBpbcOhZ2VuZXMsIGRvY3VtZW50b3MgaW1wcmVzb3MgZXNjYW5lYWRvcywgZm90b2dyYWbDrWFzIGRlIHRlY3RvLCBhcmNoaXZvcyBwZGYsIG8gaW3DoWdlbmVzIGNhcHR1cmFkYXMgY29uIHVuYSBjw6FtYXJhLCBlbiBkYXRvcyBlZGl0YWJsZXMgeSBidXNjYWJsZXMuCgojIEluc3RhbGFyIHBhcXVldGVzIHkgbGxhbWFyIGxpYnJlcsOtYXMKYGBge3J9CiNpbnN0YWxsLnBhY2thZ2VzKCJ0ZXNzZXJhY3QiKSAjIE9DUgpsaWJyYXJ5KHRlc3NlcmFjdCkKI2luc3RhbGwucGFja2FnZXMoIm1hZ2ljayIpICMgUE5HCmxpYnJhcnkobWFnaWNrKQojaW5zdGFsbC5wYWNrYWdlcygib2ZmaWNlciIpICMgV29yZApsaWJyYXJ5KG9mZmljZXIpCiNpbnN0YWxsLnBhY2thZ2VzKCJwZGZ0b29scyIpICMgcGRmCmxpYnJhcnkocGRmdG9vbHMpCmBgYAoKIyBPYnRlbmVyIHRleHRvIGRlIHVuYSBpbWFnZW4gUE5HCmBgYHtyfQojIGZpbGUuY2hvb3NlKCkKaW1hZ2VuMSA8LSBpbWFnZV9yZWFkKCIvVXNlcnMveWVzc2ljYWFjb3N0YS9EZXNrdG9wL0NhcHR1cmEgZGUgUGFudGFsbGEgMjAyNC0wOC0xMiBhIGxhKHMpIDE1LjMzLjEzLnBuZyIpCnRleHRvMSA8LSBvY3IoaW1hZ2VuMSkKdGV4dG8xCmBgYAoK

OCR

Yessica Acosta

2024-08-14

Instalar paquetes y llamar librerías

Obtener texto de una imagen PNG