Prediction of fake news by machine learning

Pedro Carvalho Brom
2016-12-09

What do we have here?

In this repository you will find the code for creating the Shiny application of false vs. True. There are only two files ui.R and server.R, which serve to evaluate how the page was built. The predictive model links have been removed since the project is limited only to running a Shiny application.

About the prediction model used

The technique used is logistic regression combined with machine learning. The intent is to calculate the probability of the news being true or false given your metrics of characters and words, using only the title and news information. The database was scraped from www.e-farsas.com and contains a total of 891 posts, of which approximately 74% are false. That is, according to the site we have more fake viral than true in circulating on the internet.

Example usage

Let's use this post as an example “China opens high bus!” . According to the website the original news is false. Just paste title and news in the respective fields in the entries and click on 'Consultar' and wait approximately 15 seconds.

A print screen of Shiny App

require(imager)
url.img = "https://www.dropbox.com/s/lht79f56a016hpm/
print_efarsas.jpg?dl=1"
img = load.image(url.img)
plot(img, axes = F, xlab = "", ylab = "")

plot of chunk _01