The capstone of the Coursera R Programming course walked the students through the development of predictions models based off data provided by the instructors.
While completing the goals of creating a code capable of cleaning and processing the data provided, I noticed the sources of the data; blogs, news, and twitter. From my own experience with the data sources, I thought to provide the user with a tailored experienced, as someone writing for a news site might use different words than an individual writing a personal blog or an influencer on twitter.
-To do this, in the final R Shiny application, I developed a tabular tool to included the data summary and plots from the milestone project to provide the user with an idea of the source’s characteristics.
-Regarding the prediction model, I included the option for user to select whether they want to base their predictions off their first, second, third, or complete sample of the all the sources.
-While the initial data was provided by the Coursera staff, this model has been developed with the intent for a user to use any three data sets they have at their disposal. As Rpubs restricts the amount of data a user can upload, code for extracting sample data was included in the first tab. I recommend using the the sample sources provided by the course instructures for testing the tool.