#Introduction

In the 2nd dataset I chose for my project 2. I had a very simple dataset found by Wilson Ng. I wanted to see how much data manipulation I can play with in order to get some analytical work on this data set. I also want to focus more on transforming the data and maybe adding more value. Maybe changing the type of data collected as well.

#Step 1 Changing Data type Once I read the data and did an overview. I saw that the column of

library(knitr)
library(stringr)
library(tidyr)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
require(ggplot2)
fruits <-read.csv("https://raw.githubusercontent.com/Wilchau/Data607Project2/main/Data_2.csv")
head(fruits)
##           item price calories
## 1     "banana"  1.00      105
## 2      "apple"  0.75       95
## 3      "apple"  0.75       95
## 4      "peach"  3.00       55
## 5      "peach"  4.00       55
## 6 "clementine"  2.50       35

We can see that the price’s string is in character. I will try to convert that back to integer and get rid of $, and ““. I also see price vs Calories as an calculated field. I will also attempt to add another column with price/calories to see how much calories you get for your price.

fruits_analysis <- fruits[,c("item","price","calories")]
fruits_analysis$ratio <- fruits_analysis$price / fruits_analysis$calories*100
head(fruits_analysis)
##           item price calories     ratio
## 1     "banana"  1.00      105 0.9523810
## 2      "apple"  0.75       95 0.7894737
## 3      "apple"  0.75       95 0.7894737
## 4      "peach"  3.00       55 5.4545455
## 5      "peach"  4.00       55 7.2727273
## 6 "clementine"  2.50       35 7.1428571
ggplot(data=fruits_analysis, aes(x=item, y=ratio, group=1)) +
  geom_line()+
  geom_point()

#Ratio is key to good price When we visit Costco or any grocery stores we usually look at Ratio per $. In this scenario, I added a ratio column that divides price / calories. It seems like you get your money’s worth on apple with the lowest ratio. Even looking at the line graph. Apple is lower than bananas in a .20 differences. That means for every 1 cent you’re paying for an apple you get 7.89 worth of calories.

If you need your nutrition’s worth. I would recommend going for an apple