install pakegages(“tidyverse”) ## Case 1 (20 points)
A survey of 417 individuals asks questions about how often they exercise, marital status, and annual income.
Fitness dataset and Write a script
in R to calculate the 25th, 50th, and 75th percentiles of income. What
does this tell you about the income distribution? (10
points)## 25% 50% 75%
## 64218.0 80705.0 101133.8
answer: The percentiles reveal 25% of individuals have an income below 64,218, 50% of individuals have an income below 80,705(majority of people earn around this amount), 25% of individuals have an income above 101,133.8. The increase from the 25th percentile (64,218) to the 75th percentile (101,133.8) suggests that incomes in the higher range are notably larger. This may indicate a right-skewed distribution, where a portion of individuals earn significantly more than others.
## No Yes
## 94913.00 81310.06 81829.97
answer: the gap between “yes” and “no” is: Yes-No = 81829.97-81310.06 = 519.91, Married people earn an average of $519.91 more than unmarried people.
The Country data file shows the annual returns
(in %) for a mutual fund focusing on investments in Latin America and a
mutual fund focusing on investments in Canada over the past 20
years.
## Year Latin_America Canada
## Min. : 1 Min. :-54.64 Min. :-42.640
## 1st Qu.: 7 1st Qu.:-17.23 1st Qu.: -9.610
## Median :13 Median : 4.11 Median : 12.280
## Mean :13 Mean : 10.58 Mean : 9.433
## 3rd Qu.:19 3rd Qu.: 41.11 3rd Qu.: 21.840
## Max. :25 Max. : 91.60 Max. : 51.910
## [1] 10.5768
## [1] 9.4328
answer:Latin_America had the higher average returns over this time period
Latin_America and
Canada. (10 points)Compare and Explain the result with
your own words.(10 points)(20 points in total)## [1] 146.24
IQR <- as.numeric(quantile(Country$Latin_America, 0.75) - quantile(Country$Latin_America, 0.25))
IQR## [1] 58.34
## [1] 31.98867
## [1] 1375.87
## [1] 37.09272
## [1] 94.55
## [1] 31.45
## [1] 17.27686
## [1] 484.9471
## [1] 22.02151
answer:sd Latin_America=37.09272, sd Canada=22.02151.Also, sd Latin_America is much lager than sd Canada. As a result, the data of Canada is more concentrated than the Latin_America’s.
You have just obtained a dataset stockprice for
analysis, including the date, AMZN,GOOG and X index.
AMZN against
X Index, set the title as “Amazon Stock Price against X
Index”, name the X-axis as “Amazon Stock Price” and Y-asix as “X Index”.
(10 points) Comment on the relationship between
AMZN and X Index (10 points). (20
points in total)plot(stockprice$X.Index~ stockprice$AMZN , main = "Amazon Stock Price against X Index", xlab = "Amazon Stock Price", ylab = "X Index", pch=16)answer:It is find that there is no relative relation character bewteen X index and stockprice.But there is a huge blank in around 1200 to 1500 stockprice.And it is also find that two point gather together in both side.The whole number are sperate in the chart.
X Index is
equal or greater than 200, the color of the points changes, and add the
Legend as “Above or equal 200” and “Below 200”. (10
points)plot(stockprice$X.Index ~stockprice$AMZN ,
main = "Amazon Stock Price against X Index",
xlab = "Amazon Stock Price", ylab = "X Index",
col = ifelse(stockprice$X.Index >= 200, "red", "blue"),
pch=16)
legend("topleft", legend=c("Above or equal 200", "Below 200"), pch=16, col=c("red", "blue"))AMZN (blue) and
GOOG (green) on the same plot, set the title as “Stock
Prices for Amazon and Google”, name the X-axis as “Date” and Y-asix as
“Stock Price”, Add the y-axis limit from 0 to 3000 and add the Legend
appropriately. (10 points) Comment on the trends over
time in the plot. (10 points) (20 points in total)stockprice$Date <- as.Date(stockprice$Date,format="%Y-%m-%d")
plot(stockprice$`AMZN` ~ stockprice$`Date`, main="Stock Prices for Amazon and Google", xlab="Date", ylab="Stock Price", col = "blue", type="n", xlim=c(as.Date("2016-01-01"),as.Date("2020-12-31")),ylim=c(0,3000))
lines(stockprice$`AMZN` ~ stockprice$`Date`, col="blue", lty = 1)
lines(stockprice$`GOOG` ~ stockprice$`Date`, col="green", lty = 1)
legend("topleft", legend=c("AMZN", "GOOG"), col=c("blue", "green"), lty=1) answer:In general,the two lines both have an increase trend.And there are some fluctuation in around 2018 and after.We can see that AMZN and GOOG are very close in 2016 to 2018.But in the end of 2017,AMZN over the GOOG and have a sharp increase to the top(about2000) and ended about 1800 in 2020.As for GOOG,still have slowly increase and variations ended in 1500 stockprice in 2020.