Activity 2 correlation

ETCV/INFV 302

Jose Romero

Dr. Ryan Straight

September 6, 2018

Assignment

This activity goes well beyond simply displaying frequencies and descriptives for common concepts like means and medians. For this activity, create and submit a document that includes and answers the following:

plot(df2$comm, df2$peers, main = "Correlation between Comm and peers", xlab = "comm", ylab = "peers")

plot(df2$comm, df2$peers, main = "Regression of Comm and peers", xlab = "comm", ylab = "peers")
 abline(lm(comm ~ peers, data = df2), col = "red")

qplot(comm,peers, data = df2, color = comm)

A scatterplot that displays the relationship between the peers and comm variables.

Answer the following:

1. What does this chart tell you about the relationship between the two variables?

In this relationship I see a positive association between both variables. The point

2.What direction is this association?

The direction of the association is positive because both variables comm and peers

increase together in a positive direction.

3.How did you determine this?

This is determined because both the variables on the x axis and y axis increase.

4. If you had to identify the association, would you label is small, moderate, or

strong? Why?

I would identify this association as moderate. The reason is that the r= 0.519 is in

middle range and the association is still positive but scattered.

Using summarytools, Determine the Pearson correlation for the peers and comm variables.

r= 0.519

cor(comm, peers, method = "pearson")
## [1] 0.5194169

1. Interpret this number.

From the number I can tell that it is a positive correlation that falls into moderate.

2. What is the strength of the association?

The strength of the associant is moderate since r = 0.5194169 is positive number that

downwards.

3. What is the direction?

The direction of r = 0.5194169 is positive moderate correlation from left to right

between variables comm and peers.

4. Like the scatterplot above, do you think it is small, moderate, or strong?

I believe that this is a moderate correlation due to the r value beig in the middle

range. If I were to look at this as a percentage than it would be 52 percent which is

near the middle.

5. How does your interpretation of the Pearson correlation coefficient compare to that

of the scatterplot?

Interpreting both the scatterplot and Pearson’s correlation I can see that it is a

and moderate because the data points are somewhat scattered. The correlation is a

positive value that falls in the moderate range.

Determine R2

1. Compute the square of the correlation coefficient you previously calculated.

cor(comm, peers, method = "pearson")^2
## [1] 0.269794

2. Interpret this value. What does it indicate about the association?

If you were to look at the value as a percentage then it would mean that about 27

percent of the data points are on or near the regression line.

3. Write a statement about the meaning of the R-squared (R2) value in terms of the

variables.

R-squared (R2) is a way to find a percentage on how close the variables data points of

comm and peers are to the regression line.

4. How does R2 compare to what you saw in the scatterplot and the Pearson correlation

coefficient?

When comparing the scatterplot and Pearson correlation to R-squared I can see how the

value is 0.269794. The scatterplot values are not clustered around the regression

line. It also has a low amount of data points that are on or near the regression line.

The Pearson correlation shows a moderate but positive linear relationship. Both these

factors explain the value of R-squared (R2)

5. Do you think this is a more valuable statistic? Why?

I do believe that R-squared (R2) is a valuable statistic. R-squared helps you

determine the differences in the dataset that you are plotting. Whether the

differences in the variables are minimal or vast, this information can help you when

examining your data.