Introduction to R/Rstudio in Posit Cloud and Review of Correlation
2025-02-05
HW 3 is due 2/3/2025
Sign up for a FREE Posit Cloud Account
Introduction to using R and RStudio
Review of correlation, \(R_{XY}\)
Review of Simple Linear Regression
Function vs. Model
Examining Real Data
Creating a Model
Interpreting a Regression Model
In-class Polling (Session ID: bua345s25)
Recall the Lecture 6 ‘Weather’ worksheet which is the ‘Lecture 7 Review Worksheet’.
The first and second inputs for the VLOOKUP
command in cell H4, are:
Which choice below contains the correct first and second inputs?
=FORMULATEXT(H4)
to check your answer.=VLOOKUP(H2, A2:E91,…
=VLOOKUP(H3, A1:E91,…
=VLOOKUP(H4, A2:E91,…
=VLOOKUP(H2, B1:E91,…
=VLOOKUP(H3, B2:E91,…
=VLOOKUP(,H4, B2:E9,…
In this course we will use R and RStudio for the predictive analytics lectures.
You will access R and RStudio through Posit Cloud.
I will post R/RStudio files on Posit Cloud that you can access in provided links.
I will also provide demo videos that show how to access files and complete exercises.
NOTE: The free Posit Cloud account is limited to 25 hours per month.
We will also use Posit cloud for quiz questions of predictive analytics skills.
For those who want to download R and RStudio (not required):
Always click ‘Save a Permanent Copy’ so you don’t lose your work.
Click Tools
> Global Options
. The next few slides are helpful reference but are not required.
On the Basic tab, next to Show in document outline
select Sections and All Chunks
.
On the Visual
tab:
check box next to Show line numbers in code blocks
.
next to Editor content width (px)
, change the value to 900
.
When you’re done selecting all options, click OK
at the bottom.
.qmd
) File OpenProvided .qmd files
appear in the upper left panel above the Console
.
Setup
Code ChunkWhenever you begin working with a provided code file, click the green triangle
in the Setup
chunk to setup options and load and install packages.
Often if we have two quantitative variables we want to understand the extent to which they are associated.
The first step is often to plot the data using a scatterplot.
We can also use quantitative measures of association to understand these relationships.
In addition to determining if there is a positive or negative relationship,
To quantify the strength a linear relationship, we calculate:
Pearson’s correlation coefficient, \(R_{xy}\).
\(R_{xy} = 0.85\)
How do we interpret this value?
\(R_{xy}\) ranges from -1 to 1.
\(R_{xy} = 1\) or \(R_{xy} = -1\) is unrealistic. These correlations are both strong and realistic:
What is the correlation between height and mass in the starwars data?
\(R_{xy}\) is only valid when examining linear relationships.
If the data have a curvilinear relationship, there are other tools that will be covered in other courses.
An introduction to R and RStudio in Posit Cloud.
A review of linear associations between variables.
We will continue this discussion in Lecture 8 on Thursday
For now, you are expected to understand
cor
command in RHW 3 was due 2/3/2025
HW 4 is due 2/12/2025
To submit an Engagement Question or Comment about material from Lecture 7: Submit it by midnight today (day of lecture).