Loading required package: pacman
BUA 345 - Lecture 28
Course Wrap Up and Some Review
Housekeeping
Upcoming Dates
HW 10 is due on Monday, 4/28.
- Grace Period ends Tuesday (4/29) at midnight.
Additional Practice Questions are posted and videos will be posted this week.
Lecture 27 (4/22) videos are posted. If you have additional questions let me know.
- More about this.
Is there interest in a Zoom Review Q&A Sessions before the final exam(s)?
Course Evaluations
Evaluations are VERY Important:
I will end class a little early again today to give you time to complete evaluations in class.
Please complete evaluations for ALL courses.
Today’s plan
Resources for Expanding and Explaining yor Skills
Overview of Course
Study Suggestions
A few Questions
Evaluations
Time for your Questions
In-class Polling (Session ID: bua345s25)
Lecture 28 In-class Exercises - Q1 - Review
Session ID: bua345s25
If you suspect your time series data have a seasonal component, the third set of practice questions demonstrates that you should develop TWO versions of the Auto ARIMA (auto.arima
in R) model and compare them.
Specify the option correctly with no spaces that states that the model should assume there IS a seasonal components.
Finance and Forecasting Coursework at Whitman
After Tuesday’s Demo lecture I received a lot questions asking about going further with forecasting.
Here are some great classes in Whitman that will expand your financial forecasting skillset
FIN 454 - Financial Analytics (Uses R and RStudio)
FIN 461 - Financial Modeling (Uses R and RStudio)
FIN 400 - Algorithmic Trading with Kivanc Avrenli (uses Python)
There are other great Finance classes but these are the ones I have direct knowledge of.
Continuing in Business Analytics
BUA 345 provides a solid foundation in data literacy.
IF you find this material interesting and want to take the lead on analyzing data, consider the Business Analytics (BA) major.
Skills that the BA major focuses on:
Advanced Analytics focuses on dealing with complex and large data sets.
Data Management shows how to go from raw data on the internet to informative data visualization dashboards.
Predictive Analytics expands on the modeling skills in this course to show students how to develop models and make sound business predictions.
Data Mining and Network Modeling
Visual Analytics builds on visualization skills in data management course to show how to present data effectively.
Financial Analytics and Marketing Analytics are electives for the BA major.
More courses are being developed.
Expand your Analytics and Data Science Skillset.
As SU Students you also have free access to Linkedin Learning
Great tutorials in R, Python, SQL
R is versatile and powerful, but employers may prefer Python, SQL, or another language/environment because that is what they know.
R, Python, Observable and Julia can all be used with Quarto
DataCamp - Not Free, but Excellent.
Whitman Wire Initiative will subsidize Data Camp courses.
Provides certificates of completion
Talking about Your Skillset
Explaining your analytics skillset is challenging, but it’s getting easier.
As data science and analytics grows as a field, more people understand what this skillset can offer.
You should not assume that interviewers, colleagues, supervisors understand your skills.
This White Paper from Data camp (also posted on Blackboard) is helpful.
Starting on Page 9 it lays out different roles people take on when working with data.
Comparing these descriptions to skills you learn in BUA 345 (and future courses) will help you communicate your skillset with confidence.
Final exam Information
Timed test - 90 minutes
2:00 PM Section: Tuesday 5/6 at 10:15 AM in Room 11
3;30 PM Section: Friday 5/2 at 8:00 AM in Room 009
There is no asynchronous option for this test and students must attend the time for their section unless they have already spoken with me.
Seats are limited.
Overview of Course
Excel Skills
Lectures 1 - 6
HW Assignments 2, 3 and 4.1
Correlation, SLR, MLR, Logistic Regression
Lectures 7 - 18
HW Assignments 4.2, 5, 6, 7, 8
Non-linear Models and Optimization
Lectures 22 - 24
HW 9 and Q1 of HW 10
Forecasting
Lectures 25 - 27
HW 10
Additional Important Course Material
Practice Questions
For Quiz 1
For Quiz 2
Additional Practice Questions
Quizzes
Quiz 1
Quiz 2
Final Exam Questions will mostly be adapted from previous quiz questions and practice questions.
Recommended Studying Strategy
Go through previous quizzes and all three sets of practice questions.
Take notes for your self on skills and terminology you are unsure of.
Go back to those skills and terms in HW assignments, lectures, and videos and take notes.
Redo questions.
Excel Skills - Lectures 1 - 6
Relative and Absolute (
$
) cell referencesExcel Tables and Table Options
- Sorting, filter, finding duplicates
Excel Pivot Tables
Summarizing complex data
Many options
Vlookup
Including embedded
Match
command to increase functionalityBoth Range and Exact match lookups
Lecture 28 In-class Exercises - Q2
Session ID: bua345s25
What percent of the females that survived the Titanic disaster were in Second Class?
There are MANY ways to approach this question.
Hint: This may be easier to do if you keep the data as counts instead of converting values to percentages.
Round answer to the closest whole percent and don’t include percent sign in your answer.
Correlation, SLR, MLR, and Logistic Regression
This material comprises the largest part of the course.
Correlation, and Simple Linear Regression included in Quiz 1
Calculating and interpreting correlations using the
cor
commandCreating a Simple Linear Regression (SLR) model and verifying it is valid.
Knowing when the
log
, natural log transformation is useful.
All MLR Regression topics included in Quiz 2
Model Selection Methods
Measures of Goodness of Fit, e.g., Adjusted \(R^2\) and AIC
Basic commands for creating a model, e.g.
lm
,ols_regress
Logistic Regression (
glm
)When is it used?
How do we use model results to find probabilities?
Lecture 28 In-class Exercises - Q3-Q4
Question 3. What is the slope for the Premium
cut diamond category in these data?
Question 4. In this diamonds dataset, Ideal
cut is the baseline category and there are a total of three cut categories, Ideal
, Premium
, and Very Good
.
Which other category, Very Good
or Premium
, is not significantly different from Ideal?
Model Summary
--------------------------------------------------------------------
R 0.880 RMSE 401.048
R-Squared 0.774 MSE 160839.634
Adj. R-Squared 0.773 Coef. Var 16.202
Pred R-Squared 0.769 AIC 14840.040
MAE 310.166 SBC 14874.394
--------------------------------------------------------------------
RMSE: Root Mean Square Error
MSE: Mean Square Error
MAE: Mean Absolute Error
AIC: Akaike Information Criteria
SBC: Schwarz Bayesian Criteria
ANOVA
-----------------------------------------------------------------------------
Sum of
Squares DF Mean Square F Sig.
-----------------------------------------------------------------------------
Regression 550650373.945 5 110130074.789 680.611 0.0000
Residual 160839633.955 994 161810.497
Total 711490007.900 999
-----------------------------------------------------------------------------
Parameter Estimates
-----------------------------------------------------------------------------------------------------
model Beta Std. Error Std. Beta t Sig lower upper
-----------------------------------------------------------------------------------------------------
(Intercept) -231.059 83.222 -2.776 0.006 -394.369 -67.749
carat 4112.089 120.651 0.907 34.083 0.000 3875.330 4348.848
cutPremium 147.418 115.427 -0.079 1.277 0.202 -79.091 373.927
cutVery Good -152.745 120.764 -0.038 -1.265 0.206 -389.726 84.237
carat:cutPremium -425.098 163.797 -0.058 -2.595 0.010 -746.525 -103.670
carat:cutVery Good 117.709 174.465 0.014 0.675 0.500 -224.652 460.071
-----------------------------------------------------------------------------------------------------
NL Models and Optimization - Lectures 22 - 24
Example Questions in Lectures, HW 9, HW 10, and Additional Practice Questions
Non-linear (NL) models
Excel is a great tool to efficiently compare different model choices
Model coefficients and \(R^2\) can be requested for each model.
Adjusted \(R^2\) values will be provided.
Optimization
Non-linear model optimization using
GRG Non-linear
method in Excel Solver.Optimizing systems of linear equations using
Simplex LP
method in Excel Solver.
Forecasting - Lectures 25 - 27
Cross-sectional vs. Time-Series Data
Forecasting Terminology
Using
ts
command to correctly specify a time seriesImplementing and Interpreting Auto-ARIMA models (
auto.arima
in R)Determining if data have a seasonal component
Reporting requested prediction bounds or the prediction interval (Hi - Lo)
Calculating model percent accuracy:
(100 - MAPE)%
Examining and comparing model residuals (HW 10)
Determining if data show a seasonal pattern (
seasonal=T
)
Key Points from Today
Evaluations are VERY Important: coursefeedback.syr.edu
The rest of today’s lecture will be a Q&A session.
Reminder of Recommended Study Strategy:
Go through previous quizzes and all three sets of practice questions.
Take notes for your self on skills and terminology you are unsure of.
Go back to those skills and terms in HW assignments, lectures, and videos and take notes.
Redo questions.
To submit an Engagement Question or Comment about material from Lecture 28: Submit it by midnight today (day of lecture).