BUA 345 - Lecture 28

Course Wrap Up and Some Review

Author

Penelope Pooler Eisenbies

Published

April 23, 2025

Housekeeping

Loading required package: pacman

Upcoming Dates

Course Evaluations

  • Evaluations are VERY Important:

    • coursefeedback.syr.edu

    • I will end class a little early again today to give you time to complete evaluations in class.

    • Please complete evaluations for ALL courses.

Today’s plan

  • Resources for Expanding and Explaining yor Skills

  • Overview of Course

  • Study Suggestions

  • A few Questions

  • Evaluations

  • Time for your Questions

In-class Polling (Session ID: bua345s25)

Lecture 28 In-class Exercises - Q1 - Review

Session ID: bua345s25

If you suspect your time series data have a seasonal component, the third set of practice questions demonstrates that you should develop TWO versions of the Auto ARIMA (auto.arima in R) model and compare them.


Specify the option correctly with no spaces that states that the model should assume there IS a seasonal components.

Finance and Forecasting Coursework at Whitman

  • After Tuesday’s Demo lecture I received a lot questions asking about going further with forecasting.

  • Here are some great classes in Whitman that will expand your financial forecasting skillset

    • FIN 454 - Financial Analytics (Uses R and RStudio)

    • FIN 461 - Financial Modeling (Uses R and RStudio)

    • FIN 400 - Algorithmic Trading with Kivanc Avrenli (uses Python)

  • There are other great Finance classes but these are the ones I have direct knowledge of.

Continuing in Business Analytics

  • BUA 345 provides a solid foundation in data literacy.

  • IF you find this material interesting and want to take the lead on analyzing data, consider the Business Analytics (BA) major.

  • Skills that the BA major focuses on:

    • Advanced Analytics focuses on dealing with complex and large data sets.

    • Data Management shows how to go from raw data on the internet to informative data visualization dashboards.

    • Predictive Analytics expands on the modeling skills in this course to show students how to develop models and make sound business predictions.

    • Data Mining and Network Modeling

    • Visual Analytics builds on visualization skills in data management course to show how to present data effectively.

  • Financial Analytics and Marketing Analytics are electives for the BA major.

  • More courses are being developed.

Expand your Analytics and Data Science Skillset.

  • As SU Students you also have free access to Linkedin Learning

    • Great tutorials in R, Python, SQL

    • R is versatile and powerful, but employers may prefer Python, SQL, or another language/environment because that is what they know.

    • R, Python, Observable and Julia can all be used with Quarto

  • DataCamp - Not Free, but Excellent.

Talking about Your Skillset

  • Explaining your analytics skillset is challenging, but it’s getting easier.

  • As data science and analytics grows as a field, more people understand what this skillset can offer.

  • You should not assume that interviewers, colleagues, supervisors understand your skills.

  • This White Paper from Data camp (also posted on Blackboard) is helpful.

  • Starting on Page 9 it lays out different roles people take on when working with data.

  • Comparing these descriptions to skills you learn in BUA 345 (and future courses) will help you communicate your skillset with confidence.

Final exam Information

  • Timed test - 90 minutes

    • 2:00 PM Section: Tuesday 5/6 at 10:15 AM in Room 11

    • 3;30 PM Section: Friday 5/2 at 8:00 AM in Room 009

  • There is no asynchronous option for this test and students must attend the time for their section unless they have already spoken with me.

  • Seats are limited.

Overview of Course

  • Excel Skills

    • Lectures 1 - 6

    • HW Assignments 2, 3 and 4.1

  • Correlation, SLR, MLR, Logistic Regression

    • Lectures 7 - 18

    • HW Assignments 4.2, 5, 6, 7, 8

  • Non-linear Models and Optimization

    • Lectures 22 - 24

    • HW 9 and Q1 of HW 10

  • Forecasting

    • Lectures 25 - 27

    • HW 10

Additional Important Course Material

  • Practice Questions

    • For Quiz 1

    • For Quiz 2

    • Additional Practice Questions

  • Quizzes

    • Quiz 1

    • Quiz 2

  • Final Exam Questions will mostly be adapted from previous quiz questions and practice questions.

Excel Skills - Lectures 1 - 6

  • Relative and Absolute ($) cell references

  • Excel Tables and Table Options

    • Sorting, filter, finding duplicates
  • Excel Pivot Tables

    • Summarizing complex data

    • Many options

  • Vlookup

    • Including embedded Match command to increase functionality

    • Both Range and Exact match lookups

Lecture 28 In-class Exercises - Q2

Session ID: bua345s25

What percent of the females that survived the Titanic disaster were in Second Class?


  • There are MANY ways to approach this question.

  • Hint: This may be easier to do if you keep the data as counts instead of converting values to percentages.

  • Round answer to the closest whole percent and don’t include percent sign in your answer.

Correlation, SLR, MLR, and Logistic Regression

  • This material comprises the largest part of the course.

  • Correlation, and Simple Linear Regression included in Quiz 1

    • Calculating and interpreting correlations using the cor command

    • Creating a Simple Linear Regression (SLR) model and verifying it is valid.

    • Knowing when the log, natural log transformation is useful.

  • All MLR Regression topics included in Quiz 2

    • Model Selection Methods

    • Measures of Goodness of Fit, e.g., Adjusted \(R^2\) and AIC

    • Basic commands for creating a model, e.g. lm, ols_regress

  • Logistic Regression (glm)

    • When is it used?

    • How do we use model results to find probabilities?

Lecture 28 In-class Exercises - Q3-Q4

Question 3. What is the slope for the Premium cut diamond category in these data?

Question 4. In this diamonds dataset, Ideal cut is the baseline category and there are a total of three cut categories, Ideal, Premium, and Very Good.

Which other category, Very Good or Premium, is not significantly different from Ideal?

                           Model Summary                             
--------------------------------------------------------------------
R                         0.880       RMSE                  401.048 
R-Squared                 0.774       MSE                160839.634 
Adj. R-Squared            0.773       Coef. Var              16.202 
Pred R-Squared            0.769       AIC                 14840.040 
MAE                     310.166       SBC                 14874.394 
--------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 
 AIC: Akaike Information Criteria 
 SBC: Schwarz Bayesian Criteria 

                                    ANOVA                                     
-----------------------------------------------------------------------------
                     Sum of                                                  
                    Squares         DF      Mean Square       F         Sig. 
-----------------------------------------------------------------------------
Regression    550650373.945          5    110130074.789    680.611    0.0000 
Residual      160839633.955        994       161810.497                      
Total         711490007.900        999                                       
-----------------------------------------------------------------------------

                                         Parameter Estimates                                          
-----------------------------------------------------------------------------------------------------
             model        Beta    Std. Error    Std. Beta      t        Sig        lower       upper 
-----------------------------------------------------------------------------------------------------
       (Intercept)    -231.059        83.222                 -2.776    0.006    -394.369     -67.749 
             carat    4112.089       120.651        0.907    34.083    0.000    3875.330    4348.848 
        cutPremium     147.418       115.427       -0.079     1.277    0.202     -79.091     373.927 
      cutVery Good    -152.745       120.764       -0.038    -1.265    0.206    -389.726      84.237 
  carat:cutPremium    -425.098       163.797       -0.058    -2.595    0.010    -746.525    -103.670 
carat:cutVery Good     117.709       174.465        0.014     0.675    0.500    -224.652     460.071 
-----------------------------------------------------------------------------------------------------

NL Models and Optimization - Lectures 22 - 24

  • Example Questions in Lectures, HW 9, HW 10, and Additional Practice Questions

  • Non-linear (NL) models

    • Excel is a great tool to efficiently compare different model choices

    • Model coefficients and \(R^2\) can be requested for each model.

    • Adjusted \(R^2\) values will be provided.

  • Optimization

    • Non-linear model optimization using GRG Non-linear method in Excel Solver.

    • Optimizing systems of linear equations using Simplex LP method in Excel Solver.

Forecasting - Lectures 25 - 27

  • Cross-sectional vs. Time-Series Data

  • Forecasting Terminology

  • Using ts command to correctly specify a time series

  • Implementing and Interpreting Auto-ARIMA models (auto.arima in R)

    • Determining if data have a seasonal component

    • Reporting requested prediction bounds or the prediction interval (Hi - Lo)

    • Calculating model percent accuracy: (100 - MAPE)%

    • Examining and comparing model residuals (HW 10)

    • Determining if data show a seasonal pattern (seasonal=T)

Key Points from Today

To submit an Engagement Question or Comment about material from Lecture 28: Submit it by midnight today (day of lecture).