set.seed(42)
library(readr)
library(lessR)

## 
## lessR 4.5.2                          feedback: gerbing@pdx.edu 
## --------------------------------------------------------------
## > d <- Read("")  Read data file, many formats available, e.g., Excel
##   d is the default data frame, data= in analysis routines optional
## 
## Find examples of reading, writing, and manipulating data, graphics,
## testing means and proportions, regression, factor analysis,
## customization, forecasting, and aggregation to pivot tables.
##   Enter: browseVignettes("lessR")
## 
## Although most previous function calls still work, most
## visualization functions are now reorganized to three functions:
##    Chart(): type = "bar", "pie", "radar", "bubble", "dot",
##                    "sunburst", "treemap", "icicle"
##    X(): type="histogram", "density", "vbs", and more
##    XY(): type="scatter" for a scatterplot, or "contour", "smooth"
## There is also Flows() for Sankey flow diagrams.
## 
## View lessR updates, now including modern time series forecasting.
##   Enter: news(package="lessR"), or ?Chart, ?X, or ?XY
## 
## Interactive data analysis for constructing visualizations.
##   Enter: interact()

library(tidyr)

Learning Objective 1

What is employee training?
- Employee training is programs that help employees learn the skills, knowledge, abilities. and other characteristics they need to perform their jobs effectively.
What is the employee training process?
- The employee training process is steps organizations use to plan, deliver, and assess training. The steps include needs assessment, creating a good learning environment, choosing training methods, and evaluating whether the training worked.
What is a “training needs assessment” and why do we need to use it before employees are sent through training programs?
- This critical step helps organizations target who needs training, what they need to be trained on, and what resources the organization has to support the training initiative. It should be used before sending employees through training so the organization can make sure the training is needed.
What analyses are involved in a training needs assessment?
- Organizational analysis, job analysis, and person analysis
How do these analyses inform the design of training goals and objectives?
- They help the organization figure out what problems need to be addressed, what employees need to learn, and what emloyees needs training on.
What trainee characteristics increase the odds that trained behaviors transfer to the workplace?
- Personality & cognitive ability, self-efficacy, motivation to learn, and metacognitive skills.
What organizational context characteristics impact training transfer?
- Organizational support, a work environment that is similar to training, and training through principles by explaining why they should do what they are doing.
What are some common training methods?
- On-the-job traing, lectures, simulations, programmed instruction, eLearning, and behavior modeling training.
What are four common measures of training effectiveness (e.g., results)?
- Reactions: Extent to which the training was liked and relevant
- Learning: Demonstration of KSAOs or behaviors from training
- Behavior: Changes to on-the-job behaviors learned from training
- Results: Changes to performance/organizational indicators
What are common research designs to use to determine whether training is effective and when would it be appropriate to use each one?
- Post-test only without control group: When the concern is whether trainees scored high enough on a measure of training effectiveness
- Pre-test, post-test without control group: More rigorous than just post-test without control group
- Post-test only with control group: Uses a group to compare results to
- Pre-test, post-test with control group: Contains a baseline measurement and control group which can strengthen inferences about training effectiveness
What is random assignment and why might we use it in a training context?
- This is when people are placed into groups by chance.

Learning Objective 2

What is a paired-sample t-test?
- It is an inferential statistical test used to compare 2 related sets of scores from the same people.
What are some assumptions underlying this t-test?
- The difference scores based on 2 outcome measures are independent of each other
- The difference scores have a univariate normal distribution in the underlying population
How do we infer statistical significance?
- By comparing the t-value to a table of critical values of a t-distribution
How do we examine practical significance? Note: you’ll have to convert the t-test to Cohen’s d. See the examples for assistance.
- By converting the t-test results into Cohen’s d.

Learning Objective 3

Watch this tutorial “Paired Samples t-test in R” (~45 min.) and reproduce his work. Post a screenshot below.

td <- read_csv("TrainingEvaluation_PrePostOnly.csv")

## Rows: 25 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (3): EmpID, PreTest, PostTest
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

# paired samples t-test
ttest(PreTest, PostTest, data = td, paired = TRUE)

## 
## 
## ------ Describe ------
## 
## Difference:  n.miss = 0,  n = 25,   mean = 19.640,  sd = 11.597
## 
## 
## ------ Normality Assumption ------
## 
## Null hypothesis is a normal distribution of Difference.
## Shapiro-Wilk normality test:  W = 0.9687,  p-value = 0.613
## 
## 
## ------ Infer ------
## 
## t-cutoff for 95% range of variation: tcut =  2.064 
## Standard Error of Mean: SE =  2.319 
## 
## Hypothesized Value H0: mu = 0 
## Hypothesis Test of Mean:  t-value = 8.468,  df = 24,  p-value = 0.000
## 
## Margin of Error for 95% Confidence Level:  4.787
## 95% Confidence Interval for Mean:  14.853 to 24.427
## 
## 
## ------ Effect Size ------
## 
## Distance of sample mean from hypothesized:  19.640
## Standardized Distance, Cohen's d:  1.694
## 
## 
## ------ Graphics Smoothing Parameter ------
## 
## Density bandwidth for 6.941

# data visualization: bar chart
prepost_means <- c(mean(td$PreTest, na.rm=TRUE), mean(td$PostTest, na.rm=TRUE))

meannames <- c("Pre-Test", "Post-Test")

meansdf <- data.frame(meannames, prepost_means)

remove(prepost_means)
remove(meannames)

BarChart(x=meannames, y=prepost_means, data=meansdf, xlab="Time of Test", ylab="Average Score")

## lessR visualizations are now unified over just three core functions:
##   - Chart() for pivot tables, such as bar charts. More info: ?Chart
##   - X() for a single variable x, such as histograms. More info: ?X
##   - XY() for scatterplots of two variables, x and y. More info: ?XY
## 
## BarChart() is deprecated, though still working for now.
## Please use Chart(..., type = "bar") going forward.

## [Interactive chart from the Plotly R package (Sievert, 2020)] 
## 
## >>> Suggestions  or  enter: style(suggest=FALSE)
## Chart(meannames type="radar")  # Plotly radar chart
## Chart(meannames type="treemap")  # Plotly treemap chart
## Chart(meannames type="pie")  # Plotly pie/sunburst chart
## Chart(meannames type="icicle")  # Plotly icicle chart
## Chart(meannames type="bubble")  # Plotly bubble chart
##

##  Plotted Values 
##  -------------- 
##   Pre-Test  Post-Test 
##      52.72      72.36

Learning Objective 4

What is an independent samples t-test?
- An inferential statistical analysis that can be used to compare the means of 2 independent groups.
What are some assumptions underlying this t-test?
- The outcome variable has a univariate normal distribution in each of the 2 underlying populations
- The Variances of the outcome variable are equal across the 2 populations
How do we infer statistical significance?
- Compare the t-value to a table of critical values of a t-distribution.
How do we examine practical significance? Note: you’ll have to convert the t-test to Cohen’s d. See the examples for assistance.
- By converting to Cohen’s d, the standardized mean difference score.

Learning Objective 5

Watch this tutorial “Independent-Samples t-test in R” (~45 min.) and reproduce his work. Post a screenshot below.

# independent-samples t-test
id <- read.csv("TrainingEvaluation_PostControl.csv")

# estimate independent samples t-test
ttest(PostTest ~ Condition, data=id, paired=FALSE)

## 
## Compare PostTest across Condition with levels New and Old 
## Grouping Variable:  Condition
## Response Variable:  PostTest
## 
## 
## ------ Describe ------
## 
## PostTest for Condition New:  n.miss = 0,  n = 25,  mean = 72.360,  sd = 6.975
## PostTest for Condition Old:  n.miss = 0,  n = 25,  mean = 61.320,  sd = 9.150
## 
## Mean Difference of PostTest:  11.040
## 
## Weighted Average Standard Deviation:   8.136 
## 
## 
## ------ Assumptions ------
## 
## Note: These hypothesis tests can perform poorly, and the 
##       t-test is typically robust to violations of assumptions. 
##       Use as heuristic guides instead of interpreting literally. 
## 
## Null hypothesis, for each group, is a normal distribution of PostTest.
## Group New  Shapiro-Wilk normality test:  W = 0.950,  p-value = 0.253
## Group Old  Shapiro-Wilk normality test:  W = 0.969,  p-value = 0.621
## 
## Null hypothesis is equal variances of PostTest, homogeneous.
## Variance Ratio test:  F = 83.727/48.657 = 1.721,  df = 24;24,  p-value = 0.191
## Levene's test, Brown-Forsythe:  t = -0.955,  df = 48,  p-value = 0.344
## 
## 
## ------ Infer ------
## 
## --- Assume equal population variances of PostTest for each Condition 
## 
## t-cutoff for 95% range of variation: tcut =  2.011 
## Standard Error of Mean Difference: SE =  2.301 
## 
## Hypothesis Test of 0 Mean Diff:  t-value = 4.798,  df = 48,  p-value = 0.000
## 
## Margin of Error for 95% Confidence Level:  4.627
## 95% Confidence Interval for Mean Difference:  6.413 to 15.667
## 
## 
## --- Do not assume equal population variances of PostTest for each Condition 
## 
## t-cutoff: tcut =  2.014 
## Standard Error of Mean Difference: SE =  2.301 
## 
## Hypothesis Test of 0 Mean Diff:  t = 4.798,  df = 44.852, p-value = 0.000
## 
## Margin of Error for 95% Confidence Level:  4.635
## 95% Confidence Interval for Mean Difference:  6.405 to 15.675
## 
## 
## ------ Effect Size ------
## 
## --- Assume equal population variances of PostTest for each Condition 
## 
## Standardized Mean Difference of PostTest, Cohen's d:  1.357
## 
## 
## ------ Practical Importance ------
## 
## Minimum Mean Difference of practical importance: mmd
## Minimum Standardized Mean Difference of practical importance: msmd
## Neither value specified, so no analysis
## 
## 
## ------ Graphics Smoothing Parameter ------
## 
## Density bandwidth for Condition New: 4.175
## Density bandwidth for Condition Old: 5.464

# data visualization: bar chart
BarChart(x=Condition, y=PostTest,
         data=id, stat = "mean",
         xlab="Training Condition",
         ylab="Mean of Post-Test Scores")

## [Interactive chart from the Plotly R package (Sievert, 2020)] 
## 
## >>> Suggestions  or  enter: style(suggest=FALSE)
## Chart(Condition type="radar")  # Plotly radar chart
## Chart(Condition type="treemap")  # Plotly treemap chart
## Chart(Condition type="pie")  # Plotly pie/sunburst chart
## Chart(Condition type="icicle")  # Plotly icicle chart
## Chart(Condition type="bubble")  # Plotly bubble chart
##  
## PostTest 
##   - by levels of - 
## Condition 
##  
##        n   miss  mean    sd   min   mdn   max 
## New   25      0    72     7    60    73    84 
## Old   25      0    61     9    42    61    79 
##

##  Plotted Values 
##  -------------- 
##     New    Old 
##      72     61

HR Analytics PPA 8

Learning Objective 1

Learning Objective 2

Learning Objective 3

Learning Objective 4

Learning Objective 5