M03-2: Data Visualization-Application Assignment

Author

Autum Waller

Published

February 23, 2026

0.1 Data

  attend termGPA priGPA ACT final atndrte hwrte frosh soph missed     stndfnl
1     27    3.19   2.64  23    28  84.375 100.0     0    1      5  0.47268906
2     22    2.73   3.52  25    26  68.750  87.5     0    0     10  0.05252101
3     30    3.00   2.46  24    30  93.750  87.5     0    0      2  0.89285713
4     31    2.04   2.61  20    27  96.875 100.0     0    1      1  0.26260504
5     32    3.68   3.32  23    34 100.000 100.0     0    1      0  1.73319328
6     29    3.23   2.93  26    25  90.625 100.0     0    1      3 -0.15756303
[1] "data.frame"
# A tibble: 680 × 11
   attend termGPA priGPA   ACT final atndrte hwrte frosh  soph missed stndfnl
    <int>   <dbl>  <dbl> <int> <int>   <dbl> <dbl> <int> <int>  <int>   <dbl>
 1     27    3.19   2.64    23    28    84.4 100       0     1      5  0.473 
 2     22    2.73   3.52    25    26    68.8  87.5     0     0     10  0.0525
 3     30    3      2.46    24    30    93.8  87.5     0     0      2  0.893 
 4     31    2.04   2.61    20    27    96.9 100       0     1      1  0.263 
 5     32    3.68   3.32    23    34   100   100       0     1      0  1.73  
 6     29    3.23   2.93    26    25    90.6 100       0     1      3 -0.158 
 7     30    1.54   1.94    21    10    93.8  75       1     0      2 -3.31  
 8     26    2      2.12    22    34    81.2 100       0     1      6  1.73  
 9     24    2.25   2.06    24    26    75   100       1     0      8  0.0525
10     29    3      2.73    21    26    90.6 100       0     1      3  0.0525
# ℹ 670 more rows

1 Ex.1 GT Table

1.1 Create a table using GT

attend termGPA priGPA ACT final atndrte hwrte frosh soph missed stndfnl
27 3.19 2.64 23 28 84.375 100.0 0 1 5 0.47268906
22 2.73 3.52 25 26 68.750 87.5 0 0 10 0.05252101
30 3.00 2.46 24 30 93.750 87.5 0 0 2 0.89285713
31 2.04 2.61 20 27 96.875 100.0 0 1 1 0.26260504
32 3.68 3.32 23 34 100.000 100.0 0 1 0 1.73319328
29 3.23 2.93 26 25 90.625 100.0 0 1 3 -0.15756303
30 1.54 1.94 21 10 93.750 75.0 1 0 2 -3.30882359
26 2.00 2.12 22 34 81.250 100.0 0 1 6 1.73319328
24 2.25 2.06 24 26 75.000 100.0 1 0 8 0.05252101
29 3.00 2.73 21 26 90.625 100.0 0 1 3 0.05252101

1.2 Decorate the Table

Preview of Attend Dataset
First 10 Observations
Attendance Term GPA Prior GPA ACT final atndrte hwrte frosh soph missed stndfnl
27 3.19 2.64 23 28 84.375 100.0 0 1 5 0.47268906
22 2.73 3.52 25 26 68.750 87.5 0 0 10 0.05252101
30 3.00 2.46 24 30 93.750 87.5 0 0 2 0.89285713
31 2.04 2.61 20 27 96.875 100.0 0 1 1 0.26260504
32 3.68 3.32 23 34 100.000 100.0 0 1 0 1.73319328
29 3.23 2.93 26 25 90.625 100.0 0 1 3 -0.15756303
30 1.54 1.94 21 10 93.750 75.0 1 0 2 -3.30882359
26 2.00 2.12 22 34 81.250 100.0 0 1 6 1.73319328
24 2.25 2.06 24 26 75.000 100.0 1 0 8 0.05252101
29 3.00 2.73 21 26 90.625 100.0 0 1 3 0.05252101

2 Ex2: Scatter plot with ggplot2

2.1 Basic Scatter Plot

2.2 Modify + Regression + Facet by soph

First relabel soph variable.

# A tibble: 680 × 11
   attend termGPA priGPA   ACT final atndrte hwrte frosh soph     missed stndfnl
    <int>   <dbl>  <dbl> <int> <int>   <dbl> <dbl> <int> <fct>     <int>   <dbl>
 1     27    3.19   2.64    23    28    84.4 100       0 Sophomo…      5  0.473 
 2     22    2.73   3.52    25    26    68.8  87.5     0 Non-Sop…     10  0.0525
 3     30    3      2.46    24    30    93.8  87.5     0 Non-Sop…      2  0.893 
 4     31    2.04   2.61    20    27    96.9 100       0 Sophomo…      1  0.263 
 5     32    3.68   3.32    23    34   100   100       0 Sophomo…      0  1.73  
 6     29    3.23   2.93    26    25    90.6 100       0 Sophomo…      3 -0.158 
 7     30    1.54   1.94    21    10    93.8  75       1 Non-Sop…      2 -3.31  
 8     26    2      2.12    22    34    81.2 100       0 Sophomo…      6  1.73  
 9     24    2.25   2.06    24    26    75   100       1 Non-Sop…      8  0.0525
10     29    3      2.73    21    26    90.6 100       0 Sophomo…      3  0.0525
# ℹ 670 more rows

2.3 Add theme and labels

2.4 Go Beyond the Minimum Visualization

2.5 Insights

The plot suggests a negative relationship between prior GPA and missed classes. Students with higher prior GPA tend to miss fewer classes. The pattern appears similar across sophomore and non-sophomore groups, although slope strength may differ slightly.

3 Ex 3: Understand barplots (dodge vs. stack vs. faceted)

3.1 ACT Count Distribution

3.2 Dodged, Stacked, and Faceted Barplots

3.2.1 Dodged

3.2.2 Stacked

3.2.3 Faceted

3.3 Evaluation

Stacked bars distort interpretation because final scores are summed. Faceted charts make comparison clearer.

4 Ex 4: Barplots with ACT averages (Understand Bar plots)

4.1 Summarize Data

# A tibble: 38 × 4
   soph            ACT final_avg     n
   <fct>         <int>     <dbl> <int>
 1 Non-Sophomore    14      20       1
 2 Non-Sophomore    15      24.3     6
 3 Non-Sophomore    16      24.2    11
 4 Non-Sophomore    17      25.2    13
 5 Non-Sophomore    18      25      14
 6 Non-Sophomore    19      23.3    23
 7 Non-Sophomore    20      25.6    30
 8 Non-Sophomore    21      23.2    29
 9 Non-Sophomore    22      25.5    28
10 Non-Sophomore    23      26.0    38
# ℹ 28 more rows

4.1.1 Dodged Average Barlot

4.1.2 Stacked Average Barlot

4.1.3 Faceted Average Barlot

4.2 Carefully evaluate the three charts – dodged bar chart, stacked bar chart, or faceted bar chart. Pay attention to the value of the y variable too. Which one(s), if any, do you think accurately describe(s) the data? Why do you think so?

The faceted bar chart most accurately describes the data. The stacked chart is misleading because it combines the two groups, making comparisons difficult. The dodged chart is better, but can look crowded. The faceted chart clearly separates sophomore and non-sophomore groups, making it easier to compare final exam scores across ACT levels.

4.3 Compare the dodged barplot from Q3.2 with the dodged bar chart from Q4.2. Which chart would you trust and why?

I would trust the dodged bar chart from Q4.2 more because it uses the average final exam scores, not raw totals. The chart in Q3.2 reflects total scores, which can be misleading if group sizes differ. Using averages gives a more accurate comparison of performance across ACT scores and sophomore status.

4.4 What would you say about the differences in the patterns of relationships between ACT and final scores shown in the sophomore and non-sophomore groups?

Both groups show a positive relationship between ACT and final scores. Higher ACT scores are generally associated with higher final exam averages. However, the pattern may vary slightly between sophomores and non-sophomores, suggesting that class standing could somewhat influence the strength of the relationship.

5 Ex 5: Boxplot vs. correct barplot

5.1 Boxplot with jitter

5.2 Describe the differences between the box plot above and the barplot you chose from Q4. Which one do you prefer and why?

The boxplot shows the distribution of final scores, including the median, spread, and outliers, while the barplot only shows the average. Because the boxplot provides more detailed information about variability and distribution, I prefer it. It gives a clearer picture of how final scores differ across ACT levels and between groups.

5.3 What would you say about the differences in the patterns of relationships between ACT and final scores shown in the sophomore and non-sophomore groups?

Both groups show a generally positive relationship between ACT and final scores. However, the spread and consistency of scores may differ slightly between sophomores and non-sophomores, suggesting that class standing may influence how strongly ACT relates to final performance.

6 Ex 6: Impact of ACT on attednace, moderated by freshman status: Boxplot

6.1 Boxplot Act - Attendence moderated by freshman

6.2 Relationship between ACT and attendance by freshman status

The relationship is not exactly the same across the two groups. While both freshmen and non-freshmen may show some association between ACT scores and attendance, the strength and spread of attendance levels appear to differ. This suggests that freshman status may slightly influence how ACT relates to class attendance.

7 Ex 7: Attendancce by ACT, moderated by freshman status: Scatter Plot

7.1 Scatterplot

7.1.1 Do the group differences in the pattern of the relationships shown through the barplot become clear or obscure?

Scatter plots with regression lines show the relationship between ACT and attendance more clearly than barplots. Faceting by freshman status makes it easier to see differences between the two groups.

7.2 In sum, describe what you found in the relationship between x and y and how the relationship pattern is different across the two freshman status groups.

Both groups show a positive relationship. Higher ACT scores generally mean higher attendance. The pattern differs slightly, suggesting freshman status may affect the strength of this relationship.

8 Ex 8: Correlations

8.1 Correlation Plot

8.1.1 ggpairs

8.1.2 ggcorr

8.1.3 Which one do you like and why?

I prefer ggcorr because it looks cleaner and more neat.

8.2 Which pair of variables has the highest positive correlation? What is the correlation statistic? Does it make sense? Which set of variables has the next highest correlation?

The highest positive correlation is between termGPA and priGPA, which makes sense because students with higher prior GPA tend to get higher term GPA. The next highest correlation is usually between attend and atndrte (attendance counts vs. attendance rate), which also makes sense.

8.3 Between attendance rate (atndrte) and homework rate (hwrte), which is more effective to improve GPA?

Looking at the correlations, attendance rate (atndrte) has a slightly stronger positive association with GPA than homework rate. This suggests that going to class may have a bigger impact on improving GPA than just turning in homework.

9 Ex 9: Scatter plot with regression line for the Impact of priGPA on termGPA

9.1 Scatter Plot priGPA → termGPA

9.2 What can you tell about the strengths of the relationships between priGPA and termGPA?

The relationship between priGPA and termGPA is strong and positive. Students with higher prior GPA tend to have higher term GPA. The slopes may differ slightly between sophomores and non-sophomores, but the overall pattern shows a consistent, positive association in both groups.

10 Ex 10: Interactive Plot & Save the plot from Ex 9.

10.1 Interactive

10.2 Save Plot