Load in Useful Packages

library(haven)
library(psych)

Attaching package: ‘psych’

The following objects are masked from ‘package:ggplot2’:

    %+%, alpha
library(tidyverse)

Load in the Dataset and Glimpse

hsb2 <- read_dta("hsb2.dta")
glimpse(hsb2)
Rows: 200
Columns: 11
$ id      <dbl> 70, 121, 86, 141, 172, 113, 50, 11, 84, 48, 75, 60, 95, 104, 38, 115…
$ female  <dbl+lbl> 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ race    <dbl+lbl> 4, 4, 4, 4, 4, 4, 3, 1, 4, 3, 4, 4, 4, 4, 3, 4, 4, 4, 4, 4, 4, 4…
$ ses     <dbl+lbl> 1, 2, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 1, 1, 3, 2, 3, 2, 2, 2…
$ schtyp  <dbl+lbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1…
$ prog    <dbl+lbl> 1, 3, 1, 3, 2, 2, 1, 2, 1, 2, 3, 2, 2, 2, 2, 1, 2, 1, 2, 1, 1, 3…
$ read    <dbl> 57, 68, 44, 63, 47, 44, 50, 34, 63, 57, 60, 57, 73, 54, 45, 42, 47, …
$ write   <dbl> 52, 59, 33, 44, 52, 52, 59, 46, 57, 55, 46, 65, 60, 63, 57, 49, 52, …
$ math    <dbl> 41, 53, 54, 47, 57, 51, 42, 45, 54, 52, 51, 51, 71, 57, 50, 43, 51, …
$ science <dbl> 47, 63, 58, 53, 53, 63, 53, 39, 58, 50, 53, 63, 61, 55, 31, 50, 50, …
$ socst   <dbl> 57, 61, 31, 56, 61, 61, 61, 36, 51, 51, 61, 61, 71, 46, 56, 56, 56, …

Codebook from Phil Ender

(http://www.philender.com/courses/intro/hsbcode.html)

Here’s a handy trick - you can insert screenshots, jpeg files, etc. directly into an RMarkdown file using the convention below. See the file for the specific code.

Question 1

Using the dataset hsb2.dta, do a detailed summary of the variable progtype (curricular program type) (In R - use the “describe” function from the psych package). Then do a tabulation of progtype using the tab command (or table command in R), both with and without the value labels. Paste the summary and the tabulations below. Which type of descriptive measure is most appropriate for this kind of data?

Statistical Summary of All Variables

describe(hsb2)

Table Showing Breakdown for prog

table(hsb2$prog)

  1   2   3 
 45 105  50 

The most appropriate statistical measure for program type is frequency, which can be identified with the table function in R. This appears to be a categorical variable with three school program types (1 = general, 2 = academic prep, 3 = vocational/technical).

Question 2

  1. Construct a graph showing three histograms, one for each level of ses (low, medium, and high) for the write variable (standardized writing scores). In R - use filter() to create three separate datasets, one for each SES group, and then use the hist() function to make a histogram for each. Paste a picture of the graph/s below. Is it useful for comparing the writing scores by SES? Why or why not?

Use filter to Create Three Separate Datasets

ses1<-filter(hsb2, ses == 1)
ses2<-filter(hsb2, ses == 2)
ses3<-filter(hsb2, ses == 3)

Histogram for SES == 1 (Low)

hist(ses1$write)

Histogram for SES == 2 (Medium)

hist(ses2$write)

Histogram for SES == 3 (High)

hist(ses3$write)

Yes, it is helpful to compare the writing scores by SES. When you filter on the students socio economic status, you notice that the histogram changes. The data suggests that students with a high socioeconomic status have higher standardized writing scores.

A Little Fun Bonus:

Create a Combined Plot of all Three Distributions, Highlighted by SES

This Uses the ggplot2 Package, Which is Great for More Customized Plotting

library(ggplot2)

p <- ggplot(data = hsb2, mapping = aes(x = write))

p +  geom_histogram(fill = "blue", alpha = .50, binwidth = 10) + 
  facet_wrap(hsb2$ses) + 
  labs(title = "Distribution of Writing Scores, by Student SES Level",
       caption = "1 = Low SES (N = 47), 2 = Medium SES (N = 95), 3 = High SES (N = 58).")

LS0tCnRpdGxlOiAiS2V5IGZvciBNb2R1bGUgMSAtIFdlZWtseSBDb250ZW50IFJldmlldyIKYXV0aG9yOiAiRHIuIEIgd2l0aCBoZWxwIGZyb20gSmFrZSBSZXlub2xkcyEiCm91dHB1dDogaHRtbF9ub3RlYm9vawotLS0KCiMgTG9hZCBpbiBVc2VmdWwgUGFja2FnZXMKYGBge3J9CmxpYnJhcnkoaGF2ZW4pCmxpYnJhcnkocHN5Y2gpCmxpYnJhcnkodGlkeXZlcnNlKQpgYGAKCiMgTG9hZCBpbiB0aGUgRGF0YXNldCBhbmQgR2xpbXBzZQpgYGBge3J9CmhzYjIgPC0gcmVhZF9kdGEoImhzYjIuZHRhIikKZ2xpbXBzZShoc2IyKQoKYGBgCgojIENvZGVib29rIGZyb20gUGhpbCBFbmRlciAKIyMgKGh0dHA6Ly93d3cucGhpbGVuZGVyLmNvbS9jb3Vyc2VzL2ludHJvL2hzYmNvZGUuaHRtbCkKSGVyZSdzIGEgaGFuZHkgdHJpY2sgLSB5b3UgY2FuIGluc2VydCBzY3JlZW5zaG90cywganBlZyBmaWxlcywgZXRjLiBkaXJlY3RseSBpbnRvIGFuIFJNYXJrZG93biBmaWxlIHVzaW5nIHRoZSBjb252ZW50aW9uIGJlbG93LiBTZWUgdGhlIGZpbGUgZm9yIHRoZSBzcGVjaWZpYyBjb2RlLgohW10oSFNCX0NvZGVib29rLnBuZykgCgojIFF1ZXN0aW9uIDEgClVzaW5nIHRoZSBkYXRhc2V0IGBoc2IyLmR0YWAsIGRvIGEgZGV0YWlsZWQgc3VtbWFyeSBvZiB0aGUgdmFyaWFibGUgYHByb2d0eXBlYCAoY3VycmljdWxhciBwcm9ncmFtIHR5cGUpIChJbiBSIC0gdXNlIHRoZSDigJxkZXNjcmliZeKAnSBmdW5jdGlvbiBmcm9tIHRoZSBwc3ljaCBwYWNrYWdlKS4gIFRoZW4gZG8gYSB0YWJ1bGF0aW9uIG9mIHByb2d0eXBlIHVzaW5nIHRoZSB0YWIgY29tbWFuZCAob3IgYHRhYmxlYCBjb21tYW5kIGluIFIpLCBib3RoIHdpdGggYW5kIHdpdGhvdXQgdGhlIHZhbHVlIGxhYmVscy4gIFBhc3RlIHRoZSBzdW1tYXJ5IGFuZCB0aGUgdGFidWxhdGlvbnMgYmVsb3cuICBXaGljaCB0eXBlIG9mIGRlc2NyaXB0aXZlIG1lYXN1cmUgaXMgbW9zdCBhcHByb3ByaWF0ZSBmb3IgdGhpcyBraW5kIG9mIGRhdGE/CgojIyBTdGF0aXN0aWNhbCBTdW1tYXJ5IG9mIEFsbCBWYXJpYWJsZXMKYGBge3J9CmRlc2NyaWJlKGhzYjIpCmBgYAoKIyMgVGFibGUgU2hvd2luZyBCcmVha2Rvd24gZm9yIGBwcm9nYApgYGB7cn0KdGFibGUoaHNiMiRwcm9nKQpgYGAKVGhlIG1vc3QgYXBwcm9wcmlhdGUgc3RhdGlzdGljYWwgbWVhc3VyZSBmb3IgcHJvZ3JhbSB0eXBlIGlzIGZyZXF1ZW5jeSwgd2hpY2ggY2FuIGJlIGlkZW50aWZpZWQgd2l0aCB0aGUgdGFibGUgZnVuY3Rpb24gaW4gYFJgLiBUaGlzIGFwcGVhcnMgdG8gYmUgYSBjYXRlZ29yaWNhbCB2YXJpYWJsZSB3aXRoIHRocmVlIHNjaG9vbCBwcm9ncmFtIHR5cGVzICgxID0gZ2VuZXJhbCwgMiA9IGFjYWRlbWljIHByZXAsIDMgPSB2b2NhdGlvbmFsL3RlY2huaWNhbCkuCgojIFF1ZXN0aW9uIDIKMi4gIENvbnN0cnVjdCBhIGdyYXBoIHNob3dpbmcgdGhyZWUgaGlzdG9ncmFtcywgb25lIGZvciBlYWNoIGxldmVsIG9mIHNlcyAobG93LCBtZWRpdW0sIGFuZCBoaWdoKSBmb3IgdGhlIHdyaXRlIHZhcmlhYmxlIChzdGFuZGFyZGl6ZWQgd3JpdGluZyBzY29yZXMpLiAgSW4gYFJgIC0gdXNlIGZpbHRlcigpIHRvIGNyZWF0ZSB0aHJlZSBzZXBhcmF0ZSBkYXRhc2V0cywgb25lIGZvciBlYWNoIFNFUyBncm91cCwgYW5kIHRoZW4gdXNlIHRoZSBoaXN0KCkgZnVuY3Rpb24gdG8gbWFrZSBhIGhpc3RvZ3JhbSBmb3IgZWFjaC4gUGFzdGUgYSBwaWN0dXJlIG9mIHRoZSBncmFwaC9zIGJlbG93LiAgSXMgaXQgdXNlZnVsIGZvciBjb21wYXJpbmcgdGhlIHdyaXRpbmcgc2NvcmVzIGJ5IFNFUz8gIFdoeSBvciB3aHkgbm90PwoKIyMgVXNlIGBmaWx0ZXJgIHRvIENyZWF0ZSBUaHJlZSBTZXBhcmF0ZSBEYXRhc2V0cwpgYGB7cn0Kc2VzMTwtZmlsdGVyKGhzYjIsIHNlcyA9PSAxKQpzZXMyPC1maWx0ZXIoaHNiMiwgc2VzID09IDIpCnNlczM8LWZpbHRlcihoc2IyLCBzZXMgPT0gMykKCmBgYAoKIyMgSGlzdG9ncmFtIGZvciBTRVMgPT0gMSAoTG93KQpgYGB7cn0KaGlzdChzZXMxJHdyaXRlKQpgYGAKCiMjIEhpc3RvZ3JhbSBmb3IgU0VTID09IDIgKE1lZGl1bSkKYGBge3J9Cmhpc3Qoc2VzMiR3cml0ZSkKYGBgCgojIyBIaXN0b2dyYW0gZm9yIFNFUyA9PSAzIChIaWdoKQpgYGB7cn0KaGlzdChzZXMzJHdyaXRlKQpgYGAKClllcywgaXQgaXMgaGVscGZ1bCB0byBjb21wYXJlIHRoZSB3cml0aW5nIHNjb3JlcyBieSBTRVMuIFdoZW4geW91IGZpbHRlciBvbiB0aGUgc3R1ZGVudHMgc29jaW8gZWNvbm9taWMgc3RhdHVzLCB5b3Ugbm90aWNlIHRoYXQgdGhlIGhpc3RvZ3JhbSBjaGFuZ2VzLiBUaGUgZGF0YSBzdWdnZXN0cyB0aGF0IHN0dWRlbnRzIHdpdGggYSBoaWdoIHNvY2lvZWNvbm9taWMgc3RhdHVzIGhhdmUgaGlnaGVyIHN0YW5kYXJkaXplZCB3cml0aW5nIHNjb3Jlcy4gIAoKIyBBIExpdHRsZSBGdW4gQm9udXM6CiMjIENyZWF0ZSBhIENvbWJpbmVkIFBsb3Qgb2YgYWxsIFRocmVlIERpc3RyaWJ1dGlvbnMsIEhpZ2hsaWdodGVkIGJ5IFNFUwojIyBUaGlzIFVzZXMgdGhlIGBnZ3Bsb3QyYCBQYWNrYWdlLCBXaGljaCBpcyBHcmVhdCBmb3IgTW9yZSBDdXN0b21pemVkIFBsb3R0aW5nCmBgYHtyfQpsaWJyYXJ5KGdncGxvdDIpCgpwIDwtIGdncGxvdChkYXRhID0gaHNiMiwgbWFwcGluZyA9IGFlcyh4ID0gd3JpdGUpKQoKcCArICBnZW9tX2hpc3RvZ3JhbShmaWxsID0gImJsdWUiLCBhbHBoYSA9IC41MCwgYmlud2lkdGggPSAxMCkgKyAKICBmYWNldF93cmFwKGhzYjIkc2VzKSArIAogIGxhYnModGl0bGUgPSAiRGlzdHJpYnV0aW9uIG9mIFdyaXRpbmcgU2NvcmVzLCBieSBTdHVkZW50IFNFUyBMZXZlbCIsCiAgICAgICBjYXB0aW9uID0gIjEgPSBMb3cgU0VTIChOID0gNDcpLCAyID0gTWVkaXVtIFNFUyAoTiA9IDk1KSwgMyA9IEhpZ2ggU0VTIChOID0gNTgpLiIpCgpgYGAKCg==