
The basics
Symmetrical plots: left skewed and right skewed
- Population size = N
- Central tendency is where the data gathers around
- mean is the average
- median the middle value
- mode most frequent occurring data item
- range largest data item - smallest item
- variance average distance from the mean, \(\sigma^2\)
- standard deviation square root of variance
- proportion (p) the ratio of data items divided by total
- InterQuartile Range (IQR) 3rd quartile minus the 1st quartile
- sample variance \(s^2\)
- sample mean \(x^-\) x with a bar on top
simple stats
sample.pop = 1:20
mean(sample.pop)
[1] 10.5
var(sample.pop)
[1] 35
sd(sample.pop)
[1] 5.91608
Central Limit Theorem = the sample size n taken from population with average \(\mu\) and variance \(\sigma^2\) . If n > 30 then the sampling is approximately Normal Distributed (\(\sigma^2\) / N)
Confidence Interval
- confidence level (90%, 95%, 98%)
- define alpha \(\alpha\) value, [1 - 0.95] \(\alpha\) = 0.05
- find z value \(\alpha\) / 2
IF n > 30 use z-table
- if population is normal and std is known, use z-table
- if population is normal and std is not known, use t-table
IF n < 30, use Degrees of Freedom and t-test
Ex. weights of Canadian adults in kg, normal distribution. Sample size is 25, sample mean is 80.4, sample variance is 20.25. Find 98% confidence interval for population mean.
sample population < 30, use DoF, df= n-1, df= 24 alpha/2 = .02/2 = .010 t-table = 2.492
the average confidence interval (80.4 - 2.492) <
\(\mu\) < (80.4 + 2.492)
4 bags of chips of 220g, with a sample mean 217.25, sample variance 6.25. Normal distribution. Alpha level is 0.05.
- Null: \(\mu\) = 220. Alternative: \(\mu\) < 220.
- Confidence Interval: \(\alpha\) = 0.05
- sample size: 4 < 30, use t-table
- df= 4 - 1 = 3
- t-table value: (df and alpha) 2.353.
- interested in 1 tail, left side (less chips), t= -2.353
- reject null if t falls in rejection region
- t-test = (217.2 -220/ 2.5/ sqrt(4)) = -2.2
- fail to reject null
Descriptive Stats
Descriptive stats
- central tendency (mean, median, mode)
- variability (range, std)
Inferential Stats
there are 2 main uses
- significance testing, determine if differences or relationship between groups is statistically significant (occurrence is not just pure chance)
- estimate population parameters from sample stats
Between Groups: select a sample, assign each person to 1 of the binary label of Indep. Var
Within Groups: each person is assigned to all labels, same group gets diff levels, groups are matched related to the dependent variable
Hypothesis Testing
The p-value is the probability of a result happening by pure randomness alone. The alpha \(\alpha\) = 0.05 (95% confidence level, 5% chance of making type 1 error)
IF p-value <= \(\alpha\) THEN reject the null, results are significant
IF p-value > \(\alpha\) THEN accept the null
Errors
- type 1 is rejecting a null when shouldn’t
- type 2 is accept null when shouldn’t
Effect size
Effect size indicates how strong the relationship between variables or differences are between groups
Interpretation of strength of a relationship:
- for r and Phi
- very strong >= .70
- large = .50
- medium = .36
- small = .10
- for eta
- very strong >= .45
- large = .37
- medium = .24
- small = .10
Correlation tests of coefficients
Pearson r and bivariate regression (parametric tests) Correlation coefficients -1 to +1, closer to -1 or +1 the stronger the relationship
-1 strong <——– weak ——-> strong +1
- -0.7 -0.5 -0.3 -0.1 <> 0.1 0.3 0.5 0.7
Chi square test
This non-parametric test can be used for difference questions with nominal or dichotomous variables. Classes: 1 or 2, Condition: on/off
Phi and Cramer’s V
these are tests for association between 2 nominal or dichotomous variables.
- Phi is used when both variables are dichotomous
- Cramer’s V when 1+ variable has multiple levels/groupings
Pearson r
this test is for strength of association between 2 scale variables. Ex. GPA vs Hours of study
Wilcoxon and McNemar to compare groups
ANOVA 1-way, F-score
analysis of variance is used to compare 3+ unrelated groups on an independent variable, determines whether there are significant variance between groups. The value is F-value
Post-hoc tests
the ANOVA just tells you that a significant difference exists
- do a Tukey’s HSD test if the p-value is not stat significant
- do a Games-Howell test if the p-value is stat significant
ANOVA 2-way
Independent variables each have 2+ levels, samples must be indep. and each participant must receive only 1 level of each indep. var. Variances must be equal. There is 3 null hypotheses: there is no diff between various levels (no main effect), there is no diff between various levels of 2nd Indep. var, and there is no interaction between indep. vars
LS0tCnRpdGxlOiAiU3RhdHMgMTc3MCBpbiBSIgpvdXRwdXQ6IAogIGh0bWxfbm90ZWJvb2s6IAogICAgdG9jOiB5ZXMKICAgIHRoZW1lOiBzcGFjZWxhYgotLS0KCgoKPHN0eWxlIHR5cGU9InRleHQvY3NzIj4KYSB7Y29sb3I6ICMzZDAwNjY7IGZvbnQtc2l6ZTogMTE7fQpwIHtjb2xvcjogIzAwMDAwMDsgZm9udC1mYW1pbHk6Im1vbmFjbyIgZm9udC1zaXplOjEzO30KaDEge2NvbG9yOiAjM2QwMDY2IH0KaDIge2NvbG9yOiAjNmIwMGIzIH0KaDMge2NvbG9yOiAjOTkwMGZmIH0KaDQge2NvbG9yOiAjY2MwMDk5IH0KdWwge2xpc3Qtc3R5bGUtdHlwZTogY2lyY2xlOyBjb2xvcjogIzZiMDBiMyB9Cm9sIHtjb2xvcjogI2NjMDA5OTsgZm9udC1zaXplOjExOyB9CnAxIHtjb2xvcjogIzAwNGQ0ZDt9CmxpIHtjb2xvcjogIzAwNGQ0ZDt9IAoKPC9zdHlsZT4KCjx0PgohW10oUlN0dWRpby1Mb2dvIDEucG5nKTwvdD4KCiMgVGhlIGJhc2ljcwpTeW1tZXRyaWNhbCBwbG90czogbGVmdCBza2V3ZWQgYW5kIHJpZ2h0IHNrZXdlZAoKLSBQb3B1bGF0aW9uIHNpemUgPSBOIAotIENlbnRyYWwgdGVuZGVuY3kgaXMgd2hlcmUgdGhlIGRhdGEgZ2F0aGVycyBhcm91bmQKLSAqKm1lYW4qKiBpcyB0aGUgYXZlcmFnZQotICoqbWVkaWFuKiogdGhlIG1pZGRsZSB2YWx1ZQotICoqbW9kZSoqIG1vc3QgZnJlcXVlbnQgb2NjdXJyaW5nIGRhdGEgaXRlbQotICoqcmFuZ2UqKiBsYXJnZXN0IGRhdGEgaXRlbSAtIHNtYWxsZXN0IGl0ZW0KLSAqKnZhcmlhbmNlKiogYXZlcmFnZSBkaXN0YW5jZSBmcm9tIHRoZSBtZWFuLCAkXHNpZ21hXjIkCi0gKipzdGFuZGFyZCBkZXZpYXRpb24qKiBzcXVhcmUgcm9vdCBvZiB2YXJpYW5jZQotICoqcHJvcG9ydGlvbiAocCkqKiB0aGUgcmF0aW8gb2YgZGF0YSBpdGVtcyBkaXZpZGVkIGJ5IHRvdGFsCi0gKipJbnRlclF1YXJ0aWxlIFJhbmdlIChJUVIpKiogM3JkIHF1YXJ0aWxlIG1pbnVzIHRoZSAxc3QgcXVhcnRpbGUKLSAqKnNhbXBsZSB2YXJpYW5jZSoqICRzXjIkCi0gKipzYW1wbGUgbWVhbioqICR4Xi0kIHggd2l0aCBhIGJhciBvbiB0b3AKCiMjIHNpbXBsZSBzdGF0cwpgYGB7cn0Kc2FtcGxlLnBvcCA9IDE6MjAKCm1lYW4oc2FtcGxlLnBvcCkKdmFyKHNhbXBsZS5wb3ApCnNkKHNhbXBsZS5wb3ApCmBgYAoqKkNlbnRyYWwgTGltaXQgVGhlb3JlbSoqID0gdGhlIHNhbXBsZSBzaXplIG4gdGFrZW4gZnJvbSBwb3B1bGF0aW9uIHdpdGggYXZlcmFnZSAkXG11JCBhbmQgdmFyaWFuY2UgJFxzaWdtYV4yJCAuIElmIG4gPiAzMCB0aGVuIHRoZSBzYW1wbGluZyBpcyBhcHByb3hpbWF0ZWx5IE5vcm1hbCBEaXN0cmlidXRlZCAoJFxzaWdtYV4yJCAvIE4pCgoqKkNvbmZpZGVuY2UgSW50ZXJ2YWwqKiAKCi0gY29uZmlkZW5jZSBsZXZlbCAoOTAlLCA5NSUsIDk4JSkKLSBkZWZpbmUgYWxwaGEgJFxhbHBoYSQgdmFsdWUsIFsxIC0gMC45NV0gJFxhbHBoYSQgPSAwLjA1Ci0gZmluZCB6IHZhbHVlICRcYWxwaGEkIC8gMgoKSUYgbiA+IDMwIHVzZSB6LXRhYmxlIDxicj4KCi0gaWYgcG9wdWxhdGlvbiBpcyBub3JtYWwgYW5kIHN0ZCBpcyBrbm93biwgdXNlIHotdGFibGUKLSBpZiBwb3B1bGF0aW9uIGlzIG5vcm1hbCBhbmQgc3RkIGlzIG5vdCBrbm93biwgdXNlIHQtdGFibGUgCgpJRiBuIDwgMzAsIHVzZSAqKkRlZ3JlZXMgb2YgRnJlZWRvbSoqIGFuZCB0LXRlc3QKCjxocj4KRXguIHdlaWdodHMgb2YgQ2FuYWRpYW4gYWR1bHRzIGluIGtnLCBub3JtYWwgZGlzdHJpYnV0aW9uLiBTYW1wbGUgc2l6ZSBpcyAyNSwgc2FtcGxlIG1lYW4gaXMgODAuNCwgc2FtcGxlIHZhcmlhbmNlIGlzIDIwLjI1LiBGaW5kIDk4JSBjb25maWRlbmNlIGludGVydmFsIGZvciBwb3B1bGF0aW9uIG1lYW4uCgpzYW1wbGUgcG9wdWxhdGlvbiA8IDMwLCB1c2UgRG9GLCBkZj0gbi0xLCBkZj0gMjQKYWxwaGEvMiA9IC4wMi8yID0gLjAxMAp0LXRhYmxlID0gMi40OTIKCnRoZSBhdmVyYWdlIGNvbmZpZGVuY2UgaW50ZXJ2YWwKKDgwLjQgLSAyLjQ5MikgPCAkXG11JCA8ICg4MC40ICsgMi40OTIpCjxocj4KCjxicj4KNCBiYWdzIG9mIGNoaXBzIG9mIDIyMGcsIHdpdGggYSBzYW1wbGUgbWVhbiAyMTcuMjUsIHNhbXBsZSB2YXJpYW5jZSA2LjI1LiBOb3JtYWwgZGlzdHJpYnV0aW9uLiBBbHBoYSBsZXZlbCBpcyAwLjA1LgoKMS4gTnVsbDogJFxtdSQgPSAyMjAuIEFsdGVybmF0aXZlOiAkXG11JCA8IDIyMC4KMi4gQ29uZmlkZW5jZSBJbnRlcnZhbDogJFxhbHBoYSQgPSAwLjA1CjMuIHNhbXBsZSBzaXplOiAgNCA8IDMwLCB1c2UgdC10YWJsZQo0LiBkZj0gNCAtIDEgPSAzCjUuIHQtdGFibGUgdmFsdWU6IChkZiBhbmQgYWxwaGEpIDIuMzUzLiAKICAtIGludGVyZXN0ZWQgaW4gMSB0YWlsLCBsZWZ0IHNpZGUgKGxlc3MgY2hpcHMpLCB0PSAtMi4zNTMKNi4gcmVqZWN0IG51bGwgaWYgdCBmYWxscyBpbiByZWplY3Rpb24gcmVnaW9uCjcuIHQtdGVzdCA9ICgyMTcuMiAtMjIwLyAgMi41LyBzcXJ0KDQpKSA9IC0yLjIKOC4gZmFpbCB0byByZWplY3QgbnVsbAoKCiMgRGVzY3JpcHRpdmUgU3RhdHMKRGVzY3JpcHRpdmUgc3RhdHMgCgotIGNlbnRyYWwgdGVuZGVuY3kgKG1lYW4sIG1lZGlhbiwgbW9kZSkKLSB2YXJpYWJpbGl0eSAocmFuZ2UsIHN0ZCkKCiMgSW5mZXJlbnRpYWwgU3RhdHMKdGhlcmUgYXJlIDIgbWFpbiB1c2VzCgotIHNpZ25pZmljYW5jZSB0ZXN0aW5nLCBkZXRlcm1pbmUgaWYgZGlmZmVyZW5jZXMgb3IgcmVsYXRpb25zaGlwIGJldHdlZW4gZ3JvdXBzIGlzIHN0YXRpc3RpY2FsbHkgc2lnbmlmaWNhbnQgKG9jY3VycmVuY2UgaXMgbm90IGp1c3QgcHVyZSBjaGFuY2UpCi0gZXN0aW1hdGUgcG9wdWxhdGlvbiBwYXJhbWV0ZXJzIGZyb20gc2FtcGxlIHN0YXRzCgohW2RlY2lzaW9uXShBNCAtIDEucG5nKQoKCkJldHdlZW4gR3JvdXBzOiBzZWxlY3QgYSBzYW1wbGUsIGFzc2lnbiBlYWNoIHBlcnNvbiB0byAxIG9mIHRoZSBiaW5hcnkgbGFiZWwgb2YgSW5kZXAuIFZhcgoKV2l0aGluIEdyb3VwczogZWFjaCBwZXJzb24gaXMgYXNzaWduZWQgdG8gYWxsIGxhYmVscywgc2FtZSBncm91cCBnZXRzIGRpZmYgbGV2ZWxzLCBncm91cHMgYXJlIG1hdGNoZWQgcmVsYXRlZCB0byB0aGUgZGVwZW5kZW50IHZhcmlhYmxlCgoKCiMgSHlwb3RoZXNpcyBUZXN0aW5nClRoZSBwLXZhbHVlIGlzIHRoZSBwcm9iYWJpbGl0eSBvZiBhIHJlc3VsdCBoYXBwZW5pbmcgYnkgcHVyZSByYW5kb21uZXNzIGFsb25lLiBUaGUgYWxwaGEgJFxhbHBoYSQgPSAwLjA1ICg5NSUgY29uZmlkZW5jZSBsZXZlbCwgNSUgY2hhbmNlIG9mIG1ha2luZyB0eXBlIDEgZXJyb3IpCgpJRiBwLXZhbHVlIDw9ICRcYWxwaGEkICBUSEVOIHJlamVjdCB0aGUgbnVsbCwgcmVzdWx0cyBhcmUgc2lnbmlmaWNhbnQgPGJyPgpJRiBwLXZhbHVlID4gJFxhbHBoYSQgIFRIRU4gYWNjZXB0IHRoZSBudWxsCgojIyBFcnJvcnMKCi0gdHlwZSAxIGlzIHJlamVjdGluZyBhIG51bGwgd2hlbiBzaG91bGRuJ3QgCi0gdHlwZSAyIGlzIGFjY2VwdCBudWxsIHdoZW4gc2hvdWxkbid0IAoKCiMjIEVmZmVjdCBzaXplCkVmZmVjdCBzaXplIGluZGljYXRlcyBob3cgc3Ryb25nIHRoZSByZWxhdGlvbnNoaXAgYmV0d2VlbiB2YXJpYWJsZXMgb3IgZGlmZmVyZW5jZXMgYXJlIGJldHdlZW4gZ3JvdXBzCgoKSW50ZXJwcmV0YXRpb24gb2Ygc3RyZW5ndGggb2YgYSByZWxhdGlvbnNoaXA6CgotIGZvciByIGFuZCBQaGkKICAtIHZlcnkgc3Ryb25nID49IC43MAogIC0gbGFyZ2UgPSAuNTAKICAtIG1lZGl1bSA9IC4zNgogIC0gc21hbGwgPSAuMTAKLSBmb3IgZXRhCiAgLSB2ZXJ5IHN0cm9uZyA+PSAuNDUKICAtIGxhcmdlID0gLjM3CiAgLSBtZWRpdW0gPSAuMjQKICAtIHNtYWxsID0gLjEwCgojIyBDb3JyZWxhdGlvbiB0ZXN0cyBvZiBjb2VmZmljaWVudHMKUGVhcnNvbiByIGFuZCBiaXZhcmlhdGUgcmVncmVzc2lvbiAocGFyYW1ldHJpYyB0ZXN0cykKQ29ycmVsYXRpb24gY29lZmZpY2llbnRzIC0xIHRvICsxLCBjbG9zZXIgdG8gLTEgb3IgKzEgdGhlIHN0cm9uZ2VyIHRoZSByZWxhdGlvbnNoaXAKCi0xICoqc3Ryb25nKiogPC0tLS0tLS0tIHdlYWsgLS0tLS0tLT4gKipzdHJvbmcqKiArMSA8YnI+Ci0gIC0wLjcgLTAuNSAtMC4zIC0wLjEgIDxcdGFiPiAgICAgICAgICAgMC4xICAwLjMgIDAuNSAgMC43CgoKCiMgQ2hpIHNxdWFyZSB0ZXN0ClRoaXMgbm9uLXBhcmFtZXRyaWMgdGVzdCBjYW4gYmUgdXNlZCBmb3IgZGlmZmVyZW5jZSBxdWVzdGlvbnMgd2l0aCBub21pbmFsIG9yIGRpY2hvdG9tb3VzIHZhcmlhYmxlcy4gQ2xhc3NlczogMSBvciAyLCBDb25kaXRpb246IG9uL29mZgoKIyBQaGkgYW5kIENyYW1lcidzIFYKdGhlc2UgYXJlIHRlc3RzIGZvciBhc3NvY2lhdGlvbiBiZXR3ZWVuIDIgbm9taW5hbCBvciBkaWNob3RvbW91cyB2YXJpYWJsZXMuIAoKLSBQaGkgaXMgdXNlZCB3aGVuIGJvdGggdmFyaWFibGVzIGFyZSBkaWNob3RvbW91cwotIENyYW1lcidzIFYgd2hlbiAxKyB2YXJpYWJsZSBoYXMgbXVsdGlwbGUgbGV2ZWxzL2dyb3VwaW5ncwoKIyBQZWFyc29uIHIKdGhpcyB0ZXN0IGlzIGZvciBzdHJlbmd0aCBvZiBhc3NvY2lhdGlvbiBiZXR3ZWVuIDIgc2NhbGUgdmFyaWFibGVzLiBFeC4gR1BBIHZzIEhvdXJzIG9mIHN0dWR5CgojIFdpbGNveG9uIGFuZCBNY05lbWFyIHRvIGNvbXBhcmUgZ3JvdXBzCgojIEFOT1ZBIDEtd2F5LCBGLXNjb3JlCmFuYWx5c2lzIG9mIHZhcmlhbmNlIGlzIHVzZWQgdG8gY29tcGFyZSAzKyB1bnJlbGF0ZWQgZ3JvdXBzIG9uIGFuIGluZGVwZW5kZW50IHZhcmlhYmxlLCBkZXRlcm1pbmVzIHdoZXRoZXIgdGhlcmUgYXJlIHNpZ25pZmljYW50IHZhcmlhbmNlIGJldHdlZW4gZ3JvdXBzLiBUaGUgdmFsdWUgaXMgKipGLXZhbHVlKioKCiMgUG9zdC1ob2MgdGVzdHMKdGhlIEFOT1ZBIGp1c3QgdGVsbHMgeW91IHRoYXQgYSBzaWduaWZpY2FudCBkaWZmZXJlbmNlIGV4aXN0cwoKLSBkbyBhIFR1a2V5J3MgSFNEIHRlc3QgaWYgdGhlIHAtdmFsdWUgaXMgbm90IHN0YXQgc2lnbmlmaWNhbnQKLSBkbyBhIEdhbWVzLUhvd2VsbCB0ZXN0IGlmIHRoZSBwLXZhbHVlIGlzIHN0YXQgc2lnbmlmaWNhbnQKCgojIEFOT1ZBIDItd2F5CkluZGVwZW5kZW50IHZhcmlhYmxlcyBlYWNoIGhhdmUgMisgbGV2ZWxzLCBzYW1wbGVzIG11c3QgYmUgaW5kZXAuIGFuZCBlYWNoIHBhcnRpY2lwYW50IG11c3QgcmVjZWl2ZSBvbmx5IDEgbGV2ZWwgb2YgZWFjaCBpbmRlcC4gdmFyLiBWYXJpYW5jZXMgbXVzdCBiZSBlcXVhbC4KVGhlcmUgaXMgMyBudWxsIGh5cG90aGVzZXM6IHRoZXJlIGlzIG5vIGRpZmYgYmV0d2VlbiB2YXJpb3VzIGxldmVscyAobm8gbWFpbiBlZmZlY3QpLCB0aGVyZSBpcyBubyBkaWZmIGJldHdlZW4gdmFyaW91cyBsZXZlbHMgb2YgMm5kIEluZGVwLiB2YXIsIGFuZCB0aGVyZSBpcyBubyBpbnRlcmFjdGlvbiBiZXR3ZWVuIGluZGVwLiB2YXJzCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgo=