- A factor is a vector used to store categorical data
- They contain two attributes: the class factor and levels, which defines the set of allowed values
- They are useful for knowing the set possible values, even when some aren't present in the data
stem_majors <- factor(c("Biology", "Chemistry", "Biology","Computer Science"),
levels = c("Biology", "Chemistry", "Physics", "Engineering", "Mathematics", "Computer Science", "Statistics"))
stem_majors
## [1] Biology Chemistry Biology Computer Science
## 7 Levels: Biology Chemistry Physics Engineering ... Statistics