A factor is quite similar to vector, however a factor has a fixed set of levels.
x <- factor(c("a","b","c", "d", "e"))
x
## [1] a b c d e
## Levels: a b c d e
As you can see, it also lists the levels here, which differs to a vector that does not have levels. However, if we change the element a to something else, see what happens to the levels:
x[1] <- "c"
x
## [1] c b c d e
## Levels: a b c d e
Notice the element has changed from a to c, but the levels have not. What you cannot do, is reassign a value to an element that is beyond the levels initially listed. To change this, you can create an option within the intial factor creation
y <- factor(c("a","b","c", "d", "e"), levels = (c("a","b","c", "d", "e", "f")))
y
## [1] a b c d e
## Levels: a b c d e f
factor1 <- factor("first")
factor2 <- factor("second")
Factors do not allow combining or substitutions outside of the allowed levels, this can be useful as a form of data validation.
worktype_char <- c("employed", "unemployed", "employed")
worktype_factor <- factor(x = worktype_char, levels = c( "unemployed", "employed", "retired"))
worktype_factor
## [1] employed unemployed employed
## Levels: unemployed employed retired
table(worktype_factor)
## worktype_factor
## unemployed employed retired
## 1 2 0
In this example, import a txt file shows 4 observations of 1 variable, but the placement of a comma in place of fullstop, its causing R to view the column as a vector as it is not reading the value 10,0 as numeric. When you view the data as numeric, you end up with errors, and when viewed in its levels, you can see what is occurring.
txt_data <- read.table("C:/Users/acliv1/Dropbox/~ADAM/PhD/Professional development hours and training/Coding/R/Udemy/Rultimate_Project/Rultimate_Project/data/factorimport.txt", header = T)
txt_data
## value
## 1 7
## 2 8
## 3 9
## 4 10,0
as.numeric #not working
## function (x, ...) .Primitive("as.double")
levels(txt_data$value)
## NULL
#solution, pass it as a character before passing it as a numeric.
as.numeric(as.character(txt_data$value))
## Warning: NAs introduced by coercion
## [1] 7 8 9 NA