# 1 Goal

The goal of this tutorial is to avoid one common mistake related to the use of factors. When trying to transform a factor containing numbers to numerical value we obtain as a result the position of the levels instead of the content of the variable. We will see how to find this problem and check that everything went fine.

# 2 Data preparation

``````# In this exercise we will use a character vector containing numbers
# We will use the iris dataset to perform this exercise
data("iris")
str(iris)``````
``````## 'data.frame':    150 obs. of  5 variables:
##  \$ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
##  \$ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
##  \$ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
##  \$ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
##  \$ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...``````

# 3 Turning a character vector into numerical

``````# We create a character vector using the Sepal Lenght variable
char_vector <- as.character(iris\$Sepal.Length)
str(char_vector)``````
``##  chr [1:150] "5.1" "4.9" "4.7" "4.6" "5" "5.4" "4.6" ...``
``````# We create a numerical vector from the character vector
num_vector <- as.numeric(char_vector)
str(num_vector)``````
``##  num [1:150] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...``
``````# We plot the difference that should be zero if the value is correctly saved
plot(num_vector - iris\$Sepal.Length)``````

``# A plot consisting of zeroes confirms that the transformation was correclty made``

# 4 Turning a factor into a numerical vector

``````# We create a factor type variable
my_factor <- factor(iris\$Sepal.Length)
str(my_factor)``````
``##  Factor w/ 35 levels "4.3","4.4","4.5",..: 9 7 5 4 8 12 4 8 2 7 ...``
``````# Now we save in a new variable the numerical values inside the factor
num_vector <- as.numeric(my_factor)
str(num_vector)``````
``##  num [1:150] 9 7 5 4 8 12 4 8 2 7 ...``
``````# We plot the difference that should be zero if the value is correctly saved
plot(num_vector - iris\$Sepal.Length)``````