Before we start this lesson, we need to load the python modules that we will be using.
import pandas as pd
import math
import numpy as np
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
We use the array function from the numpy module to create vectors in Python. Similar to R, we include the numbers that we want in the vector. Look at the vector x below to see how to do this.
x = np.array([1, 2, 3, 4])
x
## array([1, 2, 3, 4])
Now that we have defined the vector x, we can take the mean and sum of the vector, like we did in R. Remember, when we want to use a function in Python we have to include the function name and a period after the object name.
x.mean()
## 2.5
x.sum()
## 10
Similar to R, we can add, subtract, multiply, and divide numbers to vectors in Python. The next few code chunks show how we do this.
x + 3
## array([4, 5, 6, 7])
x - 1
## array([0, 1, 2, 3])
x * 4
## array([ 4, 8, 12, 16])
x / 2
## array([0.5, 1. , 1.5, 2. ])
Another thing we can do in Python is add, subtract, multiply and divide vectors that have the same length.
Let’s look at our vector x again.
x = np.array([1, 2, 3, 4])
x
## array([1, 2, 3, 4])
Now, let’s create a vector y with the same length (the same number of elements).
y = np.array([2, 3, 4, 5])
y
## array([2, 3, 4, 5])
We can see that both x and y have 4 elements in them, therefore we can add, subtract, multiply and divide them. The next few code chunks show this.
x + y
## array([3, 5, 7, 9])
x - y
## array([-1, -1, -1, -1])
x * y
## array([ 2, 6, 12, 20])
x / y
## array([0.5 , 0.66666667, 0.75 , 0.8 ])
Say we have a vector p that has 5 elements. Run the code chunk below to see this vector.
p = np.array([1, 2, 3, 4, 5])
p
## array([1, 2, 3, 4, 5])
Let’s see what happens if we try to add x and p.
#x + p
We see that there is an error because vector x has 4 elements and vector p has 5. When you want to do an operation between two vectors, you need to make sure they have the same length, just like in R.
Just like in R, we can create lists in Python. First, you need to create different elements of the lists. In the code chunk below we set food to eggs and butter, quantity to the numbers 2 through 5 and logical to true. We then put all of the object names in brackets and set that to list 2. When we run this code chunk, we can see the list.
food = ["eggs", "butter"]
quantity = [2, 3, 4, 5]
logical = [True]
list2 = [food, quantity, logical]
list2
## [['eggs', 'butter'], [2, 3, 4, 5], [True]]
We also can create data frames in Python. Similar to lists, we set words or numbers as objects. Notice for lists, we put brackets around the words or numbers and for the data frames we do not.
To create a data frame, we use the DataFrame function in pandas. When we run the code chunk below, we can see the data frame created by nums and words.
nums = 1, 2, 3
words = "goats", "cows", "ducks"
pd.DataFrame(nums, words)
## 0
## goats 1
## cows 2
## ducks 3
This is another data frame we can create.
numss = 1, 2, 3, 4, 5
oper = "add", "subtract", "multiply", "divide", "fraction"
logic = True, False, True, True, False
pd.DataFrame(oper, logic) #when I include numss into the data frame I get an error saying "shape of passed values is (5,1) indices imply (5,5)
## 0
## True add
## False subtract
## True multiply
## True divide
## False fraction
Write a code chunk to create a vector with elements 1, 5, 6, 7, 10. Name this vector z.
z = np.array([1, 5, 6, 7, 10])
z
## array([ 1, 5, 6, 7, 10])
Add 5 to the vector z.
z + 5
## array([ 6, 10, 11, 12, 15])
Multiply z by 4.
z * 4
## array([ 4, 20, 24, 28, 40])
Find the sum of vector z.
z.sum()
## 29
Find the mean of vector z.
z.mean()
## 5.8
Create a list containing the animals dog, cat, bird and goat and the numbers 2, 3, 4.
anim = ["dog", "cat", "bird", "goat"]
num = [2, 3, 4]
list3 = [anim, num]
list3
## [['dog', 'cat', 'bird', 'goat'], [2, 3, 4]]
Create a data frame containing the colors yellow, blue, and red and the words happy, sad and mad.
col = "yellow", "blue", "red"
word = "happy", "sad", "mad"
pd.DataFrame(col, word)
## 0
## happy yellow
## sad blue
## mad red
1.) Write a code chunk to create a vector with elements 2, 3, 6, 9. Name this vector w.
2.) Subtract 6 from w.
3.) Divide w by 2.
4.) Find the sum of w.
5.) Find the mean of w.
6.) Create a list containing the colors blue, purple, and red and the numbers 5, 6, 7.
7.) Create a data frame containing the foods eggs, chicken, and ham and the numbers 2, 5, 9.