Numpy

Harold Nelson

10/25/2020

Setup

Make the module numpy available with the alias np.

Answer

import numpy as np

Now create and print a list, l_list, of the integers from 1 to 5.

Answer

l_list = [1,2,3,4,5]
print(l_list)
## [1, 2, 3, 4, 5]

Now Use the function np.array() to create a numpy array, a_array, with the same contents. Print the array.

Answer

a_array = np.array(l_list)
print(a_array)
## [1 2 3 4 5]

Obtain and print the first and last elements of both of these.

Answer

print("First")
## First
print(l_list[0])
## 1
print(a_array[0])
## 1
print("Last")
## Last
print(l_list[-1])
## 5
print(a_array[-1])
## 5

Try slicing to print the second through fourth elements of both of these.

Answer

print(l_list[1:4])
## [2, 3, 4]
print(a_array[1:4])
## [2 3 4]

Note that there is a lot of similarity between l_list and a_array. However, there are no comma separators in a_array.

What happens if you “add” two copies of a list?

Answer

print(l_list + l_list)
## [1, 2, 3, 4, 5, 1, 2, 3, 4, 5]

With lists, “+” means concatenation.

What happens if you “add” two copies of a numpy array.?

Answer

print(a_array + a_array)
## [ 2  4  6  8 10]

The result is elementwise addition.

What happens if you multiply a numpy array containing numbers by a constant?

Answer

print(3 * a_array)
## [ 3  6  9 12 15]

What happens if you multiply a list containing numbers by a constant?

Answer

print(3 * l_list)
## [1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5]

Note that this is consistent with the application of “+”.

The general rule is that operations on numpy arrays take place elementwise. This is only possible if the same operation can be applied to each element. What happens if you replace the second element of l_list with a string instead of a number. Can you still create a_array from it?

Answer

l_list = [1,"2",3,4,5]
print(l_list)
## [1, '2', 3, 4, 5]

Note that the second element now has a different type from the others.

for i in l_list:
    print(i, type(i))
## 1 <class 'int'>
## 2 <class 'str'>
## 3 <class 'int'>
## 4 <class 'int'>
## 5 <class 'int'>

Let’s produce the numpy array and look at the types of its elements.

a_array = np.array(l_list)
print(a_array)
## ['1' '2' '3' '4' '5']
for i in a_array:
    print(i, type(i))
## 1 <class 'numpy.str_'>
## 2 <class 'numpy.str_'>
## 3 <class 'numpy.str_'>
## 4 <class 'numpy.str_'>
## 5 <class 'numpy.str_'>

To make elementwise operations possible, every element of a numpy array must be of the same type. List are ultra-flexible, able to contain arbitrary mixes of types, even other lists or dicts.

Note that the elements of a_array are not of type str. The class numpy.str_ is different, but similar. We won’t explore this topic farther.

Vectorized Arithmetic

Recall the basic computation of area for a rectangular figure.

\[area = length * width\] Compute the area for a rectangle with a length of 5 and a width of 7 using python.

Answer

length = 5
width = 7
area = length * width
print(area)
## 35

If we have a number of rectangular figures with their lengths in one list and their widths in another list, we could compute their areas by looping through the two lists with indices and putting the areas in a third list.

width = [2,3,4,5]
length = [10,15,20,25]
area = []
for i in range(len(width)):
    area.append(width[i] * length[i])
print(area)    
## [20, 45, 80, 125]

Show how the for loop can be avoided using numpy arrays instead of lists.

Answer

area = np.array(length) * np.array(width)
print(area)
## [ 20  45  80 125]

This is not just a briefer way to write the code. Doing this with numpy arrays instead of lists is much faster in execution. The reason is that the actual work with numpy arrays is done by extremely optimized code. This doesn’t matter much with trivial examples, but the difference is important in many real-world examples.

Multidimensional Example

The following code will create a 3D numpy array. The numbers in the array might represent things to be reported on a monthly basis over the life of a three-year project.

items = np.arange(36)
print(items)
## [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
##  24 25 26 27 28 29 30 31 32 33 34 35]

Let’s create a 3D array to identify the items by year, quarter, and month.

items3D = items.reshape(3,4,3)
print(items3D)
## [[[ 0  1  2]
##   [ 3  4  5]
##   [ 6  7  8]
##   [ 9 10 11]]
## 
##  [[12 13 14]
##   [15 16 17]
##   [18 19 20]
##   [21 22 23]]
## 
##  [[24 25 26]
##   [27 28 29]
##   [30 31 32]
##   [33 34 35]]]

Each year is represented by a 2D array. Within a year, a quarter is represented by a 1D array.

Use a proper index to obtain the value stored in the second month of the third quarter of the third year. Remember that python starts indexing at 0 instead of 1. Also remember that the dimensions are Year, Quarter, Month.

Answer

print(items3D[2,2,1])
## 31

Use Slicing to print out the entire second year.

Answer

print(items3D[1,:,:])
## [[12 13 14]
##  [15 16 17]
##  [18 19 20]
##  [21 22 23]]

Use slicing to print out all of the December values.

Answer

print(items3D[:,3,2])
## [11 23 35]

Use slicing to print out all second quarter values.

Answer

print(items3D[:,1,:])
## [[ 3  4  5]
##  [15 16 17]
##  [27 28 29]]

Aggregates

There is a second aspect of vectorized arithmetic. Some functions applied to a numpy array return a single numerical value used to describe some aspect of the set of numbers in the array. The most common examples are np.sum() and np.mean().

Compute the sum and the mean value of the numbers in the numpy array items.

Answer

print(np.sum(items))
## 630
print(np.mean(items))
## 17.5