10/25/2020

# Setup

Make the module numpy available with the alias np.

import numpy as np

Now create and print a list, l_list, of the integers from 1 to 5.

l_list = [1,2,3,4,5]
print(l_list)
## [1, 2, 3, 4, 5]

Now Use the function np.array() to create a numpy array, a_array, with the same contents. Print the array.

a_array = np.array(l_list)
print(a_array)
## [1 2 3 4 5]

Obtain and print the first and last elements of both of these.

print("First")
## First
print(l_list)
## 1
print(a_array)
## 1
print("Last")
## Last
print(l_list[-1])
## 5
print(a_array[-1])
## 5

Try slicing to print the second through fourth elements of both of these.

print(l_list[1:4])
## [2, 3, 4]
print(a_array[1:4])
## [2 3 4]

Note that there is a lot of similarity between l_list and a_array. However, there are no comma separators in a_array.

What happens if you “add” two copies of a list?

print(l_list + l_list)
## [1, 2, 3, 4, 5, 1, 2, 3, 4, 5]

With lists, “+” means concatenation.

What happens if you “add” two copies of a numpy array.?

print(a_array + a_array)
## [ 2  4  6  8 10]

What happens if you multiply a numpy array containing numbers by a constant?

print(3 * a_array)
## [ 3  6  9 12 15]

What happens if you multiply a list containing numbers by a constant?

print(3 * l_list)
## [1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5]

Note that this is consistent with the application of “+”.

The general rule is that operations on numpy arrays take place elementwise. This is only possible if the same operation can be applied to each element. What happens if you replace the second element of l_list with a string instead of a number. Can you still create a_array from it?

l_list = [1,"2",3,4,5]
print(l_list)
## [1, '2', 3, 4, 5]

Note that the second element now has a different type from the others.

for i in l_list:
print(i, type(i))
## 1 <class 'int'>
## 2 <class 'str'>
## 3 <class 'int'>
## 4 <class 'int'>
## 5 <class 'int'>

Let’s produce the numpy array and look at the types of its elements.

a_array = np.array(l_list)
print(a_array)
## ['1' '2' '3' '4' '5']
for i in a_array:
print(i, type(i))
## 1 <class 'numpy.str_'>
## 2 <class 'numpy.str_'>
## 3 <class 'numpy.str_'>
## 4 <class 'numpy.str_'>
## 5 <class 'numpy.str_'>

To make elementwise operations possible, every element of a numpy array must be of the same type. List are ultra-flexible, able to contain arbitrary mixes of types, even other lists or dicts.

Note that the elements of a_array are not of type str. The class numpy.str_ is different, but similar. We won’t explore this topic farther.

# Vectorized Arithmetic

Recall the basic computation of area for a rectangular figure.

$area = length * width$ Compute the area for a rectangle with a length of 5 and a width of 7 using python.

length = 5
width = 7
area = length * width
print(area)
## 35

If we have a number of rectangular figures with their lengths in one list and their widths in another list, we could compute their areas by looping through the two lists with indices and putting the areas in a third list.

width = [2,3,4,5]
length = [10,15,20,25]
area = []
for i in range(len(width)):
area.append(width[i] * length[i])
print(area)    
## [20, 45, 80, 125]

Show how the for loop can be avoided using numpy arrays instead of lists.

area = np.array(length) * np.array(width)
print(area)
## [ 20  45  80 125]

This is not just a briefer way to write the code. Doing this with numpy arrays instead of lists is much faster in execution. The reason is that the actual work with numpy arrays is done by extremely optimized code. This doesn’t matter much with trivial examples, but the difference is important in many real-world examples.

# Multidimensional Example

The following code will create a 3D numpy array. The numbers in the array might represent things to be reported on a monthly basis over the life of a three-year project.

items = np.arange(36)
print(items)
## [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
##  24 25 26 27 28 29 30 31 32 33 34 35]

Let’s create a 3D array to identify the items by year, quarter, and month.

items3D = items.reshape(3,4,3)
print(items3D)
## [[[ 0  1  2]
##   [ 3  4  5]
##   [ 6  7  8]
##   [ 9 10 11]]
##
##  [[12 13 14]
##   [15 16 17]
##   [18 19 20]
##   [21 22 23]]
##
##  [[24 25 26]
##   [27 28 29]
##   [30 31 32]
##   [33 34 35]]]

Each year is represented by a 2D array. Within a year, a quarter is represented by a 1D array.

Use a proper index to obtain the value stored in the second month of the third quarter of the third year. Remember that python starts indexing at 0 instead of 1. Also remember that the dimensions are Year, Quarter, Month.

print(items3D[2,2,1])
## 31

Use Slicing to print out the entire second year.

print(items3D[1,:,:])
## [[12 13 14]
##  [15 16 17]
##  [18 19 20]
##  [21 22 23]]

Use slicing to print out all of the December values.

print(items3D[:,3,2])
## [11 23 35]

Use slicing to print out all second quarter values.

print(items3D[:,1,:])
## [[ 3  4  5]
##  [15 16 17]
##  [27 28 29]]

# Aggregates

There is a second aspect of vectorized arithmetic. Some functions applied to a numpy array return a single numerical value used to describe some aspect of the set of numbers in the array. The most common examples are np.sum() and np.mean().

Compute the sum and the mean value of the numbers in the numpy array items.

print(np.sum(items))
## 630
print(np.mean(items))
## 17.5