Getting Set Up

Before anything else, you need to set Python up on your computer. The version of Python we will use is Anaconda Python - see https://www.anaconda.com/download/. Using instructions on this page, install Python (it’s free). You will want the Python 3.7 version (not 2.7). If you are offered a choice between 64-bit and 32-bit versions, unless your laptop is very old you’ll want 64-bit. The download is large (~614Mb) so it may take a while. It may also ask you if you want to install something called VS Code - you won’t need this, so I would suggest saying ‘no’.

There are a number of ways to actually run Python. Here, we will use Navigator to launch the Jupyter app. This is done by first stating Navigator, and from there launching Jupyter. To start up Navigator, follow the instructions here: https://docs.anaconda.com/anaconda/navigator/getting-started/#navigator-starting-navigator. When it has started, you will see a window similar to this:

Anaconda Navigator Window

Anaconda Navigator Window

If the JupyterLab window has a button that says Install click on this - it installs some software on to your machine. Follow any instructions you get. When installed the button will say Launch instead of Install. When it says Launch then click on it, to start up the JupyterLab app. This actually runs Python on a web page. First up, you may see a number of options - click on the Python 3 option in the Notebook section.

Next, you should see something like this:

JupyterLab Entry Screen

JupyterLab Entry Screen

Although it is capable of quite a lot more than this, the console can function as a REPL interface (Read Evaluate Print Loop)- so you type a line in to the box at the top and press shift+enter, Python reads in the line and evaluates it as a Python expression, and then prints the result (or an error message) - it then loops back to read in the next entry - in this case by creating a new box to type Python code into. This is a good way to find out about Python expressions, as in the next few sections. However, before you begin, you may want to re-start Python in your own folder (the default one is not in a a particularly helpful location). Firstly, close the Python tab. THen click on the folfer icon to the left, and navigate to a folder you would like to work in. The ‘folder +’ icon at the top allows you to create a nuew folder if you wish. This will be called Untitled Folder but if you right click (or control click for Macs) a new menu comes up allowing you to rename it. If you make a new folder, navigate into it, then click on the ‘+’ option. This starts a new Python session. Click on ‘Python 3’ in Notebook and finally use File / Rename to rename the tab. Finally got there. I think this was designed by the same person who thought up the interfaces for 1980s video recorders and microwaves.

Python as A Calculator

First off, Python can do simple arithmetic.

print(3 + 7)
## 10
print(4.5 * 6.1)
## 27.45
# 
# A hashtag makes a comment - try some division
# 
print(4 / 2)
## 2.0
print(5.7 / 8.8)
## 0.6477272727272727

Powers are possible, using **

print(2 ** 6)
## 64
print(2 ** 0.5)
## 1.4142135623730951

Self-test: what do you think % does here?

print(8 % 3)
## 2
print(9 % 3)
## 0
print(17 % 8)
## 1
print(17 % 6)
## 5
print(17 % 4)
## 1

More than one line of code

Up until now, most attentionm has been on representing information in Python, rather than doing very much with it. In this section Python programming will be introduced. From one viewpoint, Python programs can simply be lists of instructions such as introduced above, stored in a text file and executed in sequence - although some more sophisticated ideas will need to be considered. However, it is certainly the case that rather than just typing things into Jupyter line by line, it is a good idea to create your code in a single chunk, and store the programs. You can do this in Jupyter. Noting that you need to press shift+enter to send the Python code to be run, just pressing enter on its own creates a new line of code in the box, without running it. As an example of multiple line entry, note that you can also store the results of calculations in variables:

x = 6
print(x)
## 6
print(x * 6)
## 36
y = 9 / 4
print(y)
## 2.25

Can you explain the following? What do you think // does?

z = 9 // 4
print(z)
## 2

There are different data types in Python - the two you saw there were float and int types. To write an int in Python, simply write an integer without any decimal - eg 25 or -6. To write a float, enter a decimal (even if the value being written is a whole number) - eg 3.0 or -9.54. If you mix an int and a float in a calculation, the result is a float.

print(8 + 1.0)
## 9.0

If both numbers are floats, then unsurprisingly the result here is a float.

print(8.0 - 7.0)
## 1.0

If both numbers are ints then the result is an int.

print(8 - 12)
## -4

Note that this is not the case for division when the result of dividing one int into another might be a float:

print(23 / 8)
## 2.875

The answer to the simple ‘how many times does 8 go into 23’ version of division is provided by the // operator:

print(23 // 8)
## 2

This time the result is an int.

Python also has functions - these are similar to those in R - for example

a_number = -7
another_number = 67
print(abs(a_number))
## 7
print(abs(another_number))
## 67

As well as int and float types, Python handles quite a few other data types. string is another commonly used one.

my_name = "Chris Brunsdon"
print(my_name)
## Chris Brunsdon

The operator + also works for strings - it joins them together.

first_name = "Chris"
surname = "Brunsdon"
print(first_name + surname)
# Note that '+' doesn't insert spaces between strings 
# unless you explicitly tell it to
## ChrisBrunsdon
print(first_name + " " + surname)
## Chris Brunsdon

The * operator works when it has one term as a string and the other as an int, where it repeats the string argument n times, if n is the int:

print("Ha " * 3)
# Order of int and string doesn't matter
## Ha Ha Ha
underline = 20 * "-"
print(underline)
## --------------------

Note that the numerical term here must be an int - floats give an error. This makes sense, as you can’t repeat a string 3.4 times, for example. However, note that you get an error even for a float value that is a whole number, such as 4.0.

The % operator also has a use if the first argument is a string. Here it represents a printing format. For example

print("%6.4f" % (1/3))
## 0.3333

Reformats the result of 1/3 as a floating point number (hence the f) taking up 6 characters in total, with 4 digits after the decimal point. To find out more about possible formats, see this link: https://pyformat.info.

Python Packages

In most languages, other functions - such as log, sin and so on are provided. They are available in Python as well, but they are part of a library rather than in the core Python made available when you start up the REPL interpreter - as you did earlier on. Individual items in the library are called packages and you can access them via the import statement. Functions called sin, cos and so on are available in a library called math - a number of constants are also provided, such as pi. Here is an example of their use:

import math
theta = math.pi / 3.0
print(math.cos(theta))
## 0.5000000000000001
print(math.sin(theta))
## 0.8660254037844386

Note that although the functions fpr sin,cos and so on are quite accurate, they aren’t perfect, hence the slight error in cos(theta) here. Its probably a good idea to round the number of decimals when printing out results like this - for example:

print('%7.4f' % math.cos(theta))
##  0.5000

From the example, you can see that functions imported from math are named math.<fn> where <fn> is the function name. Using functions from packages generally works this way. However, sometimes if a lot of functions from a package are used this gets cumbersome - particular if thre package has a long name. One way to get round this is to use the import ... as variation:

import math as m
theta = m.pi / 3.0
print('%7.4f' % m.cos(theta))
##  0.5000
print('%7.4f' % m.sin(theta))
##  0.8660

Here we tell Python to import the math package, but to refer to it as m once it is imported. Another approach is to use the from ... import variation. Here the functions to be used from the package are directly stated, and afterwards they may be referred to directly.

from math import sin, cos, pi
theta = pi / 3.0
print('%7.4f' % cos(theta))
##  0.5000
print('%7.4f' % sin(theta))
##  0.8660

However, the complication with the above approach is that there is nothing to stop several packages having functions with the same names - it then becomes hard to distinguish between which one is to be used. Thus the last approach, although the most convenient in some ways, is best confined to short snippets of Python.

Python Lists

Until now, you have seen Python variables containing a single value - either float, int or string. Python also has variables that contain lists of values. Items in a list are separated by commas, and enclosed in square brackets:

ages = [52,21,43,23,19]
print(ages)
## [52, 21, 43, 23, 19]

A function that applies to lists is len - returning the length of the list (ie the number of items):

print(len(ages))
## 5

You can also pick out individual items in the list like an array in R:

print(ages[1], ages[0])
## 21 52

However, unlike R (but like C++ and Java) the first element in the list is indexed at zero, not one. Thus, the last element in ages is ages[4] not ages[5] - forgetting this is a common source of errors in Python, particularly if you are used to coding in R or FORTRAN.

You can also use negative numbers to select items relative to the end of the list: ages[-1] refers to the last item in ages, ages[-2] the one before that, and so on:

print(ages[-1], ages[-3])
## 19 43

You can also pick out sub-lists using slicing - specifying a sequential list of indices:

print(ages[0:2])
## [52, 21]

Note that this picks out elements 0 and 1 - but not 2 - slicing operators do not include the final element. What this does mean is that, for example, ages[0:len(ages)] selects the entire list ages -

print(ages[0:len(ages)])
## [52, 21, 43, 23, 19]

However, a classic Python ‘gotcha’ is that ages[len(ages)] without slicing gives an error, as it refers to ages[5]. It is also possible to omit the expression before or after the :. If the left hand is omitted it is assumed to be element zero; if the last is omitted, it is assumed to be the last element1.

print(ages[:2])
## [52, 21]
print(ages[1:])
## [21, 43, 23, 19]
print(ages[-2:])
## [23, 19]
print(ages[:-1])
## [52, 21, 43, 23]

Note the last example returns all of the elements in ages except the last one - the term to the right of the : isn’t included, as with the ages[:2] example. You can also mix positive and negative indexes in a slice:

# print(all elements except the first and last)
print(ages[1:-1])
## [21, 43, 23]

The expression [] refers to an empty list. Sometimes you can get this as a result of a slice in which the left hand term exceeds the right.

dead_list = ages[2:1]
print(dead_list)
## []
print(len(dead_list))
## 0

They are useful in other situations - as will be seen later.

Another function for lists is sorted. This sorts the items in the list.

print(sorted(ages))
## [19, 21, 23, 43, 52]

You can apply slicing to results of functions, provided they are also lists. The following prints all values of ages except the largest and smallest.

print(sorted(ages)[1:-1])
## [21, 23, 43]

Lists of Lists

It is possible for the individual elements of lists to be of different types.

mixed_up = ['Chris','Brunsdon',1,'Jan',2014]
print(mixed_up)
## ['Chris', 'Brunsdon', 1, 'Jan', 2014]
print(mixed_up[0:2])
## ['Chris', 'Brunsdon']

Interestingly, it is also possible to have other lists as elements of lists.

my_details = [['Chris','Brunsdon'],[1,'Jan',2014]]
print(my_details[0])
# Use succesive indexing to access elements
# of lists inside other lists
## ['Chris', 'Brunsdon']
print(my_details[1][2])
## 2014

You can use this to represent contiguity between geographical zones - if the provinces of Ireland are indexed by the numbers 0 to 3 for Ulster, Connaught, Leinster, and Munster respectively then the province contiguities (ie the information as to which pairs of provinces share a boundary) can be represented by a list of lists:

province_nbrs = [[1,2],[0,2,3],[0,1,3],[1,2]]
print(province_nbrs[0])
## [1, 2]

The list of neighbours of province 0 (Ulster) is the first element of the list province_nbrs and is itself a list - [1,2] - meaning that the neighbouring provinces are Connaught and Leinster.

Slicing and Strings

Python treats strings as lists of a kind - it is possible to access individual characters in a string via list item indexing.

name = 'Chris Brunsdon'
print(name[0])
## C
print(name[0:5])
## Chris
print(name[-1])
## n

Functions that work on lists often work on strings by treating them as a list of characters:

print(len(name))
## 14
print(sorted(name))
## [' ', 'B', 'C', 'd', 'h', 'i', 'n', 'n', 'o', 'r', 'r', 's', 's', 'u']

Note that in the sorted example, the result is literally a list of characters - they are sorted, but not reconstructed into a string. As before, the slicing operator could be applied to the right of any expression resulting in a string - for example to pick out the initial for my first name:

print(my_details[0][0][0])
## C

and to create a list with both my initials:

my_inits = [my_details[0][0][0],my_details[0][1][0]]
print(my_inits)
## ['C', 'B']

List Methods

As well as functions that return valus as lists, there are also methods. Methods differ from functions in a number of ways - but in some ways are similar. If x is some kind of Python object, then a method is called by x.<meth>() where <meth> is the method name. For example, the append method adds a new item to the end of a list.

print(ages)
## [52, 21, 43, 23, 19]
ages.append(34)
print(ages)
## [52, 21, 43, 23, 19, 34]

A key point here is that append modified the actual list. Whereas functions such as sorted left the variable ages unaltered, append actually changed it. This is not always the case with methods, but quite often it is. There is also a sort method which sorts the items in a list, but actually changes the list, rather than providing a new list.

# Use the 'sort' method on 'ages'
ages.sort()
# Check 'ages' has actually been permanently altered
print(ages)
## [19, 21, 23, 34, 43, 52]

Some methods also return values - pop returns the value of a particular index in a list, but then removes that value.

# Pop the last value from 'ages'
oldest = ages.pop(-1)
print(oldest)

# Show that the last value has been removed
## 52
print(ages)
## [19, 21, 23, 34, 43]

Another useful methods is insert - this places a new value inside an existing list before a specified position

# Put the oldest value back in the 'ages' list, but just before position 2
ages.insert(2,oldest)
print(ages)
# Now add a new age at the beginning
## [19, 21, 52, 23, 34, 43]
ages.insert(0,63)
print(ages)
# Sort it to keep it in order
## [63, 19, 21, 52, 23, 34, 43]
ages.sort()
print(ages)
## [19, 21, 23, 34, 43, 52, 63]

Joining Lists

Recall that + was used for joining strings together. It can also be used for joining lists

list1 = ['Chris','Brunsdon']
list2 = ['NUIM',2014]
print(list1 + list2)
## ['Chris', 'Brunsdon', 'NUIM', 2014]

Python Dictionaries

A dictionary is similar to a list, but a key difference is that items are referred to by a name, rather than by location:

new_car = {'colour':'blue','cylinders':4, 'capacity':1200}

The variable new_car is a kind of list, but each of the three items are referred to by names. The items in the dictionary are accessed in a similar way to lists, except that a string containing the name is used, rather than an int, as with lists.

print(new_car['colour'])
## blue
print(new_car['cylinders'])
## 4

These are useful data types when you wish to associate items of information with a list of people, places and so on. For example they can associate geographical data with the names of locations

population = {'Ulster':294803,'Connaught':542547,'Leinster':2504814,
    'Munster':1246088}
print(population['Leinster'])
## 2504814

The two entries in each dictionary element (ie the lookup-up name and the associate value) are called the key and value respectively. As with lists, the value can take any form - including a list or another dictionary. For example, the contiguity information for provinces stored earlier as a list could also be stored as a dictionary:

neighbours = {'Ulster':['Connaught','Leinster'],
    'Connaught':['Ulster','Leinster','Munster'],
    'Leinster':['Ulster','Connaught','Munster'],
    'Munster':['Connaught','Leinster']}
print(neighbours['Munster'])
## ['Connaught', 'Leinster']

As there is the idea of an empty list, there is also an empty dictionary, represented by {}. This is useful, as new items in a dictionary can be created by statements of the form dict[key]=value. If key already exists in the dictionary the item will be overwritten, but if it isn’t, a new key/value pair is added. Thus, another way to enter the neighbours of the provinces in Ireland is

# Start with an empty dictionary
neighbours = {}
# Add entries one by one
neighbours['Ulster'] = ['Connaught','Leinster']
neighbours['Connaught'] = ['Ulster','Leinster','Munster']
neighbours['Leinster'] = ['Ulster','Connaught','Munster']
neighbours['Munster'] = ['Connaught','Leinster']
# Prove the dictionary works as before
print(neighbours['Munster'])
## ['Connaught', 'Leinster']

Two methods sometimes helpful for dictionaries are keys - which extracts all of the key fields for a dictionary as a list. Similarly the method values extracts all of the values.

print(population.keys())
## dict_keys(['Ulster', 'Connaught', 'Leinster', 'Munster'])
print(population.values())
## dict_values([294803, 542547, 2504814, 1246088])

Note that although the order of the values corresponds to the values of the keys, the order they are extracted is not necessarily the order in which they were added to the dictionary. Dictionaries associate keys to values, but unlike lists, no specific order for the items is implied.

Python Programs

Program Loops

You will have already encountered loops in other languages. In Python a basic for loop looks like this:

for n in [2,4,6,9]:
    print(n)
## 2
## 4
## 6
## 9

The main ingredients are the looping variable n and the list to loop through - here [2,4,6,9]. Note that the ‘body’ of the loop is indented by a tab. Typing the tab is essential - its actually part of Python’s syntax. In this case, the loop takes every value in this list and prints it out. Also note that this is two lines of Python, and that each line on its own is insufficient to create the loop. Both lines must be entered in a box in Jupyter (lines separated by ‘enter’) and then shift+enter to run the loop.

For each cycle of the loop, n refers to the corresponding item in the list. Looping through lists is a useful tool if you want to add up their values. Consider the following code:

age_total = 0.0
for age in ages:
    age_total = age_total + age
print(age_total)
## 255.0

Note that the last line in the code has no indent (ie the first character is not tab) - this tells Python that the line is executed after the loop is completed - if it had been indented, Python would execute it on every cycle of the loop. This would result in all of the running totals to be printed as well as the final result. In your editor, add a tab to the last line, copy the modified code to the clipoboard and paste and run again.

age_total = 0.0
for age in ages:
    age_total = age_total + age
    print(age_total)
## 19.0
## 40.0
## 63.0
## 97.0
## 140.0
## 192.0
## 255.0

The output now shows the running total as predicted. If you had wanted an average age rather than a total age, the code is relatively easy to modify - again do this by editing the code in your text editor, and run it.

age_total = 0.0
for age in ages:
    age_total = age_total + age
age_average = age_total / len(ages)
print(age_average)
## 36.42857142857143

Finally on this code snippet, a useful tool is the use of += - this operator adds the right hand value to the left hand variable, and stores it in that variable. For example, x = x + 1 can be replaced by x += 1. The code now becomes

age_total = 0.0
for age in ages:
    age_total += age
age_average = age_total / len(ages)
print(age_average)
## 36.42857142857143

Now you may see why the keys method for dictionaries is useful - it provides a list of keys in a dictionary to loop through.

for province in neighbours.keys():
    print(province,"has", len(neighbours[province]), "bordering provinces")
## Ulster has 2 bordering provinces
## Connaught has 3 bordering provinces
## Leinster has 3 bordering provinces
## Munster has 2 bordering provinces

However, the output is rather messy. A new use of the % operator is as a formating tool. The expression fmt % x creates a string in which the variable x is formatted according to a specification in the string fmt - the range of possible formats is large, but for now note that the format '%10s' takes a string and pads it out with spaces to have a length of 10, if it is shorter than this beforehand. The code below adds a statement in the loop to do this.

for province in neighbours.keys():
    fmt_prov = '%10s' % province
    print(fmt_prov,"has", len(neighbours[province]), "bordering provinces")
##     Ulster has 2 bordering provinces
##  Connaught has 3 bordering provinces
##   Leinster has 3 bordering provinces
##    Munster has 2 bordering provinces

Also, if it is possible to left-justify the province names, by replacing 10 with -10 in the format statements.

for province in neighbours.keys():
    fmt_prov = '%-10s' % province
    print(fmt_prov,"has", len(neighbours[province]), "bordering provinces")
## Ulster     has 2 bordering provinces
## Connaught  has 3 bordering provinces
## Leinster   has 3 bordering provinces
## Munster    has 2 bordering provinces

Defining Functions

As well as the functions that are built in to Python (such as len) it is possible to define your own functions. For example, to define a function to compute the average of a list of numbers, the following can be used:

def average(data_set):
    total = 0.0
    for item in data_set:
        total += item
    return total / len(data_set)

As with for loops the indents (via tabbing) are actually part of the syntax. The indented code under the def statement is part of the function - and when the indenting stops, the function definition is complete. Note also the loop inside the function. The loop body is doubly indented, because

  1. It is in a loop; and
  2. The loop is inside the body of a function.

If you enter this into a a Jupyter box and send to to Python by hitting shift+enter, you have the added a new function to Python, called average. At this stage you will see no print-out because you have only defined the function - not used it. In the next box enter the following to test it out:

height = [160.0,163.0,157.0,171.0,168.0,176.0]
print(average(height))
## 165.83333333333334

You can see that average now works like any other Python function. This could be made to look neater by formatting the result:

print('%8.2f' % average(height))
##   165.83

Note that functions can return lists and dictionaries as well as single values - for example the built-in function range returns a list of numbers from 0 to n - 1.

print(range(5))
## range(0, 5)

This is useful in basic loops that count through the value of some index:

for i in range(10):
    print(i, i*i)
## 0 0
## 1 1
## 2 4
## 3 9
## 4 16
## 5 25
## 6 36
## 7 49
## 8 64
## 9 81

Again, watch out for the zero indexing, people often expect the index to run from 1 to 10 not 0 to 9.

range is also useful for loops that have to be run a fixed number of times, but without actually referring to the index variable. For example, to create a list with 10 entries, all equal to zero, use

l = []
for i in range(10):
    l += [0]
print(l)
## [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

The += here works in list mode, ie the statement is equivalent to l = l + [0], which appends a new element 0 to the existing list l. Because the list is initially empty, doing this 10 times gives a list of 10 zeroes. The value of i is not used in the loop, but because it loops over 10 values, the desired effect is achieved.

Putting some of these ideas together, the function below returns a list of length n where each element in the list is twice the value of its predecessor. Add the following to average.py and run it.

def doubling_up(n):
    latest = 1
    result = []
    for i in range(n):
        result += [latest]
        latest *= 2
    return result
    
print(doubling_up(4))
## [1, 2, 4, 8]

You can now combine the two functions you have written:

print(average(doubling_up(12)))
## 341.25

Local Variables

When you create a function via def you often use variables inside the function. These are known as local variables. An interesting characteristic of these are that they only exist inside the function definition - so in the function doubling_up the variables latest and result do not exist once the function has been run. The other characteristic of local variables is that if in the main program there are variables with the same names, they won’t be altered when you call the function.

Hence:

latest = 'Hello'
print(doubling_up(6))
## [1, 2, 4, 8, 16, 32]
print(latest)
## Hello

You can also verify that print(result leads to an error, as result is undefined outside of the function.)

If - Then - Else

Python also supports conditional statements - these are sections of code that are only run if some condition is true or false. To begin with, note that Python can also evaluate logical expressions:

x = 6
y = 9
print(x < 8)
## True
print(x == 6)
## True
print(x >= y)
## False

These are expressions that have the value True or False depending on the truth of the statement. The general comparison operators are:

Operator Meaning
== Equal to
!= Not equal to
< Less than
> Greater than
<= Less than or equal to
>= Greater than or equal to

Note the difference between = and ==. x == 6 is an expression having the value True or False depending on whether x has the value 6 or not, but x = 6 assigns the value 6 to x, overwriting any previous value.

In addition, these can be combined using not, and and or. For example

print(not x < 8)
## False
print(x == 6 or y == 5)
## True
print(x > 5 or y > 15)
## True

Another useful operator is in:

print(x in [5,7,9])
## False
print(x in range(12))
## True

These can be used in conjunction with the if statement - this works using the tabbing approach, in the same way as def and for:

z = 4
if z < 8 :
    print('z is less than 4')
    print('So it must be pretty small...')
## z is less than 4
## So it must be pretty small...

The lines that are indented with tabs after the if statement are only executed if the logical expression is true. Once the tabbing stops, the lines are executed regardless of the test condition:

if z < 8 :
    print('z is less than 8')
    print('So it must be pretty small...')
## z is less than 8
## So it must be pretty small...
print('This gets printed anyway')
## This gets printed anyway
z = 12
if z < 8 :
    print('z is less than 8')
    print('So it must be pretty small...')
print('This gets printed anyway')
## This gets printed anyway

There is also an else statement - this specifies code to be executed if the test in the if statement isn’t true. Its use is demonstrated here:

z = 12

if z < 8 :
  print('z is less than 8')
  print('So it must be pretty small...')
else:
    print('z is at least 8')
    print('So it is fairly big')
## z is at least 8
## So it is fairly big
print('This gets printed anyway')
## This gets printed anyway

Once again, the code associated with the else statement is indented with a tab. Now set z to some value less than 8 and try running the code above again.

As before, you can incorporate all of the ideas together. For example, define a function to compute factorials. The factorial of a number n is defined as n * (n -1) * (n - 2) * ... * 2 * 1, so factorial(4) is 4 * 3 * 2 * 1 = 24. An exception is that the factorial of zero is one. A function to compute factorials is

def factorial(n):
    if n == 0:
        return 1
    else:
        result = 1
        for i in range(n):
            result *= i + 1
        return result

If you define this in a Jupyter box, you can then try it out.

print(factorial(5))
## 120
print(factorial(0))
## 1

There are a few things to note about the function definition. Firstly, note the multiple tab nesting - there is a for loop inside an else statement inside a def of a new function. Another thing to note is that the result *= (i + 1) statement - this makes a running result of the multiplications, but because range gives a list going from 0 to n - 1, it is necessary to use i + 1 as the multiplier. As a self test question, what would happen if you used result *= i instead?

Next, an interesting aside - factorials can get very large quite rapidly - for example the factorial of 20 is 2,432,902,008,176,640,000. Try to compute the factorial of 50:

print(factorial(50))
## 30414093201713378043612608166064768844377641568960512000000000000

Python has another data type called long - these are basically integers of arbitrary length. When an integer calculation gets too large for standard 4-byte integers, the result converts to a long. Here is a more extreme example

print(factorial(1000))
## 402387260077093773543702433923003985719374864210714632543799910429938512398629020592044208486969404800479988610197196058631666872994808558901323829669944590997424504087073759918823627727188732519779505950995276120874975462497043601418278094646496291056393887437886487337119181045825783647849977012476632889835955735432513185323958463075557409114262417474349347553428646576611667797396668820291207379143853719588249808126867838374559731746136085379534524221586593201928090878297308431392844403281231558611036976801357304216168747609675871348312025478589320767169132448426236131412508780208000261683151027341827977704784635868170164365024153691398281264810213092761244896359928705114964975419909342221566832572080821333186116811553615836546984046708975602900950537616475847728421889679646244945160765353408198901385442487984959953319101723355556602139450399736280750137837615307127761926849034352625200015888535147331611702103968175921510907788019393178114194545257223865541461062892187960223838971476088506276862967146674697562911234082439208160153780889893964518263243671616762179168909779911903754031274622289988005195444414282012187361745992642956581746628302955570299024324153181617210465832036786906117260158783520751516284225540265170483304226143974286933061690897968482590125458327168226458066526769958652682272807075781391858178889652208164348344825993266043367660176999612831860788386150279465955131156552036093988180612138558600301435694527224206344631797460594682573103790084024432438465657245014402821885252470935190620929023136493273497565513958720559654228749774011413346962715422845862377387538230483865688976461927383814900140767310446640259899490222221765904339901886018566526485061799702356193897017860040811889729918311021171229845901641921068884387121855646124960798722908519296819372388642614839657382291123125024186649353143970137428531926649875337218940694281434118520158014123344828015051399694290153483077644569099073152433278288269864602789864321139083506217095002597389863554277196742822248757586765752344220207573630569498825087968928162753848863396909959826280956121450994871701244516461260379029309120889086942028510640182154399457156805941872748998094254742173582401063677404595741785160829230135358081840096996372524230560855903700624271243416909004153690105933983835777939410970027753472000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

Factorials are not defined for negative numbers. However the current function does not check for this -

print(factorial(-4))
## 1

so the answer does not make sense. It might be better to modify the function to test whether the number is negative, and instead of returning a numerical result, return the string 'Undefined'. One way to do this is to test whether the number is negative, return 'Undefined' if that is true, and then put the existing code in an else clause.

def factorial(n):
    if n < 0 :
        return 'Undefined'
    else: 
        if n == 0:
            return 1
        else:
            result = 1
            for i in range(n):
                result *= i + 1
            return result

It is possible to click on a Jupyter box you have already entered, and edit the code. When you have done that, pressing shift+enter re-submits the function to Python. Doing this to the factorial function, so the edited version is the one above, results in new behaviour:

print(factorial(-4))
## Undefined
print(factorial(4))
## 24

note that the code above is checking for three statuses of n - either n < 0, or n == 0 or n > 0. The above approach deals with this, but requires that you nest several if statements. A shorthand version uses elif - the template here is

\begin{enumerate} 1. if first condition to test 2. ‘tabbed in’ code to execute if above is true 3. elif Next condition (if first condition not true) 4. ‘tabbed in’ code to execute (if above condition is true) 5. repeat steps 3 and 4 if no previous conditions are true 6. else Do this if none of the above conditions are true - this is the catch-all 7. ‘tabbed in’ code to execute

Steps 5 and 6 can be omitted if no catch-all code is required.

def factorial(n):
    if n < 0 :
        return 'Undefined'
    elif n == 0:
        return 1
    else:
        result = 1
        for i in range(n):
            result *= i + 1
        return result

The factorial function now checks for negative numbers - but an alternative to returning a value if one is found is to cause an error to be raised. Python has its own errors, but it is also possible to create new ones, via the raise statement - as in this code:

def factorial(n):
    if n < 0 :
        raise Exception('Factorial not defined for negative integers')
    elif n == 0:
        return 1
    else:
        result = 1
        for i in range(n):
            result *= i + 1
        return result

If you edit the function again then you can test it out:

print(factorial(-6))

You will see an error of the form

Exception: Factorial not defined for negative integers

returned. This functions in the same way as a Python built in error - but is more helpful in identifying the problem, since it relates directly to the function you are defining.

The While Loop

The while loop is another kind of loop, making use of a logical expression. The number of times a for loop cycles is determined when the loop is started - it is just the length of the list in the expression for i in list :. A while loop begins with a logical expression and loops as long as the expression is true. In this case, the number of times the loop cycles is not known. Below a while loop is used to contruct a ‘doubling up’ sequence, as before, but this time instead of going for a fixed length it carries on until the value exceeds 1000 -

latest = 1
result = []
while latest <= 1000:
    result += [latest]
    latest *= 2
print(result)
## [1, 2, 4, 8, 16, 32, 64, 128, 256, 512]

This can also be used in functions - again note the use of tabbing. In this code block, the function double_up_until is defined, and then also run.

def double_up_until(n_max):
    latest = 1
    result = []
    while latest <= n_max:
        result += [latest]
        latest *= 2
    return result
print(double_up_until(10000))
## [1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192]

Saving your Python Session, and Coming Back Later

When you have finshed the exercise (or at any stage during the exercise) it is possible to save your status. You do this by clicking on File and then Save Notebook as… from the menu. Choose a suitable name for your notebook (it should have the file ending .ipynb) and click Save. I suugest something like pythonweek1.ipynb as a good name. When you come back to this click on the file icon, and select the file you saved. This loads all of the commands you entered but at this stage they won’t have been run. To run all of the Jupyter boxes in the same way you put them in, click on Run and then Run all cells. You can now see everything you already entered, and any printouts they produced. Also any variables you created will exist for use in future code you put in.

Exercises

If you have got this far you will have a reasonable grasp of the key ideas of programming in Python. To finish, here are a few exercises that you can use to practice your programming skills for next week. Also, to learn more Python ideas, try visiting .

Exercise 1

Write a Python function to add up all of the numbers from 1 to n.

Exercise 2

Euclid’s algorithm to find the greatest common divisor (GCD) of two integers is one of the oldest documented algorithms in the world, dating back to around 300BC. The GCD is the largest number that divides exactly into the two numbers supplied - so for example the GCD of 18 and 15 is 3. The algorithm can be described as follows:

  1. take two numbers, `aandb` - the aim is to find their GCD
  2. while b is greater than zero, repeat the following two steps:
    1. replace b with the remainder when a is divided by b
    2. replace a with the old value of b
  3. When b is zero, a is the GCD

Write a Python function, gcd(a,b) that returns the GCD of a and b.


  1. actually plus one, see above.