Data Types

Numbers

Arithmetic is relatively standard.

> 1 + 1

> 1 * 3

> 1/2

0.5

> #Exponent
+ 2 ** 4

The modulo operator in python is %, which compares to %% in R. Integer division uses // in python and %/% in R.

> 5 % 2

> 5 // 2

> (2 + 3) * (5 + 5)

Variable Assignment

> # Cannot start with a number or special characters
+ name_of_var = 2
+ name_of_var

> x = 2
+ y = 3
+ z = x + y
+ z

> z = z+z
+ z

Strings

You can use single or double quotes, but inner quotes need to differ from outter quotes.

> 'single quotes'

'single quotes'

> "double quotes"

'double quotes'

> print(" wrap lot's of other quotes")

 wrap lot's of other quotes

> print('He said, "Hello."')

He said, "Hello."

Printing

> x = 'hello'
+ x

'hello'

> print(x)

hello

> num = 12
+ name = 'Sam'
+ print('My number is: {one}, and my name is: {two}'.
+       format(one=num,two=name))

My number is: 12, and my name is: Sam

> print('My number is: {}, and my name is: {}'.
+       format(num,name))

My number is: 12, and my name is: Sam

Unlike in R, python uses zero indexing. In R, selecting the first 3 would be 1:3. In python it is 0:3 (0,1, and 2. Up to but not including 3).

> s = 'hello'
+ s[4]

'o'

> s[-1]

'o'

> s='abcdefghijk'
+ s[0:]

'abcdefghijk'

> s[:3]

'abc'

> # start at index 3 (4th letter)
+ # up to but not including index 6 (7th letter)
+ s[3:6]

'def'

Lists

A sequence of elements in square brackets, separated by commas.

> [1,2,3]

[1, 2, 3]

> ['hi',1,[1,2]]

['hi', 1, [1, 2]]

> my_list = ['a','b','c']
+ my_list.append('d')
+ my_list

['a', 'b', 'c', 'd']

> my_list[0]

'a'

> my_list[1:3]

['b', 'c']

> my_list[1:]

['b', 'c', 'd']

> my_list[:1]

['a']

You can also use assignment

> my_list[0] = 'NEW'
+ my_list

['NEW', 'b', 'c', 'd']

> nest = [1,2,3,[4,5,['target']]]
+ nest[3]

[4, 5, ['target']]

> nest[3][2]

['target']

> #grab only the list element
+ nest[3][2][0]

'target'

> # at what index is 3?
+ nest.index(3)

> # at what index is 5?
+ nest[3].index(5)

Dictionaries

key:value pair mappings

> d = {'key1':'item1','key2':'item2'}
+ d

{'key1': 'item1', 'key2': 'item2'}

> d['key1']

'item1'

> d = {'k1':[1,2,3]}
+ d['k1'][1]

> d = {'k1':{'innerkey':[1,2,3]}}
+ d['k1']['innerkey'][1]

Booleans

Booleans have a value of TRUE or FALSE.

Tuples

Similar to lists, but in parentheses and are immutable.

> t = (1,2,3)
+ t[0]

> #immutable.  Can't assign
+ t[0] = 'NEW'

Error in py_call_impl(callable, dots$args, dots$keywords): TypeError: 'tuple' object does not support item assignment

Detailed traceback: 
  File "<string>", line 1, in <module>

Sets

Only display unique elements.

> {1,2,3}

{1, 2, 3}

> {1,2,3,1,2,1,2,3,3,3,3,2,2,2,1,1,2}

{1, 2, 3}

> set([1,1,1,3,3,3,3,2,2,2,5,5,5,6,6])

{1, 2, 3, 5, 6}

> s = {1,2,3}
+ s.add(5)
+ s

{1, 2, 3, 5}

Comparison Operators

> 1 > 2

False

> 1 < 2

True

> 1 >= 1

True

> 1 <= 4

True

> 1 == 1

True

> 1 != 3

True

> 'hi' == 'bye'

False

Logic Operators

> (1 > 2) and (2 < 3)

False

> (1 > 2) or (2 < 3)

True

> (1 == 2) or (2 == 3) or (4 == 4)

True

If, Elif, Else Statements

Python doesn’t use brackets to separate block of code execution statements (like in R) it uses whitespace.

> if 1 < 2:
+     print('Yep!')

Yep!

> if 1 < 2:
+     print('yep!')

yep!

> if 1 < 2:
+     print('first')
+ else:
+     print('last')

first

It will stop executing when the first condition is met.

> if 1 == 2:
+     print('first')
+ elif 3 == 3:
+     print('second')
+ elif 4 == 4:
+     print('middle')
+ else:
+     print('Last')

second

Loops

> seq = [1,2,3,4,5]

item is a temporary variable name.

> for item in seq:
+     print(item)

> for item in seq:
+     print('Yep')

Yep
Yep
Yep
Yep
Yep

> for jelly in seq:
+     print(jelly+jelly)

While Loops

> i = 1
+ while i < 5:
+     print('i is: {}'.format(i))
+     i = i+1

i is: 1
i is: 2
i is: 3
i is: 4

Range()

> seq = [*range(1,6)]
+ seq

[1, 2, 3, 4, 5]

> range(5)

range(0, 5)

> for i in range(5):
+     print(i)

> list(range(5))

[0, 1, 2, 3, 4]

List comprehension

List comprehension is an efficient way to perform a looping operation on a set. It can be accomplished in one line.

> x = [1,2,3,4]

> out = [] #empty list
+ for item in x:
+     out.append(item**2)
+ print(out)

[1, 4, 9, 16]

Instead of a loop you can use this format: new_list = [expression for member in iterable]

> [item**2 for item in x]

[1, 4, 9, 16]

> out = [num**2 for num in x]
+ out

[1, 4, 9, 16]

Functions

> def my_func(param1):
+     print(param1)
+ my_func('hello')

hello

> def my_func(name='Default Name'):
+     print('Hello '+name)
+ my_func()

Hello Default Name

> my_func(name='Paul')

Hello Paul

> def my_func(param1='default'):
+     """
+     Docstring goes here.
+     Can go multiple lines
+     """
+     print(param1)

> my_func()

default

> my_func('new param')

new param

You can add a doc string inside a function. Then, just use <shift><tab> after writing the function and it will be displayed.

> def square(x):
+     """
+     This function squares
+     a number
+     
+     """
+     return x**2

> out = square(2)

> print(out)

Lambda expressions

You can create an anonymous function with lambda.

> def times2(var):
+     return var*2
+ 
+ times2(2)

> lambda var: var*2

<function <lambda> at 0x000000002BE69678>

Map and filter

You can evaluate a function on each element of a list using map. It is similar to R’s map function from purrr.

> seq = [1,2,3,4,5]
+ map(times2,seq)

<map object at 0x000000002BD27188>

> list(map(times2,seq))

[2, 4, 6, 8, 10]

Mapping an anonymous function using lambda.

> list(map(lambda var: var*2,seq))

[2, 4, 6, 8, 10]

You can also filter across a list.

> filter(lambda item: item%2 == 0,seq)

<filter object at 0x000000002BD27748>

> list(filter(lambda item: item%2 == 0,seq))

[2, 4]

Methods

You can retrieve a list of available methods by using <tab> after typing a function name follow by a period.

> st = 'hello my name is Sam'
+ st.lower()

'hello my name is sam'

> st.upper()

'HELLO MY NAME IS SAM'

You can also split a string. In R you can do the same with stringr::str_split().

> st.split()

['hello', 'my', 'name', 'is', 'Sam']

> tweet = 'Go Sports! #Sports'
+ tweet.split('#')

['Go Sports! ', 'Sports']

> tweet.split('#')[1]

'Sports'

> d = {'k1':1,'k2':2}
+ d

{'k1': 1, 'k2': 2}

> d.keys()

dict_keys(['k1', 'k2'])

> d.items()

dict_items([('k1', 1), ('k2', 2)])

> d.values()

dict_values([1, 2])

.pop() will remove the last element, and the change is permanent.

> lst = [1,2,3]
+ lst.pop()

> lst

[1, 2]

> lst = [1,2,3,4,5]
+ first=lst.pop(0)
+ first

> 'x' in [1,2,3]

False

> 'x' in ['x','y','z']

True

> x = [(1,2),(3,4),(5,6)]
+ x[0]

(1, 2)

> x[0][0]

Tuple Unpacking

> for item in x:
+     print(item)

(1, 2)
(3, 4)
(5, 6)

> for (a,b) in x:
+     print(a)

1
3
5

> for a,b in x:
+     print(a)
+     print(b)

Exercises

What is 7 to the power of 4?

> 7**4

Split this string into a list: s = "Hi there Sam!"

> s = 'Hi there Sam!'
+ s.split()

['Hi', 'there', 'Sam!']

Given the variables:

planet = “Earth”
diameter = 12742
Use .format() to print the following string:
The diameter of Earth is 12742 kilometers.

> planet = "Earth"
+ diameter = 12742
+ print('The diameter of {one} is {two} kilometers.'.
+       format(one=planet,two=diameter))

The diameter of Earth is 12742 kilometers.

Given this nested list, use indexing to grab the word “hello”:

> lst = [1,2,[3,4],[5,[100,200,['hello']],23,11],1,7]
+ lst[3][1][2][0]

'hello'

Given this nested dictionary grab the word “hello”:

> d = {'k1':[1,2,3,{'tricky':['oh','man',
+ 'inception',{'target':[1,2,3,'hello']}]}]}
+ 
+ d['k1'][3]['tricky'][3]['target'][3]

'hello'

What is the main difference between a tuple and a list?

> # Tuple is immutable

Create a function that grabs the email website domain from a string in the form:

user@domain.com

> def domainGet(email):
+     return email.split('@')[-1]
+ domainGet('user@domain.com')

'domain.com'

Create a basic function that returns True if the word ‘dog’ is contained in the input string. Don’t worry about edge cases like a punctuation being attached to the word dog, but do account for capitalization.

> def findDog(st):
+     return 'dog' in st.lower().split()
+ 
+ findDog('Is there a dog here?')

True

Create a function that counts the number of times the word “dog” occurs in a string. Again ignore edge cases.

> def countDog(st):
+     count = 0
+     for word in st.lower().split():
+         if word == 'dog':
+             count += 1
+     return count
+ 
+ countDog('This dog runs faster than the other dog dude!')

Use lambda expressions and the filter() function to filter out words from a list that don’t start with the letter ‘s’. For example:

seq = [‘soup’,‘dog’,‘salad’,‘cat’,‘great’]
should be filtered down to:[‘soup’,‘salad’]

> seq = ['soup','dog','salad','cat','great']
+ list(filter(lambda word: word[0]=='s',seq))

['soup', 'salad']

You are driving a little too fast, and a police officer stops you. Write a function to return one of 3 possible results: “No ticket”, “Small ticket”, or “Big Ticket”. If your speed is 60 or less, the result is “No Ticket”. If speed is between 61 and 80 inclusive, the result is “Small Ticket”. If speed is 81 or more, the result is “Big Ticket”. Unless it is your birthday (encoded as a boolean value in the parameters of the function) – on your birthday, your speed can be 5 higher in all cases.

> def caught_speeding(speed, is_birthday):
+     
+     if is_birthday:
+         speeding = speed - 5
+     else:
+         speeding = speed
+     
+     if speeding > 80:
+         return 'Big Ticket'
+     elif speeding > 60:
+         return 'Small Ticket'
+     else:
+         return 'No Ticket'
+         
+ 
+ caught_speeding(81,True)

'Small Ticket'

> caught_speeding(81,False)

'Big Ticket'

Python Crash Course

Python code in R Markdown

Paul Jozefek

2020-09-08