Course Description
It’s time to push forward and develop your Python chops even further. There are tons of fantastic functions in Python and its library ecosystem. However, as a data scientist, you’ll constantly need to write your own functions to solve problems that are dictated by your data. You will learn the art of function writing in this first Python Data Science Toolbox course. You’ll come out of this course being able to write your very own custom functions, complete with multiple parameters and multiple return values, along with default arguments and variable-length arguments. You’ll gain insight into scoping in Python and be able to write lambda functions and handle errors in your function writing practice. And you’ll wrap up each chapter by using your new skills to write functions that analyze Twitter DataFrames.
In this chapter, you’ll learn how to write simple functions, as well as functions that accept multiple arguments and return multiple values. You’ll also have the opportunity to apply these new skills to questions commonly encountered by data scientists.
In the video, you learned of another standard Python datatype,
strings. Recall that these represent textual data. To
assign the string 'DataCamp'
to a variable
company
, you execute:
company = 'DataCamp'
You’ve also learned to use the operations +
and
*
with strings. Unlike with numeric types such as ints and
floats, the +
operator concatenates strings
together, while the *
concatenates multiple copies of a
string together. In this exercise, you will use the +
and
*
operations on strings to answer the question below.
Execute the following code in the shell:
object1 = "data" + "analysis" + "visualization"
object2 = 1 * 3
object3 = "1" * 3
What are the values in object1
, object2
,
and object3
, respectively?
object1
contains
"data + analysis + visualization"
, object2
contains "1*3"
, object3
contains
13
.object1
contains
"data+analysis+visualization"
, object2
contains 3
, object3
contains
"13"
.object1
contains "dataanalysisvisualization"
, object2
contains 3
, object3
contains
"111"
.In the video, Hugo briefly examined the return behavior of the
built-in functions print()
and str()
. Here,
you will use both functions and examine their return values. A variable
x
has been preloaded for this exercise. Run the code below
in the console. Pay close attention to the results to answer the
question that follows.
str(x)
to a variable y1
:
y1 = str(x)
print(x)
to a variable y2
:
y2 = print(x)
x
, y1
,
and y2
.What are the types of x
, y1
, and
y2
?
str
types.x
is a
float
, y1
is an float
, and
y2
is a str
.x
is a
float
, y1
is a str
, and
y2
is a NoneType
.NoneType
types.In the last video, Hugo described the basics of how to define a function. You will now write your own function!
Define a function, shout()
, which simply prints out a
string with three exclamation marks '!!!'
at the end. The
code for the square()
function that we wrote earlier is
found below. You can use it as a pattern to define
shout()
.
def square():
new_value = 4 ** 2
return new_value
Note that the function body is indented 4 spaces already for you. Function bodies need to be indented by a consistent number of spaces and the choice of 4 is common.
This course touches on a lot of concepts you may have forgotten, so if you ever need a quick refresher, download the Python for Data Science Cheat Sheet and keep it handy!
shout
.'congratulations'
with another string, '!!!'
.
Assign the result to shout_word
.shout_word
.shout
function.# Define the function shout
def shout():
"""Print a string with three exclamation marks"""
# Concatenate the strings: shout_word
shout_word = 'congratulations' + '!!!'
# Print shout_word
print(shout_word)
# Call shout
shout()
Congratulations! You have successfully defined and called your own function! That’s pretty cool.
In the previous exercise, you defined and called the function
shout()
, which printed out a string concatenated with
'!!!'
. You will now update shout()
by adding a
parameter so that it can accept and process any string
argument passed to it. Also note that shout(word)
,
the part of the header that specifies the function name and
parameter(s), is known as the signature of the function. You
may encounter this term in the wild!
word
.word
with
'!!!'
to shout_word
.shout_word
.shout()
function, passing to it the string,
'congratulations'
.# Define shout with the parameter, word
def shout(word):
"""Print a string with three exclamation marks"""
# Concatenate the strings: shout_word
shout_word = word + '!!!'
# Print shout_word
print(shout_word)
# Call shout with the string 'congratulations'
shout('congratulations')
You’re getting very good at this! Try your hand at another
modification to the shout()
function so that it now
returns a single value instead of printing within the function.
Recall that the return
keyword lets you return values from
functions. Parts of the function shout()
, which you wrote
earlier, are shown. Returning values is generally more desirable than
printing them out because, as you saw earlier, a print()
call assigned to a variable has type NoneType
.
word
with '!!!'
and assign to shout_word
.print()
statement with the appropriate
return
statement.shout()
function, passing to it the string,
'congratulations'
, and assigning the call to the variable,
yell
.yell
contains the value returned by
shout()
, print the value of yell
.# Define shout with the parameter, word
def shout(word):
"""Return a string with three exclamation marks"""
# Concatenate the strings: shout_word
shout_word = word + '!!!'
# Replace print with return
return shout_word
# Pass 'congratulations' to shout: yell
yell = shout('congratulations')
# Print yell
print(yell)
Hugo discussed the use of multiple parameters in defining functions
in the last lecture. You are now going to use what you’ve learned to
modify the shout()
function further. Here, you will modify
shout()
to accept two arguments. Parts of the function
shout()
, which you wrote earlier, are shown.
word1
and word2
, in that order.word1
and word2
with
'!!!'
and assign to shout1
and
shout2
, respectively.shout1
and shout2
together, in
that order, and assign to new_shout
.'congratulations'
and
'you'
, in that order, to a call to shout()
.
Assign the return value to yell
.# Define shout with parameters word1 and word2
def shout(word1, word2):
"""Concatenate strings with three exclamation marks"""
# Concatenate word1 with '!!!': shout1
shout1 = word1 + '!!!'
# Concatenate word2 with '!!!': shout2
shout2 = word2 + '!!!'
# Concatenate shout1 with shout2: new_shout
new_shout = shout1 + shout2
# Return new_shout
return new_shout
# Pass 'congratulations' and 'you' to shout: yell
yell = shout('congratulations', 'you')
# Print yell
print(yell)
Alongside learning about functions, you’ve also learned about tuples!
Here, you will practice what you’ve learned about tuples: how to
construct, unpack, and access tuple elements. Recall how Hugo unpacked
the tuple even_nums
in the video:
a, b, c = even_nums
A three-element tuple named nums
has been preloaded for
this exercise. Before completing the script, perform the following:
nums
in the IPython shell. Note
the elements in the tuple.nums
to the value 2 by doing an assignment:
nums[0] = 2
. What happens?nums
to the variables num1
,
num2
, and num3
.even_nums
composed of the same
elements in nums
, but with the 1st element replaced with
the value, 2.# edited/added
nums = (3,4,6)
# Unpack nums into num1, num2, and num3
num1, num2, num3 = nums
# Construct even_nums
even_nums = (2, num2, num3)
In the previous exercise, you constructed tuples, assigned tuples to
variables, and unpacked tuples. Here you will return multiple values
from a function using tuples. Let’s now update our shout()
function to return multiple values. Instead of returning just one
string, we will return two strings with the string !!!
concatenated to each.
Note that the return statement return x, y
has the same
result as return (x, y)
: the former actually packs
x
and y
into a tuple under the hood!
shout_all
, and it accepts two parameters,
word1
and word2
, in that order.'!!!'
to each of
word1
and word2
and assign to
shout1
and shout2
, respectively.shout_words
, composed of
shout1
and shout2
.shout_all()
with the strings
'congratulations'
and 'you'
and assign the
result to yell1
and yell2
(remember,
shout_all()
returns 2 variables!).# Define shout_all with parameters word1 and word2
def shout_all(word1, word2):
"""Return a tuple of strings"""
# Concatenate word1 with '!!!': shout1
shout1 = word1 + '!!!'
# Concatenate word2 with '!!!': shout2
shout2 = word2 + '!!!'
# Construct a tuple with shout1 and shout2: shout_words
shout_words = (shout1, shout2)
# Return shout_words
return shout_words
# Pass 'congratulations' and 'you' to shout_all(): yell1, yell2
yell1, yell2 = shout_all('congratulations', 'you')
# Print yell1 and yell2
print(yell1)
print(yell2)
You’ve got your first taste of writing your own functions in the previous exercises. You’ve learned how to add parameters to your own function definitions, return a value or multiple values with tuples, and how to call the functions you’ve defined.
In this and the following exercise, you will bring together all these concepts and apply them to a simple data science problem. You will load a dataset and develop functionalities to extract simple insights from the data.
For this exercise, your goal is to recall how to load a dataset into
a DataFrame. The dataset contains Twitter data and you will iterate over
entries in a column to build a dictionary in which the keys are the
names of languages and the values are the number of tweets in the given
language. The file tweets.csv
is available in your current
directory.
Be aware that this is real data from Twitter and as such there is always a risk that it may contain profanity or other offensive content (in this exercise, and any following exercises that also use real Twitter data).
pd
.'tweets.csv'
using the pandas function
read_csv()
. Assign the resulting DataFrame to
df
.for
loop by iterating over
col
, the 'lang'
column in the DataFrame
df
.if-else
statements in the
for loop: if the key is in the dictionary
langs_count
, add 1
to the value corresponding
to this key in the dictionary, else add the key to
langs_count
and set the corresponding value to
1
. Use the loop variable entry
in your
code.# Import pandas
import pandas as pd
# Import Twitter data as DataFrame: df
df = pd.read_csv('tweets.csv')
# Initialize an empty dictionary: langs_count
langs_count = {}
# Extract column from DataFrame: col
col = df['lang']
# Iterate over lang column in DataFrame
for entry in col:
# If the language is in langs_count, add 1
if entry in langs_count.keys():
langs_count[entry] += 1
# Else add the language to langs_count, set the value to 1
else:
langs_count[entry] = 1
# Print the populated dictionary
print(langs_count)
Great job! You’ve now defined the functionality for iterating over entries in a column and building a dictionary with keys the names of languages and values the number of tweets in the given language.
In this exercise, you will define a function with the functionality you developed in the previous exercise, return the resulting dictionary from within the function, and call the function with the appropriate arguments.
For your convenience, the pandas package has been imported as
pd
and the 'tweets.csv'
file has been imported
into the tweets_df
variable.
count_entries()
, which has two
parameters. The first parameter is df
for the DataFrame and
the second is col_name
for the column name.if-else
statements in the
for
loop: if the key is in the dictionary
langs_count
, add 1
to its current value,
else add the key to langs_count
and set
its value to 1
. Use the loop variable entry
in
your code.langs_count
dictionary from inside the
count_entries()
function.count_entries()
function by passing to it
tweets_df
and the name of the column, 'lang'
.
Assign the result of the call to the variable result
.# edited/added
tweets_df = pd.read_csv('tweets.csv')
# Define count_entries()
def count_entries(df, col_name):
"""Return a dictionary with counts of
occurrences as value for each key."""
# Initialize an empty dictionary: langs_count
langs_count = {}
# Extract column from DataFrame: col
col = df[col_name]
# Iterate over lang column in DataFrame
for entry in col:
# If the language is in langs_count, add 1
if entry in langs_count.keys():
langs_count[entry] += 1
# Else add the language to langs_count, set the value to 1
else:
langs_count[entry] = 1
# Return the langs_count dictionary
return langs_count
# Call count_entries(): result
result = count_entries(tweets_df, 'lang')
# Print the result
print(result)
Congratulations, you’re now a bonafide Python function writer. On top of that, you have just written your very first Data Sciencey function.
At this point, although you can write basic functions, you’ve really just touched the surface of function writing capabilities. In the following Chapters, you’ll learn how to write functions that have default arguments so that when you call them, you don’t always have to specify all the parameters; you’ll learn how to write functions that can accept an arbitrary number of parameters and how to nest functions within one another; on top of this, you’ll learn how to handle errors when writing functions, which will make your functions as robust as they need to be. Moreover, you’ll see the importance of such techniques in Data Science by writing functions that are pertinent to the Data Science sphere like the Twitter DataFrame analysis that you just performed.
I am pumped for this and can’t wait to see you in the next chapter!
In this chapter, you’ll learn to write functions with default arguments so that the user doesn’t always need to specify them, and variable-length arguments so they can pass an arbitrary number of arguments on to your functions. You’ll also learn about the essential concept of scope.
In this exercise, you will practice what you’ve learned about scope
in functions. The variable num
has been predefined as
5
, alongside the following function definitions:
def func1():
num = 3
print(num)
def func2():
global num
double_num = num * 2
num = 6
print(double_num)
Try calling func1()
and func2()
in the
shell, then answer the following questions:
What are the values printed out when you call
func1()
and func2()
?
What is the value of num
in the global scope after
calling func1()
and func2()
?
func1()
prints
out 3
, func2()
prints out 6
, and
the value of num
in the global scope is
3
.
func1()
prints
out 3
, func2()
prints out 3
, and
the value of num
in the global scope is
3
.
func1()
prints
out 3
, func2()
prints out 10
, and
the value of num
in the global scope is
10
.
func1()
prints out 3
,
func2()
prints out 10
, and the value of
num
in the global scope is 6
.
Let’s work more on your mastery of scope. In this exercise, you will
use the keyword global
within a function to alter the value
of a variable defined in the global scope.
global
to alter the object
team
in the global scope.team
in the global scope to the
string "justice league"
. Assign the result to
team
.change_team()
changes the value of the name
team
!# Create a string: team
team = "teen titans"
# Define change_team()
def change_team():
"""Change the value of the global variable team."""
# Use team in global scope
global team
# Change the value of team in global: team
team = "justice league"
# Print team
print(team)
# Call change_team()
change_team()
# Print team
print(team)
Here you’re going to check out Python’s built-in scope, which is
really just a built-in module called builtins
. However, to
query builtins
, you’ll need to import builtins
‘because the name builtins is not itself built in…No, I’m serious!’ (Learning
Python, 5th edition, Mark Lutz). After executing
import builtins
in the IPython Shell, execute
dir(builtins)
to print a list of all the names in the
module builtins
. Have a look and you’ll see a bunch of
names that you’ll recognize! Which of the following names is NOT in the
module builtins?
You’ve learned in the last video about nesting functions within
functions. One reason why you’d like to do this is to avoid writing out
the same computations within functions repeatedly. There’s nothing new
about defining nested functions: you simply define it as you would a
regular function with def
and embed it inside another
function!
In this exercise, inside a function three_shouts()
, you
will define a nested function inner()
that concatenates a
string object with !!!
. three_shouts()
then
returns a tuple of three elements, each a string concatenated with
!!!
using inner()
. Go for it!
inner()
and a single parameter
word
.inner()
, passing in the parameters from
three_shouts()
as arguments to each call.# Define three_shouts
def three_shouts(word1, word2, word3):
"""Returns a tuple of strings
concatenated with '!!!'."""
# Define inner
def inner(word):
"""Returns a string concatenated with '!!!'."""
return word + '!!!'
# Return a tuple of strings
return (inner(word1), inner(word2), inner(word3))
# Call three_shouts() and print
print(three_shouts('a', 'b', 'c'))
Great job, you’ve just nested a function within another function. One other pretty cool reason for nesting functions is the idea of a closure. This means that the nested or inner function remembers the state of its enclosing scope when called. Thus, anything defined locally in the enclosing scope is available to the inner function even when the outer function has finished execution.
Let’s move forward then! In this exercise, you will complete the
definition of the inner function inner_echo()
and then call
echo()
a couple of times, each with a different argument.
Complete the exercise and see what the output will be!
inner_echo()
and a single parameter
word1
.echo()
so that it returns
inner_echo
.echo()
, passing 2 as an argument, and
assigned the resulting function to twice
. Your job is to
call echo()
, passing 3 as an argument. Assign the resulting
function to thrice
.twice()
and thrice()
and print the results.# Define echo
def echo(n):
"""Return the inner_echo function."""
# Define inner_echo
def inner_echo(word1):
"""Concatenate n copies of word1."""
echo_word = word1 * n
return echo_word
# Return inner_echo
return inner_echo
# Call echo: twice
twice = echo(2)
# Call echo: thrice
thrice = echo(3)
# Call twice() and thrice() then print
print(twice('hello'), thrice('hello'))
Let’s once again work further on your mastery of scope! In this
exercise, you will use the keyword nonlocal
within a nested
function to alter the value of a variable defined in the enclosing
scope.
echo_word
the string word
,
concatenated with itself.nonlocal
to alter the value of
echo_word
in the enclosing scope.echo_word
to echo_word
concatenated
with ‘!!!’.echo_shout()
, passing it a single
argument ‘hello’.# Define echo_shout()
def echo_shout(word):
"""Change the value of a nonlocal variable"""
# Concatenate word with itself: echo_word
echo_word = word*2
# Print echo_word
print(echo_word)
# Define inner function shout()
def shout():
"""Alter a variable in the enclosing scope"""
# Use echo_word in nonlocal scope
nonlocal echo_word
# Change echo_word to echo_word concatenated with '!!!'
echo_word = echo_word + '!!!'
# Call function shout()
shout()
# Print echo_word
print(echo_word)
# Call function echo_shout() with argument 'hello'
echo_shout('hello')
In the previous chapter, you’ve learned to define functions with more than one parameter and then calling those functions by passing the required number of arguments. In the last video, Hugo built on this idea by showing you how to define functions with default arguments. You will practice that skill in this exercise by writing a function that uses a default argument and then calling the function a couple of times.
shout_echo
. It accepts an argument word1
and a
default argument echo
with default value 1
, in
that order.*
operator to concatenate echo
copies of word1
. Assign the result to
echo_word
.shout_echo()
with just the string,
"Hey"
. Assign the result to no_echo
.shout_echo()
with the string "Hey"
and the value 5
for the default argument,
echo
. Assign the result to with_echo
.# Define shout_echo
def shout_echo(word1, echo=1):
"""Concatenate echo copies of word1 and three
exclamation marks at the end of the string."""
# Concatenate echo copies of word1 using *: echo_word
echo_word = word1 * echo
# Concatenate '!!!' to echo_word: shout_word
shout_word = echo_word + '!!!'
# Return shout_word
return shout_word
# Call shout_echo() with "Hey": no_echo
no_echo = shout_echo("Hey")
# Call shout_echo() with "Hey" and echo=5: with_echo
with_echo = shout_echo("Hey", echo=5)
# Print no_echo and with_echo
print(no_echo)
print(with_echo)
You’ve now defined a function that uses a default argument - don’t stop there just yet! You will now try your hand at defining a function with more than one default argument and then calling this function in various ways.
After defining the function, you will call it by supplying values to all the default arguments of the function. Additionally, you will call the function by not passing a value to one of the default arguments - see how that changes the output of your function!
shout_echo
. It accepts an argument word1
, a
default argument echo
with default value 1
and
a default argument intense
with default value
False
, in that order.if
statement, make the string object
echo_word
upper case by applying the method
.upper()
on it.shout_echo()
with the string, "Hey"
,
the value 5
for echo
and the value
True
for intense
. Assign the result to
with_big_echo
.shout_echo()
with the string "Hey"
and the value True
for intense
. Assign the
result to big_no_echo
.# Define shout_echo
def shout_echo(word1, echo=1, intense=False):
"""Concatenate echo copies of word1 and three
exclamation marks at the end of the string."""
# Concatenate echo copies of word1 using *: echo_word
echo_word = word1 * echo
# Make echo_word uppercase if intense is True
if intense is True:
# Make uppercase and concatenate '!!!': echo_word_new
echo_word_new = echo_word.upper() + '!!!'
else:
# Concatenate '!!!' to echo_word: echo_word_new
echo_word_new = echo_word + '!!!'
# Return echo_word_new
return echo_word_new
# Call shout_echo() with "Hey", echo=5 and intense=True: with_big_echo
with_big_echo = shout_echo("Hey", echo=5, intense=True)
# Call shout_echo() with "Hey" and intense=True: big_no_echo
big_no_echo = shout_echo("Hey", intense=True)
# Print with_big_echo and big_no_echo
print(with_big_echo)
print(big_no_echo)
Flexible arguments enable you to pass a variable number of arguments to a function. In this exercise, you will practice defining a function that accepts a variable number of string arguments.
The function you will define is gibberish()
which can
accept a variable number of string values. Its return value is a single
string composed of all the string arguments concatenated together in the
order they were passed to the function call. You will call the function
with a single string argument and see how the output changes with
another call using more than one string argument. Recall from the
previous video that, within the function definition, args
is a tuple.
gibberish
. It accepts a single flexible argument
*args
.hodgepodge
to an empty
string.hodgepodge
at the end of the
function body.gibberish()
with the single string,
"luke"
. Assign the result to one_word
.gibberish()
with multiple
arguments and to print the value to the Shell.# Define gibberish
def gibberish(*args):
"""Concatenate strings in *args together."""
# Initialize an empty string: hodgepodge
hodgepodge = ''
# Concatenate the strings in args
for word in args:
hodgepodge += word
# Return hodgepodge
return hodgepodge
# Call gibberish() with one string: one_word
one_word = gibberish("luke")
# Call gibberish() with five strings: many_words
many_words = gibberish("luke", "leia", "han", "obi", "darth")
# Print one_word and many_words
print(one_word)
print(many_words)
Let’s push further on what you’ve learned about flexible arguments -
you’ve used *args
, you’re now going to use
**kwargs
! What makes **kwargs
different is
that it allows you to pass a variable number of keyword
arguments to functions. Recall from the previous video that, within
the function definition, kwargs
is a dictionary.
To understand this idea better, you’re going to use
**kwargs
in this exercise to define a function that accepts
a variable number of keyword arguments. The function simulates a simple
status report system that prints out the status of a character in a
movie.
report_status
. It accepts a single flexible argument
**kwargs
.kwargs
to print out
the keys and values, separated by a colon ‘:’.report_status()
, pass the
following keyword-value pairs: name="luke"
,
affiliation="jedi"
and status="missing"
.report_status()
, pass the
following keyword-value pairs: name="anakin"
,
affiliation="sith lord"
and
status="deceased"
.# Define report_status
def report_status(**kwargs):
"""Print out the status of a movie character."""
print("\nBEGIN: REPORT\n")
# Iterate over the key-value pairs of kwargs
for key, value in kwargs.items():
# Print out the keys and values, separated by a colon ':'
print(key + ": " + value)
print("\nEND REPORT")
# First call to report_status()
report_status(name="luke", affiliation="jedi", status="missing")
# Second call to report_status()
report_status(name="anakin", affiliation="sith lord", status="deceased")
Recall the Bringing it all together exercise in the previous chapter where you did a simple Twitter analysis by developing a function that counts how many tweets are in certain languages. The output of your function was a dictionary that had the language as the keys and the counts of tweets in that language as the value.
In this exercise, we will generalize the Twitter language analysis that you did in the previous chapter. You will do that by including a default argument that takes a column name.
For your convenience, pandas
has been imported as
pd
and the 'tweets.csv'
file has been imported
into the DataFrame tweets_df
. Parts of the code from your
previous work are also provided.
df
and the parameter col_name
with a
default value of 'lang'
for the DataFrame column name.count_entries()
by passing the
tweets_df
DataFrame and the column name
'lang'
. Assign the result to result1
. Note
that since 'lang'
is the default value of the
col_name
parameter, you don’t have to specify it here.count_entries()
by passing the
tweets_df
DataFrame and the column name
'source'
. Assign the result to result2
.# Define count_entries()
def count_entries(df, col_name='lang'):
"""Return a dictionary with counts of
occurrences as value for each key."""
# Initialize an empty dictionary: cols_count
cols_count = {}
# Extract column from DataFrame: col
col = df[col_name]
# Iterate over the column in DataFrame
for entry in col:
# If entry is in cols_count, add 1
if entry in cols_count.keys():
cols_count[entry] += 1
# Else add the entry to cols_count, set the value to 1
else:
cols_count[entry] = 1
# Return the cols_count dictionary
return cols_count
# Call count_entries(): result1
result1 = count_entries(tweets_df, col_name='lang')
# Call count_entries(): result2
result2 = count_entries(tweets_df, col_name='source')
# Print result1 and result2
print(result1)
print(result2)
Wow, you’ve just generalized your Twitter language analysis that you did in the previous chapter to include a default argument for the column name. You’re now going to generalize this function one step further by allowing the user to pass it a flexible argument, that is, in this case, as many column names as the user would like!
Once again, for your convenience, pandas
has been
imported as pd
and the 'tweets.csv'
file has
been imported into the DataFrame tweets_df
. Parts of the
code from your previous work are also provided.
df
and the flexible argument
*args
.for
loop within the function definition so
that the loop occurs over the tuple args
.count_entries()
by passing the
tweets_df
DataFrame and the column name
'lang'
. Assign the result to result1
.count_entries()
by passing the
tweets_df
DataFrame and the column names
'lang'
and 'source'
. Assign the result to
result2
.# Define count_entries()
def count_entries(df, *args):
"""Return a dictionary with counts of
occurrences as value for each key."""
#Initialize an empty dictionary: cols_count
cols_count = {}
# Iterate over column names in args
for col_name in args:
# Extract column from DataFrame: col
col = df[col_name]
# Iterate over the column in DataFrame
for entry in col:
# If entry is in cols_count, add 1
if entry in cols_count.keys():
cols_count[entry] += 1
# Else add the entry to cols_count, set the value to 1
else:
cols_count[entry] = 1
# Return the cols_count dictionary
return cols_count
# Call count_entries(): result1
result1 = count_entries(tweets_df, 'lang')
# Call count_entries(): result2
result2 = count_entries(tweets_df, 'lang', 'source')
# Print result1 and result2
print(result1)
print(result2)
Learn about lambda functions, which allow you to write functions quickly and on the fly. You’ll also practice handling errors in your functions, which is an essential skill. Then, apply your new skills to answer data science questions.
In this exercise, you will practice writing a simple lambda function and calling this function. Recall what you know about lambda functions and answer the following questions:
add_bangs
that
adds three exclamation points '!!!'
to the end of a string
a
?add_bangs
with the argument
'hello'
?You may use the IPython shell to test your code.
add_bangs = (a + '!!!')
, and the function call is:
add_bangs('hello')
.add_bangs = (lambda a: a + '!!!')
, and the
function call is: add_bangs('hello')
.(lambda a: a + '!!!') = add_bangs
, and the function
call is: add_bangs('hello')
.Some function definitions are simple enough that they can be converted to a lambda function. By doing this, you write less lines of code, which is pretty awesome and will come in handy, especially when you’re writing and maintaining big programs. In this exercise, you will use what you know about lambda functions to convert a function that does a simple task into a lambda function. Take a look at this function definition:
def echo_word(word1, echo):
"""Concatenate echo copies of word1."""
words = word1 * echo
return words
The function echo_word
takes 2 parameters: a string
value, word1
and an integer value, echo
. It
returns a string that is a concatenation of echo
copies of
word1
. Your task is to convert this simple function into a
lambda function.
echo_word
using the
variables word1
and echo
. Replicate what the
original function definition for echo_word()
does
above.echo_word()
with the string argument
'hey'
and the value 5
, in that order. Assign
the call to result
.# Define echo_word as a lambda function: echo_word
echo_word = (lambda word1, echo: word1 * echo)
# Call echo_word: result
result = echo_word('hey', 5)
# Print result
print(result)
So far, you’ve used lambda functions to write short, simple functions
as well as to redefine functions with simple functionality. The best use
case for lambda functions, however, are for when you want these simple
functionalities to be anonymously embedded within larger expressions.
What that means is that the functionality is not stored in the
environment, unlike a function defined with def
. To
understand this idea better, you will use a lambda function in the
context of the map()
function.
Recall from the video that map()
applies a function over
an object, such as a list. Here, you can use lambda functions to define
the function that map()
will use to process the object. For
example:
nums = [2, 4, 6, 8, 10]
result = map(lambda a: a ** 2, nums)
You can see here that a lambda function, which raises a value
a
to the power of 2, is passed to map()
alongside a list of numbers, nums
. The map object
that results from the call to map()
is stored in
result
. You will now practice the use of lambda functions
with map()
. For this exercise, you will map the
functionality of the add_bangs()
function you defined in
previous exercises over a list of strings.
map()
call, pass a lambda function that
concatenates the string '!!!'
to a string
item
; also pass the list of strings, spells
.
Assign the resulting map object to shout_spells
.shout_spells
to a list and print out the
list.# Create a list of strings: spells
spells = ['protego', 'accio', 'expecto patronum', 'legilimens']
# Use map() to apply a lambda function over spells: shout_spells
shout_spells = map(lambda item: item + '!!!', spells)
# Convert shout_spells to a list: shout_spells_list
shout_spells_list = list(shout_spells)
# Print the result
print(shout_spells_list)
In the previous exercise, you used lambda functions to anonymously
embed an operation within map()
. You will practice this
again in this exercise by using a lambda function with
filter()
, which may be new to you! The function
filter()
offers a way to filter out elements from a list
that don’t satisfy certain criteria.
Your goal in this exercise is to use filter()
to create,
from an input list of strings, a new list that contains only strings
that have more than 6 characters.
filter()
call, pass a lambda function and the
list of strings, fellowship
. The lambda function should
check if the number of characters in a string member
is
greater than 6; use the len()
function to do this. Assign
the resulting filter object to result
.result
to a list and print out the list.# Create a list of strings: fellowship
fellowship = ['frodo', 'samwise', 'merry', 'pippin', 'aragorn', 'boromir', 'legolas', 'gimli', 'gandalf']
# Use filter() to apply a lambda function over fellowship: result
result = filter(lambda member: len(member) > 6, fellowship)
# Convert result to a list: result_list
result_list = list(result)
# Print result_list
print(result_list)
You’re getting very good at using lambda functions! Here’s one more
function to add to your repertoire of skills. The reduce()
function is useful for performing some computation on a list and, unlike
map()
and filter()
, returns a single value as
a result. To use reduce()
, you must import it from the
functools
module.
Remember gibberish()
from a few exercises back?
# Define gibberish
def gibberish(*args):
"""Concatenate strings in *args together."""
hodgepodge = ''
for word in args:
hodgepodge += word
return hodgepodge
gibberish()
simply takes a list of strings as an
argument and returns, as a single-value result, the concatenation of all
of these strings. In this exercise, you will replicate this
functionality by using reduce()
and a lambda function that
concatenates strings together.
reduce
function from the
functools
module.reduce()
call, pass a lambda function that takes
two string arguments item1
and item2
and
concatenates them; also pass the list of strings, stark
.
Assign the result to result
. The first argument to
reduce()
should be the lambda function and the second
argument is the list stark
.# Import reduce from functools
from functools import reduce
# Create a list of strings: stark
stark = ['robb', 'sansa', 'arya', 'brandon', 'rickon']
# Use reduce() to apply a lambda function over stark: result
result = reduce(lambda item1, item2: item1 + item2, stark)
# Print the result
print(result)
In the video, Hugo talked about how errors happen when functions are supplied arguments that they are unable to work with. In this exercise, you will identify which function call raises an error and what type of error is raised.
Take a look at the following function calls to
len()
:
len('There is a beast in every man and it stirs when you put a sword in his hand.')
len(['robb', 'sansa', 'arya', 'eddard', 'jon'])
len(525600)
len(('jaime', 'cersei', 'tywin', 'tyrion', 'joffrey'))
Which of the function calls raises an error and what type of error is raised?
len('There is a beast in every man and it stirs when you put a sword in his hand.')
raises a TypeError
.len(['robb', 'sansa', 'arya', 'eddard', 'jon'])
raises an
IndexError
.len(525600)
raises a TypeError
.len(('jaime', 'cersei', 'tywin', 'tyrion', 'joffrey'))
raises a NameError
.A good practice in writing your own functions is also anticipating the ways in which other people (or yourself, if you accidentally misuse your own function) might use the function you defined.
As in the previous exercise, you saw that the len()
function is able to handle input arguments such as strings, lists, and
tuples, but not int type ones and raises an appropriate error and error
message when it encounters invalid input arguments. One way of doing
this is through exception handling with the try-except
block.
In this exercise, you will define a function as well as use a
try-except
block for handling cases when incorrect input
arguments are passed to the function.
Recall the shout_echo()
function you defined in previous
exercises; parts of the function definition are provided in the sample
code. Your goal is to complete the exception handling code in the
function definition and provide an appropriate error message when
raising an error.
echo_word
and
shout_words
to empty strings.try
and except
in the
appropriate locations for the exception handling block.*
operator to concatenate echo
copies of word1
. Assign the result to
echo_word
.'!!!'
to echo_word
.
Assign the result to shout_words
.# Define shout_echo
def shout_echo(word1, echo=1):
"""Concatenate echo copies of word1 and three
exclamation marks at the end of the string."""
# Initialize empty strings: echo_word, shout_words
echo_word = ''
shout_words = ''
# Add exception handling with try-except
try:
# Concatenate echo copies of word1 using *: echo_word
echo_word = word1 * echo
# Concatenate '!!!' to echo_word: shout_words
shout_words = echo_word + '!!!'
except:
# Print error message
print("word1 must be a string and echo must be an integer.")
# Return shout_words
return shout_words
# Call shout_echo
shout_echo("particle", echo="accelerator")
Another way to raise an error is by using raise
. In this
exercise, you will add a raise
statement to the
shout_echo()
function you defined before to raise an error
message when the value supplied by the user to the echo
argument is less than 0.
The call to shout_echo()
uses valid argument values. To
test and see how the raise
statement works, simply change
the value for the echo
argument to a negative
value. Don’t forget to change it back to valid values to move on to the
next exercise!
if
statement by checking if the value of
echo
is less than 0.if
statement, add a
raise
statement that raises a ValueError
with
message 'echo must be greater than or equal to 0'
when the
value supplied by the user to echo
is less than 0.# Define shout_echo
def shout_echo(word1, echo=1):
"""Concatenate echo copies of word1 and three
exclamation marks at the end of the string."""
# Raise an error with raise
if echo < 0:
raise ValueError('echo must be greater than or equal to 0')
# Concatenate echo copies of word1 using *: echo_word
echo_word = word1 * echo
# Concatenate '!!!' to echo_word: shout_word
shout_word = echo_word + '!!!'
# Return shout_word
return shout_word
# Call shout_echo
shout_echo("particle", echo=5)
This is awesome! You have now learned how to write anonymous
functions using lambda
, how to pass lambda functions as
arguments to other functions such as map()
,
filter()
, and reduce()
, as well as how to
write errors and output custom error messages within your functions. You
will now put together these learnings to good use by working with a
Twitter dataset. Before practicing your new error handling skills; in
this exercise, you will write a lambda function and use
filter()
to select retweets, that is, tweets that begin
with the string 'RT'
.
To help you accomplish this, the Twitter data has been imported into
the DataFrame, tweets_df
. Go for it!
filter()
call, pass a lambda function and the
sequence of tweets as strings, tweets_df['text']
. The
lambda function should check if the first 2 characters in a tweet
x
are ‘RT’. Assign the resulting filter object to
result
. To get the first 2 characters in a tweet
x
, use x[0:2]
. To check equality, use a
Boolean filter with ==
.result
to a list and print out the list.# Select retweets from the Twitter DataFrame: result
result = filter(lambda x: x[0:2] == 'RT', tweets_df['text'])
# Create list from filter object result: res_list
res_list = list(result)
# Print all retweets in res_list
for tweet in res_list:
print(tweet)
Sometimes, we make mistakes when calling functions - even ones
you made yourself. But don’t fret! In this exercise, you will
improve on your previous work with the count_entries()
function in the last chapter by adding a try-except
block
to it. This will allow your function to provide a helpful message when
the user calls your count_entries()
function but provides a
column name that isn’t in the DataFrame.
Once again, for your convenience, pandas
has been
imported as pd
and the 'tweets.csv'
file has
been imported into the DataFrame tweets_df
. Parts of the
code from your previous work are also provided.
try
block so that when the function is called
with the correct arguments, it processes the DataFrame and returns a
dictionary of results.except
block so that when the function is called
incorrectly, it displays the following error message:
'The DataFrame does not have a ' + col_name + ' column.'
.# Define count_entries()
def count_entries(df, col_name='lang'):
"""Return a dictionary with counts of
occurrences as value for each key."""
# Initialize an empty dictionary: cols_count
cols_count = {}
# Add try block
try:
# Extract column from DataFrame: col
col = df[col_name]
# Iterate over the column in DataFrame
for entry in col:
# If entry is in cols_count, add 1
if entry in cols_count.keys():
cols_count[entry] += 1
# Else add the entry to cols_count, set the value to 1
else:
cols_count[entry] = 1
# Return the cols_count dictionary
return cols_count
# Add except block
except:
print('The DataFrame does not have a ' + col_name + ' column.')
# Call count_entries(): result1
result1 = count_entries(tweets_df, 'lang')
# Print result1
print(result1)
In the previous exercise, you built on your function
count_entries()
to add a try-except
block.
This was so that users would get helpful messages when calling your
count_entries()
function and providing a column name that
isn’t in the DataFrame. In this exercise, you’ll instead raise a
ValueError
in the case that the user provides a column name
that isn’t in the DataFrame.
Once again, for your convenience, pandas
has been
imported as pd
and the 'tweets.csv'
file has
been imported into the DataFrame tweets_df
. Parts of the
code from your previous work are also provided.
col_name
is not a column in the DataFrame
df
, raise a
ValueError 'The DataFrame does not have a ' + col_name + ' column.'
.count_entries()
to analyze the
'lang'
column of tweets_df
. Store the result
in result1
.result1
. This has been done for you, so hit
‘Submit Answer’ to check out the result. In the next exercise, you’ll
see that it raises the necessary ValueErrors
.# Define count_entries()
def count_entries(df, col_name='lang'):
"""Return a dictionary with counts of
occurrences as value for each key."""
# Raise a ValueError if col_name is NOT in DataFrame
if col_name not in df.columns:
raise ValueError('The DataFrame does not have a ' + col_name + ' column.')
# Initialize an empty dictionary: cols_count
cols_count = {}
# Extract column from DataFrame: col
col = df[col_name]
# Iterate over the column in DataFrame
for entry in col:
# If entry is in cols_count, add 1
if entry in cols_count.keys():
cols_count[entry] += 1
# Else add the entry to cols_count, set the value to 1
else:
cols_count[entry] = 1
# Return the cols_count dictionary
return cols_count
# Call count_entries(): result1
result1 = count_entries(tweets_df, 'lang')
# Print result1
print(result1)
You have just written error handling into your
count_entries()
function so that, when the user passes the
function a column (as 2nd argument) NOT contained in the DataFrame (1st
argument), a ValueError
is thrown. You’re now going to play
with this function: it is loaded into pre-exercise code, as is the
DataFrame tweets_df
. Try calling
count_entries(tweets_df, 'lang')
to confirm that the
function behaves as it should. Then call
count_entries(tweets_df, 'lang1')
: what is the last line of
the output?
Well done. You’re now well on your way to being a Pythonista Data Science ninja.
You’re now able to write functions in Python that accept single and multiple arguments and can return as many values as you please. You’re also adept at using default and flexible arguments and keyword arguments. You’ve gained insight into scoping in Python, can write lambda functions and handle errors in your very own function writing practice. You’ve also gained invaluable practice in using all of these techniques to write functions that are useful in a Data Science context. You have come a long way in your developing practice as a budding Pythonista Data Scientist.
There are more basic skills that you will need to learn in Python to be valuable as a working Data Scientist and many of these we’ll cover in the sequel to this course so if you’re finding yourself still thirsty for more Pythonista Data Science chops, I’d head over there right now. There you’ll learn all about list comprehensions, which allow you to wrangle data in lists to create other lists, a tool utilized by all Data Scientists working in Python. You’ll also learn about iterators, which you have already seen in the context of for loops without having necessarily known it. Iterators are everywhere in PythonLand and, to put it simply, allow you to rapidly iterate Data Science protocols and procedures over sets of objects; these are a couple of the cool functionalities in PythonLand you’ll encounter in the sequel to this course, which will conclude with an entire chapter devoted to a case study in which you’ll apply time and time again techniques learnt in both of these courses.
I’m looking forward to seeing you there and congratulations once again!