Datacamp course notes on writing functions and error handling.
User-Defined Funtions
Defining a function
Return
If we don’t want to directly print the value, but want to return the value and assign it to a value: use return.
Note: It is important to remember that assigning a variable to a function that prints a value but does not return a value will results in that variable being of type NoneType
.
Docstrings
- desctibe what your function does
- serve as documentation for your function
- placed in the immediate line after the function header, in between triple double quotes
"""
1 | def square(value): |
Multiple function parameters and returns
To have multiple parameters, simply accept more than 1 parameters when defining the funtion. The number of arguments in the functions equals to teh number of parameters.
To have multiple returns, we need to use Tuples.
Tuples:
- Like a list - can contain multiple values
- Immutable - can’t modify values, this means you cannot update the element in the tuple with
x[0] = a
- Constructe using parentheses ()
Unpack a tuple into several variables1
2
3even_nums = (2, 4, 6)
a, b, c = even_nums
print(a) #will return 2
Accessing tuple elements as with lists using zero-indexing1
2print(even_nums[1])
second_num = even_nums[1]
Example1
2
3
4
5
6def raise_both(value1, value2) # Function header
"""Raise value1 to the power of value2 and vice versa"""
new_value1 = value1 ** value2 # Function body
new_value2 = value2 ** value1
new_tuple = (new_value1, new_value2)
return new_tuple
Case Study: tweeter language counts
1 | # Define count_entries() |
Scope and User-Defined Functions
Scope is the part of the program where an object or name may be accessible, sinnce not all objects are accessible everywhere in a script.
- Global scope: defined in the main body of a script or python program
- Local scope: defined inside a function. Once the execution is done, any name inside the local scope cease to exist
- Built-in scope: names in the pre-defined built-ins module that python provides, such as
print()
andsum()
The sequence of scopes that Python will look in when calling a name is: local scope -> Enclosing functions (if any) -> global scope -> built-in scope. (LEGB Rule)
Examples
First we define a function:1
2
3
4
5def square(value):
"""Return the square of a value"""
new_value = value ** 2
return new_value
new_value #will return error
We cannot access the variable new_value
outside the function, since this variable is defined only within the local scope of the function, and is not defined globally.
Below, we define the name globally before defining and calling the function.1
2
3
4
5
6
7new_value = 10
def square(value):
"""Return the square of a value"""
new_value = value ** 2
return new_value
square(3) #results in 9
new_value #will return 10
- Anytime we call the name in the global scope, we will access the name in the global scope.
- Anytime we call the name in the local scope of the function, Python will look first in the local scope (that’s why
square(3)
results in 9 instead of 10). If Python cannot find the name in the local scope, it will then, and only then, look in the global scope.
Below, we access new_val, which is defined globally, within the function square. Note that the global value accessed is the value at the time the function is called, not the value when the function is defined.
Thus, if we reassigned a value to new_val
, and call the function square again, we can see that the new value of new_val
is accessed.1
2
3
4
5
6
7
8new_val = 10
def square(value):
"""Return the square of a value"""
new_value2 = new_val ** 2 #referring to the name `new_val` in the global scope
return new_value2
square(3) #results in 100
new_val = 20
square(3) #will return 400
What if we want to alter the value of a global name within a function call? We can use global
to specify that.1
2
3
4
5
6
7
8new_val = 10
def square(value):
"""Return the square of a value"""
global new_val # this is the variable that we wishe to access and alter
new_val = new_val ** 2 #referring to the name `new_val` in the global scope
return new_val
square(3) #results in 100
new_val #returns 100
Another example1
2
3
4
5
6
7
8num = 5
def func2():
global num
double_num = num * 2
num = 6
print(double_num)
func2() #will return 10
num #will return 6
Nested functions
It helps when multiple similar computations are needed.1
2
3
4
5
6
7
8def mod2plus5(x1, x2, x3):
"""Returns the remainder plys 5 of three values."""
def inner(x):
"""Returns the remainder plus 5 of a value."""
return x % 2 + 5
return (inner(x1), inner(x2), inner(x3))
print(mod2plus5(1,2,3))
Also, it can be used to return a function:1
2
3
4
5
6
7
8
9
10
11
12
13def raise_val(n):
"""Return the inner function."""
def inner(x):
"""Raise x to the power of n."""
raised = x ** n
return raised
return inner
square = raise_val(2)
cube = raise_val(3)
print(square(2), cube(4)) # 4, 64
Using nonlocal
to access and alter names in an enlosing scope:1
2
3
4
5
6
7
8
9
10
11
12
13def outer():
"""Prints the value of n."""
n = 1
def inner():
nonlocal n # same as global, to access and alter the name in the enclosing scope
n = 2
print(n)
inner()
print(n)
outer() #returns 2, instead of 1
Default and Flexible Arguements
Add a default argument:
1 | def power(number, pow = 1): |
Flexible arguments
We use flexible arguments when we are not sure about the specific arguments added to the function with *args
1
2
3
4
5
6
7
8
9
10
11
12
13
14def add_all(*args):
"""Sum all values in the *args together, irrespective of how many they are."""
# Initialize sum
sum_all = 0
# Accumulate the sum
for num in args:
sum_all += num
return sum_all
add_all(1) # 1
add_all(1, 2) # 3
add_all(5, 10, 15, 20) # 50
We can also pass arbitrary number of keyword arguments with **kwargs
, which is arguments preceded by identifiers. (dictionary)1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22# Define report_status
def report_status(**kwargs):
"""Print out the status of a movie character."""
print("\nBEGIN: REPORT\n")
# Iterate over the key-value pairs of kwargs
for key, value in kwargs.items():
# Print out the keys and values, separated by a colon ':'
print(key + ": " + value)
print("\nEND REPORT")
# First call to report_status()
report_status(name = 'luke',
affiliation = 'jedi',
status = 'missing')
# Second call to report_status()
report_status(name = "anakin",
affiliation = "sith lord",
status = "deceased")
Case study
Generate the previous function to count the occurance in any column in the dataframe.1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37# Define count_entries()
def count_entries(df, *args):
"""Return a dictionary with counts of
occurrences as value for each key."""
#Initialize an empty dictionary: cols_count
cols_count = {}
# Iterate over column names in args
for col_name in args:
# Extract column from DataFrame: col
col = df[col_name]
# Iterate over the column in DataFrame
for entry in col:
# If entry is in cols_count, add 1
if entry in cols_count.keys():
cols_count[entry] += 1
# Else add the entry to cols_count, set the value to 1
else:
cols_count[entry] = 1
# Return the cols_count dictionary
return cols_count
# Call count_entries(): result1
result1 = count_entries(tweets_df, 'lang')
# Call count_entries(): result2
result2 = count_entries(tweets_df, 'lang', 'source')
# Print result1 and result2
print(result1)
print(result2)
Lambda Functions & Error-Handling
Lambda Functions
Lambda allows you to write a function in a quick and potentially dirty way.
1 | raise_to_power = lambda x, y: x ** y |
map()
- takes two arguments:
map(func, seq)
- applies the function the all elements in the sequence
- In this case, the function does not even need to have a name, and is thus referred as anonymous function.
1
2
3
4nums = [48, 6, 9, 21, 1]
square_all = map(lambda num: num ** 2, nums)
print(square_all) # can only show that this is a map object, but cannot see the content in this object.
print(list(square_all)) #this turn the results to a list and thus is printable. [2304, 36, 81, 441, 1]
filter()
- filter out elements from a list that don’t satisfy certain criteria
1
2
3
4
5
6
7
8
9
10
11# Create a list of strings: fellowship
fellowship = ['frodo', 'samwise', 'merry', 'aragorn', 'legolas', 'boromir', 'gimli']
# Use filter() to apply a lambda function over fellowship: result
result = filter(lambda member: len(member) > 6, fellowship)
# Convert result to a list: result_list
result_list = list(result)
# Convert result into a list and print it
print(result_list)
reduce()
- useful for performing some computation on a list and returns a single value as a result
- need to be imported from the
functools
module before use1
2
3
4
5
6
7
8
9
10
11# Import reduce from functools
from functools import reduce
# Create a list of strings: stark
stark = ['robb', 'sansa', 'arya', 'eddard', 'jon']
# Use reduce() to apply a lambda function over stark: result
result = reduce(lambda item1, item2: item1 + item2, stark)
# Print the result
print(result)
Error Handling
We should endeavor to produce useful error messages for the functions that we write: catch exceptions during execution with try
& except
clause
- Runs the code following
try
- If there’s an exception, run the code following
except
1 | def sqrt(x) |
If we only wish to catch type errors, and let other errors pass through: (more errors can be specified - refer to online documentation)1
2
3
4
5
6def sqrt(x)
"""Returns the square root of a number."""
try:
return x ** 0.5
except TypeError:
print('x must be an int or float')
If we don’t want our function to work when some specific criteria are met, we can manually raise an error with an if clause (e.g. the input must be positive):1
2
3
4
5
6
7
8
9def sqrt(x)
"""Returns the square root of a number."""
if x < 0:
raise ValueError('x must be non-negative')
try:
return x ** 0.5
except TypeError:
print('x must be an int or float')
sqrt(-2) #will return a value error and a error message saying that 'x must be non-negative'
Case Study
Filtering out all the retweets
1
2
3
4
5
6
7
8
9# Select retweets from the Twitter DataFrame: result
result = filter(lambda x: x[0:2] == 'RT', tweets_df['text'])
# Create list from filter object result: res_list
res_list = list(result)
# Print all retweets in res_list
for tweet in res_list:
print(tweet)Add a
try-except
block to the function defined in previous case1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26# Define count_entries()
def count_entries(df, col_name='lang'):
"""Return a dictionary with counts of
occurrences as value for each key."""
cols_count = {}
# Add try block
try:
col = df[col_name]
for entry in col:
if entry in cols_count.keys():
cols_count[entry] += 1
else:
cols_count[entry] = 1
return cols_count
# Add except block
except:
print('The DataFrame does not have a ' + col_name + ' column.')
# Call count_entries(): result1
result1 = count_entries(tweets_df, 'lang')
print(result1)
# Call count_entries(): result2
result2 = count_entries(tweets_df, 'lang1') #error message in exceptRaise a
ValueError
with if clause.1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21# Define count_entries()
def count_entries(df, col_name='lang'):
"""Return a dictionary with counts of
occurrences as value for each key."""
# Raise a ValueError if col_name is NOT in DataFrame
if col_name not in df.columns:
raise ValueError('The DataFrame does not have a ' + col_name + ' column.')
cols_count = {}
col = df[col_name]
for entry in col:
if entry in cols_count.keys():
cols_count[entry] += 1
else:
cols_count[entry] = 1
return cols_count
result1 = count_entries(tweets_df, 'lang')
print(result1)
count_entries(tweets_df, 'lang1') # ValueError with specified message