This lecture explains modules, loops, with a brief introduction to Input/Output processes in Python. Ideally, modules should have been part of the previous lecture (with Python functions). The split was however necessary to reduce the size of lecture 6 to a manageable size.

Python modules

We have already used Python modules extensively in the past lectures, homework, and quizzes. although we never discussed them. To put it simply, Python modules are a collection of Python definitions, variables, functions, … that can be reused as a library in future.

Sometimes you want to reuse a function from an old program in a new program. The simplest way to do this is to copy and paste the old source code into the new program. However, this is not good programming practice, because you then over time end up with multiple identical versions of the same function. When you want to improve the function or correct a bug, you need to remember to do the same update in all files with a copy of the function, and in real life most programmers fail to do so. You easily end up with a mess of different versions with different quality of basically the same code. Therefore, a golden rule of programming is to have one and only one version of a piece of code. All programs that want to use this piece of code must access one and only one place where the source code is kept. This principle is easy to implement if we create a module containing the code we want to reuse later in different programs.

The import statement

We have already used the math module on multiple occasions, using the import statement. Here is an example:

In [11]: import math

In [12]: value = math.factorial(5)

In [13]: print(value)
120

In [14]: math.pi
Out[14]: 3.141592653589793

In [15]: math.e
Out[15]: 2.718281828459045


In its simplest form, the import has the following syntax:

import module1[, module2[,... moduleN]


like,

import math, cmath, numpy


The standard approach for calling the names and definitions (variables, functions, …) inside the module is using the module-name prefix, like the above examples. To call the module names without the prefix, use the following module import statement,

In [16]: from math import *

In [17]: factorial(5)
Out[17]: 120


To import only specific names, use the format like the following,

from math import pi,e,factorial,erf


This will import the four math modules names pi,e,factorial,erf. You could also change the name of the input module, or specific names from it, upon importing the module into your code, using import as statement,

In [16]: import numpy as np

In [17]: np.double(5)
Out[17]: 5.0

In [20]: from numpy import double as dble

In [21]: dble(13)
Out[21]: 13.0


A module can contain executable statements as well as function definitions. These statements are intended to initialize the module. They are executed only the first time the module name is encountered in an import statement.

Also, note that in general the practice of from mod_name import * from a module is discouraged, since it often causes poorly readable code. It is however very useful for saving time and exra typing in interactive sessions like IPython, or Jupyter.


Listing all names in an imported module

To get a list of all available names in an imported module, use dir() function.

In [11]: import math

In [13]: dir(math)
Out[13]:
['__doc__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'acos',
 'acosh',
 'asin',
 'asinh',
 'atan',
 'atan2',
 'atanh',
 'ceil',
 'copysign',
 'cos',
 'cosh',
 'degrees',
 'e',
 'erf',
 'erfc',
 'exp',
 'expm1',
 'fabs',
 'factorial',
 'floor',
 'fmod',
 'frexp',
 'fsum',
 'gamma',
 'gcd',
 'hypot',
 'inf',
 'isclose',
 'isfinite',
 'isinf',
 'isnan',
 'ldexp',
 'lgamma',
 'log',
 'log10',
 'log1p',
 'log2',
 'modf',
 'nan',
 'pi',
 'pow',
 'radians',
 'sin',
 'sinh',
 'sqrt',
 'tan',
 'tanh',
 'trunc']


Python standard Modules

Python comes with a set of standard modules as its library, the so-called Python Standard Library. Some of these modules are built into the Python interpreter; these provide access to operations that are not part of the core of the language but are nevertheless built in, for efficiency and other reasons.

Creating modules

To make a Python module, simply collect all the functions that constitute the module in one single file with a given filename, for example, mymodule.py. This file will be automatically a module, with name mymodule, from which you can import functions and definitions in the standard way described above.

Why and when do you need to create a module?

Sometimes you want to reuse a function from an old program in a new program. The simplest way to do this is to copy and paste the old source code into the new program. However, this is not good programming practice, because you then over time end up with multiple identical versions of the same function. When you want to improve the function or correct a bug, you need to remember to do the same update in all files with a copy of the function, and in real life most programmers fail to do so. You easily end up with a mess of different versions with different quality of basically the same code. Therefore, a golden rule of programming is to have one and only one version of a piece of code. All programs that want to use this piece of code must access one and only one place where the source code is kept. This principle is easy to implement if we create a module containing the code we want to reuse later in different programs.

Note that modules can import other modules. It is customary but not required to place all import statements at the beginning of a module (or script, for that matter). The imported module names are placed in the importing module’s global symbol table.

Executing modules as scripts

When a Python module is called from the Bash command prompt like,

python mycode.py


the code in the module will be executed, just as if you imported it inside another code. This is good, but can sometimnes become problematic. Let’s explain this with an example from the midterm exam, a script that finds and reports all prime numbers smaller than a given input number $n$.

When you execute this code as astandalone Python script, it will ask you for an integer, to give you all integers that are smaller than the input number. Now suppose you wanted to import this script as a Python module into your code. If you do so, the Python interpreter would run all statements in this script and asks you to input an integer, before importing the rest of the functions in this script.

In [5]: import find_primes
Enter an integer number:
n = 13

 Here is a list of all prime numbers smaller than 13:
13
11
7
5
3
2


This may not be necessarily what we want to do. For example, we may only want to use the functions get_primes and is_prime in this script, without asking the user to input an integer and finding all smaller primes. The solution is to put the part of the code in the script that we don’t want to be executed as module, that is,

print('Enter an integer number: ')
n = int(input('n = '))
print('\n Here is a list of all prime numbers smaller than {}:'.format(n))
get_primes(n)


inside the following if-block,

if __name__ == "__main__":
    print('Enter an integer number: ')
    n = int(input('n = '))
    print('Here is a list of all prime numbers smaller than {}:'.format(n))
    get_primes(n)


When the code is run as a standalone script, the __name__ property of the code is set to __main__. However, when the script is imported as a module inside another code, the __name__ property is automatically set to the name of the module find_primes. Thus as a module, the above if-block will not be executed, but the rest of the code (the two functions) will be properly imported. The corrected script is named mod_find_primes.py and can be downloaded from here.

In [6]: import mod_find_primes
In [7]: mod_find_primes.__name__
Out[7]: 'mod_find_primes'


You could also import specific names or funcitons from your own module, for example

In [11]: from mod_find_primes import is_prime


In summary,

Add test blocks in your modules

It is recommended to only have functions and not any statements outside functions in a module. The reason is that the module file is executed from top to bottom during the import. With function definitions only in the module file, and no main program, there will be no calculations or output from the import, just definitions of functions. But in case you need to write a module that can be run standalone, then put all script statements for the standalone part of the module inside a test block (the if-block described above).


Command line arguments

Test blocks are especially useful when your module can be also run as a standalone Python script that takes in command-line arguments. Here is a modified version of the mod_find_primes module now named cmd_find_primes that instead of using input() function, reads the integer number from the Bash command line. To do so, you need to modify the last part of the original module to the following, using Python’s standard sys module,

if __name__ == "__main__":
    import sys
    if len( sys.argv ) != 2: # check the number of arguments to be exactly 2.
        print('''
    Error: Exactly two arguments must be given on the command line.
    Usage:''')
        print("     ", sys.argv[0], "<a positive integer number>", '\n')
        sys.exit('     Program stopped.\n')
    else:
        n = int(sys.argv[1])
        print('Here is a list of all prime numbers smaller than {}:'.format(n))
        get_primes(n)


Now if you run this code, from the Bash command line, or inside IPython, like the following,

In [14]: run cmd_find_primes.py

    Error: Exactly two arguments must be given on the command line.
    Usage:
      cmd_find_primes.py <a positive integer number>

An exception has occurred, use %tb to see the full traceback.

SystemExit:      Program stopped.


The code will expect you to enter an integer right after the nbame of the script,

In [15]: run cmd_find_primes.py 13
Here is a list of all prime numbers smaller than 13:
13
11
7
5
3
2


In general, I recommend you to use the sys module for input arguments instead of Python’s input() function.

Modules and main functions

If you have some functions and a main program in some program file, just move the main program to the test block. Then the file can act as a module, giving access to all the functions in other files, or the file can be executed from the command line, in the same way as the original program.


Test blocks for module code verification

It is a good programming habit to let the test block do one or more of three things:

  1. provide information on how the module or program is used,
  2. test if the module functions work properly,
  3. offer interaction with users such that the module file can be applied as a useful program.

To achieve the second task, we have to write functions that verify the implementation in a module. The general advice is to write test functions that,

  1. have names starting with test_,
  2. express the success or failure of a test through a boolean variable, say success,
  3. run assert success, msg to raise an AssertionError with an optional message msg in case the test fails.

We talk about this later on in this course.

Doc-strings in modules

It is a good habit to include a doc-string in the beginning of your module file. This doc string should explain the purpose and use of the module.


Scope of definitions in your module

Once you have created your module, you can import it just like any other module into our program, for example,

In [22]: import cmd_find_primes

In [23]: dir(cmd_find_primes)
Out[23]:
['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'get_primes',
 'is_prime']


However, more often than not, you may want to have variables in your module, that are only to be used inside the module and not be accessed by the user. The convention is to start the names of these variables by an underscore. For example,

_course = "Python programming"


This however, does not prevent the import of the variable _course into your code from your the module containing it. One solution is to delete the variables that we are not interested the user to have access to, at the end of the module,

del _course


such that the module containing the above statement will give,

In [28]: import mod_cmd_find_primes_del

In [29]: dir( mod_cmd_find_primes_del )
Out[29]:
['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'get_primes',
 'is_prime']


However, note that if you import all definitions in your module as standalone definitions like the following,

In [4]: from mod_cmd_find_primes_all import *

In [5]: dir()
Out[5]:
['In',
 'Out',
 '_',
 '_3',
 '__',
 '___',
 '__builtin__',
 '__builtins__',
 '__doc__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 '_dh',
 '_i',
 '_i1',
 '_i2',
 '_i3',
 '_i4',
 '_i5',
 '_ih',
 '_ii',
 '_iii',
 '_oh',
 '_sh',
 'exit',
 'get_ipython',
 'get_primes',
 'is_prime',
 'quit']


you see that the variable _course is not imported. In general, to avoid confusion, it is best to define an __all__ variable in your module, which contains a list of all variable and function names that are to be imported as standalone definitions using from mymodule import *. For example, add the following to the above module,

__all__ = ['get_primes']


Upong importing this module, now only the function get_prime will be imported and not _course or is_prime.

The path to your modules

When you create a module, if it is in the current directory of your code, then it will be automatcally found by the Python interpreter. This is however, not generally the case if your module lives in another directory than the current working directory of Python interpreter. To add the module’s directory to the path of your Python interpreter, use the following,

In [5]: myModuleFolder = ’the path to your module’

In [6]: import sys

In [7]: sys.path
Out[7]:
['',
 'C:\\Program Files\\Anaconda3\\Scripts',
 'C:\\Program Files\\Anaconda3\\lib\\site-packages\\lmfit-0.9.5_44_gb2041c3-py3.5.egg',
 'C:\\Program Files\\Anaconda3\\python35.zip',
 'C:\\Program Files\\Anaconda3\\DLLs',
 'C:\\Program Files\\Anaconda3\\lib',
 'C:\\Program Files\\Anaconda3',
 'c:\\program files\\anaconda3\\lib\\site-packages\\setuptools-20.3-py3.5.egg',
 'C:\\Program Files\\Anaconda3\\lib\\site-packages',
 'C:\\Program Files\\Anaconda3\\lib\\site-packages\\Sphinx-1.3.5-py3.5.egg',
 'C:\\Program Files\\Anaconda3\\lib\\site-packages\\win32',
 'C:\\Program Files\\Anaconda3\\lib\\site-packages\\win32\\lib',
 'C:\\Program Files\\Anaconda3\\lib\\site-packages\\Pythonwin',
 'C:\\Program Files\\Anaconda3\\lib\\site-packages\\IPython\\extensions',
 'C:\\Users\\Amir\\.ipython']

In [8]: sys.path.insert(0,myModuleFolder)

In [9]: sys.path
Out[9]:
[’the path to your module’,
 '',
 'C:\\Program Files\\Anaconda3\\Scripts',
 'C:\\Program Files\\Anaconda3\\lib\\site-packages\\lmfit-0.9.5_44_gb2041c3-py3.5.egg',
 'C:\\Program Files\\Anaconda3\\python35.zip',
 'C:\\Program Files\\Anaconda3\\DLLs',
 'C:\\Program Files\\Anaconda3\\lib',
 'C:\\Program Files\\Anaconda3',
 'c:\\program files\\anaconda3\\lib\\site-packages\\setuptools-20.3-py3.5.egg',
 'C:\\Program Files\\Anaconda3\\lib\\site-packages',
 'C:\\Program Files\\Anaconda3\\lib\\site-packages\\Sphinx-1.3.5-py3.5.egg',
 'C:\\Program Files\\Anaconda3\\lib\\site-packages\\win32',
 'C:\\Program Files\\Anaconda3\\lib\\site-packages\\win32\\lib',
 'C:\\Program Files\\Anaconda3\\lib\\site-packages\\Pythonwin',
 'C:\\Program Files\\Anaconda3\\lib\\site-packages\\IPython\\extensions',
 'C:\\Users\\Amir\\.ipython']


In the above, we added the path to our module to the list of all paths the Python interpreter will search, in order to find the module requested to be imported (Note that ’the path to your module’ is not a real system path, this was just an example).

The collections module

One of the greatest strengths of Python as a scientific programming language is that, for almost everything that you could imagine and want to write a code, someone has already written a code, and so there is no reason to reinvent the wheel if someone has already done it for you. Throughout your career you will get to know many of the most important modules for your own domain of science. Here I will introduce only a general module, that has some interesting and rather useful functions in it. Specifically, this module contains some new non-standard Python data types that can be very handy at times.

The Counter data type

The Counter function from module collections takes in a list and creates a dictionary, whose keys are unique elements in the input list and the values of the keys, are the number of times each key appears in the list. For example,

from collections import Counter
mylist = [1,1,1,2,3,34,45,34,34,7,8,34,3,3,6,4,4,4,0,34,9,0]
c = Counter(mylist)
c
Counter({0: 2, 1: 3, 2: 1, 3: 3, 4: 3, 6: 1, 7: 1, 8: 1, 9: 1, 34: 5, 45: 1})

There are basically three methods for generating a Counter dictionary,

c1 = Counter(['a', 'b', 'c', 'a', 'b', 'b']) # input a list directly into Counter
c2 = Counter({'a':2, 'b':3, 'c':1}) # Give it the Counter dictionary
c3 = Counter(a=2, b=3, c=1) # or simply give it the counts
c1 == c2 == c3
True
What is Counter useful for?

Suppose you have a long list of letters, and for some reason you need to count the number of times each letter appears in your string. You can achieve your goal as in the following example,

s = 'amirshahmoradijakelucerotravismike'
c = Counter(s)
for key in c.keys():
    print('The letter {} appears only {} times in the string'.format(key,c[key]))
The letter v appears only 1 times in the string
The letter a appears only 5 times in the string
The letter u appears only 1 times in the string
The letter l appears only 1 times in the string
The letter j appears only 1 times in the string
The letter d appears only 1 times in the string
The letter h appears only 2 times in the string
The letter o appears only 2 times in the string
The letter i appears only 4 times in the string
The letter k appears only 2 times in the string
The letter c appears only 1 times in the string
The letter t appears only 1 times in the string
The letter s appears only 2 times in the string
The letter m appears only 3 times in the string
The letter r appears only 4 times in the string
The letter e appears only 3 times in the string   ​

Now suppose you wanted to cound the number of times different words appear in a given text,

text = "Engineering Computation Lab (COE111L) is a new course that is offered by the department of Aerospace Engineering and Engineering Mechanics at the University of Texas at Austin, starting Spring 2017. "
c = Counter(text.split())
for word in c.keys():
    print('The word "{}" appears only {} times in the text'.format(word,c[word]))
The word "Computation" appears only 1 times in the text
The word "a" appears only 1 times in the text
The word "Engineering" appears only 3 times in the text
The word "the" appears only 2 times in the text
The word "(COE111L)" appears only 1 times in the text
The word "offered" appears only 1 times in the text
The word "is" appears only 2 times in the text
The word "at" appears only 2 times in the text
The word "of" appears only 2 times in the text
The word "Lab" appears only 1 times in the text
The word "course" appears only 1 times in the text
The word "department" appears only 1 times in the text
The word "by" appears only 1 times in the text
The word "and" appears only 1 times in the text
The word "Texas" appears only 1 times in the text
The word "Mechanics" appears only 1 times in the text
The word "2017." appears only 1 times in the text
The word "new" appears only 1 times in the text
The word "University" appears only 1 times in the text
The word "starting" appears only 1 times in the text
The word "Austin," appears only 1 times in the text
The word "that" appears only 1 times in the text
The word "Spring" appears only 1 times in the text
The word "Aerospace" appears only 1 times in the text  

Now, you can also apply all different methods that exists for Counter data types on the variable c in the above case. For example, you could ask for the 3 most common words in in the text,

c.most_common(3)
[('Engineering', 3), ('the', 2), ('is', 2)]

The OrderedDict data type

This is also a subclass of dictionary data type, which provides all the methods provided by dict, but which also retains the order by which elements are added to the dictionary,



However, you can define a defaultdict dictionary which will assign a default value to all keys that do not exist, and automatically adds them to the dictionary. A normal dictionary does not conserve the order by which elements were added to the dictionary,

d = {5:5,3:3,6:6,1:1}
for i,j in d.items():
    print(i,j)


1 1 3 3 5 5 6 6

To get save order of the elements, you can use OrderedDict,

from collections import OrderedDict as od
d = od([(5,5),(3,3),(6,6),(1,1)])
for i,j in d.items():
    print(i,j)
5 5
3 3
6 6
1 1
Keep in mind that, two order dictionary with the same content may not be necessarily equal, since the order of their content also matters.


The timeit module

This is a module that provides some useful functions for timing the performance and speed of peices of your Python code.

import timeit as tt
tt.timeit( "-".join(str(n) for n in range(100)) , number=10000 )
0.03779717059364884

The first input to timeit function above is the operation which we would like to time, and the second input, tell the function, how many times repeat the task (If the operation takes a tiny amount, you would want to repeat it many many times, in order to get a sensible timing output). Here is the same operation as above, but now using the map function,

tt.timeit( "-".join( map(str,range(1000))) , number=10000 )
0.384857713242468  

In IPython or Jupyter, you can do the timing operation in a smarter way using IPython magic function %timeit,

%timeit "-".join(str(n) for n in range(100))
10000 loops, best of 3: 36.6 µs per loop

The IPython’s magic function automatically figures how many times it should run the operation to get a sensible timing of the operation.

%timeit "-".join( map(str,range(100)))
10000 loops, best of 3: 21 µs per loop

In general, as you noticed in the above example, the function map performs much better and faster than Python’s for-loop.


The time module

More generally, if you want to measure the CPU time spent on a specific part of your code, you can use the clock() method from time module,

import time
# do some work
t0 = time.clock()   # get the initial CPU time
# do some further work wqhich you want to time
t1 = time.clock()   # get the final CPU time
cpu_time = t1 - t0  # This is the time spent on the task being timed.


The time.clock() function returns the CPU time spent in the program since its start. If the interest is in the total time, also including reading and writing files, time.time() is the appropriate function to call. Now suppose you had a list of functions that performed the same task, but using different methods, and you wanted to time their performance. Since in Python, functions are ordinary objects, making a list of functions is no more special than making a list of strings or numbers. You can therefore, create a list of function names and call them one by one, inside a loop, and time each one repectively.

import time
functions = [func1, func2, func3, func4,func5, func6, func7, func8,func9, func10]
timings = [] # timings[i] holds CPU time for functions[i]
for function in functions:
    t0 = time.clock()
    function(<input variables>)
    t1 = time.clock()
    cpu_time = t1 - t0
    timings.append(cpu_time)


Loops in Python

We have already seen, both in homework and midterm, what a pain it can be if you wanted to repeat a certain number of tasks using recursive functions and if-blocks. Fortunately, Python has loop statements that can highly simplify the task of repeating certain statements for a certain number of times.

While loop

One such statement is the while-loop:

while this_logical_statement_holds_true : 
    perform_statements


For example, here is a code that prints all positive integers smaller than a given input integer,

n = int(input('input a positive integer: '))
print( 'Here are all positive integers smaller than {}'.format(n) )
while n > 1:
    n -= 1
    print(n)
input a positive integer: 7
Here are all positive integers smaller than 7
6
5
4
3
2
1

Another useful way of writing while-loops is the following (using the example above),

n = int(input('input a positive integer: '))
print( 'Here are all positive integers smaller than {}'.format(n) )
while True:
    n -= 1
    print(n)
    if n == 1: break
input a positive integer: 7
Here are all positive integers smaller than 7
6
5
4
3
2
1

In this case, the loop will continue forever, unless the condition n==1 is met at some point during the iteration.

For loop

If you are from a Fortran, C, C++ background you maybe already accustomed to counting loops than while loops. Pyhon does not have a direct method for counting loops, however, there is a for-loop syntax that loops over the elements of a list or tuple. For example, if we wanted to rewrite the above code using for-loop, one solution would be like the following,

n = int(input('input a positive integer: '))
print( 'Here are all positive integers smaller than {}'.format(n) )
my_range = range(n-1,0,-1)
for n in my_range:
    print(n)
input a positive integer: 7
Here are all positive integers smaller than 7
7
6
5
4
3
2
1

Here the Python’s builtin function range([start,] stop [, step]) creates a list of integer that starts from start to end but not including end, with a distance of size step between the elements. Here is another way of doing the same thing as in the above example,

n = int(input('input a positive integer: '))
print( 'Here are all positive integers smaller than {}'.format(n) )
mylist = list(range(n-1,0,-1))
for n in mylist:
    print(n)
input a positive integer: 7
Here are all positive integers smaller than 7
6
5
4
3
2
1

Note how I have used the range function in order to get the same output as in the previous example.

n = int(input('input a positive integer: '))
mylist = list(range(n-1,0,-1))
print(mylist)
input a positive integer: 7  
[6, 5, 4, 3, 2, 1]

​For-loop with list indices

Instead of iterating over over a list directly, as illustrated above, one could iterate over the indices of a list,

mylist = ['amir','jake','lecero','mike','travis']
for i in range(len(mylist)):
    print(mylist[i])
amir
jake
lecero
mike
travis
Iterating over list indices, instead of list elements, is particularly udseful, when you have to work with multiple lists in a for-loop.


Manipulating lists using for-loop

Note that when you want to change the elements of a list in a for-loop, you have to change the list itself, and not simply the for-loop variable.

mydigits = [1,3,5,7,9]
for i in mydigits:
    i -= 1
mydigits
[1, 3, 5, 7, 9]

The above code won’t change the values in the list, instead only the for-loop variable. If you want to change the list itself, you have to operate on the list elements directly,

mydigits = [1,3,5,7,9]
for i in rnage(len(mydigits)):
    mydigits[i] -= 1
mydigits
[0, 2, 4, 6, 8]

List comprehension

Frequently in Python programming you may need to create long lists of regurlarly ordered item. As a result, Python has a special concise syntax for such tasks, called list comprehension which uses for-loop. For example, supopse you have a list of odd digits as in the example above, and you want to create a list of even digits from it. You could achieve this using the following simple syntax,

odd_digits = [1,3,5,7,9]
even_digits = [i-1 for i in odd_digits]
even_digits
[0, 2, 4, 6, 8]

Simultaneous looping over multiple lists

Suppose you have two or more lists of the same length over the elements of which you want to perform a specific set of tasks simultaneously. To do so, it suffices to create a list of tuples using Python’s builtin function zip and loop over the tuple elements of this list. For example, let’s assume that you wanted to create a list of the addition of individual elements in the above two lists: odd_digits and even_digits. One way to do it would be the following,

sum_even_odd = []
for i,j in zip(odd_digits,even_digits):
    sum_even_odd.append(i+j) 
sum_even_odd
[1, 5, 9, 13, 17]



Comments