This lecture aims at providing an introduction to Python programming for beginners, how to install it, different Python distributoins available, how to write Jupyter notebooks, and how to perform simple arithmetic operations with Python.


Python: a brief history

Python was developed close to the beginning of the 1990’, by Guido van Rossum, a former employee of Google, who is now an employee of Dropbox. The name of the language is attribution to the British sketch comedy Monty Python’s Flying Circus. As of 2016 Python seems to be the fastest growing language for data science. Python has the following features and attributes.

  • Python is a fourth-generation, high-level programming language. Remember from our zeroth lecture, that a high-level programming language provides a high level of programming abstraction from details of computer and machine code. For comparison, Fortran, C++, and C are considered high, medium, and low -level programming languages respectively.

  • Python is general-purpose programming language, meaning that it is designed to be used for writing software in a wide variety of application domains, such as scientific computation, web and internet development, education, Software Development. For more information, visit this page.

  • Python is a multi-paradigm programming language. A programming paradigm is the style of writing and development of a computer programming language. Python allows the programmer to use the following major programming paradigms.

   Later on, we will get to each of these programming paradigms in Python.

  • The core philosophy of Python programming: Simplicity, Readability, and complexity instead of complication.

  • Python is an interpreted language. A programming language implementation is a system for executing computer programs. There are two general approaches to programming language implementation:

    • Interpretation: An interpreter takes as input a program in some language, and performs the actions written in that language on some machine.
    • Compilation: A compiler takes as input a program in some language, and translates that program into some other language, which may serve as input to another interpreter or another compiler.

      Python is an interpreted language, meaning that, as soon as you type a Python statement on the Python command line and press enter, the Python interpreter, executes the statement. Python programs can also be compiled, to be executed later when desired. This is a topic that will be covered later on in this course.

  • The most popular major implementation of Python is CPython. Other major implementations include IronPython, Jython, MicroPython, PyPy, each of which is designed for a specific purpose. Throughout this course we will be using CPython.

  • The extension for human-readable Python source code file is “.py. There are other extensions for Python program files as well, each of which represents a specific type of Python file. For example, “.pyc” represents compiled (binary) Python source code, and “.pyo” is used for optimized Python files.

Python installation

Depending on your operating system, you can download and install a specific version of Python for your personal computer from one of the major Python vendors, for example, CPython. For this course, we will rely on CPython implentation.

Basic Python installation

The official CPython implementation of Python can be found at python.org. Once you go to this webpage, you will see that two versions of Python are available for download (for Windows systems):

For operating systems other than Windows, the installation files can be found here for Linux, and here for Mac.

In addition to the basic Python distribution that you can obtain from CPython organization, there are also other popular Python distributions that, by default, contain some highly useful Python libraries, advanced Python editors and integrated development environments (IDEs). A Python distributions is basically the Basic Python core bundled together with many useful Python libraries and IDEs. For example, the basic Python distribution from CPython organization, is bundled along with a simple premitive integrated development environment for Python coding, called IDLE.

Aside from the official basic CPython distribution of Python available from python.org, there are other Python distributions that based on CPython. A comprehensive list can be found here. Some, among many, of the most popular and useful Python distributions for scientific computing purposes are the following:

  • Anaconda from Continuum Analytics. According to the company, Anaconda is the leading open data science platform powered by Python. The open source version of Anaconda is a high performance distribution of Python and R and includes over 100 of the most popular Python, R and Scala packages for data science. Additionally, the Anaconda user has access to over 720 packages that can be easily installed with conda. Conda is a language-agnostic package manager and environment management system that is developed and maintained by Continuum Analytics. The package Conda is itself written in Python.

    The Anaconda distribution of Python is the one that we will use throughout this course.

    The latest version of Anaconda includes an easy installation of Python (2.7.13, 3.4.5, 3.5.2, and/or 3.6.0) and updates of over 100 pre-built and tested scientific and analytic Python packages. These packages include NumPy, Pandas, SciPy, Matplotlib, and Jupyter. Over 620 more packages are available. You can install any of them with just one command,
    conda install package-name
    

    (NOTE: Replace “package-name” with the name of the package you want to install.)

  • Canopy Python from Enthought Canopy. According to the company, Canopy Python is a comprehensive Python analysis environment that provides easy installation of over 450 core scientific analytic and Python packages, creating a robust platform you can explore, develop, and visualize on. In addition to its pre-built, tested Python distribution. Enthought Canopy has tools for iterative data analysis, visualization and application development. Like Anaconda, Canopy has free and licensed versions available for purchase.

Installing external Python packages

One of the greatest advantages and reasons for the popularity of Python over other languages is the extensive set of libraries that have been written for Python over the past two decades. As a professional Python programmer, you will virtually always need some of these packages. In any event you need a Python library that is not already installed on your device, you can get the instructions for Linux installation from this page. For Windows devices, you can get precompiled version of Python external libraries, ready for installation, from Christoph Gohlke’s personal website.

Python editors and IDEs

The simple Python code editor, IDLE, that comes with the basic CPython distribution of Python is most not enough helpful for educational and professional programming. As a result, a myriad of Python code editors and IDEs have been also developed over the past decade. A rather complete list of most popular Python IDEs can be found here and here. Some of the most useful for our class and your future professional use are likely the following:

  • Spyder
    Spyder (formerly Pydee) is an open source cross-platform IDE for scientific Python programming. It probably has the highest design similarity to MATLAB environment. Therefore, it likely a good start as IDE for those who are already familiar and confortable with MATLAB environment design. Spyder integrates NumPy, SciPy, Matplotlib and IPython, as well as other open source software.

  • PyCharm
    PyCharm is full-featured IDE for Python. It is available in Free and Open Source edition, fully supporting Python as well as proprietary Professional Edition with Django, Flask, Pyramid and Google App Engine support.

  • IPython
    IPython is an enhanced interactive Python shell. It offers a significantly enhanced interactive shell for Python programming, such tab completion (autocompletion), inline Python syntax highlighting, command history, etc. It is highly useful for testing small snippets of your big code immediately on the IPython shell. IPython is installed on your computer as part of Anaconda package installation.

  • Jupyter
    Project Jupyter was born out of the IPython Project in 2014 as it evolved to support interactive data science and scientific computing across all programming languages. Jupyter is an abbreviation for Julia, Python, and R programming languages. The Jupyter Notebook is a web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, machine learning and much more.

  • Notepad++ (available only on Windows)
    The last, but in-my-opinion not least, important Python editor is Notepad++. It is a highly versatile text and source code editor for use with Microsoft Windows. It is likely – and in my opinion, arguably – the most powerful general-purpose text editor that is currently available on the web. Notepad++ automatically identifies the type of code the file contains based on the file extension and highlights the code sytax accordingly. However, you should keep in mind that it is not specifrically designed for Python. If you are professional multi-language programmer, you will soon find the hidden gems in Notepad++ that are not available in any other editor (including Python-specific editors) as of today, at least as far as I am aware.

Which Python standard version should you use?

Like any other programming language, Python has also evolved significantly since its inception in 1991. Normally, a good prgramming language should be backward-compatible, meaning that the newer programming standard should not violate the previous standards. For example, an old Python code should be executable on the most recent Python standard implementation. Sometimes however, with some programming language evolutions, this is not the case. It probably happens to all languages that some time, the new standard violates the older standard syntax of the language, causing runtime and compile-time error for an old-standard code.

For Python standards, this backward-incompatibility occurred between Python versions 2.x and 3.x. A list of the key differences between the two standards can be found here. If you would like to know which Python version is likely most useful for your future professional project, consult this page. However, it is important to keep in mind that Python 2.x standard is legacy, and Python 3.x is the present and future of the Python language. By year 2020, the offical plan is to cut the security updates and support for Python 2.x and most of the major Python packages have already started migrating to Python 3. Therefore, for the rest of this course we will be using Python 3 syntax.

Setting up Jupyter

There are two ways to setup and run a Jpuyter notebook:

  1. on your personal device
  2. online on Jupyter website

    In the following, both methods will be explained.

Running Jupyter on Personal Device

Now, if you have already installed Anaconda on you device, you should also have Jupyter and IPython installed automatically. To open a new Jupyter notebook, follow the instructions below (for Windows OS). For other OS, it would also be similar.

1. Open Windows’ start menu and search for jupyter.

2. By clicking on Jupyter Notebook, a Windows Command Prompt for Jupyter will open up, initializing the Jupyter server. Then a web browser window will open up on your default web browser. The content of this web page is a list of the content of your home directory on your personal device, as in the following figure.

3. Now click on the new tab on the top-right part of the page, and choose python 3. If you have installed Python 2 as well, you will also see an option for Python 2. But, for this course proceed with Python 3.

4. Once you choose and click on your Python version, a new browser tab will open, which contains your Jupyter notebook, as illustrated in the following figure.

Your Jupyter notebook file is stored in the home direcotry of your device, likely with the name Untitled.ipynb. The very cool feature of Jupyter notebooks is that you can also export your notebook as a Markdown, PDF, HTML, or a single Python file (with .py extension), as illustrated in the figure below.

Running Jupyter online

The instructions for setting up your online Jupyter notebooks are very similar to the above for your local device, except the very first step, for which, instead of searching in Windows for Jupyter, you have to visit Jupyter’s website at https://try.jupyter.org/.

IPython / Jupyter helpful commands

Everytime you start IPython on your local device, the following list of IPython commands are shown on the command line.

Amir@CCBB-Amir MINGW64 ~
$ ipython
Python 3.5.2 |Anaconda 4.2.0 (64-bit)| (default, Jul  5 2016, 11:41:13) [MSC v.1900 64 bit (AMD64)]
Type "copyright", "credits" or "license" for more information.

IPython 5.1.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.
In [1]:


Since Jupyter is an extension of IPython, these commands are also executable in Jupyter notebooks. Here is an example for the last command <object>?.

In [14]: test = 'test'

In [15]: test?
Type:        str
String form: test
Length:      4
Docstring:
str(object='') -> str
str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or
errors is specified, then the object must expose a data buffer
that will be decoded using the given encoding and error handler.
Otherwise, returns the result of object.__str__() (if defined)
or repr(object).
encoding defaults to sys.getdefaultencoding().
errors defaults to 'strict'.


ATTENTION

Note that each cell in Jupyter notebook, can contain either Python code or Markdown code, or any other code that you can select from the code dropdown menu at the top of the notebook.


Jupyter cheatsheet and keyboard shortcuts

There are may useful keyboard shortcuts in Jupyter that facilitate editing and revising your Jupyter notebook cells. A Jupyter cheatsheet can be downloaded from here. The following table is a summary of some of the most useful shortcuts, adopted from Jupyter website.

Table 1: Some useful shortcuts for Jupyter cells in view mode (Press ESC to switch to view mode).
keyboard shortcut Description of effect
Enter enter edit mode
Shift + Enter run cell, select below
Ctrl + Enter run cell
Alt + Enter run cell, insert below
Y to code
M to markdown
R to raw
1 to heading 1
2,3,4,5,6 to heading 2,3,4,5,6
Up/K select cell above
Down/J select cell below
A/B insert cell above/below
X cut selected cell
C copy selected cell
Shift + V paste cell above
V paste cell below
Z undo last cell deletion
D,D delete selected cell
Shift + M merge cell below
Ctrl + S Save and Checkpoint
L toggle line numbers
O toggle output
Shift + O toggle output scrolling
Esc close pager
H show keyboard shortcut help dialog
I,I interrupt kernel
0,0 restart kernel
Space scroll down
Shift + Space scroll up
Shift ignore



Table 2: Some useful shortcuts for Jupyter cells in edit mode (Press enter to switch to edit mode).
keyboard shortcut Description of effect
Tab code completion or indent
Shift + Tab tooltip
Ctrl + ] indent
Ctrl + [ dedent
Ctrl + A select all
Ctrl + Z undo
Ctrl + Shift + Z redo
Ctrl + Y redo
Ctrl + Home go to cell start
Ctrl + Up go to cell start
Ctrl + End go to cell end
Ctrl + Down go to cell end
Ctrl + Left go one word left
Ctrl + Right go one word right
Ctrl + Backspace delete word before
Ctrl + Delete delete word after
Esc command mode
Ctrl + M command mode
Shift + Enter run cell, select below
Ctrl + Enter run cell
Alt + Enter run cell, insert below
Ctrl + Shift + Subtract split cell
Ctrl + Shift + - split cell
Ctrl + S Save and Checkpoint
Up move cursor up or previous cell
Down move cursor down or next cell
Ctrl + / toggle comment on current or selected lines



Comments