2. Introduction to Python & Computational Notebooks#

Attention

Until we setup our Github classroom, please use the download button at the top of this page to download this notebook to your local compouter. You should have a file called python.ipynb that you can open using JupyterLab or VSCode based on the previous tutorial.

This notebook provides a high-level overview of Python programming basics and an introduction to Jupyter Notebooks: the primary way you’ll read/write code & prose and submit assignments and projects in this class. This isn’t mean to be a comprehensive introduction to Python, but rather a quick overview of the basics that you’ll need to know to get started.

Computational Notebooks#

We will primarily be using Jupyter Notebooks to interface with Python. Confusingly, a jupyter notebook is simply a file that ends with .ipynb and has nothing to do with whether you are using JupyterLab or VSCode as your coding environment. Both programs can read/edit notebook files!

A Jupyter notebook consists of cells that demarcate blocks of text or code. The two main types of cells you will use are:

  1. code cell: contains actual code that you want to run. You can specify a cell as a code cell using the pulldown menu in the toolbar in your Jupyter notebook (or command palette in VSCode). Otherwise, you can can hit esc and then y (denoted “esc, y”) while a cell is selected to specify that it is a code cell. Note that you will have to hit enter after doing this to start editing it. If you want to execute the code in a code cell, hit “shift + enter.” Note that code cells are executed in the order you execute them. That is to say, the ordering of the cells for which you hit “shift + enter” is the order in which the code is executed. If you did not explicitly execute a cell early in the document, its results are not known to the Python interpreter!

  2. markdown cells contain text. The text is written in markdown, a lightweight markup language. You can read about its syntax here and use it format your text (e.g. bold, italics) or embed images and links. Note that you can also insert HTML into markdown cells, and this will be rendered properly. As you are typing the contents of these cells, the results appear as text. Hitting “shift + enter” renders the text in the formatting you specify. You can specify a cell as being a markdown cell in the Jupyter toolbar, or by hitting “esc, m” in the cell. Again, you have to hit enter after using the quick keys to bring the cell into edit mode.

These two ways of working with cells or their contents are known as “command mode” and “edit mode” respectively. Both modes have different keyboard shortcuts that can be helpful as you gain more experience.

# This is a code cell that executes the following line of python code, with it's output displayed below:
print("Hello World")

Working Interactively#

Coming from RStudio or Matlab, jupyter notebooks are designed for you to work interactively. In other words, you should write some code, immediately evaluate the output, and create/update code in a feedback loop, rather than write an entire notebook from top-to bottom before you run any code (e.g. like a script).

# Oops this outputted an error telling me I missed a '
print('hello world)
# Here's the fixed version
print("hello world")

Python basics#

Variables#

A variable is a named object– i.e. a thing that Python knows has a particular value. It’s often useful to write code that incorporates named variables wherever possible (rather than hard-coding in specific numerical values). This way of abstracting way the specific values from the set of operations you want to perform on those values allows you to use the same line of code to perform different functions.

To define a variable, you use the assignment operator, =. The name of your variable goes on the left side of the assignment operator, and the value you want to assign to that variable goes on the right side. Play around with the example below to see how it works. For example, change the values of x and y and see how the answers change.

Types of objects in Python#

The objects that Python works with can take on different types of values, called data types. Here are some of the data types you’ll likely encounter frequently:

  • Integers (int): non-decimal scalar values (e.g., -50, 326, 0, 2500, etc.)

  • Floating points numbers (float): Real-valued scalars (e.g., 1.2345, -10.923, 0.01, 2.0, etc.).

  • Boolean (bool): True or False.

  • Strings (str): sequences of characters or symbols, enclosed in single or double quotes (e.g., 'hello', 'This is a single quoted string!', "This is a double quoted string...", etc.).

  • Null value (None): a special data type that doesn’t have any specified value. Useful as a “default” value, e.g. before you have enough information to compute the answer.

x = 3
type(x)
# simple adding a decimal place tells Python we're dealing with floating point values!
x = 3.0
type(x)
# strings can be individual characters
x = "a"
type(x)
# Or combinations of characters
x = "abc"
type(x)

Working with strings#

Python makes it very easy to work with strings relative to many other languages. In particular you can easily insert variable values into strings to create dynamic strings called f-strings. You’ll use these a lot for things like:

  • looping over lists of participant data-files and changing a folder name each time

  • creating templates of text you fill out with evaluated variables like result print-out

my_name = "Eshin"

# Notice that we pre-fix the string with f to be able to use the {} syntax for variable
my_string = f"Hello {my_name}"

print(my_string)
# F-string short-hands allow you to quick print the variable name and value
my_age = 999
print(f"{my_age=}")

# or even truncate by decimal place for prettier outputs
my_age = 999.12914
print(f"{my_age:3.1f}")

Typecasting#

Many datatypes may be converted to many other datatype using typecasting. For example, float(3) converts the integer 3 into a floating point decimal. In the next cell, explore what happens when you try to convert between different data types. You can use the type function to ask Python what the data type of a given entity is.

# Integer
a = 1
print(type(a))

# Float
b = 1.0
print(type(b))

# String
c = "hello"
print(type(c))

# Boolean
d = True
print(type(d))

# None
e = None
print(type(e))

# Cast integer to string
print(type(str(a)))
print(type(3))
list(str(float(int(3.4))))  # what's happening here?

Container types#

In addition to basic variable types, you’ll often be working with collections of things. The following types will be useful here:

Lists#

In Python, a list is a mutable sequence of values. Mutable means that we can change separate entries within a list. For a more in depth tutorial on lists look here

  • Each value in the list is an element or item

  • Elements can be any Python data type

  • Lists can mix data types

  • Lists are initialized with [] or list()

l = [1,2,3]

Elements within a list are indexed (starting with 0)

l[0]

Elements can be nested lists

nested = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

Lists can be sliced.

l[start:stop:stride]
  • Like all python containers, lists have many useful methods that can be applied

a.insert(index,new element)
a.append(element to add at end)
len(a)

List comprehension is a Very powerful technique allowing for efficient construction of new lists.

[a for a in l]
# Indexing and Slicing
a = ["lists", "are", "arrays"]
print(a[0])
print(a[1:3])

# List methods
a.insert(2, "python")
a.append(".")
print(a)
print(len(a))

# List Comprehension
print([x.upper() for x in a])

Dictionaries#

  • In Python, a dictionary (or dict) is mapping between a set of indices (keys) and a set of values

  • The items in a dictionary are key-value pairs

  • Keys can be any Python data type

  • Dictionaries are unordered

  • Here is a more indepth tutorial on dictionaries

# Dictionaries
eng2sp = {}
eng2sp["one"] = "uno"
print(eng2sp)

eng2sp = {"one": "uno", "two": "dos", "three": "tres"}
print(eng2sp)

print(eng2sp.keys())
print(eng2sp.values())

Tuples#

In Python, a tuple is an immutable sequence of values, meaning they can’t be changed

  • Each value in the tuple is an element or item

  • Elements can be any Python data type

  • Tuples can mix data types

  • Elements can be nested tuples

  • Essentially tuples are immutable lists

Here is a nice tutorial on tuples

numbers = (1, 2, 3, 4)
print(numbers)

t2 = 1, 2
print(t2)

Sets#

In Python, a set is an efficient storage for “membership” checking

  • set is like a dict but only with keys and without values

  • a set can also perform set operations (e.g., union intersection)

  • Here is more info on sets

# Union
print({1, 2, 3, "mom", "dad"} | {2, 3, 10})

# Intersection
print({1, 2, 3, "mom", "dad"} & {2, 3, 10})

# Difference
print({1, 2, 3, "mom", "dad"} - {2, 3, 10})

Additional objects and types#

In this class we’ll be making use of additional scientific Python libraries that define some new types such as an array or dataframe. Under-the-hood these more complicated objects are simply clever arrangements of the basic types in Python or another language (e.g. C for faster math)!

Operators#

In addition to the assignment operator (=), there are several other operators built into Python:

Math operators#

  • the addition operator (+)

  • the subtraction operator (-)

  • the multiplication operator (*)

  • the division operator (/)

  • the power operator (**) raises the value of the first thing to the power of the second thing

  • the modulo operators (%) remainder after division

# Addition
a = 2 + 7
print(a)

# Subtraction
b = a - 5
print(b)

# Multiplication
print(b * 2)

# Exponentiation
print(b**2)

# Modulo
print(4 % 9)

# Division
print(4 / 9)

String Operators#

  • Some of the arithmetic operators also have meaning for strings. E.g. for string concatenation use + sign

  • String repetition: Use * sign with a number of repetitions

# Combine string
a = "Hello"
b = "World"
print(a + b)

# Repeat String
print(a * 5)

Logical Operators#

Logical operators perform comparisons on objects and always return a boolean value

  • x == y: is x is equal to y?

  • x != y: is x not equal to y?

  • x > y: is x greater than y?

  • x < y: is x less than y?

  • x >= y: is x greater than or equal to y?

  • x <= y: is x is less than or equal to y

  • or as in x or y: True if either x is “truth-y” OR y is “truth-y”, and False otherwise

  • and as in x and y: True if both x is “truth-y” AND y is “truth-y”, and False otherwise

  • not: not x is True if x is False, and is False if x is True.

  • any: checks if ANY value in an iterable (anything you can loop over) is truth-y

  • all: checks if ALL values in an iterable (anything you can loop over) are truth-y

X

not X

True

False

False

True

X

Y

X AND Y

X OR Y

True

True

True

True

True

False

False

True

False

True

False

True

False

False

False

False

One special consideration is when you you want to check the type of an object, especially checking if it’s None. Instead of using value comparison x == None we use the is operator x is None

# Works for string
a = "hello"
b = "world"
c = "Hello"
print(a == b)
print(a == c)
print(a != b)

# Works for numeric
d = 5
e = 8
print(d < e)
x = None
x == None
x is None  # same, but preferred

Conditional Logic#

Unlike most other languages, Python is white-space senstive, meaning it uses tab formatting rather than closing statements like curly braces {}, parentheses (), or closing statements like end or fi. This makes it easier to read, but requires you to make sure your code is formatted properly!

# Clean and simple
if condition: 
    print('condition!')
# But your spacing matters as this won't work!
if condition: 
print('condition!') # oops should be indented (tabbed to the right)

if statements#

Python contains a number of keywords that allow you to control the flow of instructions that the computer executes. One of the main keywords is the if statement. It runs one or more lines of code only if the quantity being evaluated is True:

x = 3
if x == 3:  # notice the colon
    print("Run this line")  # all lines in the body of the if statement are indented
n = 1

if n:
    print("n is non-0")

if n is None:
    print("n is None")

if n is not None:
    print("n is not None")

elif and else statements#

Whereas the body (indented part) of an if statement will simply be skipped if the evaluated function passed to an if statement is False, you can also specify what to do under other possible circumstances:

  • The elif statement comes right after an if statement. It allows you to specify an alternative set of conditions. You can use multiple elif statements in sequence; once any of them evaluate to True the body of that statement is run and the sequence is aborted (no other elif statements are tested).

  • The else keyword comes after an if statement and (optionally) one or more elif statments. The body of an else statement runs only if none of the preceeding if or elif statements ran.

my_name = "Eshin Jolly"

if my_name == "Eshin Jolly":
    print("You are the course instructor.")
elif my_name == "Yggdrasil":
    print("You are just a silly cat!")
elif my_name == "Marie Curie":
    print("You won two Nobel Prizes.  Also, you are dead.")
else:
    print(
        "I don't know you-- nice to meet you!"
    )  # note we used double quotes to enclose the single quote

Flow control (loops)#

It’s often useful to carry out a similar operation many times. For example, you might want to read in each file in a folder and apply the same basic set of commands to each file’s contents. Loops provide a way of writing efficient and flexible code that involves doing an operation several times.

There are two types of loops in Python: for loops and while loops.

for loops#

This type of loop carries out one or more operations on each element in a given list. The syntax is:

for i in <list of items>:
  <instruction 1>
  <instruction 2>
  ...
  <instruction N>

where i is, in turn, set to the value of each element of the given list, and the instructions defined in the body of the loop are carried out. (Here i is just an example variable name; in practice any variable name may be used as a stand-in for i.) In other words, the instructions in a for loop are carried out for each value in the given list.

# Notice that the variable c contains the value of each character
# rather than an index/position of where we are in the string
my_string = "Python is going to make conducting research easier"
for c in my_string:
    print(c)

You if need access to both the value and the index you can use the enumerate() function:

# Now i is the position of a character and c is its value
my_string = "Python is going to make conducting research easier"
for i, c in enumerate(my_string):
    print(i, c)
# Looping over a list
for x in ["a", "b", "c", "d"]:
    print(x)

while loops#

This type of loop carries out one or more operations while the given logic statement holds true. The syntax is:

while <statement>:
  <instruction 1>
  <instruction 2>
  ...
  <instruction N>

where <statement> (i.e. the loop condition) is any Python expression that can be typecast to a bool. These types of loops are useful when the number of repetitions needed to carry out a particular task is not known in advance.

# Starting values
counter = 0
limit = 10
cumulative_sum = 0

# Exit condition
while counter < limit:
    # Update sum
    cumulative_sum += counter

    # Print values
    # Notice the use of f-string syntax here!
    print(f"{counter=}, {cumulative_sum=}")

    # Update counter
    counter += 1

print(f"Final value = {cumulative_sum}")

Infinite loops#

It is important that the statement used to determine whether the while loop continues with another execution or terminates is modified within the body of the loop. In other words, the parameters of the condition that is being tested for should be adjusted each time the loop executes another cycle. If the loop condition never changes its value from True, the while loop will continue looping forever; this is called an infinite loop. Infinite loops will freeze your computer program until they are manually halted by pressing ctrl + c.

Nested loops#

Both for and while loops may themselves contain other loops (of either type). For example, nested loops can be useful when you want to carry out some sequence of operations on each combination of a set of things.

Functions#

We’ve already come across several functions, such as print, type, and various operators (e.g. +, -, *, /, **, etc.– operators are a special type of function.). A function is a special data type that takes in zero or more arguments (i.e. inputs) and produces zero or more actions or outputs.

Using functions#

Functions can be called by giving them any required arguments between their ():

my_str = "hello"
# Print is a function that we call with 1 argument called my_str
print(argument)
# the len() function returns the length of an iterable
# in this case the length of my_str
len(my_str)

Getting help with functions#

In addition to looking up the documentation for a function online or on our course glossary, you can use function_name? to print out the doc-string for that function right here in a notebook:

print?

Creating functions#

You can create a function by giving it a name using the special def keyword, specify a sequence of statements, and optionally values to return. Then you can call it like any other function:

def make_upper_case(text):
    return text.upper()
  • The expression in the parenthesis is the argument.

  • It is common to say that a function “takes” an argument and “returns” a result.

  • The result is called the return value.

The first line of the function definition is called the header; the rest is called the body.

The header has to end with a colon and the body has to be indented. The body in Python ends whenever statement begins at the original level of indentation. If you want your function to output a value you need to use the return keyword. Otherwise your function will execute its statements and return None. There is no end or fed or any other identify to signal the end of function.

def make_upper_case(text):
    return text.upper()


string = "Python is going to make conducting research easier"

print(make_upper_case(string))
def square(x):
    return x**2


print(square(1))
print(square(2))
print(square(3))

Methods (functions that belong to objects)#

Sometimes a specific type of object will have a set of functions that are specific to that type of object. These are called methods and are invoked using dot-notation. In the example above, text.upper(), upper is a method of the text object which is a string.

x = 100
y = 'hello'
# to_bytes is a method on an integer
x.to_bytes()
# lower() is a method on strings
y.lower()

Just like functions you can get help with methods using ?

y.lower?

Modules/libraries and importing new functions#

Python comes with a set of built-in functions. The dir() command lists the set of functions currently available to you.

You can import new functions from another Python module. A module is a Python file that contains a collection of related function definitions. Python has hundreds of standard modules. These are organized into what is known as the Python Standard Library. You can also create and use your own modules. To use functionality from a module, you first have to import the entire module or parts of it into your namespace

To import an entire module, use the import keyword. Then use . to indicate what functions from that module you would like to use:

import math

# We're asking for the value of the pi function from the math module
print(math.pi)

To import just a subset of functionality from a module use the from keyword:

from math import pi

# No more . needed
print(pi)

You can also customize the name of what you import using the as keyword. This can be helpful for functions with the same name from different modules or for naming things with standard practice:

from math import pi as super_pi

print(super_pi)
import math as m
from math import degrees as deg

m.sin(m.pi) - m.cos(m.pi)
# We often just need to glob function from the glob module
# This saves us from typing glob.glob each time!

from glob import glob

type(glob)
# Convention to import the `numpy` module as `np`
import numpy as np

# Call the randn() function from within the random module within the numpy library
np.random.randn(1)

Looking forward#

As we use more advanced scientific Python libraries we’ll be making use of function and methods that are beyond the standard library. For example we’ll be doing things like:

# Import a new object (model class)
from sklearn.linear_model import LinearRegression

# Create an instance of the model class
reg = LinearRegression()

# Call its fit method with some data
reg.fit(X,y)

Optional exercises#

Find Even Numbers#

Let’s say I give you a list saved in a variable: a = [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]. Make a new list that has only the even elements of this list in it.

Find Maximal Range#

Given an array length 1 or more of ints, return the difference between the largest and smallest values in the array.

Duplicated Numbers#

Find the numbers in list a that are also in list b

a = [0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, 256, 289, 324, 361]

b = [0, 4, 16, 36, 64, 100, 144, 196, 256, 324]

Speeding Ticket Fine#

You are driving a little too fast on the highway, and a police officer stops you. Write a function that takes the speed as an input and returns the fine.

If speed is 60 or less, the result is $0. If speed is between 61 and 80 inclusive, the result is $100. If speed is 81 or more, the result is $500.