Python Review
This document isn't intended to teach Python from the beginning, especially for non-programmers. Instead, it's intended to refresh the memory for people who have programmed in Python before, but maybe not for a while, and have forgotten aspects of it.
The first part of this document is mostly review. You can also skip down to a few new things that are not covered in CS 111.
- About Python
- Versions of Python
- Python as an Environment
- Python as a Language
- Functions in Python
- Datatypes in Python
- Strings
- Lists
- Tuples
- Comprehensions
- Dictionaries (Hashes)
- Iteration in Python
- Variables and Scope in Python
- String Interpolation
- Executable Python Scripts and Modules
- Modules in Python
- PyDoc
- Programming Methodology
About Python¶
Python is a very likeable language. Some things that I like about it:
- Its syntax, while bizarre compared to conventional languages, is spare and almost beautiful. It will take some getting used to, but it's quite readable, even for a beginner.
- It has a lot of powerful, dynamic features: dynamic creation of objects, functions, and pretty much anything.
- It has a passable attempt at closures.
- It has a lot of powerful packages.
- It is easily portable.
- It has a read-eval-print-loop which makes experimentation and playing a joy.
- Its object-oriented programming is good but optional.
- It has spam, silly walks and a dead parrot, because Guido Van Rossum, the creator of Python and Benevolent Dictator for Life, is a huge Monty Python fan.
There are lots of good tutorials for beginners online, so it would be foolish for me to try to write one. Here are some:
- Think Python, by my friend Allen Downey. Read his textbook manifesto, and I think you'll find it irresistable to try his book. And it's free (though you can buy a bound copy if you want). I would suggest Chapter 2, Chapter 3, and Chapter 4. All these chapters are relatively short: 8-10 pages each.
- The Python Tutorial. You can't get much more
official
than this. I would suggestion sections 1–5. - Categorized Python tutorials at Awaretek.com, including many for people without programming background (read those to feel really smart).
Nevertheless, I'll give a few high-level observations.
Versions of Python¶
Python is in active development, and new versions come out regularly. However, the two very major versions are Python 2.x and Python 3.x. (There will probably never be a Python 4.x). Python 2 is obsolete and reached its end-of-life in 2020, so everyone should be switching to Python 3.
As of this writing, the default Python on the CS server is Python3 (specifically 3.6.12). That's what we will use in CS 304.
Python as an Environment¶
One of the best ways to learn python is to type expressions into the
Read-Eval-Print Loop (REPL) to see what they do. (Other sources will
call this the interpreter, but you can have an interpreter without
having this ability to type in expressions and have them evaluated and
the results printed.) Just give the command python
to your Linux or
Mac shell, and start typing expressions.
(Note, some people find it confusing that there is a program/command
called python
, but the program is the thing that understands
the Python language and executes/runs your programs.)
In these notes, I'll show Python REPL interactions like the
following. Don't type the >>>
; that's the prompt by the Python
REPL. Similarly, don't type the $
before the python
command; that
stands for the shell prompt, which is probably some detailed string
like [youracct@tempest dir]
. (A prompt just means that something
is ready for your next input.)
$ python
>>> a=3
>>> b=4
>>> a+b
7
>>> import math
>>> c = math.sqrt(a*a+b*b)
>>> c
5.0
>>> quit()
$
To exit, invoke the quit()
function or type a control-D. Notice how
that returns to our Unix prompt.
Python as a Language¶
Python code looks like no other code that I'm familiar with. It's a
complete departure from the C
family of languages.
# Python uses # as a end-of-line comment character
import math # packages are "loaded" by the import statement
a = 3 # no need to declare types; very dynamic
b = 4
c = math.sqrt(a*a+b*b)
So far, not too bad. Let's see some syntax:
if a == b:
print('a and b are the same')
else:
print('a and b differ')
print('go on')
Hmm. Where are the parens and braces? Gone! Python knows that the
else:
section is over because the indentation ends. That's right, the
indentation has syntactic meaning in Python. So, the following two
programs are different in Python.
i = 0
while i < 10:
i += 1
print(i)
i = 0
while i < 10:
i += 1
print(i)
The first one prints the last number, after the loop, while the second one prints every number, because the print statement is inside the loop.
Functions in Python¶
Functions in Python are simple: a name, a formal argument list (no datatypes), and a body. The end of the body is, as expected, signalled by the end of indentation.
def mean(a,b):
return (a+b)/2
Even better is to add a string as the first line of the body. Later, we'll see a tool that will use these for self-documenting files:
def mean(a,b):
"returns the arithmetic mean of the two numbers"
return (a+b)/2
Here is a whole file of function definitions:
import math
def hypo(a,b):
"""Returns the length of the hypotenuse of a right triangle with the given legs"""
# algorithm based on the Pythagorean theorem
return math.sqrt(a*a+b*b)
def fibonacci(n):
"""Returns the nth Fibonacci number, for `n' a non-negative integer"""
if type(n) != type(1) or n<0:
raise Exception('bad argument to fibonacci')
if n<2:
return n
else:
# what a horrible algorithm! Never do this!!
return fibonacci(n-1)+fibonacci(n-2)
def gcd(a,b):
"""Returns the greatest common divisor of the two arguments.
Example: gcd(9,8)=1, since 9 and 8 are relatively prime, but
gcd(24,30)=6, since 6 divides both 24 and 30."""
# This implementation is Dijkstra's method
print("a is {a} and b is {b}".format(a=a,b=b))
if a == b:
return a
elif a > b:
return gcd(a-b,b)
else:
return gcd(a,b-a)
def triangular(max):
"""Generates a triangular list of lists up to the given max"""
result = []
# range() gives you a list of integers; "for" iterates over lists
for n in range(max):
result.append(list(range(n)))
for elt in result:
print(elt)
if __name__ == '__main__':
if hypo(3,4) != 5:
print('error in hypo: hypo(3,4) returns ',hypo(3,4))
print('The first ten Fibonacci numbers are')
for i in range(10):
print(fibonacci(i),' ', end=' ')
# this empty print statement just gives us a blank line
print()
print('testing gcd(20,45)')
if gcd(20,45) != 5:
print('error in gcd: gcd(20,45) returns ',gcd(20,45))
print("here's a list of 5 lists")
triangular(5)
You can try these by downloading the mathfuns.pypython file, importing the contents into python, and running the functions:
$ python
>>> import mathfuns
>>> mathfuns.hypo(5,12)
13.0
>>> mathfuns.gcd(55,89)
1
You can avoid the filename (which is also the name of the module) by importing particular members or all members:
>>> from math import sqrt
>>> sqrt(9)
3
It's even possible to import every member of a module, but this is considered poor practice:
>>> from mathfuns import *
>>> fibonacci(100) # too long!
>>> gcd(30,50)
a is 30 and b is 50
a is 30 and b is 20
a is 10 and b is 20
a is 10 and b is 10
10
It's generally considered better not to do this, because it can
become less clear where a function (like sqrt
, fibonacci
and gcd
above) is defined.
Datatypes in Python¶
All my examples so far have been numeric, for no good reason but that numbers don't need much introduction. Let's look at some more interesting datatypes. To play with these test values, download this sampledata file.
Strings¶
Strings pretty much work as you expect. You can concatenate them with
the +
operator. You can take their length. You can print them.
>>> x = 'spam, '
>>> x
'spam, '
>>> y = 'eggs, '
>>> y
'eggs, '
>>> x+x
'spam, spam, '
>>> x+x+y+' and '+x
'spam, spam, eggs, and spam, '
>>> x+x+y+'and '+x
'spam, spam, eggs, and spam, '
>>> len(x)
6
>>> len(x+y)
12
>>> print(x+y)
spam, eggs,
Lists¶
Lists are denoted with square brackets with commas between the elements. You can index them numerically, and extract sub-lists. You can append items onto the end (actually, either end). You can store into them.
>>> cheeses = [ 'swiss', 'gruyere', 'cheddar', 'stilton', 'roquefort', 'brie' ]
>>> cheeses
['swiss', 'gruyere', 'cheddar', 'stilton', 'roquefort', 'brie']
>>> len(cheeses)
6
>>> cheeses[0]
'swiss'
>>> cheeses[1:3]
['gruyere', 'cheddar']
>>> cheeses[1:4]
['gruyere', 'cheddar', 'stilton']
>>> cheeses.append('gouda')
>>> cheeses
['swiss', 'gruyere', 'cheddar', 'stilton', 'roquefort', 'brie', 'gouda']
>>> cheeses[0] = 'emmentaler'
>>> cheeses
['emmentaler', 'gruyere', 'cheddar', 'stilton', 'roquefort', 'brie', 'gouda']
The append
shows how to invoke a method on a list, and that lists
are mutable, unlike tuples.
Tuples¶
Tuples are just like lists, except that they use parentheses instead of square brackets and they are immutable.
>>> troupe = ('Cleese', 'Palin', 'Idle', 'Chapman', 'Gilliam', 'Jones')
>>> troupe
('Cleese', 'Palin', 'Idle', 'Chapman', 'Gilliam', 'Jones')
>>> len(troupe)
6
>>> troupe[0] = 'Homer' # won't work
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
>>>
One annoying fact about tuples is that parentheses have too much work to do in the language, since they also enclose expressions. For example, consider the following assignments:
x = (1+1) # not a tuple
y = (2+2,3+3) # a tuple of two values
z = (4+4,) # a tuple of one value, because of the comma
The parentheses in the assignment to x
are just for grouping
(unnecessary here), so they don't produce a tuple, like the ones in
the assignment to y
. To get a tuple of one element, the trick in
Python is to put a comma after the first and only element in the
tuple.
Personally, I usually use lists rather than tuples, to avoid this ugly issue.
Comprehensions¶
Both lists and tuples are sequences, and as such can be easily iterated over, sometimes building new lists on the way:
>>> from sampledata import *
>>> cheeses
['swiss', 'gruyere', 'cheddar', 'stilton', 'roquefort', 'brie']
>>> for c in cheeses:
... print(c) # you have to indent this yourself
... # type a newline to indicate that you're done
swiss
gruyere
cheddar
stilton
roquefort
brie
>>> [ len(c) for c in cheeses ]
[5, 7, 7, 7, 9, 4]
>>> troupe
('Cleese', 'Palin', 'Idle', 'Chapman', 'Gilliam', 'Jones')
>>> [ len(x) for x in troupe ]
[6, 5, 4, 7, 7, 5]
Dictionaries (Hashes)¶
Like all civilized languages, Python has dictionaries built-in (sometimes called hashtables in other languages, such as Java). They act a little like arrays that have strings as indexes. You can store into hashes and iterate over them easily.
>>> college = { 'Cleese': 'Cambridge', 'Chapman' : 'Cambridge', 'Palin': 'Oxford', 'Jones': 'Oxford', 'Idle': 'Cambridge', 'Gilliam': 'Occidental' }
>>> college
{'Jones': 'Oxford', 'Gilliam': 'Occidental', 'Cleese': 'Cambridge', 'Chapman': 'Cambridge', 'Idle': 'Cambridge', 'Palin': 'Oxford'}
>>> college['Palin']
'Oxford'
>>> college['Palin'] = 'Oxford University'
>>> college['Palin']
'Oxford University'
>>> college.keys()
['Jones', 'Gilliam', 'Cleese', 'Chapman', 'Idle', 'Palin']
>>> for k, v in college.items():
... print(k, v)
...
Jones Oxford
Gilliam Occidental
Cleese Cambridge
Chapman Cambridge
Idle Cambridge
Palin Oxford University
Iteration in Python¶
We've seen the usual while
loop above, which is very normal. We've
also seen how the for
loop iterates over a list. What if you want to
iterate over a series of numbers, like a C-style for
loop? You can
do that with the range()
function, though I don't think you'll often
have to in CS 304. Other than for purely numeric code, most for
loops are iterating over some data structure, via numerical
indices. Nevertheless, here's an example from mathfuns.py
def triangular(max):
"""Generates a triangular list of lists up to the given max"""
result = []
# range() gives you a list of integers; "for" iterates over lists
for n in range(max):
result.append(list(range(n)))
for elt in result:
print(elt)
Note that we have to wrap the range(n)
with list()
to convert it
into a list of numbers. (In Python3, range()
returns a special
range
object &mdash a kind of generator &mdash, but we won't worry
about that.)
>>> from mathfuns import triangular
>>> triangular(10)
[]
[0]
[0, 1]
[0, 1, 2]
[0, 1, 2, 3]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4, 5]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6, 7]
[0, 1, 2, 3, 4, 5, 6, 7, 8]
Variables and Scope in Python¶
Variables are created when they are assigned; no declaration needed. If the variable is created inside a function, the variable is local to that function. If the variable is created outside a function, it is global (to that file/module).
Function code can refer to (get values from) a global variable that
already exists, but if it want to assign to the global variable
rather than create a new local variable, it must use the global
keyword.
The following code has two global variables one to keep track of how
much we've spent (SpendingTotal
) and one that is the limit on our
spending (BudgetMax
). The code also has two function to spend money,
spend1
and spend2
. Finally, there's a function called
testspending
that takes one of our spending functions as its
argument and invokes it on each of the items we'd like to buy,
printing the total spent when it's done.
BudgetMax = 1200 # in dollars per month
SpendingTotal = 0 # in dollars
def spend1(item,amt):
SpendingTotal = 0 # this is a local, not the global
if SpendingTotal + amt < BudgetMax:
print('Yes! I can afford %s' % item)
SpendingTotal += amt # doesn't do what you think it does
def spend2(item,amt):
global SpendingTotal # says that I want to modify the global
if SpendingTotal + amt < BudgetMax:
print('Yes! I can afford %s' % item)
SpendingTotal += amt
def testspending(fun):
fun('bike',1000)
fun('iPad',800)
fun('weekend getaway',1100)
fun('charitable giving',900)
print('Total amount spent is %d' % SpendingTotal)
print("Let's see how the first function works")
testspending(spend1)
print()
print("Now, let's try the second function")
testspending(spend2)
Let's run the script. Notice that we are running this script from the command line, as if it were a Java program.
$ python budget.py
Let's see how the first function works
Yes! I can afford bike
Yes! I can afford iPad
Yes! I can afford weekend getaway
Yes! I can afford charitable giving
Total amount spent is 0
Now, let's try the second function
Yes! I can afford bike
Total amount spent is 1000
Notice that there's no error message if you do this wrong; you just get code that doesn't work. Be careful!
In CS304, we will be avoiding global variables, so the issues in this section will not arise often, if ever.
Stuff Not Covered in CS 111
The following sections cover things that you might not have seen in CS 111, so this is worth reading.
String Interpolation¶
In CS 111, we would typically use the +
operator to construct a
string out of string constants and variables, like this:
msg = 'The value of x is ' + str(x) + ' and y is ' + str(y)
print(msg)
That works, but it can be cumbersome, particularly when the expression gets long. Another approach is to think about there being a long string constant with "placeholders" within it. The placeholders can be replaced by the value of variables. This process is often called string interpolation.
(Python's string interpolation was upgraded in Python 2.6 with the
format
method, and again in Python 3.6 with "f-strings". We'll
ignore f-strings for now, focussing on the format
method. )
Each placeholder can be given a name, surrounded by braces. It's the
braces that make it a placeholder. The format
method looks for the
placeholders and replaces them with values that are given in its
arguments. Each name is replaced by the corresponding value.
Here are some short examples. There's just a single template
string,
with several uses of the .format()
method on that template.
>>> template = 'the value of x is {x} and y is {y}'
>>> template.format(x=3, y=4)
'the value of x is 3 and y is 4'
>>> template.format(x=7, y=42)
'the value of x is 7 and y is 42'
>>> a = 9
>>> b = 11
>>> template.format(x=a, y=b)
'the value of x is 9 and y is 11'
The return value is always a new string, with the template remaining unchanged.
Here's another example using a Monty Python catchphrase.
>>> catchphrase = 'This is {adjective1}. Now for {noun1} completely {adjective2}'
>>> catchphrase
'This is {adjective1}. Now for {noun1} completely {adjective2}'
>>> catchphrase.format(adjective1='silly', noun1='something', adjective2='different')
'This is silly. Now for something completely different'
>>> catchphrase.format(adjective1='stupid', noun1='a topic', adjective2='smart')
'This is stupid. Now for a topic completely smart'
One thing that's interesting and nice about the format
method is
that you don't have to give the replacement values in the same order
as the placeholders.
>>> catchphrase.format(adjective1='silly',adjective2='different',noun1='something')
'This is silly. Now for something completely different'
>>> template.format(y=7, x=4)
'the value of x is 4 and y is 7'
You can also use the values more than once:
>>> tmpl = '{x}^2 is {x} times {x} or {sqr}'
>>> tmpl.format(x=3, sqr=9)
'3^2 is 3 times 3 or 9'
Executable Python Scripts and Modules¶
You can put a bunch of Python code, including function definitions and such, into a file and run it. Look at now_v1.py:
from datetime import datetime
now = datetime.now()
print(now.strftime("%Y-%m-%d %H:%M:%S")) # like the internet standard
You can run it from the shell as follows:
$ python now_v1.py
That's a useful way to write our main programs.
Modules in Python¶
We've also seen that we can import functions and other useful stuff
from files into the Python environment, as when we imported some
functions from the math
package.
It's smart to write your Python code in a modular way, grouping related functions in a file that can then be imported into other parts of the program and into the main program. (You did a lot with importing code in CS 230.)
Look at now_v2.py:
from datetime import datetime
def now():
"""Returns a string for the current day and time
the format YYYY-MM-DD HH:MM:SS is like the internet format"""
now = datetime.now()
return now.strftime("%Y-%m-%d %H:%M:%S") # like the internet standard
That file demonstrates how we could write a module.
Here's how we could use that module:
$ python
>>> import now_v2
>>> now_v2.now()
'2022-02-17'
>>> print(now_v2.now())
2022-02-17
But now the file doesn't work as a shell script. Can we do both? Amazingly, the answer is yes!
Here's the trick. You can put an if
statement in your file that
checks to see if the __name__
variable has the value __main__
. If
it does, the file is being run as a script, rather than being loaded
as a module.
Therefore,
- you can put your module-like function definitions above that line, and
- you put your script-like code to run below that line, indented within the conditional.
Look at now_v3.py:
from datetime import datetime
def now():
"""Returns a string for the current day and time
the format YYYY-MM-DD HH:MM:SS is like the internet format"""
now = datetime.now()
return now.strftime("%Y-%m-%d %H:%M:%S") # like the internet standard
# the following code is only executed if this file is invoked from the
# command line as a script, rather than loaded as a module.
if __name__ == '__main__':
print(now())
And as a script, we can run it like this:
$ python now_v3.py
2022-02-17
As a module, it works just like now_v3.py
:
$ python
>>> import now_v3
>>> now_v3.now()
'2022-02-17'
>>> print(now_v2.now())
2022-02-17
We will use this trick a lot in CS 304!
PyDoc¶
To get the documentation on a Python module, including one you write
yourself, you can use the pydoc
shell command:
$ pydoc mathfuns
produces the following documentation for mathfuns right to your screen. (You can also set up pydoc as a web server, which is very cool.)
Of course, the documentation that Pydoc gives you comes from the author of the module, and when you write Python code, you shoulder the responsibility of documenting what you create.
Give every function a meaningful documentation string. Write the kind of documentation you'd like to read if you wanted to know how to use the function. The string goes (in triple-quotes, which allows for multiple lines) as the first element of the function definition.
Some additional guidelines and information:
Programming Methodology¶
You've probably noticed that I haven't covered Object-Oriented Programming (OOP) for either PHP or Python. Object-oriented programming (OOP) is new, modern, better, and we all should use it, right? Both languages have OOP, and you're welcome to use it, but we won't (necessarily) be using it. Why?
- OOP is for controlling complexity, and we don't have that kind of complexity.
- Objects are for modeling entities with state and behavior, and that doesn't fit the kind of information processing we're doing: we're doing "filtering" or "transformation" processing.
However, we can and should use procedural modularity and abstraction.
That said, if you come across an aspect of your coding that you feel would be improved by using OOP, please ask! I'd be glad to help you with it.