This document is a little out of date and needs to be re-written. It'll get you started, but there are probably better options out there. Talk to Scott if you want help. Note added on 2/22/2022.
Python is a very likeable language. Some things that I like about it:
There are lots of good tutorials for beginners online, so it would be foolish for me to try to write one. Here are some:
officialthan this. I would suggestion sections 1–5.
Nevertheless, I'll give a few high-level observations.
Python is in active development, and new versions come out regularly. Two major versions are Python 2.x and Python 3.x. Python 2 is obsolescent, and will reach its end-of-life in 2020, so everyone should be switching to Python 3.
We have both Python 2 and 3 installed on Tempest. There are hard-to-remember incantations to choose between them, so as sysadmin, I implemented a script that lets you choose one temporarily in just that shell. (Actually a sub-shell.) See:
[cs304guest@tempest ~]$ wc-switch-python 2 [cs304guest@tempest ~]$ python -V Python 2.7.16 [cs304guest@tempest ~]$ which python /opt/rh/python27/root/usr/bin/python [cs304guest@tempest ~]$ exit exit [cs304guest@tempest ~]$ wc-switch-python 3 [cs304guest@tempest ~]$ python -V Python 3.6.3 [cs304guest@tempest ~]$ which python /opt/rh/rh-python36/root/usr/bin/python [cs304guest@tempest ~]$ exit exit [cs304guest@tempest ~]$
One of the best ways to learn python is to type expressions into the
Read-Eval-Print Loop (REPL) to see what they do. (Other sources will call
this the interpreter, but you can have an interpreter without
having this ability to type in expressions and have them evaluated and the
results printed.) Just give the command python
to your Linux
or Mac shell, and start typing expressions.
(Note, some people find it
confusing that there is a program/command called python
, but the
program is the thing that understands the Python language and executes
runs your programs.)
In these notes, commands that you're supposed to type are shown like this.
$ python >>> a=3 >>> b=4 >>> a+ba 7 >>> import math >>> c = math.sqrt(a*a+b*b) >>> c 5.0 >>> quit()
To exit, invoke the quit()
function or type a control-D.
Python code looks like no other code that I'm familiar with. It's a
complete departure from the C
family of languages.
# Like shell scripts, it uses # as a end-of-line comment character import math # packages are "loaded" by the import statement a = 3 # no need to declare types; very dynamic b = 4 c = math.sqrt(a*a+b*b)
So far, not too bad. Let's see some syntax:
if a == b: print 'a and b are the same' else: print 'a and b differ' print "let's go on"
Hmm. Where are the parens and braces? Gone! Python knows that
the else
section is over because the indentation ends.
That's right, the indentation has syntactic meaning in Python.
So, the following two programs are different in Python.
i = 0 while i < 10: i += 1 print i
i = 0 while i < 10: i += 1 print i
The one on the left prints the last number, while the one on the right prints every number, because the print statement is inside the loop.
Functions in Python are simple: a name, a formal argument list (no datatypes), and a body. The end of the body is, as expected, signalled by the end of indentation.
def mean(a,b): return (a+b)/2
Even better is to add a string as the first line of the body. Later, we'll see a tool that will use these for self-documenting files:
def mean(a,b): "returns the arithmetic mean of the two numbers" return (a+b)/2
Here is a whole file of function definitions:
import math def hypo(a,b): """Returns the length of the hypotenuse of a right triangle with the given legs""" # algorithm based on the Pythagorean theorem return math.sqrt(a*a+b*b) def fibonacci(n): """Returns the nth Fibonacci number, for `n' a non-negative integer""" if type(n) != type(1) or n<0: raise Exception('bad argument to fibonacci') if n<2: return n else: # what a horrible algorithm! Never do this!! return fibonacci(n-1)+fibonacci(n-2) def gcd(a,b): """Returns the greatest common divisor of the two arguments. Example: gcd(9,8)=1, since 9 and 8 are relatively prime, but gcd(24,30)=6, since 6 divides both 24 and 30.""" # This implementation is Dijkstra's method print("a is {a} and b is {b}".format(a=a,b=b)) if a == b: return a elif a > b: return gcd(a-b,b) else: return gcd(a,b-a) def triangular(max): """Generates a triangular list of lists up to the given max""" result = [] # range() gives you a list of integers; "for" iterates over lists for n in range(max): result.append(list(range(n))) for elt in result: print(elt) if __name__ == '__main__': if hypo(3,4) != 5: print('error in hypo: hypo(3,4) returns ',hypo(3,4)) print('The first ten Fibonacci numbers are') for i in range(10): print(fibonacci(i),' ', end=' ') # this empty print statement just gives us a blank line print() print('testing gcd(20,45)') if gcd(20,45) != 5: print('error in gcd: gcd(20,45) returns ',gcd(20,45)) print("here's a list of 5 lists") triangular(5)
You can try these by downloading the mathfuns.py python file, importing the contents into python, and running the functions:
python2.7 >>> import mathfuns >>> mathfuns.hypo(5,12) 13.0 >>> mathfuns.gcd(55,89) 1
You can avoid the filename (which is also the name of the module) by importing particular members or all members:
>>> from math import sqrt >>> sqrt(9) 3 >>> from mathfuns import * >>> fibonacci(100) # too long! >>> gcd(30,50) a is 30 and b is 50 a is 30 and b is 20 a is 10 and b is 20 a is 10 and b is 10 10
All my examples so far have been numeric, for no good reason but that numbers don't need much introduction. Let's look at some more interesting datatypes. To play with these test values, download this sampledata.py file.
Strings pretty much work as you expect. You can concatenate them with
the +
operator. You can take their length. You can print them.
>>> from sampledata import * >>> x 'spam, >>> x+x 'spam, spam, ' >>> x+x+y+' and '+x 'spam, spam, eggs, and spam, ' >>> x+x+y+'and '+x 'spam, spam, eggs, and spam, ' >>> len(x) 6 >>> len(x+y) 12 >>> print(x+y) spam, eggs,
You can substitute stuff into strings like
the C
printf
statement, using %
and a letter code to indicate the type
and format. The %d
format is for decimal numbers. Here is
more info on the
different string
formatting operations.
>>> 'The sum of %d and %d is %d' % (3,4,3+4)
'The sum of 3 and 4 is 7'
Lists are denoted with square brackets with commas between the elements. You can index them numerically, and extract sub-lists. You can append stuff onto the end (actually, either end). You can store into them.
>>> from sampledata import * >>> cheeses ['swiss', 'gruyere', 'cheddar', 'stilton', 'roquefort', 'brie'] >>> len(cheeses) 6 >>> cheeses[0] 'swiss' >>> cheeses[1:3] ['gruyere', 'cheddar'] >>> cheeses[1:4] ['gruyere', 'cheddar', 'stilton'] >>> cheeses.append('gouda') >>> cheeses ['swiss', 'gruyere', 'cheddar', 'stilton', 'roquefort', 'brie', 'gouda'] >>> cheeses[0] = 'emmentaler' >>> cheeses ['emmentaler', 'gruyere', 'cheddar', 'stilton', 'roquefort', 'brie', 'gouda']
The append
shows how to invoke a method on a list, and
that lists are mutable, unlike tuples.
Tuples are just like lists, except that they use parentheses instead of square brackets and they are immutable.
>>> from sampledata import * >>> troupe ('Cleese', 'Palin', 'Idle', 'Chapman', 'Gilliam', 'Jones') >>> len(troupe) 6 >>> troupe[0] = 'Homer' # won't work Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'tuple' object does not support item assignment >>>
One annoying fact about tuples is that parentheses have too much work to do in the language, since they also enclose expressions. For example, consider the following assignments:
x = (1+1) # not a tuple y = (2+2,3+3) # a tuple of two values z = (4+4,) # a tuple of one value, because of the comma
The parentheses in the assignment to x
are just for
grouping (unnecessary here), so they don't produce a tuple, like the
ones in the assignment to y
. To get a tuple of one
element, the trick in Python is to put a comma after the first and only
element in the tuple.
Both lists and tuples are sequences, and as such can be easily iterated over, sometimes building new lists on the way:
>>> from sampledata import * >>> cheeses ['swiss', 'gruyere', 'cheddar', 'stilton', 'roquefort', 'brie'] >>> for l in cheeses: ... print l # you have to indent this yourself ... # type a newline to indicate that you're done swiss gruyere cheddar stilton roquefort brie >>> [ len(l) for l in cheeses ] [5, 7, 7, 7, 9, 4] >>> troupe ('Cleese', 'Palin', 'Idle', 'Chapman', 'Gilliam', 'Jones') >>> [ len(x) for x in troupe ] [6, 5, 4, 7, 7, 5]
Like all civilized languages, Python has hashtables built-in, except
that Python calls them dictionaries
(like Smalltalk). You can store
into hashes and iterate over them easily.
>>> from sampledata import * >>> college {'Jones': 'Oxford', 'Gilliam': 'Occidental', 'Cleese': 'Cambridge', 'Chapman': 'Cambridge', 'Idle': 'Cambridge', 'Palin': 'Oxford'} >>> college['Palin'] 'Oxford' >>> college['Palin'] = 'Oxford University' >>> college['Palin'] 'Oxford University' >>> college.keys() ['Jones', 'Gilliam', 'Cleese', 'Chapman', 'Idle', 'Palin'] >>> for k, v in college.iteritems(): ... print k, v ... Jones Oxford Gilliam Occidental Cleese Cambridge Chapman Cambridge Idle Cambridge Palin Oxford University
We've seen the usual while
loop above, which is very
normal. We've also seen how the for
loop iterates over a
list. What if you want to iterate over a series of numbers, like a
C-style for
loop? You can do that with
the range()
function, though I don't think you'll often
have to. Other than for purely numeric code, most for
loops are iterating over some data structure, via numerical indices.
Nevertheless, here's an example from mathfuns.py
def triangular(max): """Generates a triangular list of lists up to the given max""" result = [] # range() gives you a list of integers; "for" iterates over lists for n in range(max): result.append(range(n)) for elt in result: print elt
>>> from mathfuns import triangular >>> triangular(10) [] [0] [0, 1] [0, 1, 2] [0, 1, 2, 3] [0, 1, 2, 3, 4] [0, 1, 2, 3, 4, 5] [0, 1, 2, 3, 4, 5, 6] [0, 1, 2, 3, 4, 5, 6, 7] [0, 1, 2, 3, 4, 5, 6, 7, 8]
Variables are created when they are assigned; no declaration needed. If the variable is created inside a function, the variable is local to that function. If the variable is created outside a function, it is global (to that file/module).
Function code can refer to (get values from) a global variable that
already exists, but if it want to assign to the global variable
rather than create a new local variable, it must use
the global
keyword. This is the same as with PHP.
The following code has two global variables one to keep track of how
much we've spent (SpendingTotal
) and one that is the limit
on our spending (BudgetMax
). The code also has two
function to spend money, spend1
and spend2
.
Finally, there's a function called testspending
that takes
one of our spending functions as its argument and invokes it on each of
the items we'd like to buy, printing the total spent when it's done.
BudgetMax = 1200 # in dollars per month SpendingTotal = 0 # in dollars def spend1(item,amt): SpendingTotal = 0 # this is a local, not the global if SpendingTotal + amt < BudgetMax: print('Yes! I can afford %s' % item) SpendingTotal += amt # doesn't do what you think it does def spend2(item,amt): global SpendingTotal # says that I want to modify the global if SpendingTotal + amt < BudgetMax: print('Yes! I can afford %s' % item) SpendingTotal += amt def testspending(fun): fun('bike',1000) fun('iPad',800) fun('weekend getaway',1100) fun('charitable giving',900) print('Total amount spent is %d' % SpendingTotal) print("Let's see how the first function works") testspending(spend1) print() print("Now, let's try the second function") testspending(spend2)
Let's run the script. Notice that we are running this script from the command line, as if it were a Java program.
$ python budget.py
Let's see how the first function works
Yes! I can afford bike
Yes! I can afford iPad
Yes! I can afford weekend getaway
Yes! I can afford charitable giving
Total amount spent is 0
Now, let's try the second function
Yes! I can afford bike
Total amount spent is 1000
Notice that there's no error message if you do this wrong; you just get code that doesn't work. Be careful!
Python's string interpolation was upgraded in Python 2.6, and the new version is quite different from Perl and PHP, but it's really nice.
The idea is that the substitution places in the string
have names, and the format
method takes
a hash mapping names to values. Each name is replaced by the
corresponding value.
>>> from sampledata import * >>> catchphrase 'This is {adjective1}. Now for {noun1} completely {adjective2}' >>> catchphrase.format(adjective1='silly',noun1='something',adjective2='different') 'This is silly. Now for something completely different' >>> catchphrase.format(adjective1='silly',adjective2='different',noun1='something') 'This is silly. Now for something completely different'
You can put a bunch of Python code, including function definitions and such, into a file and run it. Look at now_v1.py:
from datetime import datetime now = datetime.now() print(now.strftime("%Y-%m-%d %H:%M:%S")) # like the internet standard
You can run it from the shell as follows:
$ python now_v1.py
We've also seen that we can import functions and other useful stuff from files into the Python environment. It's smart to write your Python code so that you can use the code that way, invoking them from other Python code. Look at now_v2.py:
from datetime import datetime def now(): """Returns a string for the current day and time the format YYYY-MM-DD HH:MM:SS is like the internet format""" now = datetime.now() return now.strftime("%Y-%m-%d %H:%M:%S") # like the internet standard
Here's how we could use it:
$ python >>> import now_v2 >>> now_v2.now() '2019-02-17' >>> print now_v2.now() 2019-02-17
But now it doesn't work as a shell script. Can we do both? Yes,
there's a trick! You can put an if
statement in your
script that checks to see if the __name__
variable has the
value __main__
. If it does, the file is being run as a
script, rather than being loaded as a module.
Look at now_v3.py:
from datetime import datetime def now(): """Returns a string for the current day and time the format YYYY-MM-DD HH:MM:SS is like the internet format""" now = datetime.now() return now.strftime("%Y-%m-%d %H:%M:%S") # like the internet standard # the following code is only executed if this file is invoked from the # command line as a script, rather than loaded as a module. if __name__ == '__main__': print(now())
And as a script:
$ python now_v3.py
2019-02-17
One of the things that any program might want to do is to accept input
from the user by taking arguments on the command line. In Java, you did
this by an array of strings that was an argument to main
:
public static void main(String args[]) { ...
You can do the same thing in Python, except that its accessed slightly differently. Here's a script that echoes back what you give it, but yells them (prints them in uppercase):
#!/usr/bin/python import sys def yell(strings): for s in strings: print s.upper(), print if __name__ == '__main__': args = sys.argv[1:] # argv[0] is the command name, so skip that argc = len(args) print "Got {n} arguments. Yelling them back".format(n=argc) yell(args)
As you can see, the command-line arguments are in a varible
named argv
that is in the sys
module, so you
have to import that module. The zeroth element of that list is the
command-name itself, so here we skip that when we copy the argument
list, to make the args
variable more like the Java one.
The variable argc
(for argument count) is a
traditional name for the number of arguments, inherited from old C
programs.
To get the documentation on a Python module, including one you write
yourself, you can use the pydoc
shell command:
$ pydoc mathfuns
produces the following documentation for mathfuns right to your screen. (You can also set up pydoc as a web server, which is very cool.)
Of course, the documentation that Pydoc gives you comes from the author of the module, and when you write Python code, you shoulder the responsibility of documenting what you create.
Give every function a meaningful documentation string. Write the kind of documentation you'd like to read if you wanted to know how to use the function. The string goes (in triple-quotes, which allows for multiple lines) as the first element of the function definition.
Some additional guidelines and information:
Python is a language that is byte-compiled and interpreted on the fly,
like Perl, PHP, JavaScript (but not Java) and many other scripting
languages. It is dynamically typed, with types on objects rather than on
variables. It is mostly lexically scoped. Objects are allocated from the
heap; almost nothing is allocated on the stack, so you can return objects
from functions.
You've probably noticed that I haven't covered Object-Oriented Programming (OOP) for either PHP or Python. Object-oriented programming (OOP) is new, modern, better, and we all should use it, right? Both languages have OOP, and you're welcome to use it, but we won't (necessarily) be using it. Why?
However, we can and should use procedural modularity and abstraction.
That said, if you come across an aspect of your coding that you feel would be improved by using OOP, please ask! I'd be glad to help you with it. [an error occurred while processing this directive]