Threads
Threads are not specific to this class or web applications. They are an important part of many kinds of computer systems. The Wikipedia article on threads is a good place to start learning more.
However, at the end of this reading, there are some specific recommendations for writing code that is thread safe. Your Flask project applications must be thread-safe.
- Thread Applications
- Threads in the Server-Side of Web Applications
- Threads to the Rescue
- Processes Vs Threads
- Apache is Multi-Threaded
- Producer/Consumer
- Threads in Python
- Synchronization
- Producer/Consumer
- Examples/Demos
- Daemonic Threads
- Producer-Consumer with Locking
- Producer-Consumer with Conditions
- Producer-Consumer with Queue
- Example Summary
- Videos
- MySQL is Multi-Threaded
- MySQL and Flask
- Thread Safety Checklist
- Conclusions
Thread Applications¶
Consider a web browser, which is a kind of computer program. You launch Firefox, say, and, after a while, a window appears on your desktop and you can start browsing the web. That's a process: a process is a program that is running, so it has memory, with data structures allocated in that memory (such as the HTML and CSS of the page you're viewing) and so forth.
You open up a second tab and load a second web page in it. (The browser could still be a single process, doing this single thing.) That web page is loading kinda slowly, though, so you leave it loading while you return to the first tab, scroll down a bit, mouse over a menu, which changes color in response to the location of your mouse, and stuff like that. Meanwhile, the second tab finishes loading, which you notice because your tab bar changes appearance.
Your web browser is doing (at least) two things at once: paying attention to you, and loading a page. This is only possible with threads.
Threads in the Server-Side of Web Applications¶
We've seen how threads can be useful in web browsers, but what about on the server side? In web technologies like PHP and in CGI, scripts get read from disk, parsed and byte-compiled, run, and then finish. They are more ephemeral than a mayfly: they live for a few milliseconds, and then die. If we make 100 web requests, 100 processes are born, run and die. If they are doing something relatively simple, there's a lot of overhead for not much accomplishment. Here's a way to picture it:
So, what's wrong with that? First of all, it's expensive to invoke a CGI program, particularly a PHP/Perl/Python program:
- A separate process has to be created. That, by itself is somewhat expensive, compared to a thread. The jargon for this, by the way, is forking, because the operating system call that creates a new process is fork.
- The PHP/Perl/Python program may have to be read from disk (if it's not cached). If the disk has to turn a jillionth of an inch, that takes forever. Heaven help us if the disk arm has to move.
- The PHP/Perl/Python program has to be byte-compiled. Every single time (barring caching).
A second problem is that the CGI program typically has to be
stateless
: since it dies at the end of each transaction, if it
wants to remember anything across invocations it has to write them to
files or databases. (We noticed this problem and addressed it with
sessions.)
There are many partial solutions to these problems. For example, the
mod_python
module is an extension to Apache that essentially builds
in a Python interpreter. A deployed Flask application uses this
built-in Python, so a deployed Flask application will be
multi-threaded. That's important for us and how we code our Flask
apps.
Threads to the Rescue¶
One important solution (certainly not the only possible solution) is to avoid forking (creating new processes) by using threads.
What are threads? As you know, a program (whether written in Java,
Python, PHP, JavaScript or any other language) executes one line at a
time from some starting point (typically the main
method). Method calls and such change the location
where the
program is,
but there's always exactly one place where the
program is.
(You also know that, in real life, the high-level language is translated into a compiled form and it's that compiled code executes one line at a time, but that's an unimportant difference for this discussion. It's sufficient to think of a program executing one line at a time.)
In a course on computer organization and assembly language, you learn
that the location
of the program is the program counter,
which is a special register on the processor that holds the memory
address where the next instruction will be fetched from. (If that
sentence made no sense to you, take CS 240 or talk to me.) In that
course or some other, you learn that during a method call, the
previous location (value of the program counter) is kept on a stack,
so that the program can resume the calling method at the correct
location when the method is done. Using a stack allows recursion.
This might be called the control flow
of your program and the
sequence of locations might be called the thread of control. Imagine
Theseus executing your code, trailing the thread that Ariadne gave him
as he does so. That thread marks the sequence of locations that your
program traverses as it executes.
Now, imagine that you can have two or even more threads of control at once! Your program can be in more than one place at the same time. However, the rules for control flow are exactly the same for each thread of control.
To implement this, the virtual machine really only needs two things for each thread, a program counter and a stack.
Processes Vs Threads¶
The main distinction in this discussion is between difference processes and threads. Processes are completely separate and protected from one another, while threads share memory. In the figure about Apache CGI requests, each CGI program is a separate process. If we zoom in on the memory representation of a process, it might look like this:
After the CGI request finishes, all that stuff is discarded. When the next request comes in, it gets created and discarded again.
In contrast, creating a thread doesn't require as much. Here's a similar picture of the memory representation after creating a second thread:
Threads are:
- Cheap, since there is not much overhead for them. That is, it's easy to allocate a program counter and a stack.
- Context switches (changing which thread is executing) are also cheap, since the same program is being executed, so there's no swapping pages in and out of memory and other safeguards that the OS puts in place to protect one process from another.
- Risky, since one thread can modify a shared data structure and inadvertently (or intentionally) mess up other threads. This is what we saw with transactions.
- Risky, since one errant thread can bring down the whole program.
If the threads are cooperating on solving a problem, multi-threaded programming can be complex and difficult, and debugging can be a nightmare, since more than one thing is happening at once. On the other hand, some problems are much better solved with multi-threading, and there are enormous opportunities for parallelism to speed up execution.
Alternatively, when the threads are essentially separate, each solving an equivalent problem using the shared resources of code and memory, the programming need not be much more complicated than single-thread programming, and can also yield speedups from parallelism. You have to be careful about shared data, but hopefully that can be minimized.
In development mode, Flask uses only one thread. (That means, by the way, that if a request takes N milliseconds to complete, no other request can even get attended to for N milliseconds.) However, deployed Flask applications can use multiple threads, so we need to learn how to program with that in mind.
In Flask applications, and indeed in any multi-threaded web
application (including Apache), each thread will be handling an HTTP
request (a GET or POST request), and will be essentially
independent. Thus, we can get better performance at a relatively small
cost. We'll also see that having long-lived data in memory can be
useful, where by long-lived
we mean that it outlives the
particular HTTP request.
Apache is Multi-Threaded¶
Now, the web server software itself, such as Apache, is
multi-threaded. The structure of Apache is to have a listener
thread that does nothing but listen on the port, grab the incoming web
requests, and put them on a work
queue. Various worker
threads then grab requests off the queue and actually do the work,
whether it's reading a file off the disk and sending it to the
browser, or something more complicated like reading and executing a
PHP script, or creating a separate process for a CGI script.
A deployed Flask app works similarly, except that Apache gets the requests and puts the ones for Flask onto a Flask-only queue, from which the Flask worker threads get them.
Producer/Consumer¶
An abstraction of the way that Apache/Flask works is the
producer/consumer
problem, where some threads (the listener
thread) produce stuff (work) and other threads (the workers) consume
the stuff. We'll see an example of that in Python later, but we first
have to learn how threads work in Python.
Threads in Python¶
For those of you unfamiliar with thread programming in Python, a few notes:
- Any object can be threaded; it just has to inherit from the
Thread
class. - It has to implement a
run()
method. This method is closely analogous tomain()
, in that it is the start of a sequence of execution steps (an entry point). - You start a thread by first making an instance of the runnable
class; call it
runner
. You then invoke the.start()
method. - If some other thread wants to wait for the runner to end, it
can invoke the
.join()
method on the other thread object (the runner).
Here's a good place to start learning more about understanding threads in Python
Here's the documentation on Python Threading
Synchronization¶
The key concern with using threads is the worry that one thread will modify a shared data structure while another thread is using it, so that the result is inconsistent or even disastrous (following null pointers and the like). We saw these issues with transactions as well.
One of the main tools to avoid problems is to make sure that we lock any shared data structures whenever we use them and release the lock when we are done. This is exactly the same idea as locking tables in MySQL transactions.
Here's an excerpt of the code we saw that incremented a global counter without locking:
#define a global variable
some_var = 0
class IncrementThread(Thread):
def run(self):
#we want to read a global variable
#and then increment it
global some_var
read_value = some_var
print("some_var in {} before increment is {}"
.format(self.name, read_value))
some_var = read_value + 1
print("some_var in {} after increment is {}"
.format(self.name, some_var))
Here it is with the accesses to the global counter synchronized using locks:
# define a global, shared lock
lock = Lock() # <------------------- create lock
#define a global variable
some_var = 0
class IncrementThread(Thread):
def run(self):
#we want to read a global variable
#and then increment it
lock.acquire() # <-------------------- acquire it
global some_var
read_value = some_var
print("some_var in {} before increment is {}"
.format(self.name, read_value))
some_var = read_value + 1
print("some_var in {} after increment is {}"
.format(self.name, some_var))
lock.release() # <-------------------- release it
Next, we'll look at more sophisticated examples of threads the course, when we look at synchronization and locking, including the producer-consumer problem.
Producer/Consumer¶
The Producer/Consumer problem is a useful abstraction of many real-world problems of threading. Indeed, it's pretty close to what Apache and Flask do:
- The main thread listens to the network (say on ports 80 and 443), and when a request comes in, it puts the request on a work queue and returns to listening. Thus, it's a producer
- The other threads wait for work to be added to the work queue, and when there's something to do, they do it (respond to the request). Thus, they are consumers.
Some useful tutorials about the Producer
The next few sections will discuss three different solutions to the producer/consumer problem in Python. These are important concepts and techniques for your knowledge of computer science, but you will probably not use any of these coding techniques in your projects, so you don't have to dig deeply into the code. Read for concepts.
Examples/Demos¶
These examples are in the threads
folder:
cd ~/cs304
cp -r ~cs304/pub/downloads/threads threads
cd threads
Daemonic Threads¶
The Python threading API defines two kinds of threads: daemons and non-daemons. A Python program is defined to end when all non-daemons are done. So, if your non-daemon thread is infinite, which they often are, they will never be done and your program is hard to exit.
Therefore, in the examples below, I go to a little extra effort to
make the threads daemons so that we can easily kill the
program. This is easily done with the setDaemon
method:
t1 = MyThread()
t1.setDaemon(True)
t1.start()
If we don't set the thread to be a daemon, the Python program becomes
hard to kill. ^C
, which usually works, doesn't. Instead, we have to
^Z
to suspend the process and then kill it:
$ python daemon.py N
starting thread
thread started. Waiting for you to type something
hi
jklsd
die!
^C
^C
^Z
[1]+ Stopped python daemon.py N
$ jobs
[1]+ Stopped python daemon.py N
$ kill %1
Note that killing a thread is usually the wrong thing. What if it's holding a resource or doing something to a data structure when it gets killed? What if it's halfway through a balance transfer in our banking/babysitting app? Killing it could leave that data structure in a broken state or fail an audit.
Usually, it's better to set up a communication, possibly using a shared variable, that requests that the thread exit. Something like this:
from threading import Thread
import time
keepGoing = None
class play(Thread):
def run(self):
global keepGoing
keepGoing = True
while keepGoing:
time.sleep(1)
print(' play')
print(' darn it!')
def main():
global keepGoing
kid = play()
print('go and play')
kid.start()
time.sleep(10)
print('time to stop!')
keepGoing = False
main()
The code above is in frolic.py
So, the moral is:
avoid killing a thread. Have it exit gracefully.
Producer-Consumer with Locking¶
Our first producer-consumer example is
producer_consumer_lock.py
. Here's an excerpt from the code. The lines
with comments are the important ones.
work = [] # shared data structure ===
lock = Lock() # create lock =============
class ProducerThread(Thread):
def run(self):
while True:
num = random.randint(1,10)
lock.acquire() # acquire it ==============
work.append(num)
print('Produced ',num,
'work is ',work)
lock.release() # release it ==============
time.sleep(random.random())
class ConsumerThread(Thread):
def run(self):
while True:
lock.acquire() # acquire it ==============
if len(work) == 0:
num = -1
else:
num = work.pop()
print('Consumed ',num,
'work is ',work)
lock.release() # release it =============
time.sleep(random.random())
Observations:
- prints the content of the work queue whenever anything happens
- If there's nothing to consume, this consumer thread just spins (loops endlessly), taking up CPU time doing nothing but checking if there's something to do. This is a busy wait. Busy-waits are usually bad because they waste the CPU.
- Better is for the consumers to go to sleep, and have the producer wake it up when there's work to be done.
Producer-Consumer with Conditions¶
Conditions are objects similar to locks, except that they have a list of threads waiting on that condition, and a thread can add itself to the list. When the condition that they are waiting for happens, the threads can be notified (woken up). That avoids the problem of busy-wait; instead the thread just sleeps until there is work to do.
This example of producer-consumer is
producer_consumer_lock.py
. Here's an excerpt from the code. The
lines with comments are the important ones.
work = [] # shared data structure
condition = Condition() # create condition
class ProducerThread(Thread):
def run(self):
while True:
num = random.randint(1,10)
condition.acquire() # acquire it
work.append(num)
print('Produced ',num,
'work is ',work)
condition.notify() # --- new! -------
condition.release() # release it
time.sleep(random.random())
class ConsumerThread(Thread):
def run(self):
while True:
condition.acquire() # acquire it
if len(work) == 0:
condition.wait() # --- new! -------
num = work.pop()
print('Consumed ',num,
'work is ',work)
condition.release() # release it
time.sleep(random.random())
Producer-Consumer with Queue¶
The Queue object encapsulates the blocking and notification. See Queue The code works like the condition version, but is easier.
producer_consumer_lock.py
. Here's an excerpt from the code. The
lines with comments are the important ones.
work = queue.Queue() # ----- new data structure ----
class ProducerThread(Thread):
def run(self):
while True:
num = random.randint(1,10)
work.put(num) # ---- automatically synchronized
print('Produced ',num,
'work len ',work.qsize())
time.sleep(random.random())
class ConsumerThread(Thread):
def run(self):
while True:
num = work.get(True) # block if work queue empty ------
print('Consumed ',num,
'work len ',work.qsize())
time.sleep(random.random())
Example Summary¶
Let's summarize our producer-consumer examples before turning to our last topics.
- threads should not be killed; they should be coded so they can gracefully exit
- access to shared data structures should be managed for thread safety
- locks can give exclusive access to shared data structure
- conditions can give also exclusive access to shared data structure, but can avoid busy-wait.
- the Python library has thread-safe data structures
Videos¶
I have created videos of these demos in action. See the videos page.
MySQL is Multi-Threaded¶
MySQL obviously allows concurrent access, which is why locks were
necessary. Threads allow multiple concurrent connections to the
database. You can use the status
command to find out how many
threads are currently running on the server.
This is a good time to remind you of an important feature of
MySQL. When you insert a row into a table with an auto_increment
column, like an ID, what value just got inserted? It's not necessarily
the largest value in the table, because that might belong to a row
that was just inserted in some other thread, moments after yours.
The answer is that MySQL keeps track of the last inserted ID on a per
connection basis. You can find that out using the special function
last_insert_id()
.
MySQL and Flask¶
Because Flask might be multi-threaded, you don't want two threads to
use the same MySQL connection, because then we might have exactly the
trouble with auto_increment
that we were trying to avoid by using
last_insert_id()
.
Therefore, each Flask request should get its own database connection. Yes, that's wasteful when they could be re-used, but its better to be a little wasteful than to risk subtle threading bugs. (In a serious web application, we would probably have a pool of connection objects, and grab one from the pool instead of having to connect, and return it to the pool when our request is finished.)
As an analogy, imagine that I started a MySQL shell and then allowed
each of you to walk up to my machine and run queries and
updates. Thus, everyone shares the same connection (mine). Mostly,
that would work fine. But, if Abby walks up and does an insert into an
auto_increment
table, and Betty walks up and does likewise, and then
Abby asks for the last_insert_id()
, you can see that she'll get
Betty's value, not her own. That's why each request should get its own
database connection.
Furthermore, in Flask we should avoid global variables, because
global variables are shared among threads and therefore are not
thread-safe. (Thread-safe
means that only one thread has access
to the value, and so it can be modified without any kind of locking.)
However a read-only global variable is fine. It's global variables
that get updated (like the global counter in raceto50.py
and the
work queue in producer-consumer) that are problematic.
Thread Safety Checklist¶
Now that we've learn about threads and the notion of code being thread safe, we can require that our Flask apps be thread-safe. What does that mean?
- Global variables are either read-only or have access properly controlled (such as using locks).
- Each request gets its own database connection, so that different
requests use different connections. That means that
connection-specific functions like
last_insert_id()
will work correctly. - Any request that does multiple SQL operations needs to consider whether those operations produce "race conditions". For example, the common pattern of (a) checking if something is in a table and (b) inserting it if it is not, is not thread-safe: two threads executing that sequence at once can cause problems. This was described in the reading on transactions
Conclusions¶
Threads are pretty cool. They
- Allow the server to save computation time because it is cheaper to start a thread than to compile and run a program.
- Allow us to retain values between web requests.
- Allow us to have long-running computations that are easy to interact with.
Use last_insert_id()
in MySQL to find out the value of the
auto_increment
value that was generated in your connection.
In Flask, avoid global variables, though read-only ones are fine (for constants and configuration variables and such). Updates to global variables, if any, need to be synchrononized using locks. (Our Flask apps will only rarely need globals of this sort.) Also, each request should get its own database connection, rather than sharing a single global database connection.