Threads
Threads are not specific to this class or web applications. They are an important part of many kinds of computer systems. The Wikipedia article on threads is a good place to start learning more.
However, threads are relevant to this course because a deployed Flask application is multi-threaded. Therefore, it's important that we know how to write thread-safe code, meaning that the code can safely be run with multiple threads.
At the end of this reading, there are some specific recommendations for writing code that is thread safe. Your Flask project applications must be thread-safe.
Thread Applications¶
Consider a web browser, which is a kind of computer program. You launch Firefox, say, and, after a while, a window appears on your desktop and you can start browsing the web. That's a process: a process is a program that is running, so it has memory, with data structures allocated in that memory (such as the HTML and CSS of the page you're viewing) and so forth.
You open up a second tab and load a second web page in it. (The browser could still be a single process, doing this single thing.) That web page is loading kinda slowly, though, so you leave it loading while you return to the first tab, scroll down a bit, mouse over a menu, which changes color in response to the location of your mouse, and stuff like that. Meanwhile, the second tab finishes loading, which you notice because your tab bar changes appearance.
Your web browser is doing (at least) two things at once: paying attention to you, and loading a page. This is only possible with threads.
Threads in the Server-Side of Web Applications¶
We've seen how threads can be useful in web browsers, but what about on the server side? In web technologies like PHP and in CGI, scripts get read from disk, parsed and byte-compiled, run, and then finish. They are more ephemeral than a mayfly: they live for a few milliseconds, and then die. If we make 100 web requests, 100 processes are born, run and die. If they are doing something relatively simple, there's a lot of overhead for not much accomplishment. Here's a way to picture it:
So, what's wrong with that? First of all, it's expensive to invoke a program, particularly a PHP/Perl/Python program:
- A separate process has to be created. That, by itself is somewhat expensive, compared to a thread. The jargon for this, by the way, is forking, because the operating system call that creates a new process is fork.
- The PHP/Perl/Python program may have to be read from disk (if it's not cached). If the disk has to turn a jillionth of an inch, that takes forever. Heaven help us if the disk arm has to move.
- The PHP/Perl/Python program has to be byte-compiled. Every single time (barring caching).
A second problem is that the external program typically has to be
stateless
: since it dies at the end of each transaction, if it
wants to remember anything across invocations it has to write them to
files or databases. (We noticed this problem and addressed it with
sessions.)
There are many partial solutions to these problems. For example, the
mod_python
module is an extension to Apache that essentially builds
in a Python interpreter. A deployed Flask application uses this
built-in Python, so a deployed Flask application will be
multi-threaded. That's important for us and how we code our Flask
apps.
Multi-Threaded Flask¶
A multi-threaded deployed Flask application might look like this:
(Note that I haven't pictured the handoff of the request from Apache to Flask, but that's an irrelevant and minor complication. Another wrinkle is that a new thread is created for each request, up to some configured limit of concurrent threads, and I thought that reusing thread1 better illustrated that constraint.)
This particular Flask app is configured with a maximum of two threads, but it could easily be configured to have many more.
Each request gets handled by a thread. That thread devotes itself entirely to that request until the request is finished and a response is sent to the browser. In the picture above, the first request was assigned to thread 2. The second request goes to thread 1, which finishes quickly, and so request 3 also goes to thread 1. And so on.
If no thread is available for a request, the request either waits or gets lost. So, you want to configure enough threads to handle the anticipated demand. Or you can dynamically add more threads, which is what Apache does.
So, we need to understand threads a bit more.
Threads to the Rescue¶
One important solution (certainly not the only possible solution) is to avoid forking (creating new processes) by using threads.
What are threads? As you know, a program (whether written in Java,
Python, PHP, JavaScript or any other language) executes one line at a
time from some starting point (such as the main
method). Function
calls and method calls and such change the location
where the
program is,
but there's always exactly one place where the
program is.
(You also know that, in real life, the high-level language is translated into a compiled form and it's that compiled code executes one instruction at a time, but that's an unimportant difference for this discussion. It's sufficient to think of a program executing one line at a time.)
In a course on computer organization and assembly language, you learn
that the location
of the program is the program counter,
which is a special register on the processor that holds the memory
address where the next instruction will be fetched from. (If that
sentence made no sense to you, take CS 240 or talk to me.) In that
course or some other, you learn that during a method call, the
previous location (value of the program counter) is kept on a stack,
so that the program can resume the calling method at the correct
location when the method is done. Using a stack allows recursion.
This might be called the control flow
of your program and the
sequence of locations might be called the thread of control. Imagine
Theseus executing your code, trailing the thread that Ariadne gave him
as he does so. That thread marks the sequence of locations that your
program traverses as it executes.
Now, imagine that you can have two or even more threads of control at once! Your program can be in more than one place at the same time. However, the rules for control flow are exactly the same for each thread of control.
To implement this, the virtual machine really only needs two things for each thread, a program counter and a stack. Note that the stack is where function arguments and local variables of functions are stored, so arguments and local variables are always thread-safe.
Processes Vs Threads¶
The main distinction in this discussion is between difference processes and threads. Processes are completely separate and protected from one another, while threads share memory. In the figure about Apache CGI requests, each CGI program is a separate process. If we zoom in on the memory representation of a process, it might look like this:
After the request finishes, all that stuff is discarded. When the next request comes in, it gets created and discarded again.
In contrast, creating a thread doesn't require as much. Here's a similar picture of the memory representation after creating a second thread:
In fact, we can re-write our Flask diagram earlier like this:
So, our two threads are sharing the code and heap from our Flask app.
Thread Properties¶
Threads are:
- Cheap, since there is not much overhead for them. That is, it's easy to allocate a program counter and a stack.
- Context switches (changing which thread is executing) are also cheap, since the same program is being executed, so there's no swapping pages in and out of memory and other safeguards that the OS puts in place to protect one process from another.
- Risky, since one thread can modify a shared data structure and inadvertently (or intentionally) mess up other threads. This is what we saw with transactions.
- Risky, since one errant thread can bring down the whole program.
If the threads are cooperating on solving a problem, multi-threaded programming can be complex and difficult, and debugging can be a nightmare, since more than one thing is happening at once. On the other hand, some problems are much better solved with multi-threading, and there are enormous opportunities for parallelism to speed up execution.
Alternatively, when the threads are essentially separate, each solving an equivalent problem using the shared resources of code and memory, the programming need not be much more complicated than single-thread programming, and can also yield speedups from parallelism. You have to be careful about shared data, but hopefully that can be minimized.
In Flask applications, and indeed in any multi-threaded web
application (including Apache), each thread will be handling an HTTP
request (a GET or POST request), and will be essentially
independent. Thus, we can get better performance at a relatively small
cost. We'll also see that having long-lived data in memory can be
useful, where by long-lived
we mean that it outlives the
particular HTTP request.
(In development mode, Flask uses only one thread. (That means, by the way, that if a request takes N milliseconds to complete, no other request can even get attended to for N milliseconds.) However, deployed Flask applications can use multiple threads, so we need to learn how to program with that in mind.)
Apache is Multi-Threaded¶
Now, the web server software itself, such as Apache, is
multi-threaded. The structure of Apache is to have a listener
thread that does nothing but listen on the port, grab the incoming web
requests, and put them on a work
queue. Various worker
threads then grab requests off the queue and actually do the work,
whether it's reading a file off the disk and sending it to the
browser, or something more complicated like reading and executing a
PHP script, or creating a separate process for a CGI script.
A deployed Flask app works similarly, except that Apache gets the requests and puts the ones for Flask onto a Flask-only queue, from which the Flask worker threads get them.
Producer/Consumer¶
An abstraction of the way that Apache/Flask works is the
producer/consumer
problem, where some threads (the listener
thread) produce stuff (work) and other threads (the workers) consume
the stuff. We'll see an example of that in Python later, but we first
have to learn how threads work in Python.
Since we will not be creating threads in CS 304, I've made this part of the reading optional and moved it to this page on producer-consumer in Python. I encourage you to read it and/or watch the video demonstrations, but I understand that you might not have time to focus on that right now.
The important ideas to get out of that reading are:
- If you have a global data structure in your Flask app,
- it will be shared among the different threads (requests) and so,
- you should ensure that its use is thread-safe by using
- locking,
- conditions, or
- thread-safe Python libraries like queue
You probably will not need to have a shared global data structure in your projects, but in case you do, I want you to be prepared.
Now let's turn to other aspects of concurrency.
MySQL is Multi-Threaded¶
MySQL obviously allows concurrent access, which is why locks were
necessary, as we learned last time. Threads allow multiple concurrent
connections to the database. If you are curious, you can use the
status
command to find out how many threads are currently running on
the server.
This is a good time to remind you of an important feature of
MySQL. When you insert a row into a table with an auto_increment
column, like an ID, what value just got inserted? It's not necessarily
the largest value in the table, because that might belong to a row
that was just inserted in some other thread, moments after yours.
The answer is that MySQL keeps track of the last inserted ID on a per
connection basis. You can find that out using the special function
last_insert_id()
.
MySQL and Flask¶
Because Flask might be multi-threaded, you don't want two threads to
use the same MySQL connection, because then we might have exactly the
trouble with auto_increment
that we were trying to avoid by using
last_insert_id()
.
Therefore, each Flask request should get its own database connection. Yes, that's wasteful when they could be re-used, but its better to be a little wasteful than to risk subtle threading bugs. (In a serious web application, we would probably have a pool of connection objects, and grab one from the pool instead of having to connect, and return it to the pool when our request is finished.)
As an analogy, imagine that I started a MySQL shell and then allowed
each of you to walk up to my machine and run queries and
updates. Thus, everyone shares the same connection (mine). Mostly,
that would work fine. But, if Abby walks up and does an insert into an
auto_increment
table, and Betty walks up and does likewise, and then
Abby asks for the last_insert_id()
, you can see that she'll get
Betty's value, not her own. That's why each request should get its own
database connection.
Furthermore, in Flask we should avoid global variables, because
global variables are shared among threads and therefore are not
thread-safe. (Thread-safe
means that only one thread has access
to the value, and so it can be modified without any kind of locking.)
However a read-only global variable is fine. It's global variables
that get updated (like the global counter in raceto50.py
and the
work queue in producer-consumer) that are problematic.
Thread Safety Checklist¶
Now that we've learn about threads and the notion of code being thread safe, we can require that our Flask apps be thread-safe. What does that mean?
- Global variables are either read-only or have access properly controlled (such as using locks).
- Each request gets its own database connection, so that different
requests use different connections. That means that
connection-specific functions like
last_insert_id()
will work correctly. - Any request that does multiple SQL operations needs to consider whether those operations produce "race conditions". For example, the common pattern of (a) checking if something is in a table and (b) inserting it if it is not, is not thread-safe: two threads executing that sequence at once can cause problems. This was described in the reading on transactions
Conclusions¶
Threads are pretty cool. They
- Allow the server to save computation time because it is cheaper to start a thread than to compile and run a program.
- Allow us to retain values between web requests.
- Allow us to have long-running computations that are easy to interact with.
Use last_insert_id()
in MySQL to find out the value of the
auto_increment
value that was generated in your connection.
In Flask, avoid global variables, though read-only ones are fine (for constants and configuration variables and such). Updates to global variables, if any, need to be synchrononized using locks. (Our Flask apps will only rarely need globals of this sort.) Also, each request should get its own database connection, rather than sharing a single global database connection.