Threads

Threads are not specific to this class or web applications. They are an important part of many kinds of computer systems. The Wikipedia article on threads is a good place to start learning more.

However, threads are relevant to this course because a deployed Flask application is multi-threaded. Therefore, it's important that we know how to write thread-safe code, meaning that the code can safely be run with multiple threads.

At the end of this reading, there are some specific recommendations for writing code that is thread safe. Your Flask project applications must be thread-safe.

Thread Applications

Consider a web browser, which is a kind of computer program. You launch Firefox, say, and, after a while, a window appears on your desktop and you can start browsing the web. That's a process: a process is a program that is running, so it has memory, with data structures allocated in that memory (such as the HTML and CSS of the page you're viewing) and so forth.

You open up a second tab and load a second web page in it. (The browser could still be a single process, doing this single thing.) That web page is loading kinda slowly, though, so you leave it loading while you return to the first tab, scroll down a bit, mouse over a menu, which changes color in response to the location of your mouse, and stuff like that. Meanwhile, the second tab finishes loading, which you notice because your tab bar changes appearance.

Your web browser is doing (at least) two things at once: paying attention to you, and loading a page. This is only possible with threads.

Threads in the Server-Side of Web Applications

We've seen how threads can be useful in web browsers, but what about on the server side? In web technologies like PHP and in CGI, scripts get read from disk, parsed and byte-compiled, run, and then finish. They are more ephemeral than a mayfly: they live for a few milliseconds, and then die. If we make 100 web requests, 100 processes are born, run and die. If they are doing something relatively simple, there's a lot of overhead for not much accomplishment. Here's a way to picture it:

Each request results in a new process being created
Each request results in a new process being created

So, what's wrong with that? First of all, it's expensive to invoke a program, particularly a PHP/Perl/Python program:

  • A separate process has to be created. That, by itself is somewhat expensive, compared to a thread. The jargon for this, by the way, is forking, because the operating system call that creates a new process is fork.
  • The PHP/Perl/Python program may have to be read from disk (if it's not cached). If the disk has to turn a jillionth of an inch, that takes forever. Heaven help us if the disk arm has to move.
  • The PHP/Perl/Python program has to be byte-compiled. Every single time (barring caching).

A second problem is that the external program typically has to be stateless: since it dies at the end of each transaction, if it wants to remember anything across invocations it has to write them to files or databases. (We noticed this problem and addressed it with sessions.)

There are many partial solutions to these problems. For example, the mod_python module is an extension to Apache that essentially builds in a Python interpreter. A deployed Flask application uses this built-in Python, so a deployed Flask application will be multi-threaded. That's important for us and how we code our Flask apps.

Multi-Threaded Flask

A multi-threaded deployed Flask application might look like this:

multi-threaded Flask with two threads

(Note that I haven't pictured the handoff of the request from Apache to Flask, but that's an irrelevant and minor complication. Another wrinkle is that a new thread is created for each request, up to some configured limit of concurrent threads, and I thought that reusing thread1 better illustrated that constraint.)

This particular Flask app is configured with a maximum of two threads, but it could easily be configured to have many more.

Each request gets handled by a thread. That thread devotes itself entirely to that request until the request is finished and a response is sent to the browser. In the picture above, the first request was assigned to thread 2. The second request goes to thread 1, which finishes quickly, and so request 3 also goes to thread 1. And so on.

If no thread is available for a request, the request either waits or gets lost. So, you want to configure enough threads to handle the anticipated demand. Or you can dynamically add more threads, which is what Apache does.

So, we need to understand threads a bit more.

Threads to the Rescue

One important solution (certainly not the only possible solution) is to avoid forking (creating new processes) by using threads.

What are threads? As you know, a program (whether written in Java, Python, PHP, JavaScript or any other language) executes one line at a time from some starting point (such as the main method). Function calls and method calls and such change the location where the program is, but there's always exactly one place where the program is.

(You also know that, in real life, the high-level language is translated into a compiled form and it's that compiled code executes one instruction at a time, but that's an unimportant difference for this discussion. It's sufficient to think of a program executing one line at a time.)

In a course on computer organization and assembly language, you learn that the location of the program is the program counter, which is a special register on the processor that holds the memory address where the next instruction will be fetched from. (If that sentence made no sense to you, take CS 240 or talk to me.) In that course or some other, you learn that during a method call, the previous location (value of the program counter) is kept on a stack, so that the program can resume the calling method at the correct location when the method is done. Using a stack allows recursion.

This might be called the control flow of your program and the sequence of locations might be called the thread of control. Imagine Theseus executing your code, trailing the thread that Ariadne gave him as he does so. That thread marks the sequence of locations that your program traverses as it executes.

Now, imagine that you can have two or even more threads of control at once! Your program can be in more than one place at the same time. However, the rules for control flow are exactly the same for each thread of control.

To implement this, the virtual machine really only needs two things for each thread, a program counter and a stack. Note that the stack is where function arguments and local variables of functions are stored, so arguments and local variables are always thread-safe.

Processes Vs Threads

The main distinction in this discussion is between difference processes and threads. Processes are completely separate and protected from one another, while threads share memory. In the figure about Apache CGI requests, each CGI program is a separate process. If we zoom in on the memory representation of a process, it might look like this:

A process in memory has code, a heap, and a stack
A process in memory has code, a heap, and a stack

After the request finishes, all that stuff is discarded. When the next request comes in, it gets created and discarded again.

In contrast, creating a thread doesn't require as much. Here's a similar picture of the memory representation after creating a second thread:

A process with two threads that share code and heap, but have their own stack.
A process with two threads that share code and heap, but have their own stack.

In fact, we can re-write our Flask diagram earlier like this:

multi-threaded Flask with two threads

So, our two threads are sharing the code and heap from our Flask app.

Thread Properties

Threads are:

  • Cheap, since there is not much overhead for them. That is, it's easy to allocate a program counter and a stack.
  • Context switches (changing which thread is executing) are also cheap, since the same program is being executed, so there's no swapping pages in and out of memory and other safeguards that the OS puts in place to protect one process from another.
  • Risky, since one thread can modify a shared data structure and inadvertently (or intentionally) mess up other threads. This is what we saw with transactions.
  • Risky, since one errant thread can bring down the whole program.

If the threads are cooperating on solving a problem, multi-threaded programming can be complex and difficult, and debugging can be a nightmare, since more than one thing is happening at once. On the other hand, some problems are much better solved with multi-threading, and there are enormous opportunities for parallelism to speed up execution.

Alternatively, when the threads are essentially separate, each solving an equivalent problem using the shared resources of code and memory, the programming need not be much more complicated than single-thread programming, and can also yield speedups from parallelism. You have to be careful about shared data, but hopefully that can be minimized.

In Flask applications, and indeed in any multi-threaded web application (including Apache), each thread will be handling an HTTP request (a GET or POST request), and will be essentially independent. Thus, we can get better performance at a relatively small cost. We'll also see that having long-lived data in memory can be useful, where by long-lived we mean that it outlives the particular HTTP request.

(In development mode, Flask uses only one thread. (That means, by the way, that if a request takes N milliseconds to complete, no other request can even get attended to for N milliseconds.) However, deployed Flask applications can use multiple threads, so we need to learn how to program with that in mind.)

Apache is Multi-Threaded

Now, the web server software itself, such as Apache, is multi-threaded. The structure of Apache is to have a listener thread that does nothing but listen on the port, grab the incoming web requests, and put them on a work queue. Various worker threads then grab requests off the queue and actually do the work, whether it's reading a file off the disk and sending it to the browser, or something more complicated like reading and executing a PHP script, or creating a separate process for a CGI script.

Apache with listener thread, work queue of pending requests, and four worker threads
Apache with listener thread, work queue of pending requests, and four worker threads

A deployed Flask app works similarly, except that Apache gets the requests and puts the ones for Flask onto a Flask-only queue, from which the Flask worker threads get them.

Producer/Consumer

An abstraction of the way that Apache/Flask works is the producer/consumer problem, where some threads (the listener thread) produce stuff (work) and other threads (the workers) consume the stuff. We'll see an example of that in Python later, but we first have to learn how threads work in Python.

Since we will not be creating threads in CS 304, I've made this part of the reading optional and moved it to this page on producer-consumer in Python. I encourage you to read it and/or watch the video demonstrations, but I understand that you might not have time to focus on that right now.

The important ideas to get out of that reading are:

  • If you have a global data structure in your Flask app,
  • it will be shared among the different threads (requests) and so,
  • you should ensure that its use is thread-safe by using
    • locking,
    • conditions, or
    • thread-safe Python libraries like queue

You probably will not need to have a shared global data structure in your projects, but in case you do, I want you to be prepared.

Now let's turn to other aspects of concurrency.

MySQL is Multi-Threaded

MySQL obviously allows concurrent access, which is why locks were necessary, as we learned last time. Threads allow multiple concurrent connections to the database. If you are curious, you can use the status command to find out how many threads are currently running on the server.

This is a good time to remind you of an important feature of MySQL. When you insert a row into a table with an auto_increment column, like an ID, what value just got inserted? It's not necessarily the largest value in the table, because that might belong to a row that was just inserted in some other thread, moments after yours.

The answer is that MySQL keeps track of the last inserted ID on a per connection basis. You can find that out using the special function last_insert_id().

MySQL and Flask

Because Flask might be multi-threaded, you don't want two threads to use the same MySQL connection, because then we might have exactly the trouble with auto_increment that we were trying to avoid by using last_insert_id().

Therefore, each Flask request should get its own database connection. Yes, that's wasteful when they could be re-used, but its better to be a little wasteful than to risk subtle threading bugs. (In a serious web application, we would probably have a pool of connection objects, and grab one from the pool instead of having to connect, and return it to the pool when our request is finished.)

As an analogy, imagine that I started a MySQL shell and then allowed each of you to walk up to my machine and run queries and updates. Thus, everyone shares the same connection (mine). Mostly, that would work fine. But, if Abby walks up and does an insert into an auto_increment table, and Betty walks up and does likewise, and then Abby asks for the last_insert_id(), you can see that she'll get Betty's value, not her own. That's why each request should get its own database connection.

Furthermore, in Flask we should avoid global variables, because global variables are shared among threads and therefore are not thread-safe. (Thread-safe means that only one thread has access to the value, and so it can be modified without any kind of locking.) However a read-only global variable is fine. It's global variables that get updated (like the global counter in raceto50.py and the work queue in producer-consumer) that are problematic.

Thread Safety Checklist

Now that we've learn about threads and the notion of code being thread safe, we can require that our Flask apps be thread-safe. What does that mean?

  1. Global variables are either read-only or have access properly controlled (such as using locks).
  2. Each request gets its own database connection, so that different requests use different connections. That means that connection-specific functions like last_insert_id() will work correctly.
  3. Any request that does multiple SQL operations needs to consider whether those operations produce "race conditions". For example, the common pattern of (a) checking if something is in a table and (b) inserting it if it is not, is not thread-safe: two threads executing that sequence at once can cause problems. This was described in the reading on transactions

Conclusions

Threads are pretty cool. They

  • Allow the server to save computation time because it is cheaper to start a thread than to compile and run a program.
  • Allow us to retain values between web requests.
  • Allow us to have long-running computations that are easy to interact with.

Use last_insert_id() in MySQL to find out the value of the auto_increment value that was generated in your connection.

In Flask, avoid global variables, though read-only ones are fine (for constants and configuration variables and such). Updates to global variables, if any, need to be synchrononized using locks. (Our Flask apps will only rarely need globals of this sort.) Also, each request should get its own database connection, rather than sharing a single global database connection.