Threads

Threads are not specific to this class or web applications. They are an important part of many kinds of computer systems. The Wikipedia article on threads is a good place to start learning more.

However, at the end of this reading, there are some specific recommendations for writing code that is thread safe. Your Flask project applications must be thread-safe.

Thread Applications

Consider a web browser, which is a kind of computer program. You launch Firefox, say, and, after a while, a window appears on your desktop and you can start browsing the web. That's a process: a process is a program that is running, so it has memory, with data structures allocated in that memory (such as the HTML and CSS of the page you're viewing) and so forth.

You open up a second tab and load a second web page in it. (The browser could still be a single process, doing this single thing.) That web page is loading kinda slowly, though, so you leave it loading while you return to the first tab, scroll down a bit, mouse over a menu, which changes color in response to the location of your mouse, and stuff like that. Meanwhile, the second tab finishes loading, which you notice because your tab bar changes appearance.

Your web browser is doing (at least) two things at once: paying attention to you, and loading a page. This is only possible with threads.

Threads in the Server-Side of Web Applications

We've seen how threads can be useful in web browsers, but what about on the server side? In web technologies like PHP and in CGI, scripts get read from disk, parsed and byte-compiled, run, and then finish. They are more ephemeral than a mayfly: they live for a few milliseconds, and then die. If we make 100 web requests, 100 processes are born, run and die. If they are doing something relatively simple, there's a lot of overhead for not much accomplishment. Here's a way to picture it:

Each CGI request results in a new process being created
Each CGI request results in a new process being created

So, what's wrong with that? First of all, it's expensive to invoke a CGI program, particularly a PHP/Perl/Python program:

  • A separate process has to be created. That, by itself is somewhat expensive, compared to a thread. The jargon for this, by the way, is forking, because the operating system call that creates a new process is fork.
  • The PHP/Perl/Python program may have to be read from disk (if it's not cached). If the disk has to turn a jillionth of an inch, that takes forever. Heaven help us if the disk arm has to move.
  • The PHP/Perl/Python program has to be byte-compiled. Every single time (barring caching).

A second problem is that the CGI program typically has to be stateless: since it dies at the end of each transaction, if it wants to remember anything across invocations it has to write them to files or databases. (We noticed this problem and addressed it with sessions.)

There are many partial solutions to these problems. For example, the mod_python module is an extension to Apache that essentially builds in a Python interpreter. A deployed Flask application uses this built-in Python, so a deployed Flask application will be multi-threaded. That's important for us and how we code our Flask apps.

Threads to the Rescue

One important solution (certainly not the only possible solution) is to avoid forking (creating new processes) by using threads.

What are threads? As you know, a program (whether written in Java, Python, PHP, JavaScript or any other language) executes one line at a time from some starting point (typically the main method). Method calls and such change the location where the program is, but there's always exactly one place where the program is.

(You also know that, in real life, the high-level language is translated into a compiled form and it's that compiled code executes one line at a time, but that's an unimportant difference for this discussion. It's sufficient to think of a program executing one line at a time.)

In a course on computer organization and assembly language, you learn that the location of the program is the program counter, which is a special register on the processor that holds the memory address where the next instruction will be fetched from. (If that sentence made no sense to you, take CS 240 or talk to me.) In that course or some other, you learn that during a method call, the previous location (value of the program counter) is kept on a stack, so that the program can resume the calling method at the correct location when the method is done. Using a stack allows recursion.

This might be called the control flow of your program and the sequence of locations might be called the thread of control. Imagine Theseus executing your code, trailing the thread that Ariadne gave him as he does so. That thread marks the sequence of locations that your program traverses as it executes.

Now, imagine that you can have two or even more threads of control at once! Your program can be in more than one place at the same time. However, the rules for control flow are exactly the same for each thread of control.

To implement this, the virtual machine really only needs two things for each thread, a program counter and a stack.

Processes Vs Threads

The main distinction in this discussion is between difference processes and threads. Processes are completely separate and protected from one another, while threads share memory. In the figure about Apache CGI requests, each CGI program is a separate process. If we zoom in on the memory representation of a process, it might look like this:

A process in memory has code, a heap, and a stack
A process in memory has code, a heap, and a stack

After the CGI request finishes, all that stuff is discarded. When the next request comes in, it gets created and discarded again.

In contrast, creating a thread doesn't require as much. Here's a similar picture of the memory representation after creating a second thread:

A process with two threads that share code and heap, but have their own stack.
A process with two threads that share code and heap, but have their own stack.

Threads are:

  • Cheap, since there is not much overhead for them. That is, it's easy to allocate a program counter and a stack.
  • Context switches (changing which thread is executing) are also cheap, since the same program is being executed, so there's no swapping pages in and out of memory and other safeguards that the OS puts in place to protect one process from another.
  • Risky, since one thread can modify a shared data structure and inadvertently (or intentionally) mess up other threads. This is what we saw with transactions.
  • Risky, since one errant thread can bring down the whole program.

If the threads are cooperating on solving a problem, multi-threaded programming can be complex and difficult, and debugging can be a nightmare, since more than one thing is happening at once. On the other hand, some problems are much better solved with multi-threading, and there are enormous opportunities for parallelism to speed up execution.

Alternatively, when the threads are essentially separate, each solving an equivalent problem using the shared resources of code and memory, the programming need not be much more complicated than single-thread programming, and can also yield speedups from parallelism. You have to be careful about shared data, but hopefully that can be minimized.

In development mode, Flask uses only one thread. (That means, by the way, that if a request takes N milliseconds to complete, no other request can even get attended to for N milliseconds.) However, deployed Flask applications can use multiple threads, so we need to learn how to program with that in mind.

In Flask applications, and indeed in any multi-threaded web application (including Apache), each thread will be handling an HTTP request (a GET or POST request), and will be essentially independent. Thus, we can get better performance at a relatively small cost. We'll also see that having long-lived data in memory can be useful, where by long-lived we mean that it outlives the particular HTTP request.

Apache is Multi-Threaded

Now, the web server software itself, such as Apache, is multi-threaded. The structure of Apache is to have a listener thread that does nothing but listen on the port, grab the incoming web requests, and put them on a work queue. Various worker threads then grab requests off the queue and actually do the work, whether it's reading a file off the disk and sending it to the browser, or something more complicated like reading and executing a PHP script, or creating a separate process for a CGI script.

Apache with listener thread, work queue of pending requests, and four worker threads
Apache with listener thread, work queue of pending requests, and four worker threads

A deployed Flask app works similarly, except that Apache gets the requests and puts the ones for Flask onto a Flask-only queue, from which the Flask worker threads get them.

Producer/Consumer

An abstraction of the way that Apache/Flask works is the producer/consumer problem, where some threads (the listener thread) produce stuff (work) and other threads (the workers) consume the stuff. We'll see an example of that in Python later, but we first have to learn how threads work in Python.

Threads in Python

For those of you unfamiliar with thread programming in Python, a few notes:

  • Any object can be threaded; it just has to inherit from the Thread class.
  • It has to implement a run() method. This method is closely analogous to main(), in that it is the start of a sequence of execution steps (an entry point).
  • You start a thread by first making an instance of the runnable class; call it runner. You then invoke the .start() method.
  • If some other thread wants to wait for the runner to end, it can invoke the .join() method on the other thread object (the runner).

Here's a good place to start learning more about understanding threads in Python

Here's the documentation on Python Threading

Synchronization

The key concern with using threads is the worry that one thread will modify a shared data structure while another thread is using it, so that the result is inconsistent or even disastrous (following null pointers and the like). We saw these issues with transactions as well.

One of the main tools to avoid problems is to make sure that we lock any shared data structures whenever we use them and release the lock when we are done. This is exactly the same idea as locking tables in MySQL transactions.

Here's an excerpt of the code we saw that incremented a global counter without locking:

#define a global variable
some_var = 0

class IncrementThread(Thread):
    def run(self):
        #we want to read a global variable
        #and then increment it
        global some_var
        read_value = some_var
        print("some_var in {} before increment is {}"
              .format(self.name, read_value))
        some_var = read_value + 1 
        print("some_var in {} after increment is {}"
              .format(self.name, some_var))

Here it is with the accesses to the global counter synchronized using locks:

# define a global, shared lock
lock = Lock()                  # <------------------- create lock

#define a global variable
some_var = 0

class IncrementThread(Thread):
    def run(self):
        #we want to read a global variable
        #and then increment it
        lock.acquire()          # <-------------------- acquire it
        global some_var
        read_value = some_var
        print("some_var in {} before increment is {}"
              .format(self.name, read_value))
        some_var = read_value + 1 
        print("some_var in {} after increment is {}"
              .format(self.name, some_var))
        lock.release()          # <-------------------- release it

Next, we'll look at more sophisticated examples of threads the course, when we look at synchronization and locking, including the producer-consumer problem.

Producer/Consumer

The Producer/Consumer problem is a useful abstraction of many real-world problems of threading. Indeed, it's pretty close to what Apache and Flask do:

  • The main thread listens to the network (say on ports 80 and 443), and when a request comes in, it puts the request on a work queue and returns to listening. Thus, it's a producer
  • The other threads wait for work to be added to the work queue, and when there's something to do, they do it (respond to the request). Thus, they are consumers.

Some useful tutorials about the Producer

The next few sections will discuss three different solutions to the producer/consumer problem in Python. These are important concepts and techniques for your knowledge of computer science, but you will probably not use any of these coding techniques in your projects, so you don't have to dig deeply into the code. Read for concepts.

Examples/Demos

These examples are in the threads folder:

cd ~/cs304
cp -r ~cs304/pub/downloads/threads threads
cd threads

Daemonic Threads

The Python threading API defines two kinds of threads: daemons and non-daemons. A Python program is defined to end when all non-daemons are done. So, if your non-daemon thread is infinite, which they often are, they will never be done and your program is hard to exit.

Therefore, in the examples below, I go to a little extra effort to make the threads daemons so that we can easily kill the program. This is easily done with the setDaemon method:

t1 = MyThread()
t1.setDaemon(True)
t1.start()

If we don't set the thread to be a daemon, the Python program becomes hard to kill. ^C, which usually works, doesn't. Instead, we have to ^Z to suspend the process and then kill it:

$ python daemon.py N 
starting thread 
thread started. Waiting for you to type something 
hi 
jklsd 
die! 
^C 
^C 
^Z 
[1]+ Stopped      python daemon.py N 
$ jobs 
[1]+ Stopped      python daemon.py N 
$ kill %1 

Note that killing a thread is usually the wrong thing. What if it's holding a resource or doing something to a data structure when it gets killed? What if it's halfway through a balance transfer in our banking/babysitting app? Killing it could leave that data structure in a broken state or fail an audit.

Usually, it's better to set up a communication, possibly using a shared variable, that requests that the thread exit. Something like this:

from threading import Thread
import time

keepGoing = None

class play(Thread):
    def run(self):
        global keepGoing
        keepGoing = True
        while keepGoing:
            time.sleep(1)
            print('    play')
        print('    darn it!')

def main():
    global keepGoing
    kid = play()
    print('go and play')
    kid.start()
    time.sleep(10)
    print('time to stop!')
    keepGoing = False

main()

The code above is in frolic.py

So, the moral is:

avoid killing a thread. Have it exit gracefully.

Producer-Consumer with Locking

Our first producer-consumer example is producer_consumer_lock.py. Here's an excerpt from the code. The lines with comments are the important ones.

work = []                       # shared data structure ===
lock = Lock()                   # create lock =============

class ProducerThread(Thread):
    def run(self):
        while True:
            num = random.randint(1,10)
            lock.acquire()      # acquire it ==============
            work.append(num)
            print('Produced ',num,
                  'work is ',work)
            lock.release()      # release it ==============
            time.sleep(random.random())

class ConsumerThread(Thread):
    def run(self):
        while True:
            lock.acquire()      # acquire it ==============
            if len(work) == 0:
               num = -1
            else:
                num = work.pop()
            print('Consumed ',num,
                  'work is ',work)
            lock.release()      # release it =============
            time.sleep(random.random())

Observations:

  • prints the content of the work queue whenever anything happens
  • If there's nothing to consume, this consumer thread just spins (loops endlessly), taking up CPU time doing nothing but checking if there's something to do. This is a busy wait. Busy-waits are usually bad because they waste the CPU.
  • Better is for the consumers to go to sleep, and have the producer wake it up when there's work to be done.

Producer-Consumer with Conditions

Conditions are objects similar to locks, except that they have a list of threads waiting on that condition, and a thread can add itself to the list. When the condition that they are waiting for happens, the threads can be notified (woken up). That avoids the problem of busy-wait; instead the thread just sleeps until there is work to do.

This example of producer-consumer is producer_consumer_lock.py. Here's an excerpt from the code. The lines with comments are the important ones.

work = []                      # shared data structure
condition = Condition()         # create condition

class ProducerThread(Thread):
    def run(self):
        while True:
            num = random.randint(1,10)
            condition.acquire() # acquire it
            work.append(num)
            print('Produced ',num,
                  'work is ',work)
            condition.notify()  # --- new! -------
            condition.release() # release it
            time.sleep(random.random())

class ConsumerThread(Thread):
    def run(self):
        while True:
            condition.acquire() # acquire it
            if len(work) == 0:
                condition.wait() # --- new! -------
            num = work.pop()
            print('Consumed ',num,
                  'work is ',work)
            condition.release() # release it
            time.sleep(random.random())

Producer-Consumer with Queue

The Queue object encapsulates the blocking and notification. See Queue The code works like the condition version, but is easier.

producer_consumer_lock.py. Here's an excerpt from the code. The lines with comments are the important ones.

work = queue.Queue()  # ----- new data structure ----

class ProducerThread(Thread):
    def run(self):
        while True:
            num = random.randint(1,10)
            work.put(num)      # ---- automatically synchronized 
            print('Produced ',num,
                  'work len ',work.qsize())
            time.sleep(random.random())

class ConsumerThread(Thread):
    def run(self):
        while True:
            num = work.get(True) # block if work queue empty ------
            print('Consumed ',num,
                  'work len ',work.qsize())
            time.sleep(random.random())

Example Summary

Let's summarize our producer-consumer examples before turning to our last topics.

  • threads should not be killed; they should be coded so they can gracefully exit
  • access to shared data structures should be managed for thread safety
  • locks can give exclusive access to shared data structure
  • conditions can give also exclusive access to shared data structure, but can avoid busy-wait.
  • the Python library has thread-safe data structures

Videos

I have created videos of these demos in action. See the videos page.

MySQL is Multi-Threaded

MySQL obviously allows concurrent access, which is why locks were necessary. Threads allow multiple concurrent connections to the database. You can use the status command to find out how many threads are currently running on the server.

This is a good time to remind you of an important feature of MySQL. When you insert a row into a table with an auto_increment column, like an ID, what value just got inserted? It's not necessarily the largest value in the table, because that might belong to a row that was just inserted in some other thread, moments after yours.

The answer is that MySQL keeps track of the last inserted ID on a per connection basis. You can find that out using the special function last_insert_id().

MySQL and Flask

Because Flask might be multi-threaded, you don't want two threads to use the same MySQL connection, because then we might have exactly the trouble with auto_increment that we were trying to avoid by using last_insert_id().

Therefore, each Flask request should get its own database connection. Yes, that's wasteful when they could be re-used, but its better to be a little wasteful than to risk subtle threading bugs. (In a serious web application, we would probably have a pool of connection objects, and grab one from the pool instead of having to connect, and return it to the pool when our request is finished.)

As an analogy, imagine that I started a MySQL shell and then allowed each of you to walk up to my machine and run queries and updates. Thus, everyone shares the same connection (mine). Mostly, that would work fine. But, if Abby walks up and does an insert into an auto_increment table, and Betty walks up and does likewise, and then Abby asks for the last_insert_id(), you can see that she'll get Betty's value, not her own. That's why each request should get its own database connection.

Furthermore, in Flask we should avoid global variables, because global variables are shared among threads and therefore are not thread-safe. (Thread-safe means that only one thread has access to the value, and so it can be modified without any kind of locking.) However a read-only global variable is fine. It's global variables that get updated (like the global counter in raceto50.py and the work queue in producer-consumer) that are problematic.

Thread Safety Checklist

Now that we've learn about threads and the notion of code being thread safe, we can require that our Flask apps be thread-safe. What does that mean?

  1. Global variables are either read-only or have access properly controlled (such as using locks).
  2. Each request gets its own database connection, so that different requests use different connections. That means that connection-specific functions like last_insert_id() will work correctly.
  3. Any request that does multiple SQL operations needs to consider whether those operations produce "race conditions". For example, the common pattern of (a) checking if something is in a table and (b) inserting it if it is not, is not thread-safe: two threads executing that sequence at once can cause problems. This was described in the reading on transactions

Conclusions

Threads are pretty cool. They

  • Allow the server to save computation time because it is cheaper to start a thread than to compile and run a program.
  • Allow us to retain values between web requests.
  • Allow us to have long-running computations that are easy to interact with.

Use last_insert_id() in MySQL to find out the value of the auto_increment value that was generated in your connection.

In Flask, avoid global variables, though read-only ones are fine (for constants and configuration variables and such). Updates to global variables, if any, need to be synchrononized using locks. (Our Flask apps will only rarely need globals of this sort.) Also, each request should get its own database connection, rather than sharing a single global database connection.