Deploying a Flask Application

To deploy a Flask application means to integrate your Flask application with Apache, so that someone can use your app with a normal browser going to http://cs.wellesley.edu/yourapp or, even better, https://cs.wellesley.edu/yourapp (note the use of a secure, SSL connection because of the HTTPS protocol).

To do this, you have to:

  • write a configuration file for use by Apache
  • get the system administrator to install that file and restart Apache so that the configuration file is used

Fortunately, I'm the sysadmin for the CS server, so I can help you with the second part.

External Readings. The connection between Apache and a Python-based external application like our Flask apps is mediated by an API called WSGI and implemented by the Apache module called modwsgi. Most of our work uses a directive called WSGIDaemonProcess. These readings are optional, but the links are here if you want to pursue it.

Python Version

Note that Python comes in two versions: Python 2 and Python 3. They are fundamentally different at their core, including differences in handling Unicode strings, division of integers, iterators, the print function, and much more.

You can read more in many places, but here is one comparison of Python 2 versus Python 3

When we deploy a Flask app, it uses the version of Python that is precompiled into Apache. On Tempest as of this writing (October 2020), our Apache uses Python 2. Fortunately, it turns out that our Flask code usually runs fine in Apache.

Unfortunately, there are a few ugly things we have to do to deploy our Flask apps in this odd execution environment.

Running your app in Python2

Since the python2 in our Apache will be running your app, using it's own python path, you need to check that your app runs in Python 2. With your virtual env deactivated, run your app.py using python2:

python2 app.py

If that works, you can proceed. If not, you may have to modify your code to be backward-compatible with Python2. Our flask-starter app runs fine in Python2, so that's a good starting point. I'm happy to help if you have trouble.

aside about venv: you'll note that we aren't using the venv with the deployed app. It's possible to use a venv, but because I'm going to reserve that until we are using a version of Apache with Python3 compiled in. I've installed flask, pymysql, bcrypt, and cs304dbi in the site-packages folder for the python2 that Apache uses.

Configuring Apache

Apache is a huge, multi-faceted piece of software, and so configuring it is a huge task. Fortunately, the configuration can be broken up into separate chunks, each of which has its own file. These files are loaded when Apache starts (at boot time or when the sysadmin restarts the daemon). These files are stored in /etc/httpd/conf.d/.

Here is an excerpt of such a configuration file, namely wsgi-scott.conf in that directory:

# home is the home directory
# python-path is list of directories to add to the path; this adds the home dir
#     so that app.wsgi can import app.py and so forth
# inactivity-timeout is the number of seconds to be idle before shutting the process down. 

WSGIDaemonProcess scott \
    user=anderson \
    processes=1 \
    threads=2 \
    display-name=httpd-scott 
    home=/home/anderson/wsgi/ \
    python-path=/home/anderson/wsgi/ 
    inactivity-timeout=3600
WSGIScriptAlias /scott /home/anderson/wsgi/app.wsgi process-group=scott

A note about the syntax here. Technically, each configuration variable must be on a single line. However, you are allowed to continue on the next line if the last character on the line is a backslash. But be careful: even a single space character after the backslash breaks the continuation and you'll get a syntax error.

Let's break down the parts of that configuration. (All of these are documented in greater detail and precision in the docs linked at top of the reading.)

  • user is the user that the app will run as (that is, the username that maps to a given UID). Any read/write permissions on files will depend on that UID. This should be your account or your team account, depending on where the files are.
  • processes is the number of processes to create (we learned about processes when we learned about threads).
  • threads is the number of threads to create.
  • display-name is cosmetic. It determines how the process(es) will show up in a listing of processes running on the server.1
  • home is the home directory of the application (not necessarily the home directory of the user). I have an example application in the wsgi sub-folder of my personal account, so that's the home directory here. If you were deploying your beta version, this might be /students/teamacct/beta/.
  • inactivity-timeout is the number of seconds of inactivity before the process is restarted. This reclaims memory if your app runs and uses up a bunch of memory and then doesn't run for a while.

All of the WSGIDaemonProcess configuration information creates a process group that runs your Flask app. (Like running your app.py). In the example above, we named the process group scott.

Then, the WSGIScriptAlias maps a URL to your group. This is a URL that maps to your / route, and prefixes all your routes. This is where you come up with a name for your app, which we referred to as yourapp at the top of this reading.

Fortunately, you don't have to re-write all your URLs because you used url_for(), remember?

So, in the example above, you can get to my app by accessing https://cs.wellesley.edu/scott.

app.wsgi

The configuration of WSGIScriptAlias referred to a file called app.wsgi:

WSGIScriptAlias /scott /home/anderson/wsgi/app.wsgi process-group=scott

That file lives in your app's directory and imports your app from app.py, making it the value of application, which is the interface required by WSGI. So, it's essentially just renaming your global variable from app to application. Having a separate file helps in a few other ways as well, as we'll see below when we get to redeploying. Here's the complete app.wsgi file:

from app import app as application

You should test that this works in Python2 as well; it should:

python2 app.wsgi

If that all works, you can proceed to steps about allowing access to your app.

Enabling and Restricting Access

By default, Apache doesn't allow any access to directories without special permission. So, another thing we must do is grant permission to the directory. In all of the examples below, I'm setting permissions to the folder that contains my app, namely /home/anderson/wsgi. You would change that to the directory that contains your app, such as /students/teamacct/beta/ or whatever.

At this point, we have some choices to make. If you want to open your app up to the world, that's very easy:

<Directory /home/anderson/wsgi>
    Require all granted
</Directory>

On the other hand, you might want to restrict access to just the Wellesley community by allowing access only from on-campus IP addresses (or people using the VPN). The on-campus IP addresses start with 149.130:

<Directory /home/anderson/wsgi>
    # wellesley.edu
    Require ip 149.130
</Directory>

On the third hand, you might want to further loosen that up to allow either an on-campus IP or the guest credentials we've used through this course. The following allows either condition to hold (that's the RequireAny).

<Directory /home/anderson/wsgi>
    AuthUserFile /var/www/htpasswd-guest
    AuthGroupFile /dev/null
    # AuthName seems to have no effect on Chrome but otherwise appears
    # in the prompt to the user for the htpassword
    AuthName "Wellesley Community"
    AuthType Basic
   <RequireAny>
       Require user guest
       # wellesley.edu
       Require ip 149.130
    </RequireAny>
</Directory>

The AuthUserFile is a file on the server that contains usernames and (encrypted) passwords. See authUserFile. When an off-campus visitor visits the page, the browser will ask them for a username/password, as we've seen ourselves. The /var/www/htpasswd-guest file has the credentials we've used elsewhere in this course. If you want to set other credentials; see me about that. That file stores username/password pairs. If you're curious about options for encrypting the password, you can check the password formats. One option is bcrypt.

On the fourth hand, I often allow both of those, plus access from several validator websites, so the Directory grows even bigger:

<Directory /home/anderson/wsgi>
    AuthUserFile /var/www/htpasswd-guest
    AuthGroupFile /dev/null
    # AuthName seems to have no effect on Chrome but otherwise appears
    # in the prompt to the user for the htpassword
    AuthName "Wellesley Community"
    AuthType Basic
   <RequireAny>
       Require user guest
       # wellesley.edu
       Require ip 149.130    
       # for the validators
       # w3.org
       Require ip 128.30.52
       # webaim.org   
       Require ip 67.207.157
    </RequireAny>
</Directory>

But these various Directory settings are just variations on a theme. If none of these are right for you, consult the Apache documentation for access and talk to me.

Deploying

Combine the configuration settings of the WSGIDaemonProcess and WSGIScriptAlias and the <Directory> permissions into a single file. Call it something perspicuous like wsgi-myapp.conf.

If you want to see other examples of that file, you can look in /etc/httpd/conf.d/. Once you have such a file, contact the system administrator (me) and I can copy your file into that directory and restart Apache, so that it will use that new configuration file.

Then, you can visit your app at https://cs.wellesley.edu/myapp where myapp is the value you specified in your WSGIScriptAlias configuration variable.

Redeploying

Suppose you find and fix a bug in your app, or you add a new feature. How can you test it? How can you ask Apache to load your new code?

First, you can continue development of your app in the usual way. You probably should use git branches, or do your testing in a separate folder from your deployed app. You wouldn't want Apache to accidentally reload a broken version of your app.

But, assuming you've gotten the bug fixed or the new feature working, how can you get Apache to re-load your code? Do you have to ask the sysadmin to restart Apache?

It turns out the answer is no, you can reload your app just by indicating that the app.wsgi has been modified. That will trigger Apache to reload your code the next time someone visits the URL. You don't even have to modify the app.wsgi file; it's sufficient to modify the date on the file, which is done by the Unix touch command:

touch app.wsgi

Important note, though: That should be done after debugging and testing. The reason is that error messages from deployed apps are written to system log files that you can't read. print() functions in your Python code don't work anymore (they don't print to your terminal, though you could write python code to write to a log file). In short, debugging a deployed app is difficult at best. So, you want to have high confidence that it will work before re-deploying.

Summary

  • Deploying is important, since it's the goal of writing Flask apps. However, for the purposes of this course, it is entirely optional. You've learned the concepts and techniques, and you have a video demo that you can share with friends, family and potential employers. You have code on GitHub if someone really wants to see what you wrote. Very few CS 304 apps have been deployed. But if you want to, you can.
  • The main ideas of deploying are:
    • getting your code to run using the version of Python that is compiled into Apache.
    • configuring settings like number of processes and threads, home directory, python path and the like
    • setting the main "entry point" using ScriptAlias
    • configuring access to your app
    • re-deploying as necessary by touching the app.wsgi file.

  1. You can see a full listing of every process by running the ps -ef command on the server. Warning: the list is long; typically close to 1000 processes. I almost always grep through the result, like ps -ef | grep httpd. In fact I do that so often that I've defined an alias as a shortcut.