The Python Requests Module

The Python requests module allows your python program to make web requests. In some sense, your python program is acting like a web browser, and you can use it to scrape websites (maybe parsing the web pages with something like beautiful soup). However, we often would rather not have to parse a web page in order to get the information, so it's always nice when the server provides the data in a machine-friendly form such as JSON. This reading will focus on that. This reading will be helpful in the Ajax assignment, where I'll give you a testing script (written using the requests module in a manner similar to this reading) and you'll write the server side.

Some other resources, if you'd like to learn more.

This reading is organized as follows:

  • simple GET requests
  • simple POST requests
  • sessions with authentication

Practical Note

The requests module is available in our default Python environment on Tempest, so you can use it w/o a virtual environment. If you do use it with a virtual environment, you'll need to install it using pip:

pip install requests   # after activating your virtual environment

Simple Server

For the sake of concreteness, we'll start with a simple server application, based somewhat on our people app. It provides the following API

  • GET /people/ which returns a list of all the people in the database, as a list of dictionaries.
  • GET /people-born-in-month/<month> which returns a list of all the people in the database born in the given month, as a list of dictionaries.
  • POST /people-noauth/ which creates a new person. The posted data has to include at least nm and name. It can also include addedby and birthdate, but those are optional. This endpoint doesn't require any authentication.
  • POST /login/ which logs the user in, enabling them to POST new people to the database using the next endpoint.
  • POST /people-auth/ which is just like /people-noauth/ except that authentication is required.

Whether to require logins depends on how the app is run. When the app starts, it asks whether to require authentication (login) and the username and password to use. In real life, we would probably omit the endpoint that doesn't require authentication, unless our app were on an intranet so all the clients were trustworthy.

Of course, the endpoints above are shorthands. If your UID is 2345, the request is actually made to http://cs.wellesley.edu:2345/ plus the given endpoint.

The responses from the server will be in JSON. The format I've adopted is a dictionary with two keys: error and data. If there's an error, the error value will be an error message, and the data can be ignored. Otherwise, the error value will be False and the data can be used. For example:

{'error': 'must login', 'data': None}

{'error': False, 'data': [{},{},{}...]}

In the example server, the data will almost always be a list of dictionaries, each dictionary representing a person in the WMDB.

GET Requests

Let's start simple. To get a list of all the people in the database, just do the following. (Remember to edit the code to specify your port.)

import requests
port = 1942
prefix = f'http://cs.wellesley.edu:{port}'

resp = requests.get(prefix+'/people/')

The resp variable holds an object that contains the response from the server. Some useful properties/methods:

  • .status_code is an HTTP response code. 200 is success, 302 is a redirect, 404 is not found, 500 is a server error, etc.
  • .text is the text of the response, as a string. A web scraper might use this to get the HTML page.
  • .json is the response decoded as JSON.

In our case, you can work with the data like this:

val = resp.json()

print(val['error']) # hopefully false

print(len(val['data']))  # how many people?

# this is just normal Python, working with a list of dictionaries
for person in val['data']:
    nm = person['nm']
    name = person['name']
    bday = person['birthdate']
    print(f'{name} ({nm}) was born on {bday}')

You can run the server and then start a second python terminal and run the code above. We'll do this in class.

We should probably do a little error checking, so here is slightly more polished code, suitable for putting in a script.

import requests
port = 1942
prefix = f'http://cs.wellesley.edu:{port}'

resp = requests.get(prefix+'/people/')
if resp.status != 200:
    print('error', resp.status)
else:
    val = resp.json()
    if val['error']:
        print('error', val['error'])
    else: 
        for person in val['people']:
            nm = person['nm']
            name = person['name']
            bday = person['birthdate']
            print(f'{name} ({nm}) was born on {bday}')

Our finished client code will define two helper functions. The first just checks for errors, so that the client will quit if something goes wrong.

def check_response_for_error(resp):
    '''Raises an exception if there was an error, otherwise returns the JSON data.'''
    if resp.status_code != 200:
        raise Error(resp.status_code)
    else:
        val = resp.json()
        if val['error']:
            raise Error(val['error'])
        else:
            return val

and second just prints the first 3 and last 3 of the list of data, so that the result isn't too lengthy:

def print_people(dict_list, head=3, tail=3):
    '''prints part of a list of people, represented as dictionaries. 
       Prints the first 'head' elements and then last 'tail' elements.'''
    n = len(dict_list)
    print(f'\nA list of {n} people:')
    for person in dict_list[0:head]:
        nm = person['nm']
        name = person['name']
        bday = person['birthdate']
        print(f'{name} ({nm}) was born on {bday}')
    print('...')
    for person in dict_list[-tail:]:
        nm = person['nm']
        name = person['name']
        bday = person['birthdate']
        print(f'{name} ({nm}) was born on {bday}')

We'll use these from now on.

We can also try parameterized requests. Again, remember to edit the URL to specify your port.

Don't copy/paste this all at once. Do it a line at a time.

month = input('what month? ')
resp = requests.get(prefix+'/people-born-in-month/'+month)
val = check_response_for_error(resp)
print_people(val['data'])

POST Requests

Remember that we use POST for requests where:

  • The back-end (database) is modified (insert, update, or delete)
  • The state of the interaction is modified (e.g. login)
  • The request should not be cached
  • The request should not be re-submitted without a warning
  • The request is too big to fit in a URL
  • The request has files attached, which requires a different encoding
  • The request has sensitive information that shouldn't be put in the URL

In this example, we'll use POST to insert a new person into the database. The modifications are pretty minor. The data to be delivered is represented as a dictionary, which comprises name:value pairs just like HTML forms.

import requests
port = 1942
prefix = f'http://cs.wellesley.edu:{port}'

payload = {'nm': 234, 'name': 'Charlize Theron'}
resp = requests.post(prefix+'/people-noauth/', data=payload)
val = check_response_for_error(resp)
print(val)

Here, the response isn't a list of people. It's just a confirmation.

Sessions

If we want to update the database using POST, we usually are required to login (authenticate) first. To simplify the previous example, the server implemented the ...noauth endpoint. As mentioned earlier, in most circumstanced, we would probably omit the noauth endpoint, unless all the clients were trustworthy.

As you recall, sessions are built on cookies, and cookies are sent back and forth between client and server. In this case, we will authenticate with the /login/ endpoint, which will send back a cookie. Our program will then send that cookie along with the next request, to the /people-auth/ endpoint.

Here's how that works.

First, we login:

credentials = {'username': 'me', 'password': 'secret'}
login_resp = requests.post(prefix+'/login/', credentials)
login_val = check_response_for_error(login_resp)
print(login_val)

Then, we save the cookies (which includes the session cookie) in a global variable, so that we can send them with the next request.

session = login_resp.cookies

Next, we make that next request, along with our post.

payload = {'nm': 234, 'name': 'Charlize Theron', 'birthdate': '1975-08-07'}
resp = requests.post(prefix+'/people-auth/', cookies=session, data=payload)
val = check_response_for_error(resp)
print(val)

Finally, let's see if it worked:

resp = requests.get(prefix+'/people-born-in-month/8', cookies=session, data=payload)
val = resp.json()
print(val['data'][-1])

Woo-hoo!

Summary

There are many other things that the requests module does, but this is sufficient for our usage in CS 304.

The requests module allows us

  • to make GET requests
  • to find out the response status code
  • to get the returned web page
  • to get the data in JSON format
  • to make POST requests
  • to use sessions by saving a cookie and sending it in a later request

Examples:

resp = requests.get(url)
print(resp.status_code) 
page = resp.text
data = resp.json()
resp = requests.post(url, data=payload_dictionary)
session = resp.cookies
resp = requests.get(url, cookies=session)

Complete Client Code

Click here for the complete client code

Complete Server Code

Click here for the complete server code