Welcome!
Everything is fine.

Logins, Bcrypt and Sessions

Today, we'll talk about several interesting security issues:

  • Bcrypt
  • Flask Login Sessions

Plan

  1. Announcements
  2. (15) Recap Logins, Salt, and Bcrypt
  3. Quiz Questions
  4. (15) Bcrypt demos
  5. Breakouts

Announcements

  1. CRUD target date is Wednesday
  2. Testing CRUD

Election Remarks

I'll just say a few things about the election

  • Emotions are, understandably, high. People are stressed and anxious
  • Take care of yourself and others.
  • Attend class if you can, but I will be extra-understanding
  • If you will miss a lot of class, seek help from the Stone Center, etc.

Login and Bcrypt Summary

BCRYPT

  • Passwords should never be stored in plaintext in your database; they should be one-way hashed with a cryptographically secure hash algorithm, such as SHA-256
  • Passwords can still be cracked by brute force.
  • To thwart a brute-force approach, we can use an extremely slow hashing algorithm, such as bcrypt
  • To use bcrypt, we have to encode strings as byte arrays. I suggest UTF-8.
  • The bcrypt algorithm also uses salt, which is an additional random string that means that two accounts that use the same password will have different hash values. The function is bcrypt.gensalt().
  • bcrypt stores all the info it needs in the hash value, so you only neeed to store that one value, which is a 60-byte array.
  • The byte array should be decoded (again using UTF-8)
  • Practical Code:
    • register: hashed1 = bcrypt.hashpw(register.encode('utf-8'), bcrypt.gensalt())
    • login: hashed2 = bcrypt.hashpw(login.encode('utf-8'), hashed1.encode('utf-8'))
  • A purported password matches if and only if the newly hashed value matches the old hashed value: hashed2 == hashed1

LOGINS

  • Your app can log someone in by putting their userid in the session
  • Your app can log someone out by removing their userid from the session
  • Because of concurrency, when registering someone new by inserting someone into a table with an auto_increment column, your app should use the MySQL last_insert_id() function to determine the ID; not anything involving, say, max.

Logins and Passwords

Let's turn to Logins and Passwords. It was a long reading, but let's recap:

  • We store a hash of the password, rather than the plaintext password
  • Never store plaintext password
  • We use a slow hash algorithm, preventing the hacker from using brute force to find the password
  • bcrypt is a slow hash algorithm that can be made slower as processors get faster
  • bcrypt uses byte arrays rather than strings, and so we have to use encode and decode to convert the types.
  • There are a variety of encodings. We'll use UTF-8

Why Slowness is Important

  • If you read some of the linked articles, they talk about how, thanks to faster processors, clever use of graphics cards to do parallel computation, and other things, the bad guys can use massive amounts of computational power to try to crack passwords.
  • If it takes 1 microsecond (1 millionth of a second) to hash a possible password, you can try several billion in an hour and crack most short passwords in a very reasonable amount of time.
  • But if you use a really slow hashing algorithm, like bcrypt, so that it takes 1 second (a million times slower) to hash a possible password, you can only try a few thousand in an hour, and you're unlikely to crack a decent password in a reasonable amount of time.

But someone who types the correct password only has to wait one second, which is acceptable.

Salt

  • Because people use poor passwords and re-use their passwords, the system can add "salt" to make them more variable.
  • Salt is essentially just an array of random bytes.
  • Salt needs to be created once, when the password is set
  • Salt is used every time the password is checked
  • Therefore, salt needs to be stored
  • Bcrypt stores the salt in the hashed password.

Bcrypt

The Python bcrypt module defines several functions:

  • gensalt() which returns a randomly generated "salt" string
  • hashpw(plain, hashed/salt) which hashes the plaintext password, plain, along with the salt pulled from the second arg, and returns the hash.

The second arg can either be a salt string, or a previously hashed and stored value, so we can use this same function for generating hashes either the first time or subsequent times.

  • The hashed value is a 60 element array of bytes (your database column can be char(60))
  • bcrypt requires inputs to be bytes not string, hence
  • encode() to convert strings to bytes and
  • decode() to convert bytes to strings

Password Quiz Questions

People weren't feeling great, so feel free to ask more questions.

I'll answer your quiz questions.

Bcrypt Activities/Demos

We'll do these together. This activity works from the ground up:

  • demonstrating bcrypt,
  • demonstrating a registration/login app that uses plaintext passwords
  • then has you build a registration/login app that uses bcrypt passwords

Install Bcrypt

You'll have to pip install bcrypt in your virtual environment:

source ~/cs304/venv/bin/activate 
pip install bcrypt 

Then you can try these examples.

Demos and Code for Today

Instead of one long demo, this is broken up into several parts:

  • encode and decode in python3
  • bcrypt in python3
  • salts and bcrypt in python3
  • timing of bcrypt in python3
cd ~/cs304/
cp -r ~cs304flask/pub/downloads/bcrypt_demo bcrypt_demo
cd bcrypt_demo
ls

Example Code

import datetime
import bcrypt

straw = "le goût des fraises d'été"
ete = "été"

def bytes2hex(byte_array: bytes):
    '''converts the byte_array into a string like [41 43 54]:
    a series of pairs of hexadecimal digits, surrounded by square brackets'''
    if type(byte_array) != bytes:
        raise TypeError('argument is not an array of bytes', byte_array)
    hex = ' '.join( format(b, '02x') for b in byte_array )
    return f'[{hex}]'

def demo_encode_decode(x: str):
    '''convert the argument (a string) into an array of bytes by
    encoding as UTF8, and convert back, printing results to
    demonstrate encoding and decoding.'''
    if type(x) != str:
        raise TypeError('argument is not a string', x)
    print(f'encoding x: {x}')
    b = x.encode('utf-8')
    y = b.decode('utf-8')
    print('x', x, type(x))
    print('b', b, type(b))
    print('y', y, type(y))
    print(bytes2hex(b))
    print('are x and y equal?', x==y)

def bcrypt_string(input_string, prior=None, encoding='utf8'):
    if prior is None:
        prior = bcrypt.gensalt()
    else:
        prior = prior.encode(encoding)
    x = input_string.encode(encoding)
    y = bcrypt.hashpw(x, prior)
    output_string = y.decode(encoding)
    return output_string

def bcrypt_timing(min_work=12, max_work=18):
    '''Bcrypt a particular string with a variety of work factors'''
    last_time = datetime.datetime.now()
    for i in range(min_work, max_work):
        # work factor is embedded in the salt part of the string
        salt = bcrypt.gensalt(rounds=i)
        h = bcrypt.hashpw('secret'.encode('utf8'), salt)
        now = datetime.datetime.now()
        time_diff = now - last_time
        last_time = now
        seconds = (time_diff.seconds +
                   (time_diff.microseconds / float(1_000_000)))
        print(f"hashing 'secret' with work {i} takes {seconds} seconds")

def signup(passwd, encoding='utf8'):
    '''Returns an encrypted password, as a string, suitable for
    storing in a database.'''
    prior = bcrypt.gensalt()
    x = passwd.encode(encoding)
    y = bcrypt.hashpw(x, prior)
    output_string = y.decode(encoding)
    return output_string

def login(passwd, prior, encoding='utf8'):
    '''Returns true/false as to whether the user entered the correct
    password, given that the value stored in the database is 'prior'.'''
    p = prior.encode(encoding)
    x = passwd.encode(encoding)
    y = bcrypt.hashpw(x, p)
    output_string = y.decode(encoding)
    print(passwd, prior, x, y, output_string, sep="\n")
    return prior == output_string

Unicode, Strings, Encode and Decode

Python2 is now passé (obsolete), so we are all now using Python3. What's the difference?

The main difference is strings:

  • Python2 strings were arrays of bytes
  • Python3 strings and byte arrays are different things:
    • convert a string to a byte array by encoding
    • convert a byte array to a string by decoding
  • There are different encoding schemes, but we will use UTF8

Run Python and try the following:

from bcrypt_demo import *
demo_encode_decode('CAB')
demo_encode_decode('CAT')
demo_encode_decode('ACT')
print(ete)
demo_encode_decode(ete)
print(straw)
demo_encode_decode(straw)

Strings in the English language are puzzling because the printed form of the encoding looks just the same, but that's just a historical artefact. The French strings show that a string is not the same as an encoding as an array of bytes.

BTW, UTF8 is the default, so s.encode('utf8') is the same as s.encode(). But I like being explicit.

Bcrypt in python3

The following works in Python 3:

bcrypt_string('secret')
h1 = bcrypt_string('secret')
h1
h2 = bcrypt_string('secret', h1)
h1 == h2

Here's an example from one run:

>>> bcrypt_string('secret')
'$2b$12$D7AcAPBj76BZwIKHuGWJu.DXT2gmcMKJ2OEKMKNb46FIdjEyMjLYi'
>>> h1 = bcrypt_string('secret')
>>> h1
'$2b$12$NAu4nPQFaWdU6iayBx2yTOibUCR1PQLZ34iHeC7Mxlt9PWS78UAn6'
>>> h2 = bcrypt_string('secret', h1)
>>> h2
'$2b$12$NAu4nPQFaWdU6iayBx2yTOibUCR1PQLZ34iHeC7Mxlt9PWS78UAn6'
>>> h1 == h2
True

Salts and bcrypt

Many students have trouble understanding where the salt is stored. It's stored in the hashed password. Literally, the first 29 characters of the 60 character hash result is the salt. (Well, ignoring the first 7 characters.)

Here's some python code to show that:

import bcrypt
salt = bcrypt.gensalt()
salt
hashed = bcrypt.hashpw('secret'.encode(), salt)
hashed
len(salt)
hashed[0:29]
salt
hashed[0:29] == salt

Here's one transcript of that code in action:

>>> import bcrypt
>>> salt = bcrypt.gensalt()
>>> salt
b'$2b$12$IkhVgnCtC/HGtKLDA3QMEO'
>>> hashed = bcrypt.hashpw('secret'.encode(), salt)
>>> hashed
b'$2b$12$IkhVgnCtC/HGtKLDA3QMEOG8ujSJX7to2Sz7ZmvdBeR0YY9m74S0q'
>>> len(salt)
29
>>> hashed[0:29]
b'$2b$12$IkhVgnCtC/HGtKLDA3QMEO'
>>> salt
b'$2b$12$IkhVgnCtC/HGtKLDA3QMEO'
>>> hashed[0:29] == salt

Timing Of Bcrypt

That's the basic mechanism. The key is timing, so let's take a little aside to look at that. The bcrypt_timing function allows us to try a range of different work factors.

>>> from bcrypt_demo import *
>>> bcrypt_timing()
hashing 'secret' with work 12 takes 0.293267 seconds
hashing 'secret' with work 13 takes 0.496393 seconds
hashing 'secret' with work 14 takes 0.99661 seconds
hashing 'secret' with work 15 takes 2.210069 seconds
hashing 'secret' with work 16 takes 4.019593 seconds
hashing 'secret' with work 17 takes 8.310716 seconds

As more computation power becomes available, we just increase the rounds (the work factor) when we generate the salt.

Signup and Login

We use Bcrypt in two distinct ways:

  • when a user signs up, we encrypt their new password with brand new salt and store the encrypted password
  • when a user attempts to log in, we
    • read the earlier password from the database
    • encrypt the entered password with the old encrypted one, because it contains the meta-data: the workload and the salt
    • if the newly encrypted one matches the old one, allow them in

Let's try it:

h1 = signup('dilligrout')
h1
login('dilligrout', h1)    # Neville tries to login
login('i hate potter', h1) # Malfoy tried to break in

Here's an example of the code above:

>>> from bcrypt_demo import *
>>> h1 = signup('dilligrout')
>>> h1
'$2b$12$eVTo1sSWHmfyOnPHcQjdveOfKG1ynuNRdJGhqc3tQkkmztGc0VIcC'
>>> login('dilligrout', h1)
dilligrout
$2b$12$eVTo1sSWHmfyOnPHcQjdveOfKG1ynuNRdJGhqc3tQkkmztGc0VIcC
b'dilligrout'
b'$2b$12$eVTo1sSWHmfyOnPHcQjdveOfKG1ynuNRdJGhqc3tQkkmztGc0VIcC'
$2b$12$eVTo1sSWHmfyOnPHcQjdveOfKG1ynuNRdJGhqc3tQkkmztGc0VIcC
True
>>> login('i hate potter', h1)
i hate potter
$2b$12$eVTo1sSWHmfyOnPHcQjdveOfKG1ynuNRdJGhqc3tQkkmztGc0VIcC
b'i hate potter'
b'$2b$12$eVTo1sSWHmfyOnPHcQjdveJ56eCTpOsNx/0m4HNEO.cnLy6j9wBPW'
$2b$12$eVTo1sSWHmfyOnPHcQjdveJ56eCTpOsNx/0m4HNEO.cnLy6j9wBPW
False

A Login/Logout Flask App

Copy this code for today:

cd ~/cs304
cp -r  ~cs304flask/pub/downloads/login login
cd login

We will open start.py in VS Code and explore.

from flask import (Flask, render_template, make_response, url_for, request,
                   redirect, flash, session, send_from_directory)
app = Flask(__name__)

import cs304dbi as dbi
import secrets

app.secret_key = secrets.token_hex()

@app.route('/')
def index():
    return render_template('main.html', page_title='My App: Welcome')

@app.route('/join/', methods=["POST"])
def join():
    try:
        username = request.form['username']
        passwd1 = request.form['password1']
        passwd2 = request.form['password2']
        if passwd1 != passwd2:
            flash('passwords do not match')
            return redirect( url_for('index'))
        hashed = passwd1
        print(passwd1, type(passwd1))
        conn = dbi.connect()
        curs = dbi.cursor(conn)
        try:
            curs.execute('''INSERT INTO userpass(uid,username,hashed)
                            VALUES(null,%s,%s)''',
                        [username, hashed])
            conn.commit()
        except Exception as err:
            flash('That username is taken: {}'.format(repr(err)))
            return redirect(url_for('index'))
        curs.execute('select last_insert_id()')
        row = curs.fetchone()
        uid = row[0]
        flash('FYI, you were issued UID {}'.format(uid))
        session['username'] = username
        session['uid'] = uid
        session['logged_in'] = True
        session['visits'] = 1
        return redirect( url_for('user', username=username) )
    except Exception as err:
        flash('form submission error '+str(err))
        return redirect( url_for('index') )
        
@app.route('/login/', methods=["POST"])
def login():
    try:
        username = request.form['username']
        passwd = request.form['password']
        conn = dbi.connect()
        curs = dbi.dict_cursor(conn)
        curs.execute('''SELECT uid,hashed
                      FROM userpass
                      WHERE username = %s''',
                     [username])
        row = curs.fetchone()
        if row is None:
            # Same response as wrong password,
            # so no information about what went wrong
            flash('login incorrect. Try again or join')
            return redirect( url_for('index'))
        hashed = row['hashed']
        if hashed == passwd:
            flash('successfully logged in as '+username)
            session['username'] = username
            session['uid'] = row['uid']
            session['logged_in'] = True
            session['visits'] = 1
            return redirect( url_for('user', username=username) )
        else:
            flash('login incorrect. Try again or join')
            return redirect( url_for('index'))
    except Exception as err:
        flash('form submission error '+str(err))
        return redirect( url_for('index') )


@app.route('/user/<username>')
def user(username):
    try:
        # don't trust the URL; it's only there for decoration
        if 'username' in session:
            username = session['username']
            uid = session['uid']
            session['visits'] = 1+int(session['visits'])
            return render_template('greet.html',
                                   page_title='My App: Welcome {}'.format(username),
                                   name=username,
                                   uid=uid,
                                   visits=session['visits'])

        else:
            flash('you are not logged in. Please login or join')
            return redirect( url_for('index') )
    except Exception as err:
        flash('some kind of error '+str(err))
        return redirect( url_for('index') )

@app.route('/logout/')
def logout():
    try:
        if 'username' in session:
            username = session['username']
            session.pop('username')
            session.pop('uid')
            session.pop('logged_in')
            flash('You are logged out')
            return redirect(url_for('index'))
        else:
            flash('you are not logged in. Please login or join')
            return redirect( url_for('index') )
    except Exception as err:
        flash('some kind of error '+str(err))
        return redirect( url_for('index') )


if __name__ == '__main__':
    import sys,os
    if len(sys.argv) > 1:
        # arg, if any, is the desired port number
        port = int(sys.argv[1])
        assert(port>1024)
    else:
        port = os.getuid()
    dbi.cache_cnf()             # use my personal database
    conn = dbi.connect()
    curs = dbi.dict_cursor(conn)
    curs.execute('select database() as db')
    row = curs.fetchone()
    print('Connected to {}'.format(row['db']))
    app.debug = True
    app.run('0.0.0.0',port)

Observations:

  1. The app requires a new table, userpass. I've given you a script, userpass-recreate.sql that will (re-)create it for you.
  2. The join form checks that the passwords match in JavaScript
  3. The join route also checks that the passwords match
  4. We check that the username doesn't already exist
  5. the username is stored in the session and is used to check that the user is really logged in
  6. the URL in the /user/ route is not trusted. It's just there to remind the user who they are logged in as.
  7. routes are surrounded by try/catch; it sometimes helps with inscrutable errors. (The errors could just be printed to the console rather than flashed but I like this during development). Alternatively, remove it and get the backtrace.
  8. login gives the same error message for wrong username as wrong password. Why?
  9. Both /login/ and /join/ just do something and re-direct.
  10. The logout route removes the username and logged_in keys from the session in order to log someone out.
  11. Note that logged_in doesn't actually do anything.
  12. All in less than 150 lines of code!

But it doesn't use bcrypt and it should. See exercise below.

CS304login

The basic use of bcrypt is complicated enough, but dealing with registrations and logins creates extra layers of complexity:

  • Registration:
    • Because of concurrency, rather than check and then insert, we will just insert and hope for the best. An exception is raised if we are unlucky.
    • If the inserted username is in use, a particular subtype of a general PyMySQL IntegrityError is raised, namely ER.DUP_ENTRY. We want to catch that and report the problem to the user.
    • If some other IntegrityError is raised, we want to tell the user about that.
    • Otherwise, we want to return the new user ID
  • Login
    • The username might be invalid, in which case we return False, False
    • The password might be invalid, in which case we return False, False
    • Otherwise we return True, UID

Here's code that does that, packaged up in a Python file/module called cs304login.py.

import cs304dbi as dbi
import pymysql
import bcrypt

def insert_user(conn, username, password, verbose=False):
    '''inserts given username & password into the userpass table.  
Returns three values: the uid, whether there was a duplicate key error, 
and either false or an exception object.
    '''
    hashed = bcrypt.hashpw(password.encode('utf-8'),
                           bcrypt.gensalt())
    curs = dbi.cursor(conn)
    try: 
        curs.execute('''INSERT INTO userpass(username, hashed) 
                        VALUES(%s, %s)''',
                     [username, hashed.decode('utf-8')])
        conn.commit()
        curs.execute('select last_insert_id()')
        row = curs.fetchone()
        return (row[0], False, False)
    except pymysql.err.IntegrityError as err:
        details = err.args
        if verbose:
            print('error inserting user',details)
        if details[0] == pymysql.constants.ER.DUP_ENTRY:
            if verbose:
                print('duplicate key for username {}'.format(username))
            return (False, True, False)
        else:
            if verbose:
                print('some other error!')
            return (False, False, err)

def login_user(conn, username, password):
    '''tries to log the user in given username & password. 
Returns True if success and returns the uid as the second value.
Otherwise, False, False.'''
    curs = dbi.cursor(conn)
    curs.execute('''SELECT uid, hashed FROM userpass 
                    WHERE username = %s''',
                 [username])
    row = curs.fetchone()
    if row is None:
        # no such user
        return (False, False)
    uid, hashed = row
    hashed2_bytes = bcrypt.hashpw(password.encode('utf-8'),
                                  hashed.encode('utf-8'))
    hashed2 = hashed2_bytes.decode('utf-8')
    if hashed == hashed2:
        return (True, uid)
    else:
        # password incorrect
        return (False, False)

def delete_user(conn, username):
    curs = dbi.cursor(conn)
    curs.execute('''DELETE FROM userpass WHERE username = %s''',
                 [username])
    conn.commit()

if __name__ == '__main__':
    conn = dbi.connect()
    delete_user(conn, 'fred')
    delete_user(conn, 'george')

    for username in ['fred', 'george', 'fred']:
        print('inserting', username)
        print('\t', insert_user(conn, username, 'secret', True))
        print('login', username)
        print('\t', login_user(conn, username, 'secret'))

You are welcome to use that module in your own projects. You can install it in your virtual environment like this:

cp ~cs304flask/pub/downloads/login/cs304login.py ~/cs304/venv/lib/python3.9/site-packages/

Breakouts

I'll leave you on your own. You can

  • work on the CRUD assignment
  • work on your project draft
  • work on the Flask login exercise, below

Use bcrypt for Flask Login

Convert the app to use bcrypt. You'll have to

  1. Create a suitable table in your MySQL database for your usernames and passwords. I gave you two .SQL files to help with that. Read them and use them.
  2. Edit start.py
  3. import the bcrypt library
  4. change the way passwords are stored
  5. change the way passwords are checked

My solutions are in done.py and in done2.py. We'll walk through them next time, to make sure you'll be able to use it in your projects, or adapt it any way you want.

the login example