Notes on SQLite

SQLite is a Database Management System (DBMS) similar in spirit to MySQL but very different in architecture. These notes can help you use SQLite instead of MySQL. Since SQLite is easy to install on your personal laptop, it can be a good alternative to MySQL in some circumstances, such as having poor internet connectivity (so ssh to Tempest is a problem) or wanting to use a database on your personal machine.

Storage

With SQLite, the database is stored in a special binary file format, but the entire database is in a single file. The file is read into memory when the sqlite3 command is run. The sqlite3 command plays the role that the mysql client does.

For example:

[cs304@tempest sqlite]$ sqlite3 chinook.db 
SQLite version 3.7.17 2013-05-20 00:56:22
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .tables
albums          employees       invoices        playlists     
artists         genres          media_types     tracks        
customers       invoice_items   playlist_track
sqlite> .schema albums
CREATE TABLE "albums"
(
    [AlbumId] INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    [Title] NVARCHAR(160)  NOT NULL,
    [ArtistId] INTEGER  NOT NULL,
    FOREIGN KEY ([ArtistId]) REFERENCES "artists" ([ArtistId]) 
        ON DELETE NO ACTION ON UPDATE NO ACTION
);
CREATE INDEX [IFK_AlbumArtistId] ON "albums" ([ArtistId]);
sqlite> select count(*) from albums;
347
sqlite> select * from albums limit 10;
1|For Those About To Rock We Salute You|1
2|Balls to the Wall|2
3|Restless and Wild|2
4|Let There Be Rock|1
5|Big Ones|3
6|Jagged Little Pill|4
7|Facelift|5
8|Warner 25 Anos|6
9|Plays Metallica By Four Cellos|7
10|Audioslave|8
sqlite> 
[cs304@tempest sqlite]$ 

But with SQLite, there's no database server; there's only the client. Therefore, there's no client/server architecture, and coordinating multiple users (concurrency) is very different. But for our development mode in CS 304, we can use SQLite and the sqlite3 python module.

Tutorial

I've found the following SQLite tutorial site to be very helpful in understanding SQLite in general. Remember that SQLite is not just a Python module; it's a DBMS.

I have written some examples, which I'll describe below. All the example files are in ~cs304/pub/downloads/sqlite.

Note that the chinook.db file that I mentioned above is the demo database available at the SQLite tutorial. So, if you want to follow that tutorial and try some of those examples on the CS server, you can get the database by copying the sqlite folder:

cp -r ~cs304/pub/downloads/sqlite ~/cs304/sqlite

If you want to work locally (on your own laptop), those directions are below.

Data Types

Datatypes are much more flexible in SQLite: a column can store multiple datatypes. SQLite also doesn't have a native "datetime" datatype, but values can be stored as strings or integers (number of days since sometime long ago) to store date values and there are functions to parse and format them.

WMDB2

I've created a "lowest common denominator" version of the WMDB that uses only datatypes and features that are in common between the two systems, MySQL and SQLite. I'm calling it wmdb_sqlite. A copy of it is in the download folder.

Differences between wmdb and wmdb_sqlite include:

  • int datatypes are all just integer; SQLite doesn't have unsigned
  • date datatypes are just text

I kept all the column names the same, so your queries should work unchanged using either database.

If you are curious about the script to create and copy the wmdb to wmdb_sqlite, you are welcome to look at wmdb_sqlite-create-tables.sql in the downloads/sqlite directory.

API

Obviously, in CS 304, you'll need to interact with the DBMS from Python, for which you'll need an API. In switching from PyMySQL to SQLite3, we see some advantages of a consistent API, but also some gaps. First, Python has a standard Database API, which you can read about here: Python DB API. It specifies a number of things that we've seen before:

  • a connect function that returns a connection object
  • cursor objects returned from connection objects
  • an execute method on a cursor object, including prepared queries
  • methods fetchone and fetchall on cursor objects

Both PyMySQL and SQLite3 follow that API, but there are some gaps and inconsistencies. Some of these are understandable, and don't affect us too much. Some do, and will pose difficulties that we will have to overcome.

  • There is no authentication with SQLite. So the step of reading the ~/.my.cnf file is unnecessary.
  • The database is stored in a file, so the connect function has a different interface: you specify the file that the database is stored in.
  • There is no conn.select_db() method, since that's not an option in SQLite.

CS304dbi and SQlite3

The preceding only affect our startup and connections. I implemented a variant of cs304dbi.py called cs304dbi_sqlite3.py that papers over some of these differences. The read_cnf() and cache_cnf() exist but are no-ops (meaning they don't do anything), though there is an object that caches connection information. The use() function stores the name of the database in that cache. The connect() function uses the name of the database from the cache to connect. So, our old startup code looked like this:

    import cs304dbi as dbi
    ...
    dbi.cache_cnf()   # defaults to ~/.my.cnf
    dbi.use('wmdb')
    conn = dbi.connect()

The new code looks like this:

    import cs304dbi_sqlite3 as dbi
    ...
    dbi.cache_cnf()   # or you can omit this
    dbi.use('wmdb_sqlite')
    conn = dbi.connect()

Essentially identical except for the import! This is the beauty and power of APIs, even thin "wrapper" modules like the cs304dbi modules. If you rename the wmdb_sqlite.db file to be wmdb.db, you can even erase that difference, though I don't think it's worth it.

Dictionary Cursors

Another difference between PyMySQL and SQLite3 is that, out of the box, SQLite3 doesn't support dictionary cursors. It can do them, with a little work, and it has a different API for creating them.

I added dictionary cursors to the cs304dbi_sqlite3 module and you can create them using the same functions as in the original cs304dbi module, so your code doesn't need to change at all. For example, the getPeople() function from people_app is completely unchanged:

def getPeople(conn):
    '''Returns the name and birthdate of all the entries in
the person table, as a list of dictionaries.
    '''
    curs = dbi.dict_cursor(conn)
    curs.execute('select name,birthdate from person')
    return curs.fetchall()

Prepared Queries

The SQLite3 API does support prepared queries (and the documentation even refers to the XKCD cartoon as motivation!). However, it uses a different placeholder symbol. It uses a ? symbol rather than the %s placeholder that PyMySQL does.

Unfortunately, papering over that difference is a bit harder, and I have not (yet) done that. I'm afraid I have more pressing tasks. (If anyone is feeling ambitious and wants to help me with this, let me know.) Consequently, you will have to modify your code in order to use prepared queries. It's a pretty easy change: just replace %s in your code with ?. Most editors, including Visual Studio Code, can do this systematically (see the file > replace menu item).

Another difference is that the underlying DBMS implements different functions. For example, MySQL implements date objects and functions like month(), which extracts the month part of a date object. SQLite does not. So the people_born_in_month() function from the people_app example had to be substantively changed. I re-wrote the query to do a wildcard like search searching for birthdates that look like %-MM-% where the MM is the month number with a leading zero. So, the pattern is built like this:

    pattern = '%-{}-%'.format(str(month).zfill(2))

Recall that the month variable is an integer, so we convert it to a string, zero-fill it to 2 places, and then insert it into the pattern. I'm sure you could write this in a variety of ways, but this is what I did. (A query that extracts a substring instead of using a LIKE search might be more efficient, but only by a tiny amount.) The resulting function looks like this:

def people_born_in_month(conn,month):
    '''Returns the name and birthdate of people born in given month,
    as a list of dictionaries
    '''
    curs = dbi.dict_cursor(conn)
    # this uses a prepared query, since the 'month' comes
    # from an untrustworthy source, though the fact that we
    # have to convert it to a pattern is sufficient
    pattern = '%-{}-%'.format(str(month).zfill(2))
    curs.execute('''
        select name,birthdate from person 
        where birthdate like ?''',
                 [pattern])
    return curs.fetchall()

That's it! Getting the dictionary cursor and executing the prepared query is pretty much identical (except notice the ? in the static SQL replacing the %s). For comparision, here's the original code:

def people_born_in_month(conn,month):
    '''Returns the name and birthdate of people born in given month,
    as a list of dictionaries
    '''
    curs = dbi.dict_cursor(conn)
    # this uses a prepared query, since the 'month' comes
    # from an untrustworthy source
    curs.execute('''
        select name,birthdate from person 
        where month(birthdate) = %s''',
                 [month])
    return curs.fetchall()

With these changes in place, our people_app works, now using SQLite instead of MySQL.

All in all, it didn't take long to re-write the people_app and the longest time was in converting the people_born_in_month query.

Other Transition Steps

If you're going to develop locally on your own laptop, you'll need to replicate some of the setup that we have done in CS 304. Here's a to-do list:

  • create a directory on your machine for your CS 304 work, like cs304
  • create a virtual environment in that directory, like venv
  • activate the virtual environment and use pip to install flask
  • luckily, you do not have to install SQLite! It is included in standard Python implementations since Python 2.5.
  • You might have to install the sqlite3 shell command. However, it came pre-installed on my Mac, you might not have to.

Setup on a Mac

I'll show how I set up the local flask app on my personal laptop. Hopefully your experience will be similar.

Python3

First, we are using Python3 in this course, so I need to check that:

ScottAndersonMBP:cs304 scott$ which python
/usr/bin/python
ScottAndersonMBP:cs304 scott$ python -V
Python 2.7.16
ScottAndersonMBP:cs304 scott$ python3 -V
Python 3.5.2

As that transcript shows, the default python is python 2.7.16. But I do have python3; I just have to use a different command. That'll change once I set up the virtual environment.

Virtual Environment

To create the virtual environment, we'll use python3 and the venv module (via the -m command line argument). We will also name the directory venv, just like we did on tempest:

ScottAndersonMBP:cs304 scott$ python3 -m venv venv
ScottAndersonMBP:cs304 scott$ ls
sqlite      sqlite.tgz  venv
ScottAndersonMBP:cs304 scott$ source venv/bin/activate
(venv) ScottAndersonMBP:cs304 scott$ which python
/Users/scott/cs304/venv/bin/python
(venv) ScottAndersonMBP:cs304 scott$ python -V
Python 3.5.2

The beauty of this way of creating the virtual environment is that you know that it's being created with python3. Also, note that once we have activated the virtual environment, we can just say python and it'll use the one in the virtual environment, which is the correct one.

Installing Flask

When I first tried to install Flask, it failed:

(venv) ScottAndersonMBP:cs304 scott$ pip install flask
Collecting flask
  Could not fetch URL https://pypi.python.org/simple/flask/: There was a problem confirming the ssl certificate: [SSL: TLSV1_ALERT_PROTOCOL_VERSION] tlsv1 alert protocol version (_ssl.c:645) - skipping
  Could not find a version that satisfies the requirement flask (from versions: )
No matching distribution found for flask

A little Googling for the error message led me to this page about pip SSL certificates. That had some commands to check the Pip version and the SSL version from the command line:

(venv) ScottAndersonMBP:cs304 scott$ pip -V
pip 8.1.1 from /Users/scott/cs304/venv/lib/python3.5/site-packages (python 3.5)
(venv) ScottAndersonMBP:cs304 scott$ python -c "import ssl; print (ssl.OPENSSL_VERSION)"
OpenSSL 0.9.8zh 14 Jan 2016

Whew, version 8.1.1 is even older than the person who posted the error, but the version of SSL is the same one, and is clearly the problem. (My laptop is getting a little old; I should replace it someday.) Fortunately, the page above also gives an incantation to upgrade pip:

(venv) ScottAndersonMBP:cs304 scott$ curl https://bootstrap.pypa.io/get-pip.py | python
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1764k  100 1764k    0     0  2072k      0 --:--:-- --:--:-- --:--:-- 2071k
Collecting pip
  Downloading pip-20.0.2-py2.py3-none-any.whl (1.4 MB)
     |████████████████████████████████| 1.4 MB 7.6 MB/s 
Collecting wheel
  Downloading wheel-0.34.2-py2.py3-none-any.whl (26 kB)
Installing collected packages: pip, wheel
  Attempting uninstall: pip
    Found existing installation: pip 8.1.1
    Uninstalling pip-8.1.1:
      Successfully uninstalled pip-8.1.1
Successfully installed pip-20.0.2 wheel-0.34.2
(venv) ScottAndersonMBP:cs304 scott$ pip -V
pip 20.0.2 from /Users/scott/cs304/venv/lib/python3.5/site-packages/pip (python 3.5)

Woo, hoo! Pip version 20 seems like a great improvement. Does that work?

(venv) ScottAndersonMBP:cs304 scott$ pip install flask
Collecting flask
  Downloading Flask-1.1.1-py2.py3-none-any.whl (94 kB)
     |████████████████████████████████| 94 kB 7.7 MB/s 
Collecting click>=5.1
  Downloading click-7.1.1-py2.py3-none-any.whl (82 kB)
     |████████████████████████████████| 82 kB 3.0 MB/s 
Collecting Werkzeug>=0.15
  Downloading Werkzeug-1.0.0-py2.py3-none-any.whl (298 kB)
     |████████████████████████████████| 298 kB 32.4 MB/s 
Collecting itsdangerous>=0.24
  Downloading itsdangerous-1.1.0-py2.py3-none-any.whl (16 kB)
Collecting Jinja2>=2.10.1
  Downloading Jinja2-2.11.1-py2.py3-none-any.whl (126 kB)
     |████████████████████████████████| 126 kB 31.5 MB/s 
Collecting MarkupSafe>=0.23
  Downloading MarkupSafe-1.1.1-cp35-cp35m-macosx_10_6_intel.whl (18 kB)
Installing collected packages: click, Werkzeug, itsdangerous, MarkupSafe, Jinja2, flask
Successfully installed Jinja2-2.11.1 MarkupSafe-1.1.1 Werkzeug-1.0.0 click-7.1.1 flask-1.1.1 itsdangerous-1.1.0
(venv) ScottAndersonMBP:cs304 scott$ 

It does! Hopefully, you'll have less trouble than I did, but I was able to get it to work.

The SQLITE Example

Once you've done that setup, you can download the sqlite example directory. Here's how to do it. I'll describe it twice: once conceptually, and then with commands to run.

  • You'll download a zipfile of the example directory using curl
  • You'll unzip it, re-creating the example directory

Here are the two commands; you can copy/paste/edit them if you want. If you type them, note that the switch to the curl command is a capital letter O, for "output".

curl -O https://cs.wellesley.edu/~cs304/zip/sqlite.zip
unzip sqlite.zip

Here's what it will look like:

scott@Scotts-MBP cs304 % curl -O https://cs.wellesley.edu/~cs304/zip/sqlite.zip
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 2131k  100 2131k    0     0  3289k      0 --:--:-- --:--:-- --:--:-- 3284k
scott@Scotts-MBP cs304 % unzip sqlite.zip 
Archive:  sqlite.zip
   creating: sqlite/
  inflating: sqlite/wmdb_sqlite.db   
  inflating: sqlite/cs304dbi_sqlite3.py  
  inflating: sqlite/people.py        
  inflating: sqlite/movie_lens_sqlite.db  
   creating: sqlite/templates/
  inflating: sqlite/templates/people-list.html  
  inflating: sqlite/wmdb_sqlite-create-tables.sql  
  inflating: sqlite/servertime.py    
  inflating: sqlite/chinook.db       
  inflating: sqlite/wmdb_sqlite_ddl.sql  
  inflating: sqlite/app.py           
scott@Scotts-MBP cs304 % ls  
sqlite      sqlite.zip

Example Files

The sqlite example directory that you just downloaded has the following files, which are described in the subsections below:

scott@Scotts-MBP cs304 % ls -1 sqlite
app.py
chinook.db
cs304dbi_sqlite3.py
movie_lens_sqlite.db
people.py
servertime.py
templates
wmdb_sqlite-create-tables.sql
wmdb_sqlite.db
wmdb_sqlite_ddl.sql
scott@Scotts-MBP cs304 % 

cs304dbi

The cs304dbi_sqlite3.py module is the modified database API. You'll import that in your python modules.

TO DO: You should copy that to your virtual environment's site_packages directory.

cp cs304dbi_sqlite3.py $VIRTUAL_ENV/lib/python3.6/site-packages

Note that you may have to edit that command if you're using a different minor version of Python (say, 3.5 instead of 3.6).

the WMDB database

wmdb_sqlite.db is the SQLite version of the WMDB database.

You can work with it using the sqlite3 client. Below is an example. The .schema command is like the describe command that we used with the mysql client. Entering a query is exactly the same.

ScottAndersonMBP:~ scott$ cd cs304/sqlite
ScottAndersonMBP:sqlite scott$ sqlite3 wmdb_sqlite.db 
SQLite version 3.19.3 2017-06-27 16:48:08
Enter ".help" for usage hints.
sqlite> .schema person
CREATE TABLE IF NOT EXISTS "person" (
  "nm" int(11) NOT NULL,
  "name" text,
  "birthdate" text,
  "addedby" int(11) DEFAULT NULL,
  PRIMARY KEY ("nm"),
  CONSTRAINT "person_ibfk_1" FOREIGN KEY ("addedby") REFERENCES "staff" ("uid") ON DELETE SET NULL
);
sqlite> select * from person limit 10;
0|Alan Smithee||1
1|Fred Astaire|1899-05-10|167
2|Lauren Bacall|1924-09-16|1207
3|Brigitte Bardot|1934-09-28|1
4|John Belushi|1949-03-05|
5|Ingmar Bergman|1918-07-14|1
6|Ingrid Bergman|1915-08-29|1
7|Humphrey Bogart|1899-12-25|1247
8|Marlon Brando|1924-04-03|1
9|Richard Burton|1925-11-10|64
sqlite> 

People_app

The directory also has the adapted people_app, which has these files:

  • app.py a few small modifications, like importing the cs304dbi_sqlite3 module
  • people.py modified as described above.
  • servertime.py no modifications!
  • templates the templates folder, with no modifications!

Miscellaneous

A few other files that you might find interesting:

  • wmdb_sqlite-ddl.sql is how the tables are defined for wmdb_sqlite. Pretty much the same as for our wmdb. This is just for reference.
  • chinook.db the SQLite database used in the SQLite tutorial. Otherwise ignore/delete it.
  • movie_lens_sqlite.db a database of movies and ratings from a large public database, converted to SQLite.

Running Flask and Ports

In the normal version of CS 304, all our Flask apps are running on Tempest and therefore need to open a non-priviledged personal port. We used the UID for that, since our UIDs are > 1024 (and therefore not privileged) and are unique to us.

When we are running locally, neither of these considerations apply. We can and should revert to the way Flask normally runs, which is to open an internal network interface, running on localhost and port 5000. So, I have changed the startup part of your code at the end of app.py to the following:

if __name__ == '__main__':
    dbi.use('wmdb_sqlite')
    app.debug = True
    app.run()

The People_app in action

The People App now works, running on localhost:5000 and the Python terminal looks like this:

(venv) ScottAndersonMBP:sqlite scott$ python app.py
 * Serving Flask app "app" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: on
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
 * Restarting with stat
 * Debugger is active!
 * Debugger PIN: 249-245-801
127.0.0.1 - - [23/Mar/2020 12:07:15] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [23/Mar/2020 12:07:16] "GET /favicon.ico HTTP/1.1" 404 -
127.0.0.1 - - [23/Mar/2020 12:11:45] "GET /people/ HTTP/1.1" 200 -
127.0.0.1 - - [23/Mar/2020 12:11:49] "GET /people-born-in/9 HTTP/1.1" 200 -
127.0.0.1 - - [23/Mar/2020 12:11:53] "GET /people-born-in/11 HTTP/1.1" 200 -

Conclusion

I'm confident that those of you who would prefer to work locally will be able to do so using SQLite, the Python SQLite3 database API. It'll take a little setup, but most of the steps are skills you've learned in CS 304, and the SQLite stuff is a variation on a theme.

Do let me know if you have any difficulty. I'm happy to help!

Don't have Python?

If your development machine (let's call it your laptop) doesn't have Python3 and you still want to develop locally, you'll have to install Python. I discovered that my college Mac doesn't have Python3, so I had to install it.

I found these directions to install Python3 on Mac OSX

I had a few minor hitches, but those directions worked for me.

Python is widely supported, so there should be a pre-compiled binary for most computers/operating systems. Do a web search for "install python" and include your OS, and you'll probably find something.

Make sure that the implementation also includes pip and virtualenv. Actually, with Python3, the virtualenv command is not necessary; you should be able just do the following, which we saw above.

python3 -m venv /path/to/new/virtual/environment

Let me know if you have any trouble.