
MongoDB and NoSQL Databases
MongoDB is one of a class of so-called "NoSQL" database systems. It has some nice performance properties that account for its recent surge in popularity and the decline in market share for relational database systems (RDBMS) like MySQL. Please read the following brief introductions.
- Will NoSQL Databases Live Up to Their Promise, by Neal Levitt. This is the best description and comparison with RDBMS systems that I've read. It's only three pages long. If you read nothing else, read this.
- NoSQL Wikipedia article gives more information and motivation. It's about 10 pages, but I suggest reading just through the section entitled "document store", which is the first 3 pages.
- NoSQL Explained. This article is from the folks at MongoDB, so it's a little biased, but it does a good job and is a bit more comprehensive that the first two. This is about 8 pages long. It repeats some of the ideas from the first two, but with a bit more detail.
- MongoDB Tutorial. This is a multiple web-page tutorial. Just read the first two, please. Each page is short, bulked up with lots of ads.
MongoDB vs SQL¶
Take a moment to look at this comparison with SQL which helps to translate the terminology that we are familiar with (tables, rows) to the NoSQL analogs (collections, documents).
MongoDB has a client-server architecture just like MySQL does. There
is a daemon process that controls the database files; it's called
mongodb
. You can connect to the database using the client program,
which is called mongo
. Both are installed on the CS server.
MongoDB¶
- help
- db.help()
- db.collection.help()
- show dbs
- use
<db>
- show collections
MongoDB shell¶
MongoDB is installed on the CS server. You will need to login to the CS server, but then you can run it like this. Note that there is a collection of warning messages that are printed when it starts; you can ignore those.
Unlike MySQL, where the database administrator (me) has to create a
database for you, MongoDB creates them on the fly. I suggest that you
use your username, followed by the letters "db" as the name of your
database. So, Hermione Granger would say use hgrangerdb
. Below, I'll
use scottdb
:
$ mongo
MongoDB server version: 4.2.10
> use scottdb;
switched to scottdb
> db.actors.find() // empty
> db.actors.insertOne({"name":"Colin Firth"})
// Salma has more info; no need to be consistent
> db.actors.insertOne({"name":"Salma Hayek","birthdate":"9/2/1966"})
> db.actors.find();
{ "_id" : ObjectId("535dfee55ed6d98999b62c71"), "name" : "Colin Firth" }
{ "_id" : ObjectId("535dfef95ed6d98999b62c72"), "name" : "Salma Hayek", "birthdate" : "9/2/1966" }
> db.actors.find().pretty();
{ "_id" : ObjectId("535dfee55ed6d98999b62c71"), "name" : "Colin Firth" }
{
"_id" : ObjectId("535dfef95ed6d98999b62c72"),
"name" : "Salma Hayek",
"birthdate" : "9/2/1966"
}
MongoDB, Node.js and Callbacks¶
The Mongo shell application above actually returns values or seems to, so what about this callback-oriented programming style?
Indeed, the mongo
shell returns values, just as we would
expect. But, if we connect to the mongodb
server using a node.js
program, we are required to use callback-style coding. We'll see that
in a moment.
Aside: Promises¶
There's a version of the MongoDB API that uses Promises, which I'm still learning, but has some nice semantic and syntactic properties. I've yet to find or write a good introduction/tutorial, but this Primer on Promises by Jake Archibal looks good and is also amusing.
Just put this on your long list of things I should learn more about.
Practical Examples¶
There are some examples that you can run in the course account. (You can copy them to your own account in the usual way if you want to edit/adapt them, but that's not necessary and saves us some disk space if you don't. See me if you want to do that.)
You can run my examples by just cd
-ing to the folder in the course
account:
cd ~cs304/pub/downloads/mongo
There are several scripts in there that create/read/delete some things
in my database. The collection is called things
because the example
is to have a collection of "my favorite things".
Running the examples¶
Whether you make your own copy or use the one in the course folder, you run the examples like this:
node list-things.js
node insert-things.js
node list-things.js
node insert-zhivago.js
node insert-things.js
node list-things.js
node delete-things.js
node list-things.js
Query example¶
I won't show and discuss all the code, but we'll look at one of those
scripts, namely the list-things.js
script that prints the entire
collection of favorite things from a collection (table) called
things
in the scottdb
database. It runs and completes, rather than
starting up a server, but you run it using node
. We'll see that
after looking at the code.
// Followed example at
// http://mongodb.github.io/node-mongodb-native/3.2/tutorials/connect/
const MongoClient = require('mongodb').MongoClient;
const assert = require('assert');
// Connection URL
const url = 'mongodb://localhost:27017';
// Database Name
const dbName = 'scottdb';
// Create a new MongoClient
const client = new MongoClient(url, { useUnifiedTopology: true} );
var findThings = function(db, callback) {
var col = db.collection('things');
col.find({}).toArray( function(err, docs) {
assert.equal(null, err);
console.log('Found the following documents:');
for( var i = 0; i < docs.length; i++ ) {
console.log(i+': ',docs[i].thing);
}
console.log('after listing all documents');
callback();
});
};
// Use connect method to connect to the Server
client.connect(function(err) {
assert.equal(null, err);
console.log("Connected successfully to server");
const db = client.db(dbName);
var close = function () {
client.close();
console.log('after closing database');
}
findThings(db, close);
console.log('after executing findThings');
});
Things to note in the code:
- the accumulation of callbacks, though promises would ameliorate that
- the way that the database connection is closed after we are done, namely by passing a callback function through several layers of function calls so that after we are done with the database, it gets closed.
When we run this with an empty collection, the output looks like this:
$ node list-things.js
Connected successfully to server
after executing findThings
Found the following documents:
after listing all documents
after closing database
$
Note the order that the "after" strings are printed when the code is
run. This demonstrates exactly the execution order from our very first
node example, where 'World' is printed before the readFile
completes:
console.log('Hello');
fs.readFile('/path/to/file', function(err, data) {
// do something ...
});
console.log(' World');
A web application combining Node.js with MongoDB¶
By popular demand, I've created a tiny web application that puts all these pieces together. There's a video on the videos page.
If we combine Node.js responding to requests with MongoDB to store data, we can have a complete database-backed web application that uses an event-loop rather than threads. Our app will support three main routes: a route to list the current favorite things, a route to insert a new favorite thing, and a route to delete a particular thing. So this is most of our CRUD API, skipping only the ability to edit a favorite thing.
All database interactions will be done using Ajax.
Here's the code for the app.js
file. Notice the three routes:
/things/
returns a list (JSON) of all documents in the collection./append/
adds another thing to the collection/delete/:oid
removes a thing from the collection using its unique object ID (OID).
// Following the tutorial at https://zellwk.com/blog/crud-express-mongodb/
const myPort = 1942;
const dbName = 'scottdb';
const fs = require('fs');
const express = require('express');
const bodyParser = require('body-parser')
const mongo = require('mongodb');
const assert = require('assert');
const url = "mongodb://localhost:27017/"+dbName;
const client = new mongo.MongoClient(url, { useUnifiedTopology: true} );
var db;
var app = express();
app.use(bodyParser.urlencoded({extended: true}));
// app.use(express.static('static_files'))
const page = fs.readFileSync('page.html','utf8');
// respond with the page a GET request is made to the homepage
app.get('/', function (req, res) {
res.send(page);
});
function findThings(db, callback) {
var col = db.collection('things');
col.find({}).toArray( function(err, docs) {
assert.equal(null, err);
console.log('Found the following documents:');
for( var i = 0; i < docs.length; i++ ) {
console.log(i+': ',docs[i].thing);
}
if(callback) callback(docs);
});
};
// refresh the list of things
app.get('/things/', function (req,res) {
findThings(db, function (docs) { res.send(docs); });
});
// append a new thing; minimal response
app.post('/append/', function (req,res) {
console.log(req.body);
var col = db.collection('things');
console.log('appending ',req.body.desc);
col.insertOne({thing: req.body.desc});
console.log('ok');
res.send('ok');
});
// delete a thing; minimal response
app.post('/delete/:oid', function (req,res) {
let oid = req.params.oid;
var col = db.collection('things');
console.log('deleting ',oid);
var obj = {_id: mongo.ObjectId(oid)};
console.log(obj);
col.deleteOne(obj, function (err) { console.log('after deleteOne',err)});
console.log('ok');
res.send('ok');
});
// Startup code. Load the page once; all updates done via Ajax
client.connect(function (err) {
assert.equal(null, err);
// set global database connection
db = client.db(dbName);
// test finding things
findThings(db);
let msg = `Example app listening on port http://0.0.0.0:${myPort}!`
app.listen(myPort, () => console.log(msg));
});
Plus, there's the /
route that just returns the page. Below in the
HTML page. Notice the (empty) list of things and the JavaScript Ajax
functions to get the list of things, append a thing to the collection,
and delete something using its unique object ID (OID). The last uses a
delegated handler.
<!doctype html>
<html>
<head>
<meta charset="utf-8">
<!-- for mobile-friendly pages -->
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name=author content="Scott D. Anderson">
<meta name=description content="">
<meta name=keywords content="">
<link rel="stylesheet" href="http://cs.wellesley.edu/~anderson/sda-style.css">
<title></title>
</head>
<body>
<h1>Example of Node+Express+MongoDB</h1>
<h2>Favorite Things</h2>
<ol id="things"></ol>
<h2>UI</h2>
<p id="msg"></p>
<ul>
<li> <button id="get">get list of favorite things</button> </li>
<li>
<form>
<label for="thing_desc">Description</label>
<textarea id="thing_desc"></textarea>
<button type="button" id="put">add</button>
</form></li>
</ul>
<script src="https://code.jquery.com/jquery-3.4.1.min.js"></script>
<script>
$("#get").click(function (evt) {
console.log('refreshing list');
$.get('/things/', function (docs) {
$("#things").empty();
g = docs;
console.log('got '+docs.length+' things');
docs.forEach(function (doc) {
$('<li>')
.attr('data-oid',doc._id)
.text(doc.thing)
.appendTo("#things"); });
});
});
function refreshNeeded() {
// tell the user refresh to see the append
$("#msg").text('refresh things to see the effect').show();
setTimeout(function() { $("#msg").fadeOut(); }, 2000);
}
$("#put").click(function (evt) {
$.post('/append/',{'desc': $("#thing_desc").val()})
$("form")[0].reset();
refreshNeeded();
});
$("#things").on('click','[data-oid]', function (evt) {
let oid=$(this).attr('data-oid');
$.post('/delete/'+oid);
refreshNeeded();
});
</script>
</body>
</html>
A total of less than 150 lines of code to do a moderately complete, though tiny, web application.
Notice that the UI is intentionally a little annoying. I force the user to refresh the list of things, rather than using JavaScript to update the page. This is for several reasons:
- concurrency: it would be unrealistic to just have the front-end update the page, because that assumes no one else is editing the database. But in real life, they might, which means that the back end has to respond with an updated list of documents.
- simplicity: the back-end routes don't have to return an updated list of documents and send them back, and the front end doesn't have to handle the response.
Nevertheless, this would be easy to add. I'll leave that as an exercise for the reader.
Summary¶
- Node.js and MongoDB are real players in the database world for good reason: performance
- They do certain things very well:
- Inserting documents (row)
- Iterating over documents to search
- Handling I/O-bound HTTP requests
- Sharding to spread databases over multiple servers.
- They (mostly) don't do other things:
- Joins, so no normalized databases
- Compute-bound HTTP requests (though they can with additional threads).