Welcome!
Everything is fine.

Mongo Query Language

Today, we'll start working with MongoDB, at least understanding demos. In lab, we'll start modifying node.js scripts.

Plan

  1. Announcements
  2. Flashback to async and await
  3. Recap Databases and MongoDB
  4. Recap MQL, basic queries
  5. Recap simple Regexp
  6. Quiz Questions
  7. Examples
  8. Using Mongosh
  9. Exercises

Announcements

  1. Mongo versus SQL market share
  2. My goal is to teach you modern stuff
  3. You now all have username/password to Mongo Atlas (cloud Mongo server).

Flashback to Async and Await

  • await allows us to write asynchronous code with the illusion that it is synchronous
  • we must use await for any function that returns a promise
  • a promise is an unresolved computation. Eventually, it will resolve: succeed or fail.
  • Any function with await in it is asynchronous, which is announced with the keyword async.
  • If foo calls helper functions bar and baz and one of the helpers is asynchronous, then so is foo
    • you can't turn an asynchronous function into a synchronous one

We'll talk briefly about the final (Ajax) part of the quizzes assignment

Databases and MongoDB

  • Databases store entities in collections
  • In MongoDB, each entity is a document: a JSON (BSON) dictionary, possibily with sub-structure
  • In principle, each entity could be very different. In practice, they are similar or even identical. But not always.

WMDB

Modeled on the real IMDB.

Three main collections:

  • staff: uid, name
  • people: nm, name, birthdate, addedby, movies
  • movies: tt, title, release, addedby, director, cast

Note the plural names. Not always the case, and you won't get an error message if you say "movie" instead of "movies"; you just get an empty collection.

Keys

When searching, it's useful to have a unique identifier.

Mongo issues each document a _id when it's stored, but those aren't always the most convenient.

I've used integer counters like the real IMDB: namely NM (for names) and TT (for titles).

MQL

The Mongo Query Language is big and complicated, but don't panic!

We won't use all or even most of it. Almost all our queries in assignments and projects will be fairly straightforward: queries by keys (after all, that's what keys are for). But I didn't want to give you the mistaken impression that it was so limited, so we'll explore more complex queries in the next assignment.

Find

Let's review find 123

The find123.js file gives us a template for many of our queries.

Still, it's tedious to create a file just to test a search, so we'll spend a little time with the mongo shell a little later today.

Regular Expressions

Regular Expressions are a powerful language, but we'll just scratch the surface. To find a movie if you know the whole title, you can do this:

    let results = await db.collection('movies')
        .find({title: "Everything, Everywhere All At Once"})
        .toArray();

But if you don't get the title exactly right, there's no match. Regular expressions may match more than one title, but are easier:

    let results = await db.collection('movies')
        .find({title: /Everything/})
        .toArray();

or even

    let results = await db.collection('movies')
        .find({title: /Everything/i})
        .toArray();

If you have to build them dynamically, you can use:

let pattern = new RegExp(userVal, "i");
let query = {title: {$regex: pattern}};
let results = await db.collection('movies').find(query).toArray();

Quiz Questions

How did people feel?

  • 0
  • 0
  • 6
  • 11
  • 0

That's fantastic!

Let's look at your quiz questions

Mongosh

Mongosh is the MongoDb shell, meaning it lets you enter expressions and it evaluates them and prints the result.

(This is called a Read-Eval-Print-Loop or REPL. The JS Console in the browser is a REPL. Node.js is a REPL. Python is a REPL. ....)

By default, mongo connects to a local installation of MongoDb (which we have) but in this course, we're using the one that is hosted in the cloud.

To connect to that, we need to have a fancy URI (the more general term for a URL) that includes the hostname, and our username and password.

The username and password is stored in a file in your home directory called .cs304env and .cs304env.sh.

To use the latter, you have to source it (have your shell read it for commands):

source ~/.cs304env.sh

Do that. We'll learn more about that in lab.

Then, you can run mongosh, like this:

mongosh $MONGO_URI

We'll do that together.

Mongosh summary

  • The program is mongosh (Mongo SHell)
  • Needs a URL (URI) to connect to
  • That URI is defined in a file in your account: ~/.cs304env.sh
  • You need to source that file to define the environment variable
  • Then, you can use the variable on the command line:
source ~/.cs304env.sh   # once per login
mongosh $MONGO_URI      # the command

Or, alternatively:

~cs304node/public_html/bin/mongo-atlas

Try both

Mongo Commands

  • help
  • show dbs -- shows the available databases
  • show collections -- shows the collections in the current database
  • use foo make foo be the current database
  • db.bar.find() -- finds everything in the bar collection
  • db.bar.find({a: 1}) -- finds matching documents in the bar collection

Examples

Let's try some examples. First, go into the WMDB

use wmdb

Now, see some people:

db.people.find()

Note that mongosh is synchronous, so you don't have to do await, the way we will in our node.js scripts.

That's overwhelming. Let's just return the name, by supplying a different dictionary as the second argument. (Remember, the first argument is the query, and the second argument is the projection. Supplying an empty dictionary as the query returns all documents in the collection.)

This find only returns 20 at a time; you can type it to get the next 20.

db.people.find({}, {name: 1})

Now, see some of the movies, but just the titles.

db.movies.find({}, {title: 1})

And some people who built the database:

db.staff.find({}, {name: 1})

Movie Lens DB

The Movie Lens DB is slightly different:

use movie_lens_db

and

show collections

Let's see a sample:

db.movie.find({},{title: 1})

Here's one movie:

db.movie.find({title: "Toy Story (1995)"})

And, more prettily:

db.movie.find({title: "Toy Story (1995)"}).pretty()

Fancier Queries (Exercises)

In the WMDB

  1. Find the movie with TT = 666
  2. Look up Michelle Yeoh
  3. Is your favorite actor in the WMDB?
  4. Find all the "Harry Potter" movies in the WMDB.
  5. Find movies with Michelle Yeoh in the cast
  6. Find people with NM between 100 and 200
  7. The same people, but listing only their name and nm
  8. The same data, but sorted by name, ascending

In the Movie Lens DB

  1. Find all the "Harry Potter" movies
  2. Find "children's" movies in the movie lens db, meaning movies for children. That's a genre.
db.movies.find({tt: 666})
db.people.find({name: "Michelle Yeoh"}).pretty()
db.people.find({name: "John Malkovich"})
db.movies.find({title: /Harry Potter/})
db.movies.find({"cast.nm": 123})
db.people.find({nm: {$gt: 100, $lt: 200}})
db.people.find({nm: {$gt: 100, $lt: 200}}, {nm: 1, name: 1})
db.people.find({nm: {$gt: 100, $lt: 200}}, {nm: 1, name: 1}).sort({name: 1})
//
db.movie.find({title: /Harry Potter/})
db.movie.find({genres: /Children/})

Scripts

Let's return to the stuff from lab, and run those examples and then look at the code.

Your copy is in a folder:

cd ~/cs304/apps/exampleQueries
node find123.js
node exampleQueries.js

Let's look at the first program, find123.js1:

// File to find the person with nm 123
// Scott D. Anderson and Olivia Giandrea

const path = require('path');
require("dotenv").config({ path: path.join(process.env.HOME, '.cs304env')});
const { Connection } = require('./connection');
const cs304 = require('./cs304');

const mongoUri = cs304.getMongoUri();

async function main() {
    let db = await Connection.open(mongoUri, 'wmdb');
    // here's the one line that does the query
    let results = await db.collection('people').find({nm: 123}).toArray();
    console.log('found', results.length, 'results');
    results.forEach((r) => console.log(r));
    await Connection.close();
    console.log('done');
}

main();

Now, let's look at the more complicated one, exampleQueries.js

// standard modules, loaded from node_modules
const path = require('path');
require("dotenv").config({ path: path.join(process.env.HOME, '.cs304env')});
const { Connection } = require('./connection');
const cs304 = require('./cs304');
const mongoUri = cs304.getMongoUri();

// ================================================================

/** returns titles of movies containing a substring. 
 */

function titleIncludesSubstring(db, substring) {
  const re = new RegExp(substring, "i");
  return db                 // identify database (in this case, a stored variable)
    .collection("movie")    // movie collection
    .find({title: re})      // find titles containing "og" substring
    .project({title: 1})    // only return "title" field in final array
    .sort({title: 1})       // sort title in ascending alphabetical order
    .toArray();             // convert data to array to work with it in nodejs
}

/** returns titles of movies containing a substring where the genres
 * also includes the given substring.
 */

function searchTitleAndGenre(db, titleSubstring, genreSubstring) {
  const re1 = new RegExp(titleSubstring, "i");
  const re2 = new RegExp(genreSubstring, "i");
  return db                           // identify database (in this case, an argument)
    .collection("movie")              // use movie collection
    .find({title: re1, genres: re2})  // match titles and genres
    .project({title: 1, genres: 1})   // only return title and genre fields
    .sort({title: 1})                 // sort by title in ascending alphabetical order
    .toArray();                       // convert data to array to work with it in nodejs
}

/** Finds the youngest person in the WMDB collection. Computes locally in Node.js 
 */

async function youngestPerson(db) {
  // find the youngest actor in the wmdb database's people collection
  const people = await db               // identify db (in this case, a stored variable)
    .collection("people")               // use people collection
    .find()                             // find all entries
    .project({name: 1, birthdate: 1})   // only return name and birthdate fields
    .toArray();                         // convert data to array to work with it in nodejs
  let youngest = {name: "", birthdate: '01-01-1800'}; // set arbitrary "oldest" birthdate

  people.forEach(person => {
    // iterate over each person in found people
    if (Date.parse(person.birthdate) > Date.parse(youngest.birthdate)) {
      youngest = person; // if person is younger than the current best, replace the value
    }
  }) 

  return youngest;
}

/** Finds the youngest person in the WMDB collection. Computes in
 * Mongo and just gets one.
 */



async function youngestPersonMongo(db) {
  // find the youngest person in the wmdb database's people collection
  const youngest = await db           // identify db (in this case, a stored variable)
    .collection("people")             // use people collection
    .find()                           // find all entries
    .project({name: 1, birthdate: 1}) // only return name and birthdate fields
    .sort({birthdate: -1})            // sort by birthdate descending (most to least recent)
    .limit(1)                         // only return the "top field" 
    .toArray();                       // convert data to array to work with it in nodejs
    // data is returned in an array, so return the first (and only) element
  return youngest[0]; 
}

async function mostCredits(db) {
  // find person with most credits in wmdb
  const people = await db           // identify db (in this case, a stored variable)
    .collection("people")           // use people collection
    .find()                         // find all entries
    .project({name: 1, movies: 1})  // only return name and movies fields
    .toArray();                     // convert data to array to work with it in nodejs

  let longest = {name: "", movies_length: 0}; // starting value
  people.forEach(person => {                  // iterate over each found person
    if (person.movies !== undefined) {
      if (person.movies.length > longest.movies_length) { // if they're in more movies
        longest.name = person.name;                       // replace longest
        longest.movies_length = person.movies.length;
      }
    }
  }) 

  return longest;
}

async function peopleOlderThanDate(db, dateString) {
  // find all people in wmdb born after dateString 
  const people = await db             // define db (in this case, a stored variable)
    .collection("people")             // use people collection
    .find()                           // find all entries
    .project({name: 1, birthdate: 1}) // only return name and birthdate fields
    .toArray();                       // 

  dateObj = new Date(dateString);
  let youngins = people.filter((person) => {
    return Date.parse(person.birthdate) > dateObj; // remove any people born before 2000
  });

  return youngins;
}

/** find all people in wmdb with NM >= given value
 */

function peopleMinId(db, minId) {
  return db                                       // define db (in this case, a stored variable)
    .collection("people")                         // use people collection
    .find({nm: {$gte: minId}})                    // find all people with nm >= minId
    .project({nm: 1, name: 1, birthdate: 1})      // only return nm, name and birthdate fields
    .toArray();                                   // 
}

/** find all movies with TT <= given value
 */

function moviesMaxId(db, maxId) {
  return db                       // define db (in this case, a stored variable)
    .collection("movies")          // use movies collection
    .find({tt: {$lte: maxId}})      // find all entries with TT < maxId
    .project({title: 1, tt: 1})   // only return title and tt fields
    .toArray();                   // convert data to array to work with it in nodejs
}

async function main() {

    console.log('starting function check...\n');

    const movie_lens_db = await Connection.open(mongoUri, 'movie_lens_db');
    q1 = await titleIncludesSubstring(movie_lens_db, "harry");
    console.log("titleIncludesSubstring:",
                q1.length, "harry movies found, such as", q1[0]);
    q2 = await searchTitleAndGenre(movie_lens_db, "harry", "adventure");
    console.log("searchTitleAndGenre:",
                q2.length, "harry adventure movies found, such as", q2[0]);
    await Connection.close();

    console.log("\n");

    const wmdb = await Connection.open(mongoUri, 'wmdb');
    q6 = await youngestPerson(wmdb);
    console.log("youngestPerson",
                "the youngest person in wmdb is " + q6.name + ", born on " + q6.birthdate);
    q6mongo = await youngestPersonMongo(wmdb);
    console.log("youngestPersonMongo",
                "the youngest person is " + q6mongo.name + ", born on " + q6mongo.birthdate);
    q7 = await mostCredits(wmdb);
    console.log("mostCredits ",
                q7.name + " has the longest credit list of " + q7.movies_length + " movies");
    q8 = await peopleOlderThanDate(wmdb, '2000-01-01');
    console.log("peopleOlderThanDate:",
                q8.length, "people found with birthdates after 2000, such as", q8[0]);
    q9greater = await peopleMinId(wmdb, 10000);  
    console.log("peopleMinId:",
                q9greater.length, "people found with nm > 10000, such as", q9greater[0]);
    q9less = await moviesMaxId(wmdb, 5000);  
    console.log("maxId:",
                q9less.length, "movies found with tt < 5000, such as", q9less[0]);

  await Connection.close();

  console.log("\n");
  
}

main().catch(console.error);