
Welcome!
Everything is fine.
Mongo Query Language¶
Today, we'll start working with MongoDB, at least understanding demos. In lab, we'll start modifying node.js scripts.
Plan¶
- Announcements
- Flashback to
async
andawait
- Recap Databases and MongoDB
- Recap MQL, basic queries
- Recap simple Regexp
- Quiz Questions
- Examples
- Using Mongosh
- Exercises
Announcements¶
- Mongo versus SQL market share
- My goal is to teach you modern stuff
- You now all have username/password to Mongo Atlas (cloud Mongo server).
Flashback to Async and Await¶
await
allows us to write asynchronous code with the illusion that it is synchronous- we must use
await
for any function that returns a promise - a promise is an unresolved computation. Eventually, it will resolve: succeed or fail.
- Any function with
await
in it is asynchronous, which is announced with the keywordasync
. - If
foo
calls helper functionsbar
andbaz
and one of the helpers is asynchronous, then so isfoo
- you can't turn an asynchronous function into a synchronous one
We'll talk briefly about the final (Ajax) part of the quizzes assignment
Databases and MongoDB¶
- Databases store entities in collections
- In MongoDB, each entity is a document: a JSON (BSON) dictionary, possibily with sub-structure
- In principle, each entity could be very different. In practice, they are similar or even identical. But not always.
WMDB¶
Modeled on the real IMDB.
Three main collections:
- staff: uid, name
- people: nm, name, birthdate, addedby, movies
- movies: tt, title, release, addedby, director, cast
Note the plural names. Not always the case, and you won't get an error message if you say "movie" instead of "movies"; you just get an empty collection.
Keys¶
When searching, it's useful to have a unique identifier.
Mongo issues each document a _id
when it's stored, but those aren't
always the most convenient.
I've used integer counters like the real IMDB: namely NM (for names) and TT (for titles).
MQL¶
The Mongo Query Language is big and complicated, but don't panic!
We won't use all or even most of it. Almost all our queries in assignments and projects will be fairly straightforward: queries by keys (after all, that's what keys are for). But I didn't want to give you the mistaken impression that it was so limited, so we'll explore more complex queries in the next assignment.
Find¶
Let's review find 123
The find123.js file gives us a template for many of our queries.
Still, it's tedious to create a file just to test a search, so we'll spend a little time with the mongo shell a little later today.
Regular Expressions¶
Regular Expressions are a powerful language, but we'll just scratch the surface. To find a movie if you know the whole title, you can do this:
let results = await db.collection('movies')
.find({title: "Everything, Everywhere All At Once"})
.toArray();
But if you don't get the title exactly right, there's no match. Regular expressions may match more than one title, but are easier:
let results = await db.collection('movies')
.find({title: /Everything/})
.toArray();
or even
let results = await db.collection('movies')
.find({title: /Everything/i})
.toArray();
If you have to build them dynamically, you can use:
let pattern = new RegExp(userVal, "i");
let query = {title: {$regex: pattern}};
let results = await db.collection('movies').find(query).toArray();
Quiz Questions¶
How did people feel?
- 0
- 0
- 6
- 11
- 0
That's fantastic!
Let's look at your quiz questions
Mongosh¶
Mongosh is the MongoDb shell, meaning it lets you enter expressions and it evaluates them and prints the result.
(This is called a Read-Eval-Print-Loop or REPL. The JS Console in the browser is a REPL. Node.js is a REPL. Python is a REPL. ....)
By default, mongo connects to a local installation of MongoDb (which we have) but in this course, we're using the one that is hosted in the cloud.
To connect to that, we need to have a fancy URI (the more general term for a URL) that includes the hostname, and our username and password.
The username and password is stored in a file in your home directory
called .cs304env
and .cs304env.sh
.
To use the latter, you have to source it (have your shell read it for commands):
source ~/.cs304env.sh
Do that. We'll learn more about that in lab.
Then, you can run mongosh, like this:
mongosh $MONGO_URI
We'll do that together.
Mongosh summary¶
- The program is
mongosh
(Mongo SHell) - Needs a URL (URI) to connect to
- That URI is defined in a file in your account:
~/.cs304env.sh
- You need to source that file to define the environment variable
- Then, you can use the variable on the command line:
source ~/.cs304env.sh # once per login
mongosh $MONGO_URI # the command
Or, alternatively:
~cs304node/public_html/bin/mongo-atlas
Try both
Mongo Commands¶
help
show dbs
-- shows the available databasesshow collections
-- shows the collections in the current databaseuse foo
makefoo
be the current databasedb.bar.find()
-- finds everything in thebar
collectiondb.bar.find({a: 1})
-- finds matching documents in thebar
collection
Examples¶
Let's try some examples. First, go into the WMDB
use wmdb
Now, see some people:
db.people.find()
Note that mongosh is synchronous, so you don't have to do await
,
the way we will in our node.js scripts.
That's overwhelming. Let's just return the name, by supplying a different dictionary as the second argument. (Remember, the first argument is the query, and the second argument is the projection. Supplying an empty dictionary as the query returns all documents in the collection.)
This find
only returns 20 at a time; you can type it
to get the
next 20.
db.people.find({}, {name: 1})
Now, see some of the movies, but just the titles.
db.movies.find({}, {title: 1})
And some people who built the database:
db.staff.find({}, {name: 1})
Movie Lens DB¶
The Movie Lens DB is slightly different:
use movie_lens_db
and
show collections
Let's see a sample:
db.movie.find({},{title: 1})
Here's one movie:
db.movie.find({title: "Toy Story (1995)"})
And, more prettily:
db.movie.find({title: "Toy Story (1995)"}).pretty()
Fancier Queries (Exercises)¶
In the WMDB
- Find the movie with TT = 666
- Look up Michelle Yeoh
- Is your favorite actor in the WMDB?
- Find all the "Harry Potter" movies in the WMDB.
- Find movies with Michelle Yeoh in the cast
- Find people with NM between 100 and 200
- The same people, but listing only their name and nm
- The same data, but sorted by name, ascending
In the Movie Lens DB
- Find all the "Harry Potter" movies
- Find "children's" movies in the movie lens db, meaning movies for children. That's a genre.
db.movies.find({tt: 666}) db.people.find({name: "Michelle Yeoh"}).pretty() db.people.find({name: "John Malkovich"}) db.movies.find({title: /Harry Potter/}) db.movies.find({"cast.nm": 123}) db.people.find({nm: {$gt: 100, $lt: 200}}) db.people.find({nm: {$gt: 100, $lt: 200}}, {nm: 1, name: 1}) db.people.find({nm: {$gt: 100, $lt: 200}}, {nm: 1, name: 1}).sort({name: 1}) // db.movie.find({title: /Harry Potter/}) db.movie.find({genres: /Children/})
Scripts¶
Let's return to the stuff from lab, and run those examples and then look at the code.
Your copy is in a folder:
cd ~/cs304/apps/exampleQueries
node find123.js
node exampleQueries.js
Let's look at the first program, find123.js1
:
// File to find the person with nm 123
// Scott D. Anderson and Olivia Giandrea
const path = require('path');
require("dotenv").config({ path: path.join(process.env.HOME, '.cs304env')});
const { Connection } = require('./connection');
const cs304 = require('./cs304');
const mongoUri = cs304.getMongoUri();
async function main() {
let db = await Connection.open(mongoUri, 'wmdb');
// here's the one line that does the query
let results = await db.collection('people').find({nm: 123}).toArray();
console.log('found', results.length, 'results');
results.forEach((r) => console.log(r));
await Connection.close();
console.log('done');
}
main();
Now, let's look at the more complicated one, exampleQueries.js
// standard modules, loaded from node_modules
const path = require('path');
require("dotenv").config({ path: path.join(process.env.HOME, '.cs304env')});
const { Connection } = require('./connection');
const cs304 = require('./cs304');
const mongoUri = cs304.getMongoUri();
// ================================================================
/** returns titles of movies containing a substring.
*/
function titleIncludesSubstring(db, substring) {
const re = new RegExp(substring, "i");
return db // identify database (in this case, a stored variable)
.collection("movie") // movie collection
.find({title: re}) // find titles containing "og" substring
.project({title: 1}) // only return "title" field in final array
.sort({title: 1}) // sort title in ascending alphabetical order
.toArray(); // convert data to array to work with it in nodejs
}
/** returns titles of movies containing a substring where the genres
* also includes the given substring.
*/
function searchTitleAndGenre(db, titleSubstring, genreSubstring) {
const re1 = new RegExp(titleSubstring, "i");
const re2 = new RegExp(genreSubstring, "i");
return db // identify database (in this case, an argument)
.collection("movie") // use movie collection
.find({title: re1, genres: re2}) // match titles and genres
.project({title: 1, genres: 1}) // only return title and genre fields
.sort({title: 1}) // sort by title in ascending alphabetical order
.toArray(); // convert data to array to work with it in nodejs
}
/** Finds the youngest person in the WMDB collection. Computes locally in Node.js
*/
async function youngestPerson(db) {
// find the youngest actor in the wmdb database's people collection
const people = await db // identify db (in this case, a stored variable)
.collection("people") // use people collection
.find() // find all entries
.project({name: 1, birthdate: 1}) // only return name and birthdate fields
.toArray(); // convert data to array to work with it in nodejs
let youngest = {name: "", birthdate: '01-01-1800'}; // set arbitrary "oldest" birthdate
people.forEach(person => {
// iterate over each person in found people
if (Date.parse(person.birthdate) > Date.parse(youngest.birthdate)) {
youngest = person; // if person is younger than the current best, replace the value
}
})
return youngest;
}
/** Finds the youngest person in the WMDB collection. Computes in
* Mongo and just gets one.
*/
async function youngestPersonMongo(db) {
// find the youngest person in the wmdb database's people collection
const youngest = await db // identify db (in this case, a stored variable)
.collection("people") // use people collection
.find() // find all entries
.project({name: 1, birthdate: 1}) // only return name and birthdate fields
.sort({birthdate: -1}) // sort by birthdate descending (most to least recent)
.limit(1) // only return the "top field"
.toArray(); // convert data to array to work with it in nodejs
// data is returned in an array, so return the first (and only) element
return youngest[0];
}
async function mostCredits(db) {
// find person with most credits in wmdb
const people = await db // identify db (in this case, a stored variable)
.collection("people") // use people collection
.find() // find all entries
.project({name: 1, movies: 1}) // only return name and movies fields
.toArray(); // convert data to array to work with it in nodejs
let longest = {name: "", movies_length: 0}; // starting value
people.forEach(person => { // iterate over each found person
if (person.movies !== undefined) {
if (person.movies.length > longest.movies_length) { // if they're in more movies
longest.name = person.name; // replace longest
longest.movies_length = person.movies.length;
}
}
})
return longest;
}
async function peopleOlderThanDate(db, dateString) {
// find all people in wmdb born after dateString
const people = await db // define db (in this case, a stored variable)
.collection("people") // use people collection
.find() // find all entries
.project({name: 1, birthdate: 1}) // only return name and birthdate fields
.toArray(); //
dateObj = new Date(dateString);
let youngins = people.filter((person) => {
return Date.parse(person.birthdate) > dateObj; // remove any people born before 2000
});
return youngins;
}
/** find all people in wmdb with NM >= given value
*/
function peopleMinId(db, minId) {
return db // define db (in this case, a stored variable)
.collection("people") // use people collection
.find({nm: {$gte: minId}}) // find all people with nm >= minId
.project({nm: 1, name: 1, birthdate: 1}) // only return nm, name and birthdate fields
.toArray(); //
}
/** find all movies with TT <= given value
*/
function moviesMaxId(db, maxId) {
return db // define db (in this case, a stored variable)
.collection("movies") // use movies collection
.find({tt: {$lte: maxId}}) // find all entries with TT < maxId
.project({title: 1, tt: 1}) // only return title and tt fields
.toArray(); // convert data to array to work with it in nodejs
}
async function main() {
console.log('starting function check...\n');
const movie_lens_db = await Connection.open(mongoUri, 'movie_lens_db');
q1 = await titleIncludesSubstring(movie_lens_db, "harry");
console.log("titleIncludesSubstring:",
q1.length, "harry movies found, such as", q1[0]);
q2 = await searchTitleAndGenre(movie_lens_db, "harry", "adventure");
console.log("searchTitleAndGenre:",
q2.length, "harry adventure movies found, such as", q2[0]);
await Connection.close();
console.log("\n");
const wmdb = await Connection.open(mongoUri, 'wmdb');
q6 = await youngestPerson(wmdb);
console.log("youngestPerson",
"the youngest person in wmdb is " + q6.name + ", born on " + q6.birthdate);
q6mongo = await youngestPersonMongo(wmdb);
console.log("youngestPersonMongo",
"the youngest person is " + q6mongo.name + ", born on " + q6mongo.birthdate);
q7 = await mostCredits(wmdb);
console.log("mostCredits ",
q7.name + " has the longest credit list of " + q7.movies_length + " movies");
q8 = await peopleOlderThanDate(wmdb, '2000-01-01');
console.log("peopleOlderThanDate:",
q8.length, "people found with birthdates after 2000, such as", q8[0]);
q9greater = await peopleMinId(wmdb, 10000);
console.log("peopleMinId:",
q9greater.length, "people found with nm > 10000, such as", q9greater[0]);
q9less = await moviesMaxId(wmdb, 5000);
console.log("maxId:",
q9less.length, "movies found with tt < 5000, such as", q9less[0]);
await Connection.close();
console.log("\n");
}
main().catch(console.error);