MongoDB Queries

In this assignment, you'll implement a number of queries to the MongoDB databases that we are using for this class. Your implementation will be a queries.js file that will run all the query functions and print the results.

Outline

In lab6, we discussed symlinks and the symlinks connecting our personal ~/cs304/apps/ folder (and its subfolders) with the ones in the course account, ~cs304node. That's all coming into play now.

Your work will be in your personal ~/cs304/apps/queries/ folder, which we created in lab6 that day. You should review and do those steps if you haven't already.

  1. I suggest copying the queriesTemplate.js file to queriesHwk.js. That file will be where all your work is.
  2. Implement the queries below. I suggest writing and testing them one at time, commenting them out when they work.
  3. When you are ready to turn in your work, please uncomment them all, so I can run them in one batch without editing your file.
  4. You can compare your output to mine.
  5. Drop the finished queriesHwk.js file and upload a gradescope item.

Database Quirks

The movie_lens_db has some odd quirks in it. The title field contains both the title of the movie (e.g. Gravity) and the release year (Gravity was released in 2013). The release year is at the end of the field, in parentheses. So Gravity appears as "Gravity (2013)"

So, that means that the title is relatively unique and useable as a key. However, there are additional quirks. Some shows have two entries, with different data including different movieId values, despite having the same title. Here's one:

Atlas atlas-kumn99-shard-0 [primary] movie_lens_db> db.movie.find({title: /^Confessions of a Dangerous Mind/})
[
  {
    _id: ObjectId('63920187ee9e1a9c62bd381d'),
    movieId: 6003,
    title: 'Confessions of a Dangerous Mind (2002)',
    genres: 'Comedy|Crime|Drama|Thriller',
    imdbId: 290538,
    tmdbId: '4912',
  },
  {
    _id: ObjectId('63920187ee9e1a9c62bd4b66'),
    movieId: 144606,
    title: 'Confessions of a Dangerous Mind (2002)',
    genres: 'Comedy|Crime|Drama|Romance|Thriller',
    imdbId: 270288,
    tmdbId: '4912',
  }
]

There are 5 such movies:

  { _id: 'Saturn 3 (1980)', numMovies: 2, minId: 2851, maxId: 168358 },
  { _id: 'War of the Worlds (2005)', numMovies: 2, minId: 34048, maxId: 64997 },
  { _id: 'Confessions of a Dangerous Mind (2002)', numMovies: 2, minId: 6003, maxId: 144606 },
  { _id: 'Eros (2004)', numMovies: 2, minId: 32600, maxId: 147002 },
  { _id: 'Emma (1996)', numMovies: 2, minId: 838, maxId: 26958 }

I suggest just using the title field as the unique identifier in the movie_lens_db. In the wmdb, you should use tt as the unique identifier.

Queries

Each of the following should be its own function, called by main, which will print the results.

Q1 2010 Movies

In the movie_lens_db, return an array of all movies released in 2010, sorted by title in ascending order.

main should print the number of movies and the first one.

Q2 2010 Comedies

In the movie_lens_db, return an array of all comedies released in 2010, sorted by title in ascending order.

main should print the number of movies and the first one.

Q3 610 Ratings

In the movie_lens_db, return an array of all movies rated by the user with id 610. Sort by rating in descending order and then by movie title in ascending order alphabetically.

main should print the number of movies and the first one, which will be the alphabetically first 5-star movie (if any).

Q4 Average Rating of the Princess Bride

In the movie_lens_db, compute and return the average rating of The Princess Bride. You can look up the movieId of that movie, or find the exact title, as you choose.

main should print the average rating.

Q5 Count Distinct Ratings

In the movie_lens_db, return the number of distinct movies rated. That means that if "The Princess Bride" is rated 10 times, it only counts once.

main should print the count.

Q6 Oldest Person

Switching now to the wmdb database.

Return the oldest person in the WMDB. main should print the person's name and birthdate.

Q7 Longest Cast

In the wmdb, return the movie that has the longest cast. main should print the title of the movie and the length of the cast.

Q8 Movies before 1990

In the wmdb, return an array of movies with release dates before 1990, sorted by release year ascending and then by title, ascending. main should print the number of movies and the first one, but just the _id, title and release.

CRUD

The next three queries are in your personal database. The database name should match your Wellesley username, like ww123. You'll write some queries to insert a document into a collection of pets, find it, update it, and delete it.

Q9 Insert

Write a function to insert a new pet you got for your birthday. Decide some reasonable attributes that the document should have, but the pet should have at least a name and a weight.

Q10 Read

Write a function to search your collection for the pet by name or by some other ID, returning the document. main should call Q9 and Q10 and print the returned document.

Q11 Update

Your pet is lazy and pampered and has now gained some weight. Write a function to update the document with the new weight. main should call Q11 and Q10 and print the returned document, which will then show the updated weight.

Q12 Delete

You're off at college now and can't care for your pet properly, so you have asked your parents to care for it. Write a function to delete the pet from your collection. main should call Q12 and Q10, which will then show that the document is gone.

Efficiency

Efficiency is not your primary concern. However, some of these queries can be done either computing the answer locally in Node.js or in the database. Extra bragging rights for solving it in the database.

Documentation

Each function should be documented; this should go without saying. Arguments and return values described, and comments about any tricky parts of the implementation.

Testing

In this assignments folder is the output from my solution to this assignment. You can look at it and compare. The file is

~cs304/public_html/assignments/queries/queries-assignment-solution.out

Here's a link: solution output

Note that the solution above is captured from the database contents at a given time, so it might be out of date if someone adds or removes a movie from the WMDB. So, if you get results that are close but not quite accurate, let me know before you sink too much time into debugging.

Checklist

  • Are functions modular and reusable?
  • Do you have "use strict"; at the top of your .js file?
  • Are all functions properly documented?
  • Is the code nicely indented and readable?
  • Are there lines that are excessively long (longer than 80 characters)?
  • Does the preamble load code for modules that you don't use?
  • For the CRUD operations, did you read the response document to check that the modification worked?

Turning It In

The only file I need is your queriesHwk.js file. Drop that to the cs304 drop folder and upload the queriesHwk.js file to Gradescope (so I can comment), and you're done! Here's a reminder of the drop command:

drop cs304node queriesHwk.js

You should also drop a copy to your partner, if any, for their reference.