Quiz

  1. I think I am confused about the tradeoff between doing computation in the database versus in Node.. How does aggregation affect performance time/storage?

    Suppose you want to get the total sales for 2025, for a collection of sales, say working for 7-11. You could connect to MongoDB, download all 10,000,000 sales for 2025, add them all up and report the total.

    Or, you could compute the total in the cloud, and just download the total.

    The latter is the same amount of work, but vastly less network traffic.

    Furthermore, it's quite possible that the work will be done more efficiently in the database (maybe the data doesn't have to converted to JSON from BSON...).

    In general, try to compute/filter early, in the cloud, before downloading.

  2. Can you talk more about the aggregation pipeline? / Perhaps another example of the aggregation pipeline? / Could you go over again what situations in this class might be helpful to use the aggregation pipeline for?

    Sure. Suppose we again are working for 7-11 and the boss wants a list of sales totals by month and region for 2025.

    So, we *group* by both month and region,

    We *sum* the sales in each group

    We return the aggregate values:

    
        [ {region: 'NE', month: 'jan', total: 1_234_567},
          {region: 'SE', month: 'jan', total: 2_543_987},
        ... ]
    
    
  3. Also in what cases is $limit used in aggregation? Does $limit: n just grab the first n inputs?

    Yes. So, suppose in the previous example we only want the top 10 by sales.

    We take the aggregate values from the previous stage of the pipeline

    sort by total

    and use limit to get the top 10.