• Why does a RDBMS require a structured datatype for each column? I'm wondering what the advantages are of requiring this as opposed to letting the user input what they want. Does it help with the amount of space used by the database or prevent errors?

    Great question! Well, as you recall, we have to define a datatype for all our columns: int, datetime, char(9), varchar(30), etc.

    Why is that? Here are some reasons:

    All of the reasons above are the same as in programming languages. Consider Java (strictly typed) versus Python (dynamic typing).

  • How do you handle relationships between collections in MongoDB if you can't use joins?

    Great question! There are several ways:

  • Could you explain what does it mean that denormalized MongoDB documents can lead to inconsistency across shards?

    Sure. That's another great question.

    Suppose that someone whose name is "Ellen Page" has been in multiple movies and those movies (and their cast lists) are in different shards. (Denormalized the actor name information.)

    Then Ellen changes his name to "Elliot Page" and we have to update the database.

    There might be a time, hopefully brief, when the name has been changed in some shards, but not in others. So the database as a whole is inconsistent.

    This can't happen in a normalized database, because the name is only stored in one place, so the update is atomic. But with denormalized data, we have to chase down all the copies.

    Brief inconsistency is maybe not a problem in the WMDB, but we can try to come up with examples where it is: maybe a banking app where there needs to be global consistency. Or an e-commerce site where you can't sell somethign more than once.