Quiz

  1. Why should use use char() for social security numbers? Are characters ('123') somehow more space efficient than numbers (123)?

    Great question. The main thing is that numbers are just different. First, numbers have some weird problems char don't. For example, there's a maximum integer, namely 2,147,483,647. max int. So, suppose we try to store a SSN of 321-45-8911 into an int. We get 2147483647 for the SSN. Hunh??? That's weird, unexpected and very unpleasant.

    There is a tiny amount of time spent converting from characters (for I/O) and numbers (internal representation) but that's not important.

  2. Now that we have seen varchar(), is there any place where it would be the best choice to use a string as the type rather than a varchar()? They seem functionally very similar to me, but varchar() I think are more space efficient?

    True. There is a tiny amount of space overhead for varchar, so varchar(2) is probably less efficient than char(2). And fixed-length fields are faster to access than variable-length fields. So if something is essentially fixed-length (like SSN or unit number) I'd prefer char. But, yeah, for any string of non-trivial length where it's probably variable, I'd use varchar().

  3. If you are using varchar and you know that the information going inside a certain column will be around a general length, should you do varchar of a little more than that general length, or should you always do varchar of a large number? Does it make a difference in terms of space and time if you do varchar(1000) vs. varchar(200)?

    Another great question. You're right, if we do varchar(1000) and, in practice, nothing is ever longer than 200, then we never use any extra space. The limit is just there for cutting things off instead of letting them get out of hand.

    Varchar doesn't care about the average length, just the worst case. So, allow for that, within reason.

  4. What is the purpose of "not null" in "uid int not null auto_increment"? At first I thought it was to tell auto_increment not to just let rows stand with NULL as the primary key. But you didn't say "not 0", and I also saw you use "uid int auto_increment", so I am confused as to what the "not null" is doing.

    Primary keys are never allowed to be NULL, so that's a standard declaration. For example, in a student database, B-number cannot be NULL.

    We might also specify that name can't be null, but address might be allowed to be NULL.

  5. Can you go over updating tables with autoincrement (and sets)?

    I like the "bakery" metaphor: the table has an internal counter, and increments it whenever an INSERT happens, and uses the counter value for the ID. Guarantees uniqueness as long as you don't try to override the counter.

  6. So, in other words, auto_increment means autogenerated? Can we auto_increment ids that include letters or other characters or is it only applicable to ints?

    Yes, it's autogenerated. Has to be a number.

  7. I'm a little confused as to how last_insert_id() solves the problem described in the reading where if you use max(uid), you might get the wrong value if someone else added a user before you could type the command. Wouldn't the same thing happen if you did last_insert_id() after another user was making changes as well?

    Understandable. We'll revisit this later in the course, when we talk about concurrency. But for now, imagine this scenario:

    • Fred and George connect to the database, independently (separate connections)
    • Fred inserts his name
    • George inserts his name
    • Fred asks what ID did I get?
    • George asks what ID did I get?

    If we use the MAX technique, they both get the same answer, namely George's ID. If we use the LAST_INSERT_ID technique, they both get the correct answer.

  8. How does deleting values work with auto-increment? Will the next value added fill in the missing space or be given the next highest key value like normal?

    AUTO_INCREMENT doesn't fill in "holes". It just increments a counter.

  9. What do we do if we want multiple datatypes for one column?

    We can't. Gotta commit to a datatype. You could use the more-general one and coerce things later.

  10. more examples regarding the difference between set and enum / Could you show another example about the difference between set and enum?

    You're responding to a wedding invitation and the RSVP card says chicken, fish or vegetarian? Is that SET or ENUM?

    You're asked for your ethnicity: SET or ENUM?

    You're asked your favorite color: SET or ENUM?

  11. Enum sorts using numerical values/they sort the order they are given in the `enum` list. I understand the second part of that statement (the `cat` precedes `dog` makes sense), but not the first part . Could you explain further?

    Sure. If we have ENUM('fish','chicken','vegetarian'), then fish is represented internally as a 1, chicken as 2 and vegetarian as a 3. So if we sort (ORDER BY), all the fish entries come first, then the chicken, and lastly the vegetarian.

    Maybe you don't care, but if you do, use ENUM('chicken','fish','vegetarian')

  12. Do changes in one table flow through to other tables in a database if some columns have the same name?

    No. There's no coordination among tables, even if they have the same name. (You can have a ID and NAME fields in Person and also in MOVIE w/o them being the same ideas or values.)

    Well, with an exception we'll talk about next time. There are also fancy techniques to set up some automatic updating, but we won't cover them.

  13. How does updating a table look like when it shares information with multiple tables? Are there ways to keep track of the changes?

    No again. Each datum is independent. That makes updating easier/faster/more concurrent. But at a price.