MySQL Summary

This is a brief summary of some of the concepts and techniques that we've learned with respect to Tempest and MySQL in the first few class meetings. It's arranged in order by when the material is presented. During those first course meetings, it's a good idea to look over what we've done and what we will do the next day.

Part 1: Getting Started with Simple Queries¶

databases comprise tables of rows of columns of different kinds of info
under the hood, databases have a data structure that allows rows to be easily searched, inserted, and updated. Examples: b-trees and hash tables.
rows often inserted/removed. Columns rarely are.
client-server architecture
Tempest is the CS server, hosting a MySQLdb server

Unix Commands

ls lists the contents of a directory; default the current one
ls -l lists more info
rm deletes a file
up-arrow or ^-p to go to previous command
cd changes the current directory
more file types out the given file
curl -O url downloads the given URL
tar xf file unpacks the given tarfile
mysql starts the MySQL client (the command line interface)
mysql < file runs mysql with input from given file

SQL Commands

select user() shows who you are logged in as
quit to exit
help
show databases shows the databases you have
source file reads commands from the given file
use database makes that database the default one
show tables shows tables in the default database

SELECT query:

 
SELECT col1, col2, ... or * 
FROM table 
WHERE boolean expression
LIMIT number of row

the last two clauses are option.

Part 2: Creating Tables, Inserting, Updating and Deleting Data¶

tables need a name, named columns, with datatypes
Usually should have a primary key
primary key is guaranteed unique and guaranteed to be the fastest lookup
datatypes determine semantics of operations and storage requirements
special value NULL for missing data.
NULL is not equal to any value. Special predicates to search for it.
values for primary key must be unique
primary key values are often just arbitrary counters, sometimes generated by auto_increment.
There are many additional datatypes (unsigned integers, integers of different sizes, enums (one of a set of options), sets (a subset of a set of options)
SQL statements to insert, update, and delete data.
the mysqldump command gives a snapshot of the state of the database at the moment the dump is taken, in a text file that can be used to re-create the state of the database at that moment. The dump is not the database.

SQL Code

CREATE table statement:

 
CREATE TABLE tablename  ( 
    colname  datatype, 
    colname  datatype, 
    colname  datatype, 
    PRIMARY KEY (colname) 
);

describe table describes the given table
datatypes:
- int for arithmetic
- float for arithmetic
- char(width) for a fixed number of characters
- varchar(width) for a variable number of characters up to some fixed limit
- datetime for storing both date and time
```
<li><code>INSERT</code> data:
```
```
 
insert into tablename(col1,col2...coln) values (val1,val2...,valn); 
```
is null or is not null to find or skip null values
auto_increment is an option for values to be generated by a counter in the database, say for ID values.
UPDATE statement:
```
 
update tablename set col1 = val1, col2=val2;
where conditional
```
the optional WHERE clause often specifies the primary key of the row to update.
DELETE statement:
```
 
delete from tablename
where conditional
```
the optional WHERE clause often specifies the primary key of the row to delete

Part 3: Joins¶

A join combines rows from two or more tables.
Almost always, the join combines rows that are semantically related, such as a pet and its owner, or a student and their courses.
Joins are how relational databases do 1-many relationships (lists) and many-to-many relationships.
Examples to keep in mind: movies and directors (many to one) and movies and actors (many to many)
Connections between tables are represented by foreign keys: a foreign key is a key in some other table.
Lists are backwards: each of the many points to the one, as opposed to the one having a list of the many.
Conceptually, a join can also be thought of as a subset of a cross-product. A cross-product combines all pairs of rows.
Many-to-many relationships require an intermediate table, listing pairs of primary keys. For example, the credit table lists pairs of an actor ID (NM) and a movie id (TT) where the pair (x,y) means that actor with NM=x acted in move with TT=y
special syntax for joins

SQL

columns can be (re)named in a query:

 
select a.name as 'scoop1', b.name as 'scoop2' 
from flavor a, flavor b;

Join on a condition:

 
select * from A inner join B on (A.x = B.y) WHERE ...     
select * from movie inner join person on (movie.director = person.nm) WHERE ...

Join on equality of a condition:

 
select * from A inner join B using (x) WHERE ...     
select * from movie inner join credit using (tt) WHERE ...     
select * from person inner join credit using (nm) WHERE ...     
select * from person inner join credit using (nm) inner join movie using (tt) WHERE ...

Part 4: Complex Queries¶

Grouping

You can group rows, reporting one result per group.
For example, grouping current Wellesley students by graduation year would yield just a few rows (one for each class year, plus Davis Scholars and a few others).
Can also group by booleans: one group for true and one for false
Sometimes you can report values that vary over the group, but you'll only get one arbitrary value, which is not what you want. You should only report aggregate values, like counts, sums, averages and such.
Can filter groups using a having clause.

Subqueries

You can use a subquery where the outer query is looking for a (1) value, (2) list, or (3) table.

Example of (1) looking for a value:

 
select * from movie 
where director = (select nm from person where name = 'Peter Jackson');

Example of (2) looking for a list:

 
select name from person inner join credit using (nm) 
where tt in  
    -- returns a list 
    select tt from movie where director = 1392;

With a sub-query that returns a list, we can use predicates like IN and NOT IN to see if a value occurs in the list. You can see that in the example above.
Example of (3) looking for a table. Here the subquery is aliased to t1 so it's like a temporary table named t1. Here, the subquery is a table of titles and director ids of movies added by Scott (UID=1). The outer query joins with the Person table to find the name of the director.
```
 
select name, title 
from person, (select title, director from movie where addedby = 1) as t1 
where person.nm = t1.director; 
```
With a sub-query that returns a table, we can use predicates like exists and not exists to see if the subquery returns any results. For example: People who don't have acting credits:
```
 
select name from person 
where not exists (select * from credit 
                  where credit.nm = person.nm) 
```

Sorting

Databases are good at sorting
the ORDER BY clause does it.
the ORDER BY clause is almost last, after everything except LIMIT.
can sort by more than one column, each ascending (ASC) or descending (DESC)

SQL

Counting rows: select count(*) from table

Eliminate duplicates by using the distinct keyword:

 
select `release` from movie;             # many duplicates 
select distinct `release` from movie;    # no duplicates

Grouping reports one row/result per value of the grouping criterion. The following example gives the release years and the number of movies in that year.
```
 
select `release`,count(*) 
from movie 
group by `release` 
```
Common aggregate functions: count(), sum(), max(), min(), avg()
Having clauses allow you to choose which groups to report. The following only reports years where we have more than 10 movies.
```
 
select `release`,count(*) 
from movie 
group by `release` 
having count(*) > 10; 
```

Sorting:

 
select name, birthdate 
from person 
order by name ASC, birthdate DESC;

Wildcards. This finds all names starting with "George"

 
select name,birthdate 
from person 
where name like 'George%';

Part 5: Referential Integrity¶

Use InnoDB for both tables
construct an index for the foreign key
specify the foreign key constraint with policies for update and delete

use webdb;

-- drop in reverse order
drop table if exists track;
drop table if exists album;

create table album ( 
    aid int auto_increment not null primary key, 
    title varchar(50),
    artist varchar(50)
    ) 
ENGINE = InnoDB;

create table track ( 
    aid int not null, 
    tid tinyint not null comment 'track id', 
    title varchar(50), 
    primary key (aid,tid),
    -- necessary for ref integ
    INDEX (aid),
    foreign key (aid) references album(aid) 
        on update restrict 
        on delete restrict
    ) 
ENGINE = InnoDB;