MySQL Summary
This is a brief summary of some of the concepts and techniques that we've learned with respect to Tempest and MySQL in the first few class meetings. It's arranged in order by when the material is presented. During those first course meetings, it's a good idea to look over what we've done and what we will do the next day.
Part 1: Getting Started with Simple Queries¶
- databases comprise tables of rows of columns of different kinds of info
- under the hood, databases have a data structure that allows rows to be easily searched, inserted, and updated. Examples: b-trees and hash tables.
- rows often inserted/removed. Columns rarely are.
- client-server architecture
- Tempest is the CS server, hosting a MySQLdb server
Unix Commands
ls
lists the contents of a directory; default the current onels -l
lists more inform
deletes a file- up-arrow or
^-p
to go to previous command cd
changes the current directorymore file
types out the given filecurl -O url
downloads the given URLtar xf file
unpacks the given tarfilemysql
starts the MySQL client (the command line interface)mysql < file
runs mysql with input from given file
SQL Commands
select user()
shows who you are logged in asquit
to exithelp
show databases
shows the databases you havesource file
reads commands from the given fileuse database
makes that database the default oneshow tables
shows tables in the default databaseSELECT
query:SELECT col1, col2, ... or * FROM table WHERE boolean expression LIMIT number of row
Part 2: Creating Tables, Inserting, Updating and Deleting Data¶
- tables need a name, named columns, with datatypes
- Usually should have a primary key
- primary key is guaranteed unique and guaranteed to be the fastest lookup
- datatypes determine semantics of operations and storage requirements
- special value NULL for missing data.
- NULL is not equal to any value. Special predicates to search for it.
- values for primary key must be unique
- primary key values are often just arbitrary counters, sometimes
generated by
auto_increment
. - There are many additional datatypes (unsigned integers, integers of different sizes, enums (one of a set of options), sets (a subset of a set of options)
- SQL statements to insert, update, and delete data.
- the
mysqldump
command gives a snapshot of the state of the database at the moment the dump is taken, in a text file that can be used to re-create the state of the database at that moment. The dump is not the database.
SQL Code
CREATE
table statement:CREATE TABLE tablename ( colname datatype, colname datatype, colname datatype, PRIMARY KEY (colname) );
describe table
describes the given table- datatypes:
int
for arithmeticfloat
for arithmeticchar(width)
for a fixed number of charactersvarchar(width)
for a variable number of characters up to some fixed limitdatetime
for storing both date and time
<li><code>INSERT</code> data:
insert into tablename(col1,col2...coln) values (val1,val2...,valn);
is null
oris not null
to find or skip null valuesauto_increment
is an option for values to be generated by a counter in the database, say for ID values.UPDATE
statement:update tablename set col1 = val1, col2=val2; where conditional
the optional WHERE clause often specifies the primary key of the row to update.
DELETE
statement:delete from tablename where conditional
the optional WHERE clause often specifies the primary key of the row to delete
Part 3: Joins¶
- A join combines rows from two or more tables.
- Almost always, the join combines rows that are semantically related, such as a pet and its owner, or a student and their courses.
- Joins are how relational databases do 1-many relationships (lists) and many-to-many relationships.
- Examples to keep in mind: movies and directors (many to one) and movies and actors (many to many)
- Connections between tables are represented by foreign keys: a foreign key is a key in some other table.
- Lists are backwards: each of the many points to the one, as opposed to the one having a list of the many.
- Conceptually, a join can also be thought of as a subset of a cross-product. A cross-product combines all pairs of rows.
- Many-to-many relationships require an intermediate table,
listing pairs of primary keys. For example,
the
credit
table lists pairs of an actor ID (NM) and a movie id (TT) where the pair (x,y) means that actor with NM=x acted in move with TT=y - special syntax for joins
SQL
- columns can be (re)named in a query:
select a.name as 'scoop1', b.name as 'scoop2' from flavor a, flavor b;
- Join on a condition:
select * from A inner join B on (A.x = B.y) WHERE ... select * from movie inner join person on (movie.director = person.nm) WHERE ...
- Join on equality of a condition:
select * from A inner join B using (x) WHERE ... select * from movie inner join credit using (tt) WHERE ... select * from person inner join credit using (nm) WHERE ... select * from person inner join credit using (nm) inner join movie using (tt) WHERE ...
Part 4: Complex Queries¶
Grouping
- You can group rows, reporting one result per group.
- For example, grouping current Wellesley students by graduation year would yield just a few rows (one for each class year, plus Davis Scholars and a few others).
- Can also group by booleans: one group for
true
and one forfalse
- Sometimes you can report values that vary over the group, but you'll only get one arbitrary value, which is not what you want. You should only report aggregate values, like counts, sums, averages and such.
- Can filter groups using a having clause.
Subqueries
- You can use a subquery where the outer query is looking for a (1) value, (2) list, or (3) table.
- Example of (1) looking for a value:
select * from movie where director = (select nm from person where name = 'Peter Jackson');
- Example of (2) looking for a list:
select name from person inner join credit using (nm) where tt in -- returns a list select tt from movie where director = 1392;
- With a sub-query that returns a list, we can use predicates
like
IN
andNOT IN
to see if a value occurs in the list. You can see that in the example above. - Example of (3) looking for a table. Here the subquery is aliased
to
t1
so it's like a temporary table namedt1
. Here, the subquery is a table of titles and director ids of movies added by Scott (UID=1). The outer query joins with the Person table to find the name of the director.select name, title from person, (select title, director from movie where addedby = 1) as t1 where person.nm = t1.director;
- With a sub-query that returns a table, we can use predicates
like
exists
andnot exists
to see if the subquery returns any results. For example: People who don't have acting credits:select name from person where not exists (select * from credit where credit.nm = person.nm)
Sorting
- Databases are good at sorting
- the ORDER BY clause does it.
- the ORDER BY clause is almost last, after everything except LIMIT.
- can sort by more than one column, each ascending (ASC) or descending (DESC)
SQL
- Counting rows:
select count(*) from table
- Eliminate duplicates by using the
distinct
keyword:select `release` from movie; # many duplicates select distinct `release` from movie; # no duplicates
- Grouping reports one row/result per value of the grouping
criterion. The following example gives the release years and the number of
movies in that year.
select `release`,count(*) from movie group by `release`
- Common aggregate functions:
count(), sum(), max(), min(), avg()
Having
clauses allow you to choose which groups to report. The following only reports years where we have more than 10 movies.select `release`,count(*) from movie group by `release` having count(*) > 10;
- Sorting:
select name, birthdate from person order by name ASC, birthdate DESC;
- Wildcards. This finds all names starting with "George"
select name,birthdate from person where name like 'George%';
Part 5: Referential Integrity¶
- Use
InnoDB
for both tables - construct an index for the foreign key
- specify the
foreign key
constraint with policies forupdate
anddelete
use webdb;
-- drop in reverse order
drop table if exists track;
drop table if exists album;
create table album (
aid int auto_increment not null primary key,
title varchar(50),
artist varchar(50)
)
ENGINE = InnoDB;
create table track (
aid int not null,
tid tinyint not null comment 'track id',
title varchar(50),
primary key (aid,tid),
-- necessary for ref integ
INDEX (aid),
foreign key (aid) references album(aid)
on update restrict
on delete restrict
)
ENGINE = InnoDB;