|SMALLINT||7502||Also integer numbers, but takes up less memory on some implementations|
|NUMERIC (p, s)||7502.098||numbers with a decimal point. p is the precision (total number of digits), s is the scale (number of digits after the decimal point)|
|DECIMAL (p,s)||7502.098||same as NUMERIC, but does not reject numbers with larger precision if permitted by the implementation|
|REAL or DOUBLE PRECISION||3.33E+6 (=3,330,000)||numbers stored in scientific notation|
|FLOAT(p)||3.33E+6 (=3,330,000)||also numbers in scientific notation, but the user gets to specify precision. Better to use than REAL or DOUBLE PRECISION, since it makes the database more portable.|
|CHAR (x) or CHARACTER(x)||'John Smith'||used for strings. x is the length of the string. Shorter strings are padded with blank spaces.|
|VARCHAR (x)||'John Smith'||used for strings. x is the length of the string. Shorter strings are not padded.|
|TIME (x)||'14:34:57.75' = 57.75 seconds after 2:34pm||time. x specified the number of fractional digits. The example is in format TIME(2) The highest value for seconds is 60.|
|TIMESTAMP(x)||'2002-02-16 14:34:57'||date and time. x specifies the number of fractional digits in time. The example is in format TIMESTAMP(0)|
CREATE TABLE Students (
The column ID is identified as the primary key, which means that this column cannot be NULL and all entries in the column must be distinct. A weaker constraint NOT NULL for Last_Name only requires that the entry is present in each row, but the entries may have repetitions. The database automatically checks that both requirements are satisfied for all data that we enter. Note that we have chosen Year field to be a string rather than an integer, since we would like to store Year (of graduation) in the form '03', instead of an integer 3.
If we want to add a column Date_of_birth, we can do it using the following command:
ALTER TABLE Students ADD COLUMN Date_of_birth DATE;
Similarly we can remove a column:
ALTER TABLE Students DROP COLUMN Date_of_birth;
Finally, if we don't need the entire table, we can get rid of it by typing
DROP TABLE Students;
Assume that we have the table Student as above and a table Courses given below. The primary key in this table is ID.
Given these tables, we define the table Enrollment which contains information about students taking courses and the student's grades. The constraints defined in the table guarantee that all students and all courses are present in their respective tables. We also define domain for possible values of grades.
CREATE DOMAIN Grade_dom CHAR(1) CHECK (Grade IN ('A', 'B', 'C', 'D', 'F'));
CREATE TABLE Enrollment (
CONSTRAINT Course_cons FOREIGN KEY (Course_ID)
REFERENCES Courses(ID) ON DELETE CASCADE
The clause "ON DELETE CASCADE" guarantees that when the record for the student or the course is deleted from the database, all corresponding entries in enrollment will be deleted as well. Alternatively, you can specify "ON DELETE SET NULL" (if the column is not restricted to non-null data). If you don't specify anything, then the entry will remain even if the student or the course record has been deleted.
INSERT INTO Students
VALUES (333, 'Ann', 'Smith', '02', 3.5, '1982-10-09');
We may also insert values for only some of the columns:
INSERT INTO Students (ID, First_name, Last_name, Date_of_birth)
VALUES (444, 'Mary', 'Scott', '1984-07-08');
Of course, we must provide values for the PRIMARY KEY and for all columns declared NOT NULL.
SET Year = '04', GPA = '3.8'
WHERE First_Name = 'Mary' AND Last_Name = 'Scott';
As the result, the columns Year and GPA of Mary Scott would be set to the new values. If there were previous values, they will be overwritten.
One has to be careful when specifying the WHERE clause of an UPDATE statement. For instance, if there are several students whose name is Mary Scott, then all such records will be updated! It's safer to rewrite the statement as
SET Year = '04', GPA = '3.8'
WHERE ID = 444;
Since IDs are guaranteed to be unique, only one record will be updated. Of course, you have to know the ID to be able to write your query this way!
DELETE statement deletes a row (or rows) in a table. It also has a WHERE clause and affects all rows that satisfy the clause. For instance, the following statement deletes records of all student who have graduated in 1940:
DELETE FROM Students
WHERE Year = '40';
This example suggests that our choice of meaning and representation of Year probably was not very wise. You can think of better options.
For these example let's assume that the tables in the database have the following information:
The simplest SELECT statement selects all information from a table:
SELECT * FROM Students;
will return the entire table Student. If we are only interested in a list of students' first and last names (say, sorted alphabetically by last name), we get the result by entering the following query:
SELECT First_Name, Last_Name FROM Sudents ORDERED BY Last_Name;
The result will be a table
We can also return only entries that satisfy some condition. For instance,
SELECT * FROM Students WHERE Year = '04';
will return the table
We can also specify more complex conditions. For instance, the following query will find all students whose first name starts with an A:
SELECT * FROM Students WHERE First_Name LIKE 'A%';
% is a wildcard that matches any string, including empty string. Underscore _ matches exactly one character. For instance, '%a%r_' matches 'Mary' and 'Alexandra'. Please keep in mind that in strings upper case characters are different from lower case ones.
You can write various conditional expressions for numbers. For instance,
SELECT * FROM Students WHERE GPA > 3.6 AND GPA <= 3.8; will select whose GPA is greater than 3.6 but not less than 3.8. You can form any combinations of conditions by using OR and NOT. You can also rewrite this condition as "WHERE GPA BETWEEN 3.6 AND 3.8". BETWEEN is also applicable to strings, the comparison is alphabetical. You can also specify a list of values, using IN:
SELECT * FROM Students WHERE Year IN ('02', '03');
DATE, TIME, and TIMESTAMP datatypes can be broken down into components (year, month, etc.) using EXTRACT:
SELECT * FROM Students WHERE (EXTRACT (MONTH FROM Date_of_birth)) = 10;
selects all students born in October.
You may also use functions to compute maximum, average, and so on. For instance,
SELECT AVG(GPA) FROM Students;
will compute the average GPA of all students.
SELECT COUNT(*) FROM Courses WHERE Semester LIKE '%01';
counts the number of courses offered in 2001. There are also functions MAX(), MIN(), and SUM() that you might find helpful. Another useful thing to know is that you can check if the value of a column is NULL. For instance, you can choose rows based on the criteria "WHERE Grade IS NULL" or "WHERE grade IS NOT NULL".
Tons of other information about SQL and a lot of examples, is available here.
Selecting from multiple tables. Now we will finally see why this kind of databases is called "relational": we will be selecting data from different tables which is related in a certain why. For instance, suppose we want to find all the students who took CS111 in Fall 2001. Here is how:
SELECT Students.Last_Name, Students.First_Name FROM Students, Enrollment
WHERE Enrollment.Course_ID = 1115 AND Students.ID = Enrollment.Student_ID;
What happens when this query gets processed? A large table gets created which includes all columns of the tables Students and Enrollment. It is filled with all possible combinations of rows of the first and the second table. We say that the two tables are joined. For instance, one row of this large table will contain all the information about CS111 of the spring 2002 semester and all information about Ann Smith. After the large table is constructed, the condition in the WHERE clause is checked for each row. As the result, the following table is returned:
OK, but what if we don't know that the ID for the course is 1115? Then we need to join three tables:
SELECT Students.Last_Name, Students.First_Name FROM Students, Enrollment, Courses
WHERE Courses.Course_Name = 'CS111' AND Courses.Semester = 'F01'
AND Courses.ID = Enrollment.Course_ID AND Students.ID = Enrollment.Student_ID;
Whenever unambiguous, you can omit the table name and specify just the column name. In our case, we can simplify the query as
SELECT Last_Name, First_Name FROM Students, Enrollment, Courses
WHERE Course_Name = 'CS111' AND Semester = 'F01'
AND Courses.ID = Course_ID AND Students.ID = Student_ID;
However, you have to be careful when simplifying queries, because adding another table into the query may create an ambiguity. For instance, what if we want to join these tables with a table Instructors which also has columns Last_Name and First_Name?
You can specify DISTINCT keyword to eliminate duplicate rows in the resulting table. For instance,
SELECT DISTINCT Course_name FROM Courses, Enrollment
WHERE Enrollment.Course_ID = Courses.ID;
will print out the list of names of all courses for which students have ever been registered. Note that it's different from just selecting distinct course names from Courses, because there may be courses that are listed in the table of courses, but have never been offered.
It's clear that the primary key for this table must be a combination of the customer ID and the product (none of these columns by itself may be used to identify rows). But the price of the product depends only on the product, not on the customer! This causes problems: firstly, we cannot record the price of a product until someone actually buys it, and secondly, we record the price of the product with every purchase, which is just a lot of repetitive information!
The reason for the problem is that the table is not in second normal form (2NF): a table is in 2NF if each attribute depends on all components of the primary key.
To remedy the problem, we need to break up this table into two: one for products and prices, the other for purchases.
Suppose we change the table in the example to allow only one entry for each customer. A resulting table may look like this:
While Customer_ID can be used as the primary key now, and the table is in 2NF, there is still a problem: suppose we want to delete the first row because the customer wants to return the bread back to the store. But if this row is deleted, the price of the bread is deleted as well! The table is not in the third normal form (3NF): A table is said to have transitive dependencies if one attribute depends on another attribute, which in turn depends on a third attribute. A table is in 3NF if it is in 2NF and doesn't have transitive dependencies. In our example the price does not depend on the primary key (Customer_ID) directly, but only through the product.
The way to fix the problem is, again, to break the table into two: one for products, and one for purchases.
There are other, more complex rules that prevent less trivial anomalities. However, the rule of thumb in designing a database is to have one table for each kind of entities that you want to describe. In our case, products and purchases are different things, and information about them should not be combined in the same table.
GRANT INSERT (ID, First_Name, Last_Name, Year, Date_of_birth)
ON Students TO Registrar;
allows the registrar to enter information about a new student into the Student table, but does not give privilagies to set the student's GPA.
You can achieve the same effect by granting Registrar INSERT privileges to the entire table and then revoking the right to insert data into GPA:
GRANT INSERT ON Students TO Registrar;
REVOKE INSERT (GPA) ON Students FROM Registrar;
Another important category of privilegies on a table is the ability of a user to reference the table, i.e. use its primary key as a foreign key in his/her own table. Consider the following situation: a company has created a table ToLayOff where it keeps dataabout employees who are going to be laid off. User Cathy does not have SELECT privilegies for the table, so she cannot check who is on the list. However, there is a round-about way to find out! Cathy finds out that the primary key in the table ToLayOff is Emp_ID and creates her own table which references ToLayOff:
CREATE TABLE Snoop (Emp_ID INTEGER REFERENCES ToLayOff);
Now she can try to insert every possible employee ID into her table. Those which she would be able to insert are on the layoff list!
To control user's ability to reference a table (i.e. to create tables which reference it), one may explicitly grant or revoke REFERENCES privileges. By default, no one can reference other owner's table. If you want to let other users reference your table, you must issue statements like this:
GRANT REFERENCES ON ToLayOff TO Personnel_manager;
If you want an object to be accessible to everyone, you may grant privileges on this object to PUBLIC. If you want to grant all of the above privileges, use GRANT ALL. You may redelegate rights to give privileges to other users:
GRANT SELECT ON Students TO Registrar
WITH GRANT OPTION;
You may revoke privileges or just the grant options from other users. For instance, the following statement revokes only the grant priveleges, but not the ability of the user himself to use the table:
REVOKE GRANT OPTION FOR INSERT, DELETE, UPDATE
ON Students FROM Assistant_registrar CASCADE;
The optional CASCADE keyword causes all INSERT, DELETE, UPDATE privileges granted by Assistant_registrar to other users to be revoked. If you omit the CASCADE keyword, then users who got their privileges on Students from Assistant_registrar retain their privileges.
A reasonable way of handling this is to go through the list of products, checking if the quantities are available, and if yes, decrement the product counter. When the first unavailable product is encountered, the transaction is rolled back, and all the quantities are restored to their initial values. All this works fine if there is only one person performing these actions at a time.
However, many things can go wrong if two or more users are performing these steps simultaneously. For instance, the first user may decrement the quantity of a product, and later roll back the transaction. However, the second user has checked the value after it has been decremented, which has not been suffiecient to fill the second order. Then the second user is forced to roll back the transaction, too, even though it could've gone through. This situation is called dirty read: the value that the second user has seen was not a real value, because the transaction that changed it has never been committed.
This situation can be avoided if the database system guarantees that all transactions are performed as if each of them was the only one running at a time, i.e. as if they have been ordered in some way. Such transactions are called serializable. Serializable transactions provide the highest level of data protection. However, they may significantly reduce efficiency. Therefore in some cases it makes sense to relax the requirement for transaction isolation. You may do so by declaring a transaction to be one of the following:
Any SQL statements which are not preceded by a SET TRANSACTION statement are considered a part of SERIALIZABLE READ WRITE transaction. The end of a transaction (which is automatically the beginning of a new one) is marked by a COMMIT or ROLLBACK statement. The following is an example of defining a transaction with a non-default behaviour:
SET TRANSACTION READ ONLY READ COMMITTED ;
... some SQL statements ...
COMMIT or ROLLBACK