Division in Relational Algebra

The "division" operator is diabolical, but fortunately rare. It also has no natural counterpart in SQL, but we will see how to do queries that demand division.

The idea is that

  • Relation A has "extra" attributes compared to relation B.
  • You want the tuples from A where the extra attributes cover all the possibilities in B.

Example: Relation R is R(professors,courses) where there is a tuple if professor P teaches course C.

  • R/(πprofessors(R)) finds courses taught by every professor.
  • R/(πcourses(R)) finds professors who have taught every course.

The denominator of the first is all the professors, so if you divide R by all the professors, you find courses that are taught by every professor.

In the second, the denominator is all the courses, so if you divide R by all courses, you find professors who have taught every course.

Example: List students who have taken all the CS courses, where there are relations Student(Name,id), Transcript(StudID,crsCode) and Course(title,number,crsCode)

Student/πcrsCodecrsCode='CS%'(Course))

The denominator is all the crsCode values corresponding to CS courses, so dividing the table of all students by that denominator yields students who have taken every CS class.

How could we compute that as a SQL query? we List students where there isn't any CS course that they didn't take:

SELECT S.id 
FROM   Student S 
WHERE  NOT EXISTS ( 
       -- all CS courses 
       (SELECT C.crsCode 
        FROM   Course C 
        WHERE  C.crsCode LIKE 'CS%') 
       EXCEPT 
       -- This student's courses 
       (SELECT R.crsCode 
        FROM   Transcript R 
        WHERE  R.StudID = S.id) ) 

MySQL doesn't seem to have the EXCEPT operator, but we can accomplish the same result using EXISTS and NOT IN. Here's an example of a division computation using the WMDB, creating a person (named "Ubiquitous") who is in every movie, and a movie (named "A Cast of Thousands") that has every actor in it.

-- An example of the division operator.  As test data, we create a tuple
-- that can satisfy each of the division queries. First, we create a
-- person with id 0 and a movie with id 0.  Then, we add a credit for
-- person 0 for every movie.  We also add a credit for every person in
-- movie 0.

use wmdb;
delete from person where nm=987654321;
insert into person(nm,name,addedby) values (987654321,'Ubiquitous',1);

-- Note that this inserts as many tuples into the credit table as there
-- are movies, and Ubiquitous is in each of them as a constant.
insert into credit(nm,tt) SELECT  987654321,tt FROM movie;

delete from movie where tt=0;
insert into movie(tt,title) values (0,'A Cast of Thousands');
insert into credit(tt,nm) SELECT 0,nm FROM person;

-- find movies that have every actor:
select tt,title
from movie M
where not exists 
    -- all people not in this movie
    (select nm from person where nm not in 
        -- all people in this movie 
        (select nm from credit C where M.tt=C.tt));

-- find actors that acted in every movie
select nm,name
from person P
where not exists
    -- all movies missing this person
    (select tt from movie where tt not in
        -- all movies this person appears in
        (select tt from credit C where C.nm=P.nm));

-- the "on cascade delete" will take care of deleting all the credits
delete from person where nm=987654321;
delete from movie  where tt=0;