Week 2 Lecture 2

Set Operations

union, intersect, except operate on relations and correspond to the respective algebra operations. Each of these operations automatically eliminate duplicates. To retain duplicates, use the corresponding multiset versions union all, intersect all and except all.

Aggregate Functions

These functions operate on the multiset of values of a column of a relation or tuples in a relation, and return a value.

  • avg - average
  • min - minimum value
  • max - maximum value
  • sum - sum of values
  • count - number of values
group by with aggregate functions

Attributes in the select clause outsie of aggregate functions must appear in a group by list.

having clause

The conditions in the having clause are applied after the formation of groups whereas conditions in the where clause are applied before forming groups.

The having clause applies to the result of an aggregate function.

Null Values

Is is possible for tuples to have a null value, denotes by null, for some of their attributes. null signifies an unknown value or that a value does not exist. The result of any arithmetic expression involving null is null.

The predicate is null can be used to check for null values.

Null values and three valued logic

P is unknown evalutates to true, if predicate P evaluates to unknown

Any comparison with null returns unknown. The result of the where clause predicate is treated as false if it evalutates to unknown.

  • Unknown or true = true
  • Unknown or false = unknown
  • Unknown or unknown = unknown
  • True and unknown = unknown
  • False and unknown = false
  • Unknown and unknown = unknown
  • not unknown = unknown

Testing for set membership

in or not in

The following code finds courses offered in Autumn 2009 and in Spring 2010

select distinct course_id
from section
where semester = 'Autumn' and year=2009 and
	course_id in (select course_id
                  from section
                  where semester = 'Spring' and year=2010);

Correlation Variables

Another way of specifying the query "Find all courses taught both in Autumn 2009 and Spring 2010"

select course_id
from section as S
where semester='Autumn' and year=2009 and
	exists (select *
            from section as T
            where semester='Spring' and year=2010
            	and S.course_id=T.course_id);

Derived Relations

SQL allows a subquery expression to be used in the from clause. For example: finding the average instructors' salaries of those departments where the average salary is greater than £42,000

select dept_name, avg_salary
from (select dept_name, avg(salary) as avg_salary
      from instructor
      group by dept_name)
where avg_salary > 42000;

Note that the having clause is not needed here

Modification to the database


delete from instructor; - This would delete all instructors

delete from instructor where dept_name='Finance'; - This would delete all instructors from the Finance department

delete from instructor
where dept_name in (select dept_name
                    from department
                    where building='Watson');

This would delete all tuples in the instructor relation for those instructors associated with a department located in the Watson building.


update instructor
	set salary=salary*1.05
    where salary<(select avg(salary) from instructor);

Gives a 5% salary raise to instructors whose salary is less than average.