Week 6 Lecture 1

Database normalisation

First normal form

This enforces that tables should be "strictly relational":

  1. There is no top-to-bottom ordering of rows.
  2. There is no left-to-right ordering of columns
  3. There are duplicate rows
  4. Rows have no hidden components such as row IDs
  5. Every row and column intersection contains exactly one value from a particular domain (and nothing else). It should be atomic.
  6. All columns are regular

A 1NF database should not expose anything other than the values to the user.

Set representation

Think of a table as a true set of the rows that are contained within it. It may not contain multiple rows that take the same values for all columns. This can be enforced if the choice of a key for all the tables is forced. In the absense of a smaller key, this is equivalent to using the whole set of attributes as keys.

Data redundancy

Data redundancy is a condition where the same piece of data is held in multiple places. This could lead to potential inconsistencies. Disallowing redundancy is the core principle in the database normalisation.

  • Update anomaly
  • Insertion anomaly
  • Deletion anomaly

Table decomposition

Splitting a large table into multiple smaller tables. The smaller tables contain attribute sets whose union is the attribute set of the large table.

Tables cannot be decomposed to avoid anomalies. It is reqired that the decomposition is lossless, else information will be lost while trying to avoid data redundancy.