DBM #03: Integrity Constraints in Relational Models

Checklist:

> Be able to identify what integrity constraints are violated when an operation is performed either in a referencing or a referenced relation.

> Understand Armstrong's Axioms on Relational Databases

> To check if two sets of FDs are equivalent.

> To find the minimal cover of a set of FD.

> To check if a relation is in a specified normalized form

A database is only good as the information stored in it. Integrity constraints are conditions specified on a database schema that places restrictions on the data that can be stored within it. They are enforced when (a) the schema is defined, and (b) when relations are modified. The types of integrity constraints are:

1. Domain integrity constraints, which restricts the domain e.g., Student_ID must be integers.

2. Key integrity constraints, which imposes a uniqueness for values of an attribute. Null values are allowed.

3. Entity integrity constraints, which say that the primary key must be unique and not null.

4. Functional dependencies, where attributes determine each other. E.g., if the city is London, it is understood that the country is United Kingdom, but not vice versa. Remember that these are defined on the schema, not on the instance.

5. Referential integrity constraint, which applies to attribute of one relation referring to another attribute of another relation. Here, the former should have a valid reference to the latter or can be null. E.g., a student cannot be taking a course that is not present in the relation for courses.

Except for referential integrity constraints, the rest are defined within a particular relation. Referential integrity constraint can also be applied within the same relation.

A foreign key is a set of columns that references unique data values (i.e., candidate keys) of some other table. The referenced attributes need not be a primary key of the referenced relation. The foreign key itself need not be a candidate key and can have null values. However, when the foreign key is composite, the referenced candidate key must also be a composite key with same number of attributes.

Example: If a foreign key of relation R, ab, references some candidate key xy of relation S, then attribute a of R will correspond to x of S and attribute b of R will correspond to y of S.

Referential Actions:

There are cases when referential integrity could get violated. In the referencing relation, referential integrity could get violated during insertion or modification of records and such operations are simply not allowed. Similarly, in the referenced relation, violations of referential integrity can happen during modification and delete operations. But, if a violation occurs as a result of any of these operations in the referenced relation, we can either choose to cascade the operation i.e., reflect the operation in the referencing relation as well, or set the values in the referencing relation to null or some defined value, or even ignore the violation. In SQL, these actions are specified in the referencing table.

Example: Assume that relation R references attribute x of relation S using a foreign key k. If we choose to cascade the violation, then modifying values in x will reflect in those records of R whose values of k are the same as the previous values of those modified in x.

---

Functional Dependencies:

Functional dependencies are usually written as X->Y, where X and Y are set of attributes.

Given an instance of a relation, we can only show that a functional dependency does NOT hold. We cannot comment about the functional dependencies that the relation holds.

1. Transitive rule: X->Y, Y->Z also gives X->Z

2. We can split/ combine the right-hand side of any functional dependency. E.g., X->Y, X->Z would give X->YZ.

3. We can have the trivial functional dependency X->Y where Y is a subset of X.

4. Augmentation: X->Y then XZ->YZ

5. We can compose A->B and C->D as AC->BD

Axioms of Functional Dependencies:

1. Reflexivity - X --> Y and X is a superset of Y.

2. Augmentation - X --> Y implies XA --> YA

3. Transitivity

Closure of an Attribute:

Closure of a set x is the set of all attributes that x can determine. This is useful in listing out the other attributes that can be determined by some attribute set.

To check if a functional dependency is correct in a schema:

A good way to check if a functional dependency is valid or not over a relational schema is by finding the closure of the attribute set on the left.

Equivalency of Two Functional Dependencies

For two sets F, G, we say that F covers G when all functional dependencies of G are implied by F i.e. G is covered by F. To show this, we find the closure of X using F for all X --> Y present in G.

The two sets F, G are said to be equivalent when F is covered by G and G is also covered by F.

Minimal Cover/ Canonical Cover/ Irreducible Functional Dependency Set

The minimal cover of a set of functional dependencies is void of any redundant functional dependency i.e., they could be implied from other functional dependencies. The minimal cover is not unique, and different minimal covers may have different number of FDs. The algorithm is as follows:

(1) Simplification of the RHS: Eliminate any trivial FDs and split the existing functional dependencies such that the RHS only has one attribute.

(2) Simplification of the LHS: {A-->C; AB-->C} would mean B is an extraneous attribute i.e. not required. In general, for an FD XA-->Y, A is redundant/ extraneous when X can determine A. Eliminate the extraneous attribute.

(3) For each of the remaining dependencies, eliminate those that are redundant. For each X->Y, see if Y can be determined using the other FDs (not the one being considered currently), which can be done by finding the closure of X. If so, the FD can be eliminated. Use this new set for subsequent eliminations.

You may get different minimal covers depending on the order of taking the FDs.

Types of Functional Dependencies:

a) Trivial functional dependencies of the form X à Y, where Y is a subset of X. They are trivial since they hold true for every relation. Otherwise, they are non-trivial

b) Partially dependent functional dependencies of the form X à Y where any proper subset of X can determine any attribute of Y. Otherwise, they are fully functionally dependent.

c) Transitive functional dependencies of the form X à a, where there exists some Y such that X àY and Yà a, and a is not in X and Y. Otherwise, they are non-transitive.

Normal Forms:

1) 1NF: The relational model is by default in 1NF since the attributes are assumed to be atomic.

2) 2NF: All non-prime attributes are fully functionally dependent on all candidate keys. This is violated when there are non-prime attributes that are partially dependent on candidate keys i.e. some proper subset of a candidate key determines some non-prime attribute.

3) 3NF: All non-prime attributes are not transitively dependent on all candidate keys. This is violated when

a. Some non-prime attribute is transitively dependent on some candidate key, or

b. Some non-prime attribute is determined is a non-super key, or

c. Some non-prime attribute is determined by another non-prime attribute.

4) BCNF: For all non-trivial functional dependencies X à Y, X is a super key. This is violated when X is not a super key, or when prime and non-prime attributes are either partially or transitively determined by candidate keys. BCNF guarantees no redundancy occurring from the functional dependencies. Other redundancies may still exist.

ThTheorem: A relation with two attributes is always in BCNF.

PRACTICE SET:

Q1. Consider a relation R(a, b, c, d, e) such that a, b, and d have at most 3 different values. If the functional dependencies {ab->c, cd->e} hold over R, find the MAXIMUM number of different values for the attribute e.

Q2: When is a relation in 3NF but not in BCNF?

Q3: T/F - A prime attribute can be transitively dependent on a key in BCNF?

Q4: T/F - X --> Y, WY-->Z implies XW --> Z.

TEST YOUR UNDERSTANDING:

T1: When are integrity constraints enforced in a database?

The Automata of Life

DBM #03: Integrity Constraints in Relational Models

Popular posts from this blog

DBMS #05: Data Independence

DBMS #01: Introduction to DBMS

DBMS #04: Joins, Decomposition and Normalization