DBM #02: Relational Model
In relational model, data is stored in a tabular fashion. They use the mathematical concept of relations.
A relation is a set of records. Each row (or tuple) in a relation represents a record and specifies a relationship between the values in that row. A relation does not allow duplicate records.
A schema is a description of data in terms of a data model. It can be seen as a structure/ outline of a relation. An instance of relation is a snapshot of it. While an instance of a relation can change, the schema however cannot.
Every attribute in a relation has a domain i.e., a set of atomic values. If a relation has three attributes with domains D1, D2 and D3, any record of the relation is in D1 X D2 X D3. This would also imply that every instance (a set of records) is a subset of D1 X D2 X D3.
The degree/ arity of a relation is the number of attributes of its relation schema. The cardinality of a relation is the number of records in the relation.
To uniquely identify a non-null record in a relation, we use a super key made up of one or more attributes. It need not be minimal and may have NULL values. The set of all attributes will always be a super key since there are no duplicate records. When the super key is minimal, we call it a candidate key. Remember that the idea of minimal is with respect to the set under consideration. So, we can have simple candidate keys, where the candidate key is just one attribute, and composite candidate keys. The primary key is the candidate key chosen to be the key for a relation. Every relation has exactly one primary key. Primary key and its attributes cannot be null.
Maximum Number of Super Keys and Candidate Keys:
Consider a relation R(a, b, c, d, e). Then the maximum number of super keys is possible when each attribute is by itself a candidate key. For instance, if 'a' is a super key, then 'ab', 'abc', etc. are also super keys. In other words, we are looking at all non-empty subsets of the attribute set. So, we will have 2^5 - 1 = 31 super keys.
For the maximum number of candidate keys, we are looking at keys that do not conflict with each other. For instance, 'a' and 'ab' cannot both be candidate keys simultaneously, whereas 'a', 'b', 'c', 'd', 'e' can. Generally, this number turns out to be C(n, n/2).
Finding Super Keys from a Candidate Key Set:
Consider a relation R(a, b, c, d, e) with candidate keys {a, bc}. Then to count the total number of super keys in R, we consider each candidate key separately. With the key 'a', we have 4 attributes to choose from, giving a superset of size 2^4. For 'bc', we will have a superset of 2^3. However, we are also counting certain keys twice. So, we consider the combined key 'abc', which has a superset of size 2^2, which needs to be excluded.
Q1: Consider a relation R(A, B, C) such that the domains of A, B and C have sizes 20, 20, 30 respectively.
(a) What is the maximum cardinality possible for any instance of R?
(b) What is the total number of instances possible for R?
Q2: Can any attribute of a composite candidate key be null?
Q3: For a relation, can we have a null record i.e., all attributes are null?
Q4: For a relation R(A, B, C, D, E), which of the following is/ are possible set(s) of candidate keys?
(a) {A, BC} (b) {A, AB}, (c) {AB, AC}, (d) { }, (e) {A, B, C, D, E}
Q5: For a relation R(a, b, c, d, e) having candidate keys {abc, acd, ae}, find the total number of super keys possible.