INSTRUCTOR’S MANUAL
TO ACCOMPANY
40th Anniversary Edition
DATABASE PROCESSING
Fundamentals, Design, and Implementation
15th Edition
Chapter 3
The Relational Model and Normalization
David M. Kroenke | David J. Auer | Scott L. Vandenberg | Robert C. Yoder
Chapter 3 The Relational Model and Normalization
Page 3-2
CHAPTER OBJECTIVES
To understand basic relational terminology
To understand the characteristics of relations
To understand alternative terminology used in describing the relational model
To be able to identify functional dependencies, determinants, and dependent
attributes
ERRATA
There are no known errors at this time. Any errors that are discovered in the future will
TEACHING SUGGESTIONS
Databases used in the text examples and solution database files for your use are
Most CS, CIS or MIS majors have dealt with database programs and many have
developed databases before taking this class. In most of those cases, the
“databases” have consisted of one or two tables and have never considered the idea
of design. This is a time to have students reflect on those databases and evaluate
their design.
Chapter 3 The Relational Model and Normalization
Page 3-3
When discussing keys, a good example is an automobile. If you ask for an identifier
for an auto, most students will jump immediately on the license number. Now, what
about autos from different states? Now you have a composite identifier (State +
LicenseNumber). But what about autos at a dealership that do not have a license?
Now we have to use the VIN. But seldom does a police officer try to check the VIN
as he/she chases you down the interstate at 80 MPH. We now have a VIN as the
primary key and the License plate as a candidate key.
so this provides a good point to discuss exactly what is meant to be a determinant.
Point out to students that a functional dependency means that if we know the value
of one (or several) data items, then we can find the value of another one (or several).
They can read a functional dependency such as X Y as “If I know the value of X,
then I can look up the value of Y” (in a table somewhere). It is also easier for
students to understand (at first) if you explain that for any given X you will get only
one value of Y (don’t worry about multivalued dependencies for a while).
Chapter 3 The Relational Model and Normalization
Page 3-4
ANSWERS TO REVIEW QUESTIONS
3.1 Name three sources for databases.
(1) Existing data, (2) the development of new information systems, and (3) the redesign of
existing databases.
3.2 What is the basic premise of this chapter?
The basic premise is that when we receive one or more tables from a source to be included in a
3.3 Explain what is wrong with the table in Figure 3-2.
3.4 Define each of the terms listed in Figure 3-3.
Term
Definition
Relation
A table-like structure of rows and columns, where the rows store
data about an entity and the columns store data about the
attributes of that entity. Each column has a unique name, and all
values in a column are for the same attribute. The cells in the table
can only hold a single value. The order of the rows doesn’t matter,
and neither does the order of the columns. The data in each row as
a whole must be unique.
value of A, we will always know (we can lookup) the corresponding
value of B.
Composite key
A key that contains two or more attributes.
Term
Definition
Chapter 3 The Relational Model and Normalization
Page 3-5
Primary key
The candidate key selected to be the “official” key of a relationship.
Referential integrity
constraint
A value constraint of a foreign key that states that no value can be
placed in the foreign key unless it already exists as a primary key
value in the linked table.
Normal form
One category in a set of categories (“normal forms”) used to
describe relations according to the type of anomalies that can occur
in the relations.
multivalued dependency exists in a separate table, but can cause a
problem if the multivalued dependency exists within a table with
other attributes.
3.5 Describe the characteristics of a table that make it a relation. Define the term domain
and explain the significance of the domain integrity constraint to a relation.
For a table to be a relation:
A. The rows must contain data about an entity
Surrogate key
A column of artificial data added to a relation to serve as the primary
key.
that is then placed in a second table to form a relationship to the
first table by storing linking values. The term foreign key refers to
the attribute or set of attributes in the second table.
Chapter 3 The Relational Model and Normalization
Page 3-6
3.6 Give an example of two tables that are not relations.
Any table that does not meet all of the criteria in 3.5 is not a relation.
Two examples other than those in the text:
SALE
SALE_ID_NUMBER
DATE
TOTAL_AMOUNT
PLANET
2011000654
10-AUG-11
$2,344.78
Mars
(2) A table where all data in rows are not unique two or more rows contain identical data:
SALE
SALE_ID_NUMBER
DATE
TOTAL_AMOUNT
2011000654
10-AUG-11
$2,344.78
3.7 Suppose that two columns in two different tables have the same column name. What
convention is used to give each a unique name?
Chapter 3 The Relational Model and Normalization
3.8 Must all the values in the same column of a relation have the same length?
No. Although the cells in a column must hold the same type data, the values do not have to be
the same length. For example, note the lengths of the values of TOTAL_AMOUNT and
CUSTOMER in the following SALE table:
SALE
SALE_ID_NUMBER
DATE
TOTAL_AMOUNT
CUSTOMER
3.9 Explain the three different sets of terms used to describe tables, columns, and rows.
From spreadsheets, we get the commonly used terms table, rows and columns. E.F. Codd used
Table
Rows
Columns
3.10 Explain the difference between functional dependencies that arise from equations and
those that do not.
All functional dependencies document the fact that the knowledge of the value of one or more
attributes can be used to uniquely determine the value of one or more associated attributes.
Chapter 3 The Relational Model and Normalization
3.11 Intuitively, what is the meaning of the functional dependency:
3.12 Explain the following statement: “The only reason for having relations is to store
instances of functional dependencies.”
If functional dependencies were always based on equations, we could calculate the results of the
related data for future use. Relations are used for that storage.
3.13 Explain the meaning of the expression:
3.14 What is a composite determinant?
3.15 If (A, B)
C, then can we also say that A
C?
3.16 If A
(B, C), then can we also say that A
B?
3.17 For the SKU_DATA table in Figure 3-1, explain why Buyer determines Department, but
Department does not determine Buyer.
Buyer Department because each value of Buyer is always paired with the same value of
3.18 For the SKU_DATA table in Figure 3-1, explain why:
SKU_Description
(SKU, Department, Buyer)
Chapter 3 The Relational Model and Normalization
3.19 If it is true that:
PartNumber
PartWeight
does that mean that PartNumber will be unique in a relation?
3.20 Under what conditions will a determinant be unique in a relation?
A determinant will be unique in a relation only if it determines every other column in the relation.
3.21 What is the best test for determining whether a determinant is unique?
3.22 What is a composite key?
A composite key is a key that has two or more attributes or columns.
3.23 What is a candidate key?
3.24 What is a primary key? Explain the significance of the entity integrity constraint to a
primary key.
The primary key is the candidate key that has been selected from the set of all candidate keys to
3.25 Explain the difference between a candidate key and a primary key.
3.26 What is a surrogate key?
A surrogate key is a column of artificial data added to a relation to serve as the primary key.
3.27 Where does the value of a surrogate key come from?
3.28 When would you use a surrogate key?
Chapter 3 The Relational Model and Normalization
Page 3-10
A surrogate key is used when the non-surrogate primary key would be too large and unwieldy.
3.29 What is a foreign key? Explain the significance of the referential integrity constraint
to a foreign key.
A foreign key is the attribute (or set of attributes) that is a primary key in one table that is then
3.30 The term domestic key is not used. If it were used, however, what do you think it
would mean?
3.31 What is a normal form?
3.32 Illustrate deletion, modification, and insertion anomalies on the STUDENT_ACTIVITY
relation in Figure 3-24.
Here is the STUDENT_ACTIVITY table:
StudentID
Activity
ActivityFee
AmounttPaid
100
Golf
65.00
65.00
100
Skiing
200.00
0.00
200
Skiing
200.00
0.00
200
Swimming
50.00
50.00
300
Skiing
200.00
100.00
300
Swimming
50.00
50.00
400
Swimming
50.00
50.00
Chapter 3 The Relational Model and Normalization
Page 3-11
StudentID
Activity
ActivityFee
AmounttPaid
200
Skiing
200.00
0.00
An insertion anomaly occurs when we cannot insert needed data for one entity until there is
matching data for a second entity present in the table. In this table, if we want to add the Activity
of Running with a cost of $25.00, we also need associated data for at least one student:
StudentID
Activity
ActivityFee
AmounttPaid
100
Golf
65.00
65.00
100
Skiing
200.00
0.00
200
Skiing
200.00
0.00
200
Swimming
50.00
50.00
300
Skiing
200.00
100.00
400
Golf
65.00
65.00
400
Swimming
50.00
50.00
???
Running
25.00
???
An update anomaly occurs when needed data is inconsistent among rows of the table. In this
table, if we change the Activity name of Swimming to Diving but only update one of the three
rows where this occurs, we will have an update anomaly:
StudentID
Activity
ActivityFee
AmountPaid
100
Golf
65.00
65.00
100
Skiing
200.00
0.00
200
Skiing
200.00
0.00
200
Diving
50.00
50.00
200
Davis
Swimming
50.00
50.00
300
Garrett
Skiing
200.00
100.00
300
Garrett
Swimming
50.00
50.00
Chapter 3 The Relational Model and Normalization
Page 3-12
300
Swimming
50.00
50.00
3.33 Explain why duplicated data leads to data integrity problems.
All occurrences of duplicated data must be maintained if updated, all occurrences must be
3.34 What relations are in 1NF?
3.35 Which normal forms are concerned with functional dependencies?
3.36 What conditions are required for a relation to be in 2NF?
3.37 What conditions are required for a relation to be in 3NF?
3.38 What conditions are required for a relation to be in BCNF?
A relation is in BCNF if (1) it is in 3NF and (2) every determinant is a candidate key.
3.39 If a relation is in BCNF, what can we say about it with regard to 2NF and 3NF?
3.40 What normal form is concerned with multivalued dependencies?
3.41 What is the premise of Fagin’s work on DK/NF?
3.42 Summarize the three categories of normalization theory.
Chapter 3 The Relational Model and Normalization
As shown in Figure 3-12, the three categories of normalization theory are (1) anomalies
3.43 In general, how can you transform a relation not in BCNF into ones that are in BCNF?
There are two methods.
1. Identify every functional dependency.
2. Identify every candidate key.
3. Check to see if all determinants are candidate keys. IF there is a functional dependency
that has a determinant that is not a candidate key, THEN:
i. Move the columns of that functional dependency to a new relation.
NOTE: IF there is more than one such functional dependency, start with the one with the
most columns.
4. Repeat Step 3 until every determinant of every relation is a candidate key.
3.44 What is a referential integrity constraint? Define the term, and give an example of its
use. Are null values allowed in foreign key columns with a referential integrity
constraint? How does the referential integrity constraint contribute to database
integrity?
A referential integrity constraint is a value constraint on a foreign key that states that no value can
3.45 Explain the role of referential integrity constraints in normalization.
Since the referential integrity constraint requires that a value of the foreign key exist as a primary
key value, we will maintain consistency between the two relations. We are protected against
Chapter 3 The Relational Model and Normalization
3.46 Why is an un-normalized relation like a paragraph with multiple themes?
According to the rules of English composition, paragraphs should only have one theme. If you
3.47 In normalization example 3, why is the EXTENDED_PRICE relation silly?
The EXTENDED_PRICE relation is silly because it contains an unnecessary column,
3.48 In normalization example 4, describe the conditions where the functional dependency
(StudentID, Activity)
ActivityFee
3.49 If a determinant is part of a candidate key, is that good enough for BCNF?
No, the determinant must be in exactly the same columns as the candidate key. To wit, “I swear
3.50 In normalization example 5, why are the following two tables not a correct solution to
the problem of normalizing SKU_DATA_3 ?
3.51 How does a multivalued dependency differ from a functional dependency?
Chapter 3 The Relational Model and Normalization
Page 3-15
3.52 Consider the relation:
PERSON (Name, Sibling, ShoeSize)
Assume the following functional dependencies exist:
Describe deletion, modification, and insertion anomalies for this relation.
Using the data in Figure 3-28 and adding Shoe_Size, we get the following relation:
Name
Sibling
ShoeSize
Chau
Eileen
C
Chau
Jonathan
C
Green
Nikki
B
Jones
Frank
E
Jones
Fred
E
Jones
Sally
E
Deletion anomaly: When we delete information about a sibling, we must also delete
information about Shoe_Size. For example, if we delete data about
Greene’s sibling Nikki, we also delete the fact that Greene wears size B
shoes.
3.53 Place the PERSON relation in Review Question 3.52 into 4NF.
To put PERSON into 4NF, put the multivalued dependency into its own relation.
Chapter 3 The Relational Model and Normalization
Page 3-16
3.54 Consider the relation:
PERSON_2 (Name, Sibling, ShoeSize, Hobby)
Assume that the following functional dependencies exist:
Name →→ Sibling
Using the data in Figure 3-28 and adding ShoeSize and Hobby, we get the following relation:
Name
Sibling
ShoeSize
Hobby
Chau
Eileen
C
Scuba diving
Chau
Jonathan
C
Scuba Diving
Green
Nikki
B
Embroidery
Jones
Frank
E
Model boats
Jones
Frank
E
Scuba diving
Jones
Fred
E
Model boats
Jones
Fred
E
Scuba diving
Jones
Sally
E
Model boats
Jones
Sally
E
Scuba diving
Chapter 3 The Relational Model and Normalization
Deletion anomaly: If a Name drops a Hobby, one line must be deleted for each Sibling.
For example, if Jones drops the Hobby of model boats, three lines will
have to be deleted one for Sibling Fred, one for Sibling Sally and one
for Sibling Frank.
3.55 Place the PERSON_2 relation in Review Question 3.54 into 4NF.
To put PERSON2 into 4NF, put the multivalued dependencies into their own relations.
PERSON_4 (Name, Shoe_Size)
3.56 What is 5NF?
5NF, also known as Project-Join Normal Form (PJ/NF) is used when we need to correct for a
3.57 How do the conditions for DK/NF correspond to the conditions for BCNF?
Chapter 3 The Relational Model and Normalization
Page 3-18
ANSWERS TO EXERCISES
3.58 Consider the table:
STAFF_MEETING (EmployeeName, ProjectName, Date)
The rows of this table record the fact that an employee from a particular project attended
a meeting on the given date. Assume that a project meets at most once per day. Also,
assume that only one employee represents a given project, but that employees can be
assigned to multiple projects.
a. State the functional dependencies in STAFF_MEETING.
b. Transform this table into one or more tables in BCNF. State the primary keys,
candidate keys, foreign keys, and referential integrity constraints.
STAFF_MEETING FUNCTIONAL DEPENDENCIES:
(ProjectName, Date) EmployeeName
Chapter 3 The Relational Model and Normalization
Page 3-19
STAFF_MEETING_2 CANDIDATE KEYS:
(ProjectName, Date)
Is every determinant a candidate key?
YES, therefore the relation is in BCNF
Is every determinant a candidate key?
REFERENTIAL INTEGRITY CONSTRAINTS:
ProjectName in STAFF_MEETING_EMPLOYEE must exist in STAFF_MEETING_2.
c. Is your design in part b an improvement over the original table? What advantages
and disadvantages does it have?
Yes, the design in part b is an improvement over the original table. The advantage is that it is
Chapter 3 The Relational Model and Normalization
Page 3-20
3.59 Consider the table:
STUDENT (StudentNumber, StudentName, Dorm, RoomType, DormCost, Club,
ClubCost, Sibling, Nickname)
Assume that students pay different dorm costs, depending on the type of room they
have, but that all members of a club pay the same cost. Assume that students can have
multiple nicknames.
a. State any multivalued dependencies.
We will assume that StudentNumber StudentName where name is not unique (i.e., there
may be more than one “John Smith”, each with a different student number). Then the
multivalued dependencies are:
b. State the functional dependencies.
We will assume that StudentNumber StudentName where name is not unique (i.e., there
may be more than one “John Smith”, each with a different student number). Then the
functional dependencies are:
StudentNumber StudentName