– Part III
7
Chapter 32 Data Warehousing Design
Review Questions
32.1 Discuss the activities associated with initiating an enterprise data warehouse (EDW) project.
To begin a data warehouse project, we need answers for questions such as: which user
requirements are most important and which data should be considered first? Also, should the
project be scaled down into something more manageable, yet at the same time provide an
infrastructure capable of ultimately delivering a full-scale enterprise-wide data warehouse?
32.2 Compare and contrast the approaches taken in the development of an EDW by In
Corporate Information
complete, it is used to implement an EDW. The EDW is then used to feed departmental
databases (data marts), which exist to meet the particular information requirements of each
department. The EDW can also provide data to other specialized decision support applications
32.3 Discuss the main prin
Lifecycle.
32.4 Discuss the concepts associated with dimensionality modeling.
– Part III
8
Every dimensional model (DM) is composed of one table with a composite primary key,
32.5 Describe how star, snowflake, and starflake schemas differ.
Star schema is a logical structure that has a fact table containing factual data in the center,
32.6
Lifecycle.
32.7 Identify the steps used in Phase I of
Lifecycle.
32.8 Identify the particular issues associated with the development of an enterprise data
warehouse.
32.9 Describe how the Oracle Warehouse Builder supports the design of a data warehouse.
Described in Section 32.5.2.
Exercises
Please note that all of the exercises listed refer to the DM shown in Figure 32.2.
32.10 Identify three types of analysis that the DM can support about property sales.
– Part III
What is the maximum offerPrice in different regions of the United Kingdom?
32.11 Identify three types of analysis that the DM cannot support about property sales.
32.12 Discuss how you would change the DM to support the queries identified in Exercise 32.11.
In order to modify the DM to support the above queries the values being queried would need
32.13 What is the granularity of the fact table shown in the property sales DM?
32.14 What is the purpose of the fact table and dimension tables shown in the property sales DM?
occur for
32.15 Identify an example of a derived attribute in the fact table and describe how it is calculated.
Are there any others that you could suggest?
32.16 Identify two examples of natural and surrogate keys in the property sales DM and discuss the
benefits associated with using surrogate keys in general.
32.17 Identify two possible examples of SCD in the property sales DM and discuss the types of
change (Type 1 or Type 2) that each represents.
– Part III
10
32.18 Identify the dimensions that make the property sales DM a star schema, rather than a
snowflake schema.
32.19 Select one dimension from the DM to demonstrate how you would change the DM into a
snowflake schema.
An example is shown in Figure 32.3.
32.20 Examine the bus matrix for a university shown in Figure 32.12. The university is organized as
schools such as the School of Computing, School of Business Studies and each school has a
portfolio of programs and modules. Students apply to university to join a program but only
some of those applications are successful. Successful applicants enrol on university programs,
which are made up of six modules per year of study. Student attendance at module classes is
monitored as well as student results for each module assessment.
(a) Describe what matrix shown in Figure 32.12 represents.
Following the establishment of analytical themes, the business processes that are associated
(b) Using the information shown in bus matrix of Figure 32.12, create a first-draft, high-level
dimensional model to represent the fact tables and dimensions tables that will form the data
warehouse for the university.
The student should present a rough design that shows each business process (listed down first
(c) Using the information in Figure 32.12 produce a dimensional model as a star schema for
the student module results business process. Based on your (assumed) knowledge of this
business process as a current or past student, add a maximum of five (possible) attributes to
each dimension table in your schema. Complete your star schema by adding a maximum of 10
(possible) attributes to your fact table. Describe how your choice of attributes can support the
analysis of student result.
11
dimDate(date (PK), dayOfWeek, week, month, trimester, season, year)
dimStudent(studentID (PK), studentNo, studentName, studentHomeAddress, studentDOB,
32.21 Examine the dimensional model (star schema) shown in Figure 32.13. This model describes
part of a database that will provide decision support for a taxi company called FastCabs. This
company provides a taxi service to clients who can book a taxi either by phone a local office
s taxi jobs to gain a better understanding of how to resource the company in the coming
years.
(a) Provide examples of the types of analysis that can be undertaken, using the star schema in
Figure 32.13.
(b) Provide examples of the types of analysis that cannot be undertaken, using the star schema
in Figure 32.13.
Examples of answers include:
– Part III
12
(c) Describe the changes that would be necessary to the star schema shown in Figure 32.13 to
support the following analysis. (At the same time consider the possible changes that would be
necessary to the transaction system providing the data.)
The age of the client at the time of booking would have to be calculated and held in the fact
table.
The date of booking would have to be captured and held in the fact table.
(d) Identify examples of natural and surrogate keys in the star schema shown in Figure 32.13
and describe the benefits associated with using surrogate keys in general.
– Part III
13
(e) Using examples taken from the dimensional model of Figure 32.13, describe why the model
(f) Consider the dimensions of Figure 32.13 and identify examples that may suffer from the
slowly changing dimension problem. Describe for each example, whether type I, II, or III
would be the most useful approach for dealing with the change.