8.0 Introduction to the relational database tool
-
8.0.1 Relational database - introduction
-
8.0.2 Relational database concepts
8.0.1 Relational database - introduction
Section 8.0.2 (Relational database concepts) describes the advantages of using a relational database over a
single-table database and considers the essential points of database design. If you are unfamiliar with
such concepts as primary keys, relations and queries, you should read section
8.0.2 before reading this section and before attempting to implement any database applications. A
little planning in the early stages will greatly benefit the effectiveness of the final database.
If you are going to be designing a new database application, you will need to read all of sections in this
section.
If you are going to supply data to an existing database, you should read
section 8.4.1 - Add a Record.
If you are going to make use of data from an existing database, you should read section 8.6.4 - Display a
Query View.
Section 8.1 describes in detail how to use all of the features of the relational database tool.
A typical relational database view is shown in Figure 8.0.1:
Figure 8.0-1: A typical relational database view
8.0.2 Relational database concepts
-
8.0.2.1 Introduction
-
8.0.2.2 Database design
-
8.0.2.3 Database issues
-
8.0.2.4 Database usage
8.0.2.1 Introduction
This section outlines the basic concepts of relational databases for those users who are unfamiliar with
them. Although it is intended to be read as preparation for using the relational database tool within
Business Collaborator, the concepts are applicable to the design and implementation of relational databases
in any database management system.
If you simply wish to store a simple table of values, you can use the relational database tool to do this.
Some of the features the Business Collaborator Relational Database provides:
-
Record locking - only one person can edit a record at a time
-
Strong type-checking
-
Error handling
-
Range checking - over user-specified ranges
-
Auto numbering of fields
-
Import/export capability
-
Multiple query views with templates and individual access rights
-
Extensible structure
Advantages of relational databases
The relational data model is appropriate for database applications requiring flexibility in the data
structures and access paths of the database. Flexibility in the data structures allows the data to be
stored as groups of logically similar data, with the groups being inter-linked as needed, rather than in a
single, monolithic structure. Flexibility of the access paths permits the database to provide the exact
data which each data consumer requires, in the most appropriate format for them. Relational databases are
suitable both for applications under production control and for those in which there is a substantial need
for ad hoc data manipulation by end users who are not computer professionals.
Relational Databases limit replication of data. By storing all the data pertaining to a particular item
together, and then linking this collection of information to related objects, there is no need to store
data about the original item in several different places. For instance, in a contact database, the
information about each organisation is stored in one place and information about individual contacts within
that organisation are linked to the relevant corporate information. There is therefore no need to store
duplicate data.
By storing the data relating to an object in a single place, there is less likelihood of incorrect or
incomplete data being stored or used. It only needs to be kept up-to-date in one place. If the data
changes, it is only necessary to edit it in one place which saves time for those entering the data and
reduces the likelihood of errors occurring on data entry. Data inconsistencies are thus more easily
avoided.
Users of the data stored in the database do not have to be aware of the underlying structure. This permits
the database designer to optimise the data storage while presenting the users with the data in the format
which they need. For instance, in the contact database example, the individual and corporate information
may be stored separately but a user who needs the address for an individual will be presented with a
combination of the individual's information (name and title, say) along with the corporate data
(company name and postal address).
Relational databases are very flexible. Because they can be used to present information in different ways,
it is easy to add new views of the data as they are required. Inexperienced users can easily obtain the
information they require without having to know anything about database design or implementation. Different
components of the data can be maintained by different individuals so that the burden of keeping the data
up-to-date can be spread over a number of people. Well-designed relational databases can provide
appropriate data storage and retrieval facilities over a long timescale.
The relational database model is noted for its simplicity and expandability. The majority of large database
applications are relational databases.
Data Protection
The data in a relational database is protected so that removing columns or tables cannot invalidate any
relations associated with them. (Business Collaborator's relational database tool does this by
preventing the deletion of any fields or tables involved in relations.)
Similarly, database queries will not permit the user to display data which has been removed or altered or
where the data can no longer be accessed. (Business Collaborator's relational database tool will not
permit a relation to be deleted if a query is using it.)
The Contact database example
In the following sections, an example of a relational database application will be used to illustrate a
number of the concepts. This example concerns a typical "Contact Database". Most organisations
maintain a store of the names and addresses of the individuals and companies with whom they do business.
This may include clients who buy their products or services, suppliers of equipment and service providers.
The detailed contact information for each company or individual in the database will vary, e.g. some
companies may not have a fax number. Also, different parts of the organisation may be in contact with
different departments of the client, e.g. the Accounts department may be in touch with their counterparts
whereas a group of technical consultants may have reason to contact an IT support group. Different users
within the organisation may wish to obtain different combinations of the data, e.g. a secretary may wish to
combine an individual's name and title with a company address to address a letter while a member of the
software support team might need the phone number of all clients who have bought a specific product.
The implementation details for the database, including the exact fields required will be dependent on the
organisations' needs. The example tables below are only an indication of some of the functionality
which might be appropriate.
Table 8.0.1: Corporate Information
|
Column Name
|
Datatype
|
Sample Data
|
|
Company Name
|
String
|
Enviros Software Solutions
|
|
Company Type
|
String
|
Environmental Consultant
|
|
Company Address
|
Paragraph
|
45 Station Rd.
Henley-on-Thames
UK
RG9 1AT
|
|
Company Telephone No.
|
String
|
01491 410 474
|
|
Company Fax No.
|
String
|
01491 576916
|
Table 8.0.2: Individual Information
|
Column Name
|
Datatype
|
Sample Data
|
|
Contact Name
|
String
|
Robert Smith
|
|
Title
|
String
|
Senior Software Engineer
|
|
Company Name
|
String 2
|
Enviros Software Solutions
|
|
Personal Telephone No.
|
String
|
864207
|
|
Christmas Card?
|
Yes/No
|
Yes
|
Table 8.0.3: Company Type1
|
Column Name
|
Datatype
|
Sample Data
|
|
Type of Company
|
String
|
Stationery Supplier
|
1 The "Company Type" table is a look-up table. The data it contains is only stored in
a table so that its values can be selected (looked up) from another field, in this case a field in the
"Company Type" column of the "Corporate Information" table.
2 The Company Name field in the "Individual Contact Information" table would be linked
to the equivalent field in the "Corporate Information" table.
The database's queries might include "Individual Phone Numbers", displaying the fields:
|
Contact Name
|
Title
|
Company Name
|
Personal Phone No.
|
Company Phone No
|
Users could then search for a particular contact name to find a known individual or sort on company name to
find all the phone numbers within a named company.
Glossary of database terms
This glossary defines a number of the most frequently used database terms and gives an example of each item
from the Contact Database example.
Table 8.0.4: Glossary of database terms
|
Term
|
Definition
|
Contact database example
|
|
Column
|
A list of fields in a table
|
"Contact Name" in the "Individual Contact Info" table
|
|
Datatype
|
One of a few pre-defined types of data which have validity ranges associated with them, e.g. the hour
part of a time cannot be greater than 25.
|
Contact names in the "Individual Contact Info" table is a string
|
|
Field
|
A single, atomic item of data in a record
|
A company's name in the company's record in the "Corporate Info" table
|
|
Foreign Key
|
A field linked to a primary key via a relation
|
Company Name in the "Individual Contact Info" table
|
|
Index
|
Used to speed up access to frequently-used database fields
|
Might be appropriate for "Personal Telephone No" in the "Individual Contact Info"
table
|
|
Meta-data
|
Data about the data e.g. the name or datatype of a field.
|
"Company Name" (string)
|
|
Primary Key
|
A data field providing a unique identifier for a record
|
Company Name
|
|
Query
|
A subset of data from the database, grouping information in a suitable form for use by data consumers
|
All sales personnel in all contact companies
|
|
Record
|
A group of fields pertaining to the same entity; stored as a row of a table - data
providers usually provide one record at a time
|
All of an individual's contact information in the "Individual Contact Info" table
|
|
Relation
|
A one-to-many (or one-to-one) association between a field in one table and another
field in a different table. (Related fields should have the same datatype.)
|
Link between the company name in the "Corporate Info" table and the company name in the
"Individual Contact Info" table
|
|
Table
|
A collection of records of the same type; its rows are the records
|
"Corporate Info"
|
|
View
|
A data display for consumers of the data and is composed of data fields from one or more tables.
|
Sales personnel's names, company names and telephone numbers (for all sales personnel in all
contact companies)
|
Database design is usually accomplished in two phases. In the first, conceptual, phase, the data to
be stored is identified and the logical design is outlined.
In the second, physical, design phase, the database designer decides how to implement the conceptual
design using the database system at his disposal.
Key stages in constructing a robust and effective database design are:
Collecting Data
The database designer must collect as much realistic data as possible to appreciate the information which
will be entered into, stored within and retrieved from the database. It is important at this stage to
understand what constitutes valid data and what steps should be taken if invalid data is entered or
detected.
Understanding the application
The database designer must understand the application so that it is clear how the data will be maintained
and how it will be used. Information to be ascertained at this stage of the design process must be the
rules for creating and accessing data - how many users will perform each of these tasks, how much expertise
they will have in using the system and how regularly the data will be updated or viewed. It is also
important to understand the circumstances in which the data may be deleted and whether only specific users
will be permitted to do this.
Organising the data to form an initial conceptual model
At this stage, the database designer begins to identify the significant groups of data which are logically
related. These logically distinct groups will correspond to database tables. Having recognised the
differences between these sets of data, it is also necessary to recognise how the data sets will be linked.
Evaluating the conceptual model
The initial conceptual design should be used as a starting point for constructing "use cases" for
each aspect of the database. Such "use cases" should cover different roles of user -
administrator, manager, data provider, data consumer - as well as different scenarios of database use. By
considering these scenarios in the context of the initial database design, it will become clear where
shortcomings of the initial model lie.
Sequential Database Design
As a rule of thumb. the order in which constituent parts of the database should be constructed is as
follows:
-
Identify tables
-
Insert columns into tables
-
Eliminate repeating groups 1
-
Identify a primary key for each table
-
Identify relationships between the tables
-
Identify foreign keys
1Repeating groups are groups of fields which recur in two or more tables. They indicate that the
data has not been divided to a sufficiently atomic level and that this group should form the core of a new
table with links to the tables requiring access to this information. Eliminating repeating groups in the
database design will eliminate later problems in updating and accessing data.
This sequence of design stages should be iterated around until the design has been finalised.
Good Database Design - Hints
-
Mapping from a file-based process directly to a relational database is not good practice; the database
should always be designed
-
Several narrow fields are better than a single wide one
-
Use character fields for numeric values such as phone numbers
-
Do not store derived data in a record - only store raw, unprocessed data
Primary Keys
Primary keys are very important because they are used to help build associations between tables. In fact,
the relational database tool will not allow you to set up a relation between two fields unless one of them
is a primary key. Primary keys must be
-
Unique (- this is ensured by the relational database tool)
-
Meaningful
-
Easy to use
-
Short (for performance reasons)
In some database applications, it may be necessary to introduce a field purely to act as the primary key.
If two candidate primary keys exist, the one which is least likely to change or has fewer characters should
be chosen.
Identifying the primary key for the records in a table will help to evaluate whether the chosen database
model is appropriate. An indication that the primary key has been chosen correctly is that the table's
fields satisfy the:
Key Dependency Rule - All the database fields depend solely on the primary key and on no other
fields.
The intuitive database design concepts described above can be formalised into a number of formal rules
about the organisation of data within a database.
Normalisation
Normalisation describes the methodical organisation of data within a database to prevent problems when the
database is in use. Without normalisation, data anomalies can occur when data is added to, edited in or
removed from the database.
There are a number of levels of normalisation to which a database can conform:
First normal form
-
The database contains no repeating groups
Second normal form
-
The database is in first normal form and
-
Every field in a table is fully functionally dependent on the table's primary key
It is good database practice to design every database to conform to third normal form.
Third normal form
-
The database is in second normal form and
-
All fields which do not constitute the primary key are mutually independent
One-to-Many Relationships
Relational databases facilitate the creation of one-to-many relationships between tables of data. In a
one-to-many relation, one data item in a table will be related to a number of data items in another
table. Relations relate the primary key of one table (the one) to a foreign key in another table (the
many).
Typical examples of one-to-many relationships are listed below; these can be read as:
"A [Single item] [Verb] many [Many items].":
|
Single item
|
Verb
|
Many items
|
|
Company
|
Employs
|
Employees
|
|
School Year
|
Contains
|
Pupils
|
|
Garage
|
Owns
|
Cars
|
This can be illustrated as shown in Figure 8.0.2:

Figure 8.0.2: A one-to-many relationship
Many-to-Many Relationships
There are, however, many instances where the relationship between data items will not be a simple
one-to-many relation. In a many-to-many relation, several data items in one table will be related to
a number of data items in another table. The following are some examples of many-to-many relations; these
can be read left to right as: "[Many items] are [Left Verb] many [Many items]." and right to left
as "Many [Many items] [Right Verb] many [Many items]."
|
Many items
|
Verb
|
Many items
|
|
Projects
|
Worked on by/Works on
|
Employees
|
|
Subjects
|
Studied by/Study
|
Pupils
|
|
Cars
|
Serviced by/Service
|
Garages
|
Many-to-many relationships may be illustrated as shown in Figure 8.0.3:

Figure 8.0.3: A many-to-many relationship
In order to implement such many-to-many relationships in a relational database, a link table must be
introduced to divide the many-to-many relationship into two one-to-many relationships. There is no data
replication but foreign keys recur in the link table.
Use of a link table is illustrated in Figure 8.0.4:

Figure 8.0.4: Using a link table
In the relational database tool, relations are only permitted between pairs of foreign and primary keys. A
primary key must be specified before it can be used in a relation. Once a relation between two fields has
been established, adding records to the foreign key becomes very straightforward - the user simply has to
select from a list box containing the values of the primary key.
8.0.2.4 Database Usage
At the database design stage, one of the key considerations is to determine how the database will be used
and who will use it. Sections 8.0.2.2 (Database Design) and 8.0.2.3 (Database Issues) considered the
database more from the perspective of the designer than the user. In this section, the emphasis is more on
how databases may be used rather than on how they are implemented.
Remember that a brief set of guidelines describing the design and implementation of any database
application is a very valuable asset, not only for the database designer and anyone who extends the
database but also for anyone using the database in future.
Who does what?
One of the relevant pieces of information about any database is which roles the users of the database will
fulfil and what information they will provide and consume.
For instance, the corporate information stored in a contact database may be maintained solely by the
managing director's secretary. In this case, it is important that only she has sufficient permission to
edit the corporate information (add new records, update existing ones or delete obsolete ones). However,
some of the corporate data such as the address of the company's headquarters should be visible to all
users who may need to write to an individual at that company. This is clearly an occasion where the data
provider and data consumers are different people.
The kinds of questions to consider in this context include:
-
Who is going to implement the database - will they continue their involvement with the database to
maintain it or should they document their design decisions and rationale so that the database can be
maintained by someone else?
-
Is the database administrator the only person who can add new tables or relations, say?
-
Who is going to add information to the database - are the information providers for each of the sets of
information the same people?
-
Are the information providers familiar with how to use the database - do they need training or any
special guidelines?
-
If the data providers are inexperienced, what actions should they be prevented from performing in order
to protect the database?
-
If the data providers are not permitted to delete data, for instance, who will ultimately be responsible
for deleting irrelevant or obsolete data?
-
Should all of the data by generally available or should some data sets be restricted to specific
individuals?
Answering these questions will go a long way towards prescribing a basic set of default access rights to be
assigned to the database, its tables and queries.
Who sees what?
As well as determining who the database users are, it is also vital to understand how users will wish to be
presented with the data. The exact format which best suits their needs should be established. Two users may
wish to search within the same subset of the data for slightly different information. For instance, one may
wish to obtain all the fax numbers for customers who have bought productX while another wishes to obtain
all the phone numbers for this same group of customers. It is at the database designer's discretion
whether this information is presented together in a single view or separated into two distinct views. The
advantage of separating the views is that access to each view can be assigned exactly to the group of users
who will require the information in this format. The disadvantage is obviously that any users who need both
components of this information will require to consult two views of the data.
In addition, different groups of users may wish to view identical subsets of the data but presented in
different forms - for instance with the fields re-ordered or sorted in a different way. Such decisions can
only be made by someone with a detailed understanding of the application and its users.
Where does the data come from?
Building up a database application does not always involve entering large volumes of data from scratch. The
raw data is invariably already available in some other format, perhaps in plain text, a word-processed
document or in a spreadsheet. Through using the data in this form, with all the problems of accessibility,
data accuracy and consistency which this method entails, it often becomes clear to users that a database
application is required. Therefore, it is often possible to partially fill some or all of the database
tables with data immediately the database is constructed. It is always worth investigating ways in which
such existing data can be transferred from its original location into the new database application as this
will reduce the volume of data to be entered into the database manually and thus minimise the potential for
data entry errors.
However, some modification of data imported in this way will usually be necessary. For example, it is
usually desirable to specify the types of data which will be stored in each field and thus ensure that
type-checking is performed. The default field type will probably differ from the one which is ultimately
required and so the field types will need to be edited.
Data will also be entered directly by users and as much guidance should be given to them as possible about
what data is required and in what format. If certain aspects of a record are crucial, e.g. the company name
in the Company Info table of the contact database example, then these components of the record should be
made "required" so that users are forced to enter at least this component.
Applying Relational Database Concepts
Now that you have a basic grasp of the vocabulary of relational databases, why not try applying some of the
ideas presented in this section? Information on how to use the relational database package is provided in
section 8.1.
|