<<<<< Section >>>>> Level Up <<<<< Page >>>>> Contents
 


8.0 Introduction to the relational database tool

8.0.1 Relational database - introduction
8.0.2 Relational database concepts


8.0.1 Relational database - introduction

Section 8.0.2 (Relational database concepts) describes the advantages of using a relational database over a single-table database and considers the essential points of database design. If you are unfamiliar with such concepts as primary keys, relations and queries, you should read section 8.0.2 before reading this section and before attempting to implement any database applications. A little planning in the early stages will greatly benefit the effectiveness of the final database.

If you are going to be designing a new database application, you will need to read all of sections in this section.

If you are going to supply data to an existing database, you should read section 8.4.1 - Add a Record.

If you are going to make use of data from an existing database, you should read section 8.6.4 - Display a Query View.

Section 8.1 describes in detail how to use all of the features of the relational database tool.

A typical relational database view is shown in Figure 8.0.1:


Figure 8.0-1: A typical relational database view



8.0.2 Relational database concepts

8.0.2.1 Introduction
8.0.2.2 Database design
8.0.2.3 Database issues
8.0.2.4 Database usage


8.0.2.1 Introduction

This section outlines the basic concepts of relational databases for those users who are unfamiliar with them. Although it is intended to be read as preparation for using the relational database tool within Business Collaborator, the concepts are applicable to the design and implementation of relational databases in any database management system.

If you simply wish to store a simple table of values, you can use the relational database tool to do this. Some of the features the Business Collaborator Relational Database provides:

  • Record locking - only one person can edit a record at a time
  • Strong type-checking
  • Error handling
  • Range checking - over user-specified ranges
  • Auto numbering of fields
  • Import/export capability
  • Multiple query views with templates and individual access rights
  • Extensible structure

Advantages of relational databases

The relational data model is appropriate for database applications requiring flexibility in the data structures and access paths of the database. Flexibility in the data structures allows the data to be stored as groups of logically similar data, with the groups being inter-linked as needed, rather than in a single, monolithic structure. Flexibility of the access paths permits the database to provide the exact data which each data consumer requires, in the most appropriate format for them. Relational databases are suitable both for applications under production control and for those in which there is a substantial need for ad hoc data manipulation by end users who are not computer professionals.

Relational Databases limit replication of data. By storing all the data pertaining to a particular item together, and then linking this collection of information to related objects, there is no need to store data about the original item in several different places. For instance, in a contact database, the information about each organisation is stored in one place and information about individual contacts within that organisation are linked to the relevant corporate information. There is therefore no need to store duplicate data.

By storing the data relating to an object in a single place, there is less likelihood of incorrect or incomplete data being stored or used. It only needs to be kept up-to-date in one place. If the data changes, it is only necessary to edit it in one place which saves time for those entering the data and reduces the likelihood of errors occurring on data entry. Data inconsistencies are thus more easily avoided.

Users of the data stored in the database do not have to be aware of the underlying structure. This permits the database designer to optimise the data storage while presenting the users with the data in the format which they need. For instance, in the contact database example, the individual and corporate information may be stored separately but a user who needs the address for an individual will be presented with a combination of the individual's information (name and title, say) along with the corporate data (company name and postal address).

Relational databases are very flexible. Because they can be used to present information in different ways, it is easy to add new views of the data as they are required. Inexperienced users can easily obtain the information they require without having to know anything about database design or implementation. Different components of the data can be maintained by different individuals so that the burden of keeping the data up-to-date can be spread over a number of people. Well-designed relational databases can provide appropriate data storage and retrieval facilities over a long timescale.

The relational database model is noted for its simplicity and expandability. The majority of large database applications are relational databases.

Data Protection

The data in a relational database is protected so that removing columns or tables cannot invalidate any relations associated with them. (Business Collaborator's relational database tool does this by preventing the deletion of any fields or tables involved in relations.)

Similarly, database queries will not permit the user to display data which has been removed or altered or where the data can no longer be accessed. (Business Collaborator's relational database tool will not permit a relation to be deleted if a query is using it.)

8.0.2.2 Database design

The Contact database example

In the following sections, an example of a relational database application will be used to illustrate a number of the concepts. This example concerns a typical "Contact Database". Most organisations maintain a store of the names and addresses of the individuals and companies with whom they do business. This may include clients who buy their products or services, suppliers of equipment and service providers. The detailed contact information for each company or individual in the database will vary, e.g. some companies may not have a fax number. Also, different parts of the organisation may be in contact with different departments of the client, e.g. the Accounts department may be in touch with their counterparts whereas a group of technical consultants may have reason to contact an IT support group. Different users within the organisation may wish to obtain different combinations of the data, e.g. a secretary may wish to combine an individual's name and title with a company address to address a letter while a member of the software support team might need the phone number of all clients who have bought a specific product.

The implementation details for the database, including the exact fields required will be dependent on the organisations' needs. The example tables below are only an indication of some of the functionality which might be appropriate.

Table 8.0.1: Corporate Information

Column Name

Datatype

Sample Data

Company Name

String

Enviros Software Solutions

Company Type

String

Environmental Consultant

Company Address

Paragraph

45 Station Rd.
Henley-on-Thames
UK
RG9 1AT

Company Telephone No.

String

01491 410 474

Company Fax No.

String

01491 576916


Table 8.0.2: Individual Information

Column Name

Datatype

Sample Data

Contact Name

String

Robert Smith

Title

String

Senior Software Engineer

Company Name

String 2

Enviros Software Solutions

Personal Telephone No.

String

864207

Christmas Card?

Yes/No

Yes


Table 8.0.3: Company Type1

Column Name

Datatype

Sample Data

Type of Company

String

Stationery Supplier

1 The "Company Type" table is a look-up table. The data it contains is only stored in a table so that its values can be selected (looked up) from another field, in this case a field in the "Company Type" column of the "Corporate Information" table.

2 The Company Name field in the "Individual Contact Information" table would be linked to the equivalent field in the "Corporate Information" table.

The database's queries might include "Individual Phone Numbers", displaying the fields:

Contact Name

Title

Company Name

Personal Phone No.

Company Phone No

Users could then search for a particular contact name to find a known individual or sort on company name to find all the phone numbers within a named company.

Glossary of database terms

This glossary defines a number of the most frequently used database terms and gives an example of each item from the Contact Database example.

Table 8.0.4: Glossary of database terms

Term

Definition

Contact database example

Column

A list of fields in a table

"Contact Name" in the "Individual Contact Info" table

Datatype

One of a few pre-defined types of data which have validity ranges associated with them, e.g. the hour part of a time cannot be greater than 25.

Contact names in the "Individual Contact Info" table is a string

Field

A single, atomic item of data in a record

A company's name in the company's record in the "Corporate Info" table

Foreign Key

A field linked to a primary key via a relation

Company Name in the "Individual Contact Info" table

Index

Used to speed up access to frequently-used database fields

Might be appropriate for "Personal Telephone No" in the "Individual Contact Info" table

Meta-data

Data about the data e.g. the name or datatype of a field.

"Company Name" (string)

Primary Key

A data field providing a unique identifier for a record

Company Name

Query

A subset of data from the database, grouping information in a suitable form for use by data consumers

All sales personnel in all contact companies

Record

A group of fields pertaining to the same entity; stored as a row of a table - data providers usually provide one record at a time

All of an individual's contact information in the "Individual Contact Info" table

Relation

A one-to-many (or one-to-one) association between a field in one table and another field in a different table. (Related fields should have the same datatype.)

Link between the company name in the "Corporate Info" table and the company name in the "Individual Contact Info" table

Table

A collection of records of the same type; its rows are the records

"Corporate Info"

View

A data display for consumers of the data and is composed of data fields from one or more tables.

Sales personnel's names, company names and telephone numbers (for all sales personnel in all contact companies)

Database design is usually accomplished in two phases. In the first, conceptual, phase, the data to be stored is identified and the logical design is outlined.

In the second, physical, design phase, the database designer decides how to implement the conceptual design using the database system at his disposal.

Key stages in constructing a robust and effective database design are:

Collecting Data

The database designer must collect as much realistic data as possible to appreciate the information which will be entered into, stored within and retrieved from the database. It is important at this stage to understand what constitutes valid data and what steps should be taken if invalid data is entered or detected.

Understanding the application

The database designer must understand the application so that it is clear how the data will be maintained and how it will be used. Information to be ascertained at this stage of the design process must be the rules for creating and accessing data - how many users will perform each of these tasks, how much expertise they will have in using the system and how regularly the data will be updated or viewed. It is also important to understand the circumstances in which the data may be deleted and whether only specific users will be permitted to do this.

Organising the data to form an initial conceptual model

At this stage, the database designer begins to identify the significant groups of data which are logically related. These logically distinct groups will correspond to database tables. Having recognised the differences between these sets of data, it is also necessary to recognise how the data sets will be linked.

Evaluating the conceptual model

The initial conceptual design should be used as a starting point for constructing "use cases" for each aspect of the database. Such "use cases" should cover different roles of user - administrator, manager, data provider, data consumer - as well as different scenarios of database use. By considering these scenarios in the context of the initial database design, it will become clear where shortcomings of the initial model lie.

Sequential Database Design

As a rule of thumb. the order in which constituent parts of the database should be constructed is as follows:

  • Identify tables
  • Insert columns into tables
  • Eliminate repeating groups 1
  • Identify a primary key for each table
  • Identify relationships between the tables
  • Identify foreign keys

1Repeating groups are groups of fields which recur in two or more tables. They indicate that the data has not been divided to a sufficiently atomic level and that this group should form the core of a new table with links to the tables requiring access to this information. Eliminating repeating groups in the database design will eliminate later problems in updating and accessing data.

This sequence of design stages should be iterated around until the design has been finalised.

Good Database Design - Hints
  • Mapping from a file-based process directly to a relational database is not good practice; the database should always be designed
  • Several narrow fields are better than a single wide one
  • Use character fields for numeric values such as phone numbers
  • Do not store derived data in a record - only store raw, unprocessed data

8.0.2.3 Database Issues

Primary Keys

Primary keys are very important because they are used to help build associations between tables. In fact, the relational database tool will not allow you to set up a relation between two fields unless one of them is a primary key. Primary keys must be

  • Unique (- this is ensured by the relational database tool)
  • Meaningful
  • Easy to use
  • Short (for performance reasons)

In some database applications, it may be necessary to introduce a field purely to act as the primary key.

If two candidate primary keys exist, the one which is least likely to change or has fewer characters should be chosen.

Identifying the primary key for the records in a table will help to evaluate whether the chosen database model is appropriate. An indication that the primary key has been chosen correctly is that the table's fields satisfy the:

Key Dependency Rule - All the database fields depend solely on the primary key and on no other fields.

The intuitive database design concepts described above can be formalised into a number of formal rules about the organisation of data within a database.

Normalisation

Normalisation describes the methodical organisation of data within a database to prevent problems when the database is in use. Without normalisation, data anomalies can occur when data is added to, edited in or removed from the database.

There are a number of levels of normalisation to which a database can conform:

First normal form
  • The database contains no repeating groups

Second normal form

  • The database is in first normal form and
  • Every field in a table is fully functionally dependent on the table's primary key

It is good database practice to design every database to conform to third normal form.

Third normal form

  • The database is in second normal form and
  • All fields which do not constitute the primary key are mutually independent

One-to-Many Relationships

Relational databases facilitate the creation of one-to-many relationships between tables of data. In a one-to-many relation, one data item in a table will be related to a number of data items in another table. Relations relate the primary key of one table (the one) to a foreign key in another table (the many).

Typical examples of one-to-many relationships are listed below; these can be read as:
"A [Single item] [Verb] many [Many items].":

Single item

Verb

Many items

Company

Employs

Employees

School Year

Contains

Pupils

Garage

Owns

Cars

This can be illustrated as shown in Figure 8.0.2:




Figure 8.0.2: A one-to-many relationship



Many-to-Many Relationships

There are, however, many instances where the relationship between data items will not be a simple one-to-many relation. In a many-to-many relation, several data items in one table will be related to a number of data items in another table. The following are some examples of many-to-many relations; these can be read left to right as: "[Many items] are [Left Verb] many [Many items]." and right to left as "Many [Many items] [Right Verb] many [Many items]."

Many items

Verb

Many items

Projects

Worked on by/Works on

Employees

Subjects

Studied by/Study

Pupils

Cars

Serviced by/Service

Garages

Many-to-many relationships may be illustrated as shown in Figure 8.0.3:




Figure 8.0.3: A many-to-many relationship


In order to implement such many-to-many relationships in a relational database, a link table must be introduced to divide the many-to-many relationship into two one-to-many relationships. There is no data replication but foreign keys recur in the link table.

Use of a link table is illustrated in Figure 8.0.4:




Figure 8.0.4: Using a link table


In the relational database tool, relations are only permitted between pairs of foreign and primary keys. A primary key must be specified before it can be used in a relation. Once a relation between two fields has been established, adding records to the foreign key becomes very straightforward - the user simply has to select from a list box containing the values of the primary key.


8.0.2.4 Database Usage

At the database design stage, one of the key considerations is to determine how the database will be used and who will use it. Sections 8.0.2.2 (Database Design) and 8.0.2.3 (Database Issues) considered the database more from the perspective of the designer than the user. In this section, the emphasis is more on how databases may be used rather than on how they are implemented.

Remember that a brief set of guidelines describing the design and implementation of any database application is a very valuable asset, not only for the database designer and anyone who extends the database but also for anyone using the database in future.

Who does what?

One of the relevant pieces of information about any database is which roles the users of the database will fulfil and what information they will provide and consume.

For instance, the corporate information stored in a contact database may be maintained solely by the managing director's secretary. In this case, it is important that only she has sufficient permission to edit the corporate information (add new records, update existing ones or delete obsolete ones). However, some of the corporate data such as the address of the company's headquarters should be visible to all users who may need to write to an individual at that company. This is clearly an occasion where the data provider and data consumers are different people.

The kinds of questions to consider in this context include:

  • Who is going to implement the database - will they continue their involvement with the database to maintain it or should they document their design decisions and rationale so that the database can be maintained by someone else?
  • Is the database administrator the only person who can add new tables or relations, say?
  • Who is going to add information to the database - are the information providers for each of the sets of information the same people?
  • Are the information providers familiar with how to use the database - do they need training or any special guidelines?
  • If the data providers are inexperienced, what actions should they be prevented from performing in order to protect the database?
  • If the data providers are not permitted to delete data, for instance, who will ultimately be responsible for deleting irrelevant or obsolete data?
  • Should all of the data by generally available or should some data sets be restricted to specific individuals?

Answering these questions will go a long way towards prescribing a basic set of default access rights to be assigned to the database, its tables and queries.

Who sees what?

As well as determining who the database users are, it is also vital to understand how users will wish to be presented with the data. The exact format which best suits their needs should be established. Two users may wish to search within the same subset of the data for slightly different information. For instance, one may wish to obtain all the fax numbers for customers who have bought productX while another wishes to obtain all the phone numbers for this same group of customers. It is at the database designer's discretion whether this information is presented together in a single view or separated into two distinct views. The advantage of separating the views is that access to each view can be assigned exactly to the group of users who will require the information in this format. The disadvantage is obviously that any users who need both components of this information will require to consult two views of the data.

In addition, different groups of users may wish to view identical subsets of the data but presented in different forms - for instance with the fields re-ordered or sorted in a different way. Such decisions can only be made by someone with a detailed understanding of the application and its users.

Where does the data come from?

Building up a database application does not always involve entering large volumes of data from scratch. The raw data is invariably already available in some other format, perhaps in plain text, a word-processed document or in a spreadsheet. Through using the data in this form, with all the problems of accessibility, data accuracy and consistency which this method entails, it often becomes clear to users that a database application is required. Therefore, it is often possible to partially fill some or all of the database tables with data immediately the database is constructed. It is always worth investigating ways in which such existing data can be transferred from its original location into the new database application as this will reduce the volume of data to be entered into the database manually and thus minimise the potential for data entry errors.

However, some modification of data imported in this way will usually be necessary. For example, it is usually desirable to specify the types of data which will be stored in each field and thus ensure that type-checking is performed. The default field type will probably differ from the one which is ultimately required and so the field types will need to be edited.

Data will also be entered directly by users and as much guidance should be given to them as possible about what data is required and in what format. If certain aspects of a record are crucial, e.g. the company name in the Company Info table of the contact database example, then these components of the record should be made "required" so that users are forced to enter at least this component.

Applying Relational Database Concepts

Now that you have a basic grasp of the vocabulary of relational databases, why not try applying some of the ideas presented in this section? Information on how to use the relational database package is provided in section 8.1.

<<<<< Section >>>>> Level Up <<<<< Page >>>>> Contents
 Business Collaborator 4.9  © 1997-2003 Business Collaborator Ltd