Basic Computer Notions Databases

Collections of data

Data may be collected, manipulated and retrieved in various ways:

Reasons for a DBMS

A DBMS is a software package for defining and managing a database.

A ‘real’ database includes

A DBMS provides Data independence.

One must be able to change storage mechanisms and formats without having to modify all application programmes. For example:

Database organization

Four main types of database organization:

Flat databases

Example of flat data organization

A single kind of record with a fixed number of fields.

Notice the repetition of data, and thus an increased chance of errors.

Hierarchical databases

Hierarchical relationships among different types of data.


Sample hierarchical database

Hierarchical example

Question #1: What is the average weight of babies delivered by Jones to mothers with heights > 170 cm?

Question #2: How many babies with birth weights > 2.5 kg have been delivered by each physician?
Sample hierarchical database

Question #1: What is the average weight of babies delivered by Jones to mothers with heights > 170 cm?

First we scan through the Physician records looking for Jones.
Sample hierarchical database

Question #1: What is the average weight of babies delivered by Jones to mothers with heights > 170 cm?

Now we've found Jones ...
Sample hierarchical database

Question #1: What is the average weight of babies delivered by Jones to mothers with heights > 170 cm?

... so we scan through the relevant Mother records looking for mothers with heights > 170 cm.
Sample hierarchical database

Question #1: What is the average weight of babies delivered by Jones to mothers with heights > 170 cm?

This mother's height is > 170 cm ...
Sample hierarchical database

Question #1: What is the average weight of babies delivered by Jones to mothers with heights > 170 cm?

... so we scan through her babies looking at the weights.
Sample hierarchical database

Question #1: What is the average weight of babies delivered by Jones to mothers with heights > 170 cm?

Now we can calculate the average weight for these two babies.

We've read 6 records in all.
Sample hierarchical database

Question #2: How many babies with birth weights > 2.5 kg have been delivered by each physician?

To get at the weights in all the baby records, we have to first read each physician record ...
Sample hierarchical database

Question #2: How many babies with birth weights > 2.5 kg have been delivered by each physician?

... and, for each physician, read each mother record ...
Sample hierarchical database

Question #2: How many babies with birth weights > 2.5 kg have been delivered by each physician?

... and finally, for each mother, read each baby record.
Sample hierarchical database

Question #2: How many babies with birth weights > 2.5 kg have been delivered by each physician?

Another baby record ...
Sample hierarchical database

Question #2: How many babies with birth weights > 2.5 kg have been delivered by each physician?

We've had to read 17 records in all, including all the mother records (which we weren't really interested in).

Relational databases

Data are organized as logically independent tables. Features:


Operations on tables

Example relational tables

Relational examples

Question #1: What is average weight of babies delivered by Jones to mothers with heights > 170 cm?

First we Join the Mother and Physician tables.
M# MHt MBirth Ph# PhName
123456 150 4605 123 Smith
392382 177 4912 220 Jones
238427 162 5204 123 Smith
649308 174 5409 123 Smith
732293 155 5810 220 Jones

Next we Project the columns that we're interested in.
PhName MHt M#
Smith 150 123456
Jones 177 392382
Smith 162 238427
Smith 174 649308
Jones 155 732293

Next we Select the records with PhName = Jones and MHt > 170.

One record satisfies the selection criteria.
PhName MHt M#
Jones 177 392382

Next we Join this with the Baby table.

The join with the baby table results in 2 records.
PhName MHt M# DelivMWtBWt
Jones 177 392382 7512 57.4 3.6
Jones 177 392382 7706 58.2 3.4

From these we can calculate the average weight.

We have carried out 4 relational operations (Join, Project, Select and Join).
Example relational tables

Question #2: How many babies with birth weights > 2.5 kg have been delivered by each physician?

First we Select babies with BWt > 2.5
B# M# Deliv MWt BWt
941830392382 7512 57.4 3.6
743920392382 7706 58.2 3.4
840238238427 7801 64.1 2.7
230484123456 7803 60.3 3.4
749200649308 7902 58.7 2.9
530284123456 8005 62.3 3.7
638201649308 8101 57.9 3.1

Next we Project the columns we're interested in.
B# BWt M#
9418303.6 392382
7439203.4 392382
8402382.7 238427
2304843.4 123456
7492002.9 649308
5302843.7 123456
6382013.1 649308

Next we Join this with the Mother table.
B# BWt M# MHt MBirth Ph#
941830 3.6 392382177 4912 220
743920 3.4 392382177 4912 220
840238 2.7 238427162 5204 123
230484 3.4 123456150 4605 123
749200 2.9 649308174 5409 123
530284 3.7 123456150 4605 123
638201 3.1 649308174 5409 123

Now we can count the babies for each physician.

We've done 3 relational operations.

Relational implementation

Relational operations are computationally expensive.

A relational front end may be built on top of a hierarchical back end.

Relational software

Object-oriented databases

Object-oriented analysis is another way to model the world, involving abstraction, encapsulation, modularity and hierarchy (with inheritance).

An object consists of data and methods.

Classes are used to group objects which have the same types of data and the same methods.

Abstraction

Consider only features which are necessary for the problem at hand.

For example, a person may be defined by the ability to ask for the ID number, age, height and weight.

Encapsulation

The internal structure of an object is hidden.

For example, all we know is that we can ask for the age, not whether the age is (1) periodically updated or (2) calculated on demand from birth date and current date.

Modularity

Grouping classes and objects into ‘cohesive and loosely coupled modules’.

For example, the classes person and cat might be implemented within the same module because they share many things (like how to calculate age).

Hierarchy

Objects are instances of classes, e.g., JDoe is an instance of the class patient

Classes and objects form two different hierarchies.

Examples of class hierarchies

Person IDNum
Physician IDNum, Name
Patient IDNum
Mother IDNum, DateBirth, Height
Baby IDNum, WeightB
Event DateEvent, Physician, Patient
Ultrasound DateEvent, Physician, Patient, CRL, BPD, GestUS
Delivery DateEvent, Physician, Mother, Baby, WeightM
Numeric_value (range)
Height (range,arith_ops)
Weight (range,arith_ops)
WeightM (range, arith_ops)
WeightB (range, arith_ops)
Date (range date_ops)
DateBirth (range, date_ops)
DateEvent (range, date_ops)
ID_num (range)
String_value (char_coding, sort_ops)
Name (char_coding, sort_ops, name_ops)

Example of object hierarchy

Examples of objects

Persons

physician(1) 123 smith
physician(2) 220 jones
mother(1) 123456 194605 150
mother(2) 649308 195409 174
mother(3) 238427 195204 162
mother(4) 732293 195810 155
mother(5) 392382 194912 177
baby(1) 481148 2.5
baby(2) 941830 3.6
baby(3) 743920 3.4
baby(4) 840238 2.7
baby(5) 230484 3.4
baby(6) 749200 2.9
baby(7) 538403 2.2
baby(8) 530284 3.7
baby(9) 638201 3.1
baby(10) 539208 1.4

Events

delivery(1) 197511 physician(1) mother(3) baby(1) 61.8
delivery(2) 197512 physician(2) mother(5) baby(2) 57.4
delivery(3) 197706 physician(2) mother(5) baby(3) 58.2
delivery(4) 197801 physician(1) mother(3) baby(4) 64.1
delivery(5) 197803 physician(1) mother(1) baby(5) 60.3
delivery(6) 197902 physician(1) mother(2) baby(6) 58.7
delivery(7) 197906 physician(2) mother(4) baby(7) 59.2
delivery(8) 198005 physician(1) mother(1) baby(8) 62.3
delivery(9) 198101 physician(1) mother(2) baby(9) 57.9
delivery(10) 198205 physician(1) mother(2) baby(10) 55.2

Object-oriented implementation

An object-oriented front end may be built on top of a relational back end.

For example, open-source Hibernate.

Centralized databases

E.g., hospital information system. Complex.
Advantages:

Historical note: ‘The minimum computer memory required for average data bases varies from 30K to 160K bytes.’ (Fundamentals of Data Base Systems, S.M. Deen, 1977, p. 5). This memory requirement is less than the size of the level-2 cache in a current entry-level PC.

A centralized DBMS will offer different views of the data.

Different applications will need to look at data in different ways. For example:

Electronic Medical Records

2001 survey: EMR's used by general practitioners (ref) :

2005 study: in USA, EMR's used by

Current efforts

Importance of semantics, structured terminology.

At MUHC

Oacis

Jointly with CHUM.
Open architecture using HL7.

Used by clinicians.
Can be linked to medical devices.

(Further details on Oacis)

DRG system


Bacon home page
R. Funnell
Last modified: Wed, 2007 Mar 21 17:29:21