Data may be collected, manipulated and retrieved in various ways:
A DBMS is a software package for defining and managing a database.
A ‘real’ database includes
A DBMS provides Data independence.
One must be able to change storage mechanisms and formats without having to modify all application programmes. For example:
Four main types of database organization:
A single kind of record with a fixed number of fields.
Notice the repetition of data, and thus an increased chance of errors.
Hierarchical relationships among different types of data.
Question #1: What is the average weight of babies delivered by Jones to mothers with heights > 170 cm?
Question #2:
How many babies with birth weights > 2.5 kg
have been delivered by each physician?
Question #1: What is the average weight of babies delivered by Jones to mothers with heights > 170 cm?
First we scan through the Physician records looking for Jones.
Question #1: What is the average weight of babies delivered by Jones to mothers with heights > 170 cm?
Now we've found Jones ...
Question #1: What is the average weight of babies delivered by Jones to mothers with heights > 170 cm?
... so we scan through the relevant Mother records
looking for mothers with heights > 170 cm.
Question #1: What is the average weight of babies delivered by Jones to mothers with heights > 170 cm?
This mother's height is > 170 cm ...
Question #1: What is the average weight of babies delivered by Jones to mothers with heights > 170 cm?
... so we scan through her babies looking at the weights.
Question #1: What is the average weight of babies delivered by Jones to mothers with heights > 170 cm?
Now we can calculate the average weight for these two babies.
We've read 6 records in all.
Question #2: How many babies with birth weights > 2.5 kg have been delivered by each physician?
To get at the weights in all the baby records, we have to first read
each physician record ...
Question #2: How many babies with birth weights > 2.5 kg have been delivered by each physician?
... and, for each physician, read each mother record ...
Question #2: How many babies with birth weights > 2.5 kg have been delivered by each physician?
... and finally, for each mother, read each baby record.
Question #2: How many babies with birth weights > 2.5 kg have been delivered by each physician?
Another baby record ...
Question #2: How many babies with birth weights > 2.5 kg have been delivered by each physician?
We've had to read 17 records in all, including all the mother records
(which we weren't really interested in).
Data are organized as logically independent tables. Features:
Join
, Project
, Select
)
to form new tables
Join
: combine tables having common columns
Project
: extract specified columns, remove
duplicate rows
Select
: select rows satisfying a condition
Question #1: What is average weight of babies delivered by Jones to mothers with heights > 170 cm?
First we Join
the Mother and Physician tables.
M# | MHt | MBirth | Ph# | PhName |
---|---|---|---|---|
123456 | 150 | 4605 | 123 | Smith |
392382 | 177 | 4912 | 220 | Jones |
238427 | 162 | 5204 | 123 | Smith |
649308 | 174 | 5409 | 123 | Smith |
732293 | 155 | 5810 | 220 | Jones |
Next we Project
the columns that
we're interested in.
PhName | MHt | M# |
---|---|---|
Smith | 150 | 123456 |
Jones | 177 | 392382 |
Smith | 162 | 238427 |
Smith | 174 | 649308 |
Jones | 155 | 732293 |
Next we Select
the records with
PhName = Jones and MHt > 170.
One record satisfies the selection criteria.
PhName | MHt | M# |
---|---|---|
Jones | 177 | 392382 |
Next we Join this with the Baby table.
The join with the baby table results in 2 records.
PhName | MHt | M# | Deliv | MWt | BWt |
---|---|---|---|---|---|
Jones | 177 | 392382 | 7512 | 57.4 | 3.6 |
Jones | 177 | 392382 | 7706 | 58.2 | 3.4 |
From these we can calculate the average weight.
We have carried out 4
relational operations
(Join, Project, Select and Join).
Question #2: How many babies with birth weights > 2.5 kg have been delivered by each physician?
First we Select
babies with BWt > 2.5
B# | M# | Deliv | MWt | BWt |
---|---|---|---|---|
941830 | 392382 | 7512 | 57.4 | 3.6 |
743920 | 392382 | 7706 | 58.2 | 3.4 |
840238 | 238427 | 7801 | 64.1 | 2.7 |
230484 | 123456 | 7803 | 60.3 | 3.4 |
749200 | 649308 | 7902 | 58.7 | 2.9 |
530284 | 123456 | 8005 | 62.3 | 3.7 |
638201 | 649308 | 8101 | 57.9 | 3.1 |
Next we Project
the columns
we're interested in.
B# | BWt | M# |
---|---|---|
941830 | 3.6 | 392382 |
743920 | 3.4 | 392382 |
840238 | 2.7 | 238427 |
230484 | 3.4 | 123456 |
749200 | 2.9 | 649308 |
530284 | 3.7 | 123456 |
638201 | 3.1 | 649308 |
Next we Join
this with the Mother table.
B# | BWt | M# | MHt | MBirth | Ph# |
---|---|---|---|---|---|
941830 | 3.6 | 392382 | 177 | 4912 | 220 |
743920 | 3.4 | 392382 | 177 | 4912 | 220 |
840238 | 2.7 | 238427 | 162 | 5204 | 123 |
230484 | 3.4 | 123456 | 150 | 4605 | 123 |
749200 | 2.9 | 649308 | 174 | 5409 | 123 |
530284 | 3.7 | 123456 | 150 | 4605 | 123 |
638201 | 3.1 | 649308 | 174 | 5409 | 123 |
Now we can count the babies for each physician.
We've done 3 relational
operations.
Relational operations are computationally expensive.
A relational front end may be built on top of a hierarchical back end.
Object-oriented analysis is another way to model the world, involving abstraction, encapsulation, modularity and hierarchy (with inheritance).
An object consists of data and methods.
Classes are used to group objects which have the same types of data and the same methods.
Consider only features which are necessary for the problem at hand.
For example, a person may be defined by the ability to ask for the ID number, age, height and weight.
The internal structure of an object is hidden.
For example, all we know is that we can ask for the age, not whether the age is (1) periodically updated or (2) calculated on demand from birth date and current date.
Grouping classes and objects into ‘cohesive and loosely coupled modules’.
For example, the classes person and cat might be implemented within the same module because they share many things (like how to calculate age).
Objects are instances of classes, e.g., JDoe is an instance of the class patient
Classes and objects form two different hierarchies.
Person | IDNum |
Physician | IDNum, Name |
Patient | IDNum |
Mother | IDNum, DateBirth, Height |
Baby | IDNum, WeightB |
Event | DateEvent, Physician, Patient |
Ultrasound | DateEvent, Physician, Patient, CRL, BPD, GestUS |
Delivery | DateEvent, Physician, Mother, Baby, WeightM |
Numeric_value | (range) |
Height | (range,arith_ops) |
Weight | (range,arith_ops) |
WeightM | (range, arith_ops) |
WeightB | (range, arith_ops) |
Date | (range date_ops) |
DateBirth | (range, date_ops) |
DateEvent | (range, date_ops) |
ID_num | (range) |
String_value | (char_coding, sort_ops) |
Name | (char_coding, sort_ops, name_ops) |
physician(1) | 123 | smith | |
physician(2) | 220 | jones | |
mother(1) | 123456 | 194605 | 150 |
mother(2) | 649308 | 195409 | 174 |
mother(3) | 238427 | 195204 | 162 |
mother(4) | 732293 | 195810 | 155 |
mother(5) | 392382 | 194912 | 177 |
baby(1) | 481148 | 2.5 | |
baby(2) | 941830 | 3.6 | |
baby(3) | 743920 | 3.4 | |
baby(4) | 840238 | 2.7 | |
baby(5) | 230484 | 3.4 | |
baby(6) | 749200 | 2.9 | |
baby(7) | 538403 | 2.2 | |
baby(8) | 530284 | 3.7 | |
baby(9) | 638201 | 3.1 | |
baby(10) | 539208 | 1.4 |
delivery(1) | 197511 | physician(1) | mother(3) | baby(1) | 61.8 |
delivery(2) | 197512 | physician(2) | mother(5) | baby(2) | 57.4 |
delivery(3) | 197706 | physician(2) | mother(5) | baby(3) | 58.2 |
delivery(4) | 197801 | physician(1) | mother(3) | baby(4) | 64.1 |
delivery(5) | 197803 | physician(1) | mother(1) | baby(5) | 60.3 |
delivery(6) | 197902 | physician(1) | mother(2) | baby(6) | 58.7 |
delivery(7) | 197906 | physician(2) | mother(4) | baby(7) | 59.2 |
delivery(8) | 198005 | physician(1) | mother(1) | baby(8) | 62.3 |
delivery(9) | 198101 | physician(1) | mother(2) | baby(9) | 57.9 |
delivery(10) | 198205 | physician(1) | mother(2) | baby(10) | 55.2 |
An object-oriented front end may be built on top of a relational back end.
For example, open-source Hibernate.
E.g., hospital information system.
Complex.
Advantages:
Historical note: ‘The minimum computer memory required for average data bases varies from 30K to 160K bytes.’ (Fundamentals of Data Base Systems, S.M. Deen, 1977, p. 5). This memory requirement is less than the size of the level-2 cache in a current entry-level PC.
A centralized DBMS will offer different views of the data.
Different applications will need to look at data in different ways. For example:
2001 survey: EMR's used by general practitioners (ref) :
2005 study: in USA, EMR's used by
Importance of semantics, structured terminology.
Jointly with CHUM.
Open architecture using
HL7.
Used by clinicians.
Can be linked to medical devices.