Skip to content

Lecture 4: Normalization

Normalization

Definition:
Normalization is the process of organizing data in a database to minimize data redundancy and update anomalies.

Objectives

  • Reduce data duplication.
  • Decompose poorly structured relations into smaller, stable tables.
  • Maintain data integrity and consistency.
  • Based on Functional Dependencies (FDs).

Common Normal Forms

  • 1NF – Eliminate repeating groups, ensure atomic values.
  • 2NF – Eliminate partial dependencies.
  • 3NF – Eliminate transitive dependencies.
  • BCNF, 4NF, 5NF – Higher normalization for advanced design refinement.

Anomalies

Database anomalies are inconsistencies that occur when modifying data in unnormalized tables.

TypeDescription
Insertion AnomalyUnable to add data due to missing other information.
Deletion AnomalyRemoving a record unintentionally deletes important data.
Update AnomalyInconsistent data after partial updates.

Goal: Remove anomalies to ensure reliable insertion, deletion, and modification operations.

Functional Dependency (FD)

A functional dependency (FD) exists when one attribute (or set of attributes) determines another attribute.

Notation:
X -> Y means attribute Y is functionally dependent on X.

  • X: Determinant
  • Y: Dependent attribute

Example:
Student_ID -> Student_Name
Each student ID maps to exactly one name, but a name may belong to multiple students.

First Normal Form (1NF)

A relation is in 1NF if:

  1. All attribute values are atomic (no multiple or repeating values).
  2. There are no repeating groups or columns.
  3. Each record can be uniquely identified using a primary key.

Example

Unnormalized Table:

Department_NameStudent_Name
BE-ITRasmus, Jens, Kim
Multimedia DesignSara

1NF Conversion:

Department_NameStudent_Name
BE-ITRasmus
BE-ITJens
BE-ITKim
Multimedia DesignSara

Each record now has atomic values.

Exercise A – 1NF

Initial Table:

Product_IDProduct_NameCategoryPrice_1Price_2Price_3
P001Wooden ChairFurniture150.00135.00120.00
P002Metal TableOffice300.00280.00250.00

Step 1: Separate composite attributes
-> Split Category into Category_Style and Category_Material.

Step 2: Remove repeating values
-> Store each price as a separate record.

Final 1NF Table:

Product_IDProduct_NameCategory_StyleCategory_MaterialPrice
P001Wooden ChairFurnitureWood150.00
P001Wooden ChairFurnitureWood135.00
P001Wooden ChairFurnitureWood120.00
P002Metal TableOfficeMetal300.00
P002Metal TableOfficeMetal280.00
P002Metal TableOfficeMetal250.00

Exercise B – 1NF

Normalize a table with multiple course columns (Course1–Course7) into atomic records by creating one record per student-course pair.

Initial Table:

Student_IDStudent_NameCourse1Course2Course3
S101John SmithMathScienceEnglish
S102Maria GarciaHistoryArtNULL

1NF Conversion:

Student_IDStudent_NameCourse
S101John SmithMath
S101John SmithScience
S101John SmithEnglish
S102Maria GarciaHistory
S102Maria GarciaArt

Second Normal Form (2NF)

A relation is in 2NF if:

  1. It is already in 1NF.
  2. Every non-key attribute is fully functionally dependent on the entire primary key, not part of it.

Key Concept: Eliminate partial dependencies (dependencies on part of a composite key).

Decomposition Process:

  • Identify attributes that depend only on part of a composite key.
  • Move them into a new table with the corresponding key.

Example – Decomposition to 2NF

Unnormalized Table:

Lecturer_IDLecturer_NameGenderDOBDepartment_IDDepartment_NameCityZip_CodeHead_of_Department
L001Dr. JohnsonM1975D01Computer ScienceAalborg9000Dr. Petersen
L002Dr. NielsenF1980D02MathematicsAalborg9000Dr. Hansen

Step 1: Split into two tables:
Lecturer Table

Lecturer_IDLecturer_NameGenderDOBDepartment_ID
L001Dr. JohnsonM1975D01
L002Dr. NielsenF1980D02

Department Table

Department_IDDepartment_NameCityZip_CodeHead_of_Department
D01Computer ScienceAalborg9000Dr. Petersen
D02MathematicsAalborg9000Dr. Hansen

Each non-key attribute now depends entirely on its table's primary key.

Exercise C – 2NF

Given supplier and product data, separate supplier attributes (status, city) from transaction attributes (product and quantity).
Result: two relations — one for suppliers, one for supply records.

Third Normal Form (3NF)

A relation is in 3NF if:

  1. It is in 2NF.
  2. There are no transitive dependencies (non-key attributes depending on other non-key attributes).

Condition:
Every non-prime attribute depends only on the primary key.

Example – 3NF Decomposition

Original Table:

Department_IDDepartment_NameStreetCityZip_Code
D01Computer ScienceSelma Lagerlöfs Vej 300Aalborg9000
D02MathematicsFredrik Bajers Vej 7GAalborg9220

Transitive Dependency:
Department_ID -> City and City -> Zip_Code

Decomposed Tables:

  1. Department Table
Department_IDDepartment_NameStreetCity
D01Computer ScienceSelma Lagerlöfs Vej 300Aalborg
D02MathematicsFredrik Bajers Vej 7GAalborg
  1. City Table
CityZip_Code
Aalborg9000
Aalborg9220

Now, Zip_Code depends directly on City, not on Department_ID.

Exercise E – 3NF

Original Table:

Customer_IDCustomer_NameAccNoBankCodeBank
C001Anna Jensen12345B01Nordea Bank
C002Peter Hansen67890B02Jyske Bank

Decompose:

  1. Customer Table:
Customer_IDCustomer_NameAccNoBankCode
C001Anna Jensen12345B01
C002Peter Hansen67890B02
  1. Bank Table:
BankCodeBank
B01Nordea Bank
B02Jyske Bank

Eliminates transitive dependency (BankCode -> Bank).

Boyce-Codd Normal Form (BCNF)

A table is in BCNF if:

  1. It is in 3NF.
  2. For every functional dependency X -> Y, X is a superkey.

In short: Every determinant must be a candidate key.

Note: BCNF is a stricter version of 3NF and eliminates remaining redundancy that 3NF may allow.

Summary

Normal FormKey ConditionGoal
1NFEliminate repeating groups; atomic valuesRemove duplication within columns
2NFNo partial dependency on a composite keyRemove redundancy from subsets of primary key
3NFNo transitive dependencyEnsure all fields depend only on primary key
BCNFEvery determinant is a candidate keyStronger version of 3NF ensuring full dependency on keys

Assignment

Task:
Design a Patient Database and normalize it up to 3NF or BCNF.

Attributes:

  • Personal details: Name, CPR, Address, Phone, DOB, Gender, MaritalStatus, Race.
  • Medical details: Reason of Hospitalization, Date of Hospitalization, Drug_Abuse, Smoker.
  • Administrative details: Insurance_Number, Insurance_Name, Coverage_Date, Nearest_Relative, Relative_Phone, DrivingLicenseNum, DrivingLicenseState.

Deliverables:

  • Create the tables.
  • Normalize them.
  • Specify keys, relationships, and cardinalities.
  • Insert 5 patient records.

Patient Database Design (3NF)

1. Patient Table

CPRNameDOBGenderMaritalStatusRaceAddressPhoneInsurance_NumberDrivingLicenseNumDrivingLicenseState
0101801234Lars Nielsen1980-01-01MMarriedCaucasianNørrebrogade 12, 9000 Aalborg12345678INS001DL12345Northern Denmark
1505905678Maria Petersen1990-05-15FSingleCaucasianVesterbro 25, 9000 Aalborg87654321INS002NULLNULL

2. Hospitalization Table

Hospitalization_IDCPRReason_of_HospitalizationDate_of_HospitalizationDrug_AbuseSmoker
H0010101801234Appendicitis2024-01-15NoYes
H0021505905678Pneumonia2024-02-20NoNo

3. Insurance Table

Insurance_NumberInsurance_NameCoverage_Date
INS001Danmark Insurance2023-05-01
INS002Tryg Forsikring2024-01-10

4. Relative Table

Relative_IDCPRNearest_RelativeRelative_Phone
R0010101801234Anna Nielsen11223344
R0021505905678Peter Petersen44332211

Relationships:

  • Patient (1) Hospitalization (1:N)
  • Patient (1) Insurance (1:1)
  • Patient (1) Relative (1:N)

Keys:

  • Patient: CPR (Primary Key)
  • Hospitalization: Hospitalization_ID (Primary Key), CPR (Foreign Key)
  • Insurance: Insurance_Number (Primary Key)
  • Relative: Relative_ID (Primary Key), CPR (Foreign Key)