Skip to content

Lecture 5: Relational Algebra Fundamentals

1.0 Introduction to Relational Query Languages

Query languages are the primary means by which users interact with databases, serving as the essential tool for retrieving and manipulating stored information. They form the bridge between a user's request for data and the database management system's ability to fulfill that request. Fundamentally, these languages can be classified into two distinct categories based on how the user formulates their query: procedural and non-procedural.

The table below outlines the key difference between these two approaches.

Language TypeUser Interaction Method
ProceduralThe user instructs the system on how to compute the desired result by specifying a sequence of operations to be performed on the database.
Non-ProceduralThe user describes the desired information without providing a specific, step-by-step procedure for obtaining that information.

This guide focuses on Relational Algebra, a foundational query language that serves as a prime example of the procedural approach.

2.0 Defining Relational Algebra

Relational Algebra is a formal, procedural query language that underpins the logic of relational databases. It provides a set of operations that act as a step-by-step toolkit for constructing queries and retrieving data. Instead of simply describing what data is needed, a user of Relational Algebra defines how to get it by combining a sequence of these fundamental operations.

The core characteristics of Relational Algebra are:

  • It is a procedural language, where queries are built by applying operators in a specific sequence.
  • Its operations are applied to either one or two relations (tables) at a time.
  • The result of every operation is always a single new relation.

We will now detail the fundamental operations that constitute Relational Algebra.

3.0 Core Relational Algebra Operations

The power of Relational Algebra comes from five core operations that serve as the building blocks for all queries. By combining these operations—Select, Project, Union, Intersection, and Difference—one can construct complex instructions to retrieve precise sets of data from a database.

3.1 The Select Operation (σ)

The Select Operation is used to filter tuples (rows) from a relation. It works by choosing only those tuples that satisfy a specific condition, known as a predicate.

The notation for the Select operation is:

σ p(r)

Here, σ (sigma) is the symbol for the Select operation, p is the selection condition or predicate, and r is the relation (table) you are querying. The condition p can include logical connectors such as AND, OR, and NOT to create more complex filters.

Example: To select tuples from the Books table where the subject is 'database' AND the price is '450'.

Query: σ subject = "database" and price = "450"(Books)

Source "Books" Relation:

subjectBook_isbnprice
database345h-e6ck300
OOP with java827g-y90j700
Artificial intelligence4h7k-0jde450
databaseLedu-4520450
DatabaseL129-8nge200

Output Relation:

subjectBook_isbnprice
databaseLedu-4520450

3.2 The Project Operation π

The Project Operation is used to select specific attributes (columns) from a relation. It creates a new relation containing only the specified attributes, effectively eliminating all others from the result.

The notation for the Project operation is:

π A1, A2, .....An(r)

In this format, π (pi) is the symbol for the Project operation, A1, A2, ... represents the list of attribute names to be included in the result, and r is the source relation.

Example: To retrieve only the Name and City for all entries in the CUSTOMER relation.

Query: π Name, City(CUSTOMER)

Source "CUSTOMER" Relation:

NameStreetCity
JonesMainHarrison
SmithNorthRye
HaysMainHarrison
BrooksSenatorBrooklyn

Output Relation:

NameCity
JonesHarrison
SmithRye
HaysHarrison
BrooksBrooklyn

3.3 The Union Operation

The Union Operation combines all tuples from two relations into a single new relation. It is analogous to the union of two sets, meaning it automatically removes any duplicate tuples from the final result.

The notation for the Union operation is R ∪ S.

For a Union operation to be valid, it must satisfy the following conditions:

  • The two relations (R and S) must have the same number of attributes.
  • Duplicate tuples are automatically eliminated from the resulting relation.

Example: To find the names of all customers who have either a borrower account, a depositor account, or both.

Query: π Customer_name(BORROW) ∪ π Customer_name(DEPOSITOR)

Source 1: **π Customer_name(BORROW)**

Customer_name
Jones
Smith
Hayes
Curry
Jackson

Source 2: **π Customer_name(DEPOSITOR)**

Customer_name
Johnson
Smith
Mayes
Johnson
Jones

Final Output Relation:

Customer_name
Johnson
Smith
Mayes
Jones
Hayes
Curry
Jackson

3.4 Intersection Operation

The Set Intersection Operation is used to find the common tuples between two relations. The resulting relation contains only those tuples that exist in both of the source relations.

The notation for the Intersection operation is R ∩ S.

Example: To find the names of customers who have both a borrower account and a depositor account.

Query: π Customer_name(BORROW) ∩ π Customer_name(DEPOSITOR)

Output Relation:

Customer_name
Smith
Jones

3.5 The Set Difference Operation (-)

The Set Difference Operation is used to find tuples that are present in one relation but not in another. The operation R - S will produce a relation containing only the tuples that are in relation R but are not in relation S.

The notation for the Difference operation is R - S.

Example: To find the names of customers who have a borrower account but not a depositor account.

Query: π Customer_name(BORROW) - π Customer_name(DEPOSITOR)

Output Relation:

Customer_name
Hayes
Curry
Jackson

4.0 Conclusion and Key Takeaways

Relational Algebra provides the formal, procedural foundation upon which modern database query languages, like SQL, are built. By understanding its core operations, one gains insight into the fundamental logic of data retrieval in relational systems. Each operation is a distinct tool that, when combined with others, allows for the precise and powerful extraction of information.

The table below provides a final summary of the five core operations.

OperationSymbolCore Function
SelectσFilters rows (tuples) from a relation based on a specified condition.
ProjectπSelects a subset of columns (attributes) from a relation, discarding the rest.
UnionCombines all rows from two relations into a single set, eliminating duplicates.
Set IntersectionReturns only the rows that are present in both of the specified relations.
Set Difference-Returns the rows that are in the first relation but not in the second.