A relation instance r defined over schema R is denoted by r (R).
The current values in a relation are specified by a table.
An element t of relation r is called a tuple and is represented by a row in a table.
Attributes
The set of allowed values for each attribute is called the domain of the attribute.
Attribute values can be required to be atomic: that is, indivisible. E.g., married, single, devoice; male, female.
The special value null is a member of every domain. Indicated that the value is “unknown”.
The null value causes complications in the definition of many operations.
Relations are Unordered
Order of tuples does not matter (tuples may be stored i an arbitrary order)
Example: instructor relation with unordered tuples.
Database Schema
Database schema (empty table)– is the logical structure of the database.
Database instance(table with contents)– is a snapshot of the data in the database at a given instant in time. This may change. E.g., delete, insert rows.
Example:
schema: instructor(ID, name, dept_name, salary)
Instance:
Keys
Let \(K \subseteq R\)
K is a superkey of R if values are sufficient to identify a unique tuple (i.e., row) of each possible relation r(R)
Example: {ID} and {ID,name} are both superkeys of instructor.
Superkey K is a candidate key if K is minimal
Example: {ID} is a candidate key for instructor, (compared to {ID,name},{ID} is minimal)
One of the candidate keys is selected to be the primary key.
Which one?
Keys continued
Foreign key constraint: Value in one relation must appear in another relation(figure on next page)
Referencing relation.
Referenced relation.
Example: dept_name in instructor is a foreign key from instructor referencing department.
Schema Diagram for University Database
Relational Algebra
A procedual language consisting of a set of operations that take one or two relations as input and produce a new relation as their result.
Seven basic operators
select: \(\sigma\)
project:\(\Pi\)
join: \(\bowtie\)
union: \(\cup\)
set difference: -
Cartesian product: x
rename: \(\rho\)
Select Operation
The select operation selects tuples that satisfy a given predicate.
Notation: \(\sigma_p (r)\)
p is called the selection predicate
Example: sekect those tuples of the instructor relation where the instructor is in the “Physics” department.
Query
\(\sigma\) dept_name =“Physics”(instructor)
Select Operation (Cont.)
We allow the following comparisons to be used in the selection predicate
=,\(\neq\),>,\(\ge\),<,\(\le\)
We can combine several predicates into a larger predicate by using the following connectives:
\(\land\)(and),\(\lor\)(or),\(\neg\)(not)
E.g. \(\sigma\) name=“Wu”\(\land\)dept_name="CS(instructor)
The select predicate may include comparisons between two attributes.
Example, find all departments whose name is the same as their building name:
\(\sigma\)dept_name=building(department)
Select Operation()
2 minute challenge:
Find all instructors in Physics with a salary greater $90,000
Find the instructors in CS with the ID 83821
Find the instructors in CS whose name is not Katz
Project Operation
A unary operation that returns its argument relation, with certain attributes left out(return some columns not all).
Notation (where \(A_1,A_2,...,A_k\) are attributes names and r is a relation name.):
\(\Pi A_1,A_2,A_3,...,A_k\)
The result is defined as the relation of k columns obtained by erasing the columns that are not listed.
Duplicate rows removed from result, since relations are sets (no duplicated elements)
Project Operation Example
Example: eliminate the dept_name attribute of instructor
Query:
\(\Pi\)ID,name,salary(instructor)
Result
Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.