Basic Relational Algebra
Categories: others
Tags: Relational Algebra
Query(expression) on set of relations produces relation as a result
Example: simple college admissions database
College(cName, state, enrollment) Student(sID, sName, GPA, sizeHS) Apply(sID, cName, major, decision)
Simplest query: relations name (ex. write a query โStudentโ -> produce copy of Student relation)
Use operators to filter, slice, combine
Select operator: picks certain rows $\sigma_{condition} Relation$
-
Students with GPA > 3.7 \(\sigma_{GPA > 3.7} {Student}\) This will produce subset of the student table containing those rows where the GPA is greater than 3.7.
-
Student GPA > 3.7 and HS<1000 \(\sigma_{GPA>3.7 \and HS<1000} {Student}\)
-
Applications to UofT CS major \(\sigma_{cName='UofT'\and major='cs'} {Apply}\)
Proejct operator: picks certain columns $\Pi_{Attribute_1, A_2, โฆ} RelationName$
ID and decision of all application \(\Pi_{sID, dec} Apply\)
To pick both rows and columns
ID and name of students with GPA>3.7 \(\Pi_{sID, sName} (\sigma_{GPA>3.7} Student)\) So actually we can do both select and project on any operations.
Duplicates
List of application majors and decisions \(\Pi_{major, dec} Apply\) duplicates are always eliminated in relation algebra
SQL: based on multisets, bags Relational algebra: based on sets
Cross-product: combine two relations(Cartesian product)
Schema of the result is the union of the schemas of the two relations and the contents of the result are every combination of tuples from those relations.
If we do $Student \times Apply$ we get the result with 8 attributes. For ones with the duplicate attribute name in this case sID, we could access them by putting the relation name before the attribute like Student.sID, Apply.sID. If Student has S tuples, and Apply has A tuples, the result of cartesian product would give us S times A tuples.
Names and GPAs of students with HS>1000 who applied to CS and were rejected \(\Pi_{sName, GPS}(\sigma_{Student.sID=Apply.sID \\ \and HS>1000 \and major='cs' \and dec = 'R'} (Student \times Apply))\)
Natural Join ($\bowtie$)
- Enforce equality on all attributes with same name
- Eliminate one copy of duplicate attributes
Names and GPAs of students with HS>1000 who applied to CS and were rejected \(\Pi_{sName, GPA}(\sigma_{HS>1000 \and major='cs' \and dec='R'}(Student \bowtie Apply))\) Names and GPAs of students with HS>1000 who applied to CS at college with enr>20000 and were rejected \(\Pi_{sName, GPA}(\sigma_{HS>1000 \and major='cs' \and dec='R' \and enr>20,000}(Student \bowtie (Apply \bowtie College))\)
To Summerize \(Expression_1 \bowtie Expression_2 \equiv \Pi_{schema(E_1) \cup schema(E_2)}(\sigma_{E1_{A1} = E2_{A1} \and E1_{A2} = E2_{A2} ... } Expression_1 \times Expression_2)\)
Theta Join
\[Expression_1 \bowtie_\theta Expresion2 \equiv \sigma_\theta(Expression_1 \times Expression_2)\]Basic operation implemented in DBMS
Term โJoinโ often means theta join
basic operation: take two relations, combine all tuples, but then only keep the combination that pass the theta condition
Leave a comment