2 minute read

Categories:

Tags:

Query(expression) on set of relations produces relation as a result

Example: simple college admissions database

College(cName, state, enrollment) Student(sID, sName, GPA, sizeHS) Apply(sID, cName, major, decision)

Simplest query: relations name (ex. write a query โ€œStudentโ€ -> produce copy of Student relation)

Use operators to filter, slice, combine

Select operator: picks certain rows $\sigma_{condition} Relation$

  1. Students with GPA > 3.7 \(\sigma_{GPA > 3.7} {Student}\) This will produce subset of the student table containing those rows where the GPA is greater than 3.7.

  2. Student GPA > 3.7 and HS<1000 \(\sigma_{GPA>3.7 \and HS<1000} {Student}\)

  3. Applications to UofT CS major \(\sigma_{cName='UofT'\and major='cs'} {Apply}\)

Proejct operator: picks certain columns $\Pi_{Attribute_1, A_2, โ€ฆ} RelationName$

ID and decision of all application \(\Pi_{sID, dec} Apply\)

To pick both rows and columns

ID and name of students with GPA>3.7 \(\Pi_{sID, sName} (\sigma_{GPA>3.7} Student)\) So actually we can do both select and project on any operations.

Duplicates

List of application majors and decisions \(\Pi_{major, dec} Apply\) duplicates are always eliminated in relation algebra

SQL: based on multisets, bags Relational algebra: based on sets

Cross-product: combine two relations(Cartesian product)

Schema of the result is the union of the schemas of the two relations and the contents of the result are every combination of tuples from those relations.

If we do $Student \times Apply$ we get the result with 8 attributes. For ones with the duplicate attribute name in this case sID, we could access them by putting the relation name before the attribute like Student.sID, Apply.sID. If Student has S tuples, and Apply has A tuples, the result of cartesian product would give us S times A tuples.

Names and GPAs of students with HS>1000 who applied to CS and were rejected \(\Pi_{sName, GPS}(\sigma_{Student.sID=Apply.sID \\ \and HS>1000 \and major='cs' \and dec = 'R'} (Student \times Apply))\)

Natural Join ($\bowtie$)

  • Enforce equality on all attributes with same name
  • Eliminate one copy of duplicate attributes

Names and GPAs of students with HS>1000 who applied to CS and were rejected \(\Pi_{sName, GPA}(\sigma_{HS>1000 \and major='cs' \and dec='R'}(Student \bowtie Apply))\) Names and GPAs of students with HS>1000 who applied to CS at college with enr>20000 and were rejected \(\Pi_{sName, GPA}(\sigma_{HS>1000 \and major='cs' \and dec='R' \and enr>20,000}(Student \bowtie (Apply \bowtie College))\)

To Summerize \(Expression_1 \bowtie Expression_2 \equiv \Pi_{schema(E_1) \cup schema(E_2)}(\sigma_{E1_{A1} = E2_{A1} \and E1_{A2} = E2_{A2} ... } Expression_1 \times Expression_2)\)

Theta Join

\[Expression_1 \bowtie_\theta Expresion2 \equiv \sigma_\theta(Expression_1 \times Expression_2)\]

Basic operation implemented in DBMS

Term โ€œJoinโ€ often means theta join

basic operation: take two relations, combine all tuples, but then only keep the combination that pass the theta condition

Leave a comment