This problem will explore different join implementations and the associated 10 costs for each model. Let R(a, b), S(b,…

This problem will explore different join implementations and the associated 10 costs for each model. Let R(a, b), S(b,…

Question:

  

Transcribed Image Text:

This problem will explore different join implementations and the associated 10
costs for each model. Let R(a, b), S(b, c), and T(c,d) be tables. For the purpose of
this question, use the values provided below.
• P(R) = number of pages of R = 20
T(R) = number of tuples of R = 1600
P(S) = number of pages of S = 200
• T(S)= number of tuples of S = 15000
• P(T) = number of pages of T = 2000
P(R, S) = number of pages in output RS = 100
• P(S, T) = number of pages in output ST = 1000
• P(R, S, T) = number of pages in output RST = 500
• B= number of buffer pages = 32
Assume that no repacking is being done unless it is specified explicitly.
.
.
.
Question 4.1
Let us start by considering a simple nested loop join. Compute the 10 cost for a
simple nested loop join if R is the “outer loop” and S is the “inner loop.”
Question 4.2
Compute the 10 cost for a simple nested loop join if S is the “outer loop” and R is
the “inner loop.” Question 4.3
Now consider using a block nested loop join. Compute the 10 cost for joining R, S
and then joining the result with T. Then compute the 10 cost for joining S, T and
then joining the result with R.
Question 4.4
Now consider using a sort-merge join. Compute the 10 cost for joining R, S and then
joining the result with T. Assume that the tables are not sorted before starting. Also
assume that we do not need to do any back up as described in lecture.
Question 4.5
Again, using a sort-merge join, compute the 10 cost for joining S, T and then joining
the result with R. Assume that the tables are not sorted before starting. Also
assume that we do not need to do any back up as described in lecture.
Question 4.6
Now suppose we only want to join R and S (with sort-merge join) but this time all
values for the join attribute are the same. What would be the 10 cost now?
Question 4.7
Now consider using a hash join. Compute the 10 cost for joining R, S and then
joining the result with T.
Question 4.8
Again, using a hash join, compute the 10 cost for joining S, T and then joining the
result with R.
Question 4.9
For the query plan where join1 = R(a,b),S(b,c) and join2 = join1(a,b,c),T(c,d), find a
configuration where using hash join for join1 and sort-merge join for join2 is
cheaper than sort-merge join for join1 and hash join for join2 by adjusting buffer
sizes and the number of pages in each table. Provide your answer in terms of the
variables listed at the start of this question (such as P(R), P(R, S), B, etc.). The
output sizes you choose for P(R, S) and P(R, S, T) must be non-zero and feasible
(e.g. the maximum output size of join 1 is P(R)*P(S)).

Expert Answer:

Answer rating: 100% (QA)

Based on the provided information let s begin by addressing Question 4 1 Question 4 1 asks us to com
View the full answer