This problem will explore different join implementations and the associated 10 costs for each model. Let R(a, b), S(b,…

Question:

Transcribed Image Text:

This problem will explore different join implementations and the associated 10

costs for each model. Let R(a, b), S(b, c), and T(c,d) be tables. For the purpose of

this question, use the values provided below.

• P(R) = number of pages of R = 20

T(R) = number of tuples of R = 1600

P(S) = number of pages of S = 200

• T(S)= number of tuples of S = 15000

• P(T) = number of pages of T = 2000

P(R, S) = number of pages in output RS = 100

• P(S, T) = number of pages in output ST = 1000

• P(R, S, T) = number of pages in output RST = 500

• B= number of buffer pages = 32

Assume that no repacking is being done unless it is specified explicitly.

.

.

.

Question 4.1

Let us start by considering a simple nested loop join. Compute the 10 cost for a

simple nested loop join if R is the “outer loop” and S is the “inner loop.”

Question 4.2

Compute the 10 cost for a simple nested loop join if S is the “outer loop” and R is

the “inner loop.” Question 4.3

Now consider using a block nested loop join. Compute the 10 cost for joining R, S

and then joining the result with T. Then compute the 10 cost for joining S, T and

then joining the result with R.

Question 4.4

Now consider using a sort-merge join. Compute the 10 cost for joining R, S and then

joining the result with T. Assume that the tables are not sorted before starting. Also

assume that we do not need to do any back up as described in lecture.

Question 4.5

Again, using a sort-merge join, compute the 10 cost for joining S, T and then joining

the result with R. Assume that the tables are not sorted before starting. Also

assume that we do not need to do any back up as described in lecture.

Question 4.6

Now suppose we only want to join R and S (with sort-merge join) but this time all

values for the join attribute are the same. What would be the 10 cost now?

Question 4.7

Now consider using a hash join. Compute the 10 cost for joining R, S and then

joining the result with T.

Question 4.8

Again, using a hash join, compute the 10 cost for joining S, T and then joining the

result with R.

Question 4.9

For the query plan where join1 = R(a,b),S(b,c) and join2 = join1(a,b,c),T(c,d), find a

configuration where using hash join for join1 and sort-merge join for join2 is

cheaper than sort-merge join for join1 and hash join for join2 by adjusting buffer

sizes and the number of pages in each table. Provide your answer in terms of the

variables listed at the start of this question (such as P(R), P(R, S), B, etc.). The

output sizes you choose for P(R, S) and P(R, S, T) must be non-zero and feasible

(e.g. the maximum output size of join 1 is P(R)*P(S)).

Expert Answer:

Answer rating: 100% (QA)

Based on the provided information let s begin by addressing Question 4 1 Question 4 1 asks us to com

View the full answer