Return to Homework Assignments



Programming Assignment #2


Due: Friday, November 12, 1999

The purpose of this assignment is to write an Asymmetric Hash Join program that will be used to compute the following aggregate SQL query:

A query like this could be used by some brokerage firm when it buys/sells different securities (stocks). The relations R and S could contain the quotations of stocks on two different exchanges, where A is sequentially assigned internal number, attribute B is the stock_id and attribute C is the price of the stock. The difference in prices is relevant for arbitrage actions, where stocks are bought on one exchange favorably and shortly after are sold on another exchange.

When executing the Asymmetric Hash Join you may assume that both R and S partitions fit into the memory of a join processor, as well as the result of the local join. Meta-information about relations as well as test data are provided.

Your program should use as a parameter the number of join processors (test it with a number between 4 and 10). The number of scan processors should be fixed to four. Initially, all data will reside at one client processor from where it should be distributed first using range partitioning to an identical number of R and S scan nodes (i.e., two nodes each).

After you have read all the data from the test files into the client start measuring your execution time by making use of the routine MPI_Wtime() and finish the measurement after the the join has been finished and the client has received the final aggregate value. For this assigment the execution time is important, in addition to the correctness of the algorithm.

Test Data


THURS Oct 21, 1999