Algorithms Chapter 18 Algorithms Sequential Parallel And Distributed Distributed Computation Algorithms Glance Table Contents

subject Type Homework Help
subject Pages 11
subject Words 1992
subject Authors Jerome L. Paul, Kenneth A. Berman

Unlock document.

This document is partially blurred.
Unlock all pages and 1 million more documents.
Get Access
page-pf1
Algorithms: Sequential, Parallel and Distributed 1-1
Chapter 18
Distributed Computation Algorithms
At a Glance
Table of Contents
Overview
Objectives
Instructor Notes
Solutions to Selected Exercises
Lecture Notes
page-pf2
Algorithms: Sequential, Parallel and Distributed 1-2
Overview
In Chapter 18 we introduce distributed computation algorithms suitable for implementation in a
message passing environment such as commonly used today on Beowulf clusters. Pseudocode
conventions for basic functions such as send, receive, broadcast, and scatter are discussed.
These functions are very similar to functions contained in the MPI programming library. The
differences between synchronous and asynchronous communication functions are highlighted,
and avoiding deadlock is discussed. The generic paradigm the alternating
computation/communication cycle is presented, including how it is implemented using buffered
sends and the probe command. Pseudocode for the compare-exchange operation is given, and
Chapter Objectives
After completing the chapter the student will know:
The message passing paradigm for distributed computation on a cluster
The difference between the blocking Ssend function, and the buffered Bsend function .
How to avoid deadlock when using the blocking Ssend function
The basic actions performed by Ssend, Bsend, receive, broadcast, scatter, probe and
related message passing functions
How to alternate between computation and communication cycles using asynchronous
communication as implemented using the probe function
Instructor Notes
page-pf3
Algorithms: Sequential, Parallel and Distributed 1-3
As mentioned in this chapter, our pseudocode for distributed computation algorithms closely
resembles MPI (C version). It is for this reason that some time was spent in a careful
discussion of the pseudocode. For example, we typically don’t specify the data type (integer,
float, char, and so forth) in our pseudocode used for sequential and parallel algorithms in order
to maintain a high level description of the algorithms. Consistent with this view, in our
pseudocode for Ssend and Bsend, we have amalgamated the first three data description
parameters in the corresponding MPI_send and MPI_Bsend functions into a single parameter
if an instructor does not have a Beowulf cluster available, there is software available for free
downloading that allows an MPI program utilizing p > 1 processors to be simulated on a single
processor machine.
We have created a template library to facilitate teaching message passing parallel computing.
To see further discussion of this library, see
Solutions to Selected Exercises
18.1
In the following solution, process 0 is the dealer and process 1 is a player. This exercise tests
whether the student properly toggles send and receive instructions. As a hint they might be
instructed to use only blocking Ssend and receive and use tags. In the following pseudocode,
we assume p = 2 processes.
page-pf4
Ssend(card, 1, 0)
sum_dealer get_card()
card get_card();
Ssend(card, 1, 0)
sum_dealer sum_dealer + get_card()
if sum < sum_dealer .and. sum_dealer < 21 then
// dealer won
Ssend(sum_dealer, 1, DEALER_WON)
else
// player won
// receive first couple of cards
receive(card, 0, ANY_TAG, status)
sum sum + card
receive(card, 0, ANY_TAG, status)
sum sum + card
page-pf5
game_num game_num + 1
if game_num = 10 then
break;
endif
Ssend (player_quit, 0, 0)
18.2
The while loop exits when the receive instruction encounters a message with data containing
18.3
Due to blocking Ssend and receive, the proper ordering of these function calls must be
18.4
We assume that a[0:m 1] is in process Pi and b[0:n 1] is in process Pj.
procedure CompareSplit(a[0:m 1], b[0:m 1], i, j)
Model:message-passing distributed network
Input: a[0:m 1], b[0:m 1] (sorted arrays in processors Pi and Pj , respectively)
q r 1
for k 1 to m do
if b[q] < tempa[r] then
a[k] b[q]
q q + 1
page-pf6
for k 0 to m 1 do //store copy of Pj’s b[0:m 1] in tempb[0:m 1]
tempb[k] b[k]
endfor
// Pj’s b[0:m 1] gets the m largest elements
q r m
18.6
We basically need to emulate a tree-like scatter process similar to how we designed our
asynchronous broadcast on page 589-590. We assume that data is an array of length n (divisible
by p) and the root is in 0 (with a different root you can easily adjust the tree)
page-pf7
18.8
a.
x rand();
if myid = 0 then
sum x
18.9
x compute()
if myid = 0 then
18.10
The following is a high-level description of the algorithm.
2) scatter the pieces into each processor
4) gather the results in processor 0
5) processor 0 will find maximum in an array of p values
page-pf8
18.12
The algorithm starts with a list of natural numbers 2, …, n where none of them is marked. The
algorithm then repeats the following two steps, until we reach n:
i. find the smallest unmarked k
ii. mark all multiples of k between k2 and n
next_prime find_next_prime()
broadcast(next_prime, active_p)
while next_prime > 0 do
flag_multiples(next_prime)
next_prime find_next_prime()
endwhile
broadcast (active_p, active_p)
endif
endwhile
scatter(data, myportion, 0) // in the end process 0 wil contain an array 0..n
18.14 The stability of EvenOddSort follows from the stability of MergeSort, CompareSplit,
and the easily seen fact that the scatter and gather procedures both preserve stability. That
page-pf9
18.15
In EvenOddSort there are two functions that can affect the stability: MergeSort and
CompareSplit.
18.16
procedure MatrixProduct(A[0:n 1,0:n 1], B[0:n 1,0:n 1],
C[0:n 1,0:n 1])
Model: message-passing distributed network of p processors
18.17
Master
Compute initial set of tasks task1, …, taskn
page-pfa
Algorithms: Sequential, Parallel and Distributed 1-10
18.18
For example a worker may be sending a helper message to another worker while the other
worker already received Finish message due to the timing issues. In that case the receiving
18.19
If a solution is located in a node in the state-space tree in such a way that one of the parallel
18.20
procedure CoveringSet(tasks[0:2p, 0:1+log2p], num_tasks, task_len)
Model: message-passing distributed network of p processes
Input: p (number of processors to create the covering set for
Output: num_tasks (number of created initial paths)
page-pfb
p p / 2
task_len task_len + 1
endwhile
// create a bit map of every possible p
for i 0 to num_tasks 1 do
18.21
The following pseudocode solves the n-Queens problem. An MPI-version is given in the
solution to Exercise 18.23.
First an initial covering set size of n is created. Then, using master-worker paradigm, each
worker is assigned a path from the initial covering set, and solves its part independently. When
return
endif
page-pfc
Algorithms: Sequential, Parallel and Distributed 1-12
endfor
X[PathLen] 0
HelperPath[i] X[i]
i i + 1
endwhile
if n-i < rt then return (NULL) endif //refuse help since not enough work left
if i btLevel then
k k + 1
btLevel i
btLevelShift 0;
endif
btLevelShift btLevelShift +1
return
endif
while iterations < ComputeCycleLength .and.
k btLevel .and. solution_found = .false. do
page-pfd
Algorithms: Sequential, Parallel and Distributed 1-13
else
X[k] X[k] + 1
endif
else
if IsSafePos(k,X[k]) then
if k = btLevel then
btLevelShift btLevelShift + 1
X[k] btLevelShift;
else
X[k] X[k] + 1
endif
endif
endif // if/else X[k] = n
if X[k] n then break endif
endwhile
page-pfe
cur_tasks 1
tasks[0][0] 0 // empty path, keep the path len in tasks[i][0]
while cur_tasks < requested_tasks do
// extend all tasks by n
prev_cur_tasks cur_tasks
18.22
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
int compare(const void *x, const void *y)
page-pff
MPI_Ssend(a,m,MPI_INT,j,0,MPI_COMM_WORLD);
MPI_Recv(tempa,m,MPI_INT,j,0,MPI_COMM_WORLD,&status);
for(k = 0; k < m; k++) {
temp[k] = a[k];
}
//a gets the m smallest elements
MPI_Ssend(a,m,MPI_INT,i,0,MPI_COMM_WORLD);
for(k = 0; k < m; k++) {
temp[k] = a[k];
}
q = m - 1; r = m - 1;
for(k = m - 1; k >= 0; k--) {
MPI_Comm_size(MPI_COMM_WORLD,&p);
m = n/p;
MPI_Comm_rank(MPI_COMM_WORLD,&myrank);
a = (int *) malloc(m*sizeof(int));
MPI_Scatter(L,m,MPI_INT,a,m,MPI_INT,0,MPI_COMM_WORLD);
page-pf10
free(a);
}
#define M 10000 /* per processor chunk */
int main(int argc, char *argv[])
{
int size, rank, *arr=NULL, n, i;
MPI_Init(&argc, &argv);
}
18.23
main.c
/*
* nQueens problem
* main.c
* - general framework
*
*
page-pf11
int main(int argc, char *argv[]){
int size, rank, ret;
void *buffer;
int buffer_size;
ret = MPI_Init(&argc, &argv);
MPI_Buffer_attach( buffer, buffer_size );
if (rank == 0)
{
Master(size);
}
else
{
Worker(rank, size);
}
}

Trusted by Thousands of
Students

Here are what students say about us.

Copyright ©2022 All rights reserved. | CoursePaper is not sponsored or endorsed by any college or university.