Sending and Receiving with Halo Exchanges
From our examination of the illustrations for column-major and row-major languages, we conclude that we should split our two-dimensional grid by columns for Fortran and similar languages, and by rows for C++, Python, and others with that array orientation.
Let us assume for both cases that we have nr
rows and nc
columns in our grid. In a one-dimensional decomposition, we will split only one of those values among the processes. Call the local number of rows and columns nrl
and ncl
. When boundary (“ghost” or real) are added, each subdomain will contain nrl+2
and ncl+2
rows and columns. Our computed domains will extend from $1$ to nrl and 1 to ncl, with boundaries at $0$ and at $nrl+1$ and $ncl+1$. Two exchanges will require two Sendrecv invocations.
C++ and Python
For these languages $nrl = nr \div nprocs$ and $ncl=nc$. We will send the first computed row up to row $nrl+1$ of the process at $rank-1$ and the last computed row down to row $0$ of the process at $rank+1$.
C++ syntax:
MPI_Sendrecv(&w[1][1],nc, MPI_DOUBLE,up,tag,&w[nrl+1][1],
nc, MPI_DOUBLE,down,tag,MPI_COMM_WORLD,&status);
MPI_Sendrecv(&w[nrl][1],nc,MPI_DOUBLE,down,tag,&w[0][1],
nc,MPI_DOUBLE,up,tag,MPI_COMM_WORLD,&status);
Note that although the variable w
alone would be a pointer, a specific element is not (it dereferences the memory location) and so must be passed to MPI by reference in C++.
Python syntax:
comm.Sendrecv([w[1,1:nc+1],MPI.DOUBLE], up, tag, [w[nr+1,1:nc+1],MPI.DOUBLE], down, tag )
comm.Sendrecv([w[nr,1:nc+1],MPI.DOUBLE], down, tag, [w[0,1:nc+1],MPI.DOUBLE], up, tag )
Fortran
For Fortran $nrl=nr$ and $ncl= nc \div nprocs$. We send the first computed column left to column $ncl+1$ of $rank+1$ and the last computed column right to column $0$ of $rank-1$.
call MPI_SENDRECV(w(1:nr,1),nr,MPI_DOUBLE_PRECISION,left,tag, w(1:nr,ncl+1),nr,&
MPI_DOUBLE_PRECISION,right,tag, &
MPI_COMM_WORLD,mpi_stat,ierr)
call MPI_SENDRECV(w(1:nr,ncl),nr,MPI_DOUBLE_PRECISION,right,tag, w(1:nr,0),nr, &
MPI_DOUBLE_PRECISION,left,tag, &
MPI_COMM_WORLD,mpi_stat,ierr)
MPI_PROC_NULL
Rank $0$ has no neighbor to the top or left, and rank $nprocs$ has no neighbor to the bottom or right. We might think that we would have to write a complicated set of conditionals to handle these situations, but this is a common scenario and MPI provides a built-in solution: MPI_PROC_NULL
.
The special value MPI_PROC_NULL can be used in place of a source or destination and results in a “no op,” i.e. the send or receive is not attempted and the function returns immediately.
Exercise
Use the language of your choice. Write a program that fills an array w with the uniform value of 50
in the inner grid (1 to nr, 1 to nc) where nr=nc=500. Set the values of row 0 to 0 and the values of row nrl+1 and columns 0 and ncl+1 to 100. Do not worry about whether values at corners match.
Set up the transfer and run one exchange from u to w. Print some values to check your result.
Review strong and weak scaling. Try both with different runs in your code. For strong scaling, make sure that the number of processes equally divides the number of rows or columns.
Hint Set up the neighbors for each rank (“up” and “down” or “left” and “right”) before doing any transfers.
C++
#include <iostream>
#include <mpi.h>
using namespace std;
int main (int argc, char *argv[]) {
int i, j;
int N;
double topBC=100;
double bottomBC=0.;
double edgeBC=100.;
// Added for MPI
int nr, nrl, nc, ncl;
int rank, nprocs;
MPI_Status status;
int root=0, tag=0;
int up, down;
//Initialize MPI
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD,&nprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
//Usually we would read in nr and nc; they represent the global grid RxC;
//For this case they are both N.
N=500;
nr=N;
nc=N;
//Use ncl for clarity
ncl=nc;
//Weak scaling
//nrl=nr;
//Strong scaling
if (nr%nprocs!=0) {
MPI_Finalize();
cout<<"Not an even division of the arrays.\n";
return 0;
}
else {
nrl=nr/nprocs;
}
//Find my neighbors
if (rank==0) {
up=MPI_PROC_NULL;
}
else {
up=rank-1;
}
if (rank==nprocs-1) {
down=MPI_PROC_NULL;
}
else {
down=rank+1;
}
//Two extra rows and columns for boundary conditions
int nrows=nrl+2;
int ncols=ncl+2;
double **w=new double*[nrows];
double *wptr=new double[(nrows)*(ncols)];
for (i=0;i<(nrl+2);++i,wptr+=ncols)
w[i] = wptr;
if (rank==0) {
for (int i=0;i<=ncl+1;++i){
w[0][i]=bottomBC;
}
}
if (rank==nprocs-1) {
for (int i=0;i<ncl+1;++i){
w[nrl+1][i]=topBC;
}
}
for (int i=0;i<=nrl+1;++i){
w[i][0]=edgeBC;
w[i][ncl+1]=edgeBC;
}
//Initialize interior only
for ( i = 1; i <= nrl; i++ ) {
for (j = 1; j <= ncl; j++ ) {
w[i][j] = 50.;
}
}
MPI_Sendrecv(&w[1][1],ncl, MPI_DOUBLE,up,tag,&w[nrl+1][1],
ncl, MPI_DOUBLE,down,tag,MPI_COMM_WORLD,&status);
MPI_Sendrecv(&w[nrl][1],ncl,MPI_DOUBLE,down,tag,&w[0][1],
ncl,MPI_DOUBLE,up,tag,MPI_COMM_WORLD,&status);
//Spot-check results
for (i=0;i<nprocs;++i) {
if (i==rank) {
cout<<i<<" "<<w[0][ncl/2]<<" "<<w[nrl+1][ncl/2]<<endl;
}
}
MPI_Finalize();
}
Fortran
program halo
use mpi
double precision, allocatable, dimension(:,:) :: w
integer :: N=500
double precision :: topBC, bottomBC, edgeBC
integer :: i
integer :: nr, nc, nrl, ncl
integer :: rank, nprocs, tag=0
integer :: errcode, ierr
integer, dimension(MPI_STATUS_SIZE) :: mpi_stat
integer, parameter :: root=0
integer :: left, right
!Initialize MPI, get the local number of columns
call MPI_INIT(ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD,nprocs,ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD,rank,ierr)
!For this case
!Usually we'd read in nr and nc; they represent the global grid size RxC.
nr=N
nc=N
!Generally we'd just use nr and not duplicate; this is for clarity later
nrl=nr
! weak scaling
!ncl = nc
! strong scaling
if (mod(nc,nprocs).ne.0) then
call MPI_FINALIZE(ierr)
stop 'Not an even division of the array'
else
ncl = nc/nprocs
endif
! Find my neighbors
if ( rank .eq. root ) then
left = MPI_PROC_NULL
else
left = rank - 1
endif
if ( rank .eq. nprocs-1 ) then
right = MPI_PROC_NULL
else
right = rank + 1
endif
allocate(w(0:nrl+1,0:ncl+1))
! Boundary conditions
topBC=0.d0
bottomBC=100.d0
edgeBC=100.d0
!Set boundary conditions
if (rank==0) then
w(:,0)=edgeBC
endif
if (rank==nprocs-1) then
w(:,ncl+1)=edgeBC
endif
w(nrl+1,:)=bottomBC
w(0,:)=topBC
! Initialize the array interior
w(1:nrl,1:ncl)=50.
! Exchange halo values
call MPI_SENDRECV(w(1:nrl,1) ,nrl,MPI_DOUBLE_PRECISION,left,tag, &
w(1:nrl,ncl+1),nrl,MPI_DOUBLE_PRECISION,right,tag, &
MPI_COMM_WORLD,mpi_stat,ierr)
call MPI_SENDRECV(w(1:nrl,ncl) ,nrl,MPI_DOUBLE_PRECISION,right,tag, &
w(1:nrl,0) ,nrl,MPI_DOUBLE_PRECISION,left,tag, &
MPI_COMM_WORLD,mpi_stat,ierr)
!Spot-check result
do i=0,nprocs-1
if (i==rank) then
write (*,*) rank, w(nrl/2,0), w(nrl/2,ncl+1)
endif
enddo
call MPI_FINALIZE(ierr)
end program
Python
import sys
import numpy as np
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
nprocs = comm.Get_size()
N = 500
#Strong scaling
if N%nprocs==0:
nrl = N//nprocs
else:
print("Number of ranks should divide the number of rows evenly.")
sys.exit()
#Weak scaling
#nrl = N
#Both
ncl = N
w = np.zeros((nrl+2, ncl+2), dtype=np.double)
#Set up boundaries
if rank == 0 :
w[0,:] = 0. # up
if rank == nprocs-1 :
w[nrl+1,:] = 100. # bottom
w[:,0] = 100. # left
w[:,ncl+1] = 100. # right
#Arbitarary value for interior that may speed up convergence somewhat.
#Be sure not to overwrite boundaries.
w[1:nrl+1,1:ncl+1] = 50.
# setting up the up and down rank for each process
if rank == 0 :
up = MPI.PROC_NULL
else :
up = rank - 1
if rank == nprocs - 1 :
down = MPI.PROC_NULL
else :
down = rank + 1
tag=0
# sending up and receiving down
comm.Sendrecv([w[1,1:ncl+1],MPI.DOUBLE], up, tag, [w[nrl+1,1:ncl+1],MPI.DOUBLE], down, tag)
# sending down and receiving up
comm.Sendrecv([w[nrl,1:ncl+1],MPI.DOUBLE], down, tag, [w[0,1:ncl+1],MPI.DOUBLE], up, tag)
# Spot-check result
for n in range(nprocs):
if n==rank:
print(n,w[0,ncl//2],w[nrl+1,ncl//2])