Introduction to OpenMP


What is OpenMP

  • Fork-Join model of parallel execution
    • Execution starts with one thread of control - master thread
    • Parallel regions fork off new threads on entry - team thread
    • Threads join back together at the end of the region - only master thread continues


Participants


Support for OpenMP on MSI Systems


Control Constructs (Directives)


Do Scheduling

!$OMP PARALLEL DO &
!$OMP  SCHEDULE(STATIC,3)
DO J = 1, 36
     Work (j)
END DO
!$OMP END DO
!$OMP PARALLEL DO &
!$OMP  SCHEDULE(DYNAMIC,1)
DO J = 1, 36
     Work (j)
END DO
!$OMP END DO
!$OMP PARALLEL DO &
!$OMP  SCHEDULE(GUIDED,1)
DO J = 1, 36
     Work (j)
END DO
!$OMP END DO

Nested Parallelism



Orphaned Directives

PROGRAM main
!$OMP PARALLEL
CALL foo()
CALL bar()
CALL error()
!$OMP END PARALLEL


SUBROUTINE error()
! Not allowed due to nested control structs
!$OMP SECTIONS
!$OMP SECTION
CALL foo()
!$OMP SECTION
CALL bar()
!$OMP END SECTIONS
END


SUBROUTINE foo()
!$OMP DO
DO i = 1, n
       ...
END DO
!$OMP END DO
END


SUBROUTINE bar()
!$OMP SECTIONS
!$OMP SECTION
CALL section1()
!$OMP SECTION
...
!$OMP SECTION
...
!$OMP END SECTIONS
END

Synchronization

!$OMP PARALLEL
!$OMP DO
   DO I=2, N
      B(I) = (A(I) + A(I-1)) / 2.0
   ENDDO
!$OMP END DO NOWAIT
!OMP DO
   DO I=1, M
     Y(I) = SQRT (Z(I))
   ENDDO
!$OMP END DO NOWAIT
!$OMP END PARALLEL
!$OMP PARALLEL DEFAULT(PRIVATE) SHARED(X,Y)
!$OMP CRITICAL (XAXIS)
       CALL DEQUEUE(IX_NEXT,X)
!$OMP END CRITICAL (XAXIS)

       CALL WORK(IX_NEXT,X)

!$OMP CRITICAL (YAXIS)
       CALL DEQUEUE(IY_NEXT,Y)
!$OMP END CRITICAL (YAXIS)

       CALL WORK(IY_NEXT,Y)

!$OMP END PARALLEL

Data environment


Example

INTEGER x(3), y(3), z
!$OMP PARALLEL DO DEFAULT (PRIVATE), SHARED(x), &
!$OMP REDUCTION(+:z)
   DO k = 1, 3
      x(k) = k
      y(k) = k*k
      z = z + x(k) * y(k)
   END DO
!$OMP END PARALLEL DO

OpenMP Environment and Runtime Library


Compilation on the IBM SP

Related site, vendor-specific directives
Compilation
xlf_r -qsmp -qreport=smplist source.f

xlf_r -qsmp
: IBM*, SMP$, $OMP, IMBP, and IBMT trigger constants
-qsmp: parallelized for the SMP
xlf_r: thread-safe version of xlf
xlf_r: recognizes trigger_constants IBM* and IBMT
-qreport=smplist: generate a listing of code transformation

Compilation on the Origin 2000


Designing Parallel Programs in OpenMP

  • Partition
    • Divide problem into tasks
  • Communicate
    • Determine amount and pattern of communication
  • Agglomerate
    • Combine tasks
  • Map
    • Assign agglomerated tasks to physics processors
  • Partition
    • In OpenMP, look for any independent operations (loop parallel, task parallel)
  • Communicate
    • In OpenMP, look for synch points and dependencies
  • Agglomerate
    • In OpenMP, mark parallel loops and/or parallel sections
  • Map
    • In OpenMP, implicit or explicit scheduling
    • Data mapping goes outside the standard

Irregular Mesh: The Problem

  • The Problem
    • Given an irregular mesh of values
    • Update each value using its neighbors in the mesh
  • The approach
    • Store the mesh as a list of edges
    • Process all edges in parallel
      • Compute contribution of edge
      • Add to one endpoint, subtract from the other


Irregular Mesh: Sequential Program

REAL x(nnode), y(nnode), flux
INTEGER iedge(nedge,2)
err = tol * 1e6
DO WHILE (err > tol)
   DO i = 1, nedge
      flux = (y(iedge(i,1))-y(iedge(i,2))) / 2
      x(iedge(i,1)) = x(iedge(i,1)) - flux
      x(iedge(i,2)) = x(iedge(i,2)) + flux
      err = err + flux(i)*flux(i)
   END DO
   err = err / nedge
   DO j = 1, nnode
      y(i) = x(i)
   END DO
END DO

Irregular Mesh: OpenMP Partitioning


Irregular Mesh: OpenMP Communication and Agglomeration


Irregular Mesh: OpenMP Program

!$OMP PARALLEL, DEFAULT(SHARED)
!$OMP SINGLE
err = tol * 1e6
!$OMP END SINGLE
DO WHILE (err > tol)
    !$OMP DO, PRIVATE(flux), REDUCTION(+:err)
    DO i = 1, nedge
        flux = (y(iedge(i,1))-y(iedge(i,2)))/2
        !$OMP ATOMIC
        x(iedge(i,1)) = x(iedge(i,1)) - flux
        !$OMP ATOMIC
        x(iedge(i,2)) = x(iedge(i,2)) + flux
        err = err + flux(i)*flux(i)
    END DO
    !$OMP END DO
    !$OMP SINGLE
    err = err / nedge
    !$OMP SINGLE
    !$OMP DO
    DO j = 1, nnode
        y(i) = x(i)
    END DO
    !$OMP END DO
END DO
!$OMP END PARALLEL

Irregular Mesh: OpenMP Mapping


Irregular Mesh: Bad Data Ordering


Irregular Mesh: Good Data Ordering


Irregular Mesh: OpenMP Program with Data Reordering

!$OMP PARALLEL, DEFAULT(SHARED)
CALL renumber_nodes( iedge, permute_node)
!$OMP DO
DO i = 1, nnode
    x( permute_node(i) ) = old_x(i)
END DO
!$OMP END DO NOWAIT
!$OMP DO
DO i = 1, nedge
    iedge(i,1) = permute_node(iedge(i,1))
    iedge(i,2) = permute_node(iedge(i,2))
END DO
!$OMP END DO
CALL sort_edges(iedge, nedge)
!$OMP SINGLE
err = tol * 1e6
!$OMP END SINGLE
DO WHILE (err > tol)
    !$OMP DO, SCHEDULE(STATIC), PRIVATE(flux), REDUCTION(+:err)
    ...
    !$OMP END DO
END DO
!$OMP END PARALLEL

OpenMP Summary


Acknowledgements