Introduction to Parallel Debugging Tools


1. Introduction


Why do we need parallel debugging tools?

Limitations of parallel debugging tools


Commonly encountered problems when porting parallel code



2. Parallel Debugging Tools Installed on MSI Machines


On the IBM machines

On the Origins

On the Linux Cluster



3. Introduction to TotalView

TotalView is a Motif-based debugger for complex code. It helps eliminate the head-banging frustration, delays, and pain inherent in developing complex and parallel code, and provides unrivaled thread debugging support. It supports all types of parallel programming models, including MPI, and OpenMP, so if you employ parallelism, TotalView is also the debugger for you.

TotalView has advanced support for C/C++ and F90, and understands such constructs as F90 modules and nested C++ templates. It has help for those who use complex objects like non-native types: through the new type mapping facility one can display these complex objects.

TotalView is a good choice for those working with parallelism or large amounts of data because it scales transparently to support the big code and data sets running with a large number of processes or processors. It's available on a wide variety of UNIX and Linux platforms (SP, Origin and Netfinity Cluster in the Supercomputing Institute).

Please see the instruction on how to use the TotalView debugger for detailed information.


4. Introduction to Pdbx on SP

pdbx is the command-line debugger built on dbx, but adds function specific to parallel programming on IBM SP systems. Because pdbx runs in the Parallel Operating Environment, it accepts all the flags supported by the poe command.

To use pdbx for interactive debugging you first need to compile the C or Fortran 77 program and set up the execution environment as you would to invoke a parallel program with the poe command. Your program should be compiled with the -g flag in order to produce an object file with symbol table references. It is also advisable to not use the optimization option, -O. Using the debugger on optimized code may produce inconsistent and erroneous results. For more information on the -g and -O compiler options, refer to their use on other compiler commands such as cc and xlf.

Please see the instruction on how to use the Pdbx debugger for detailed information.


5. Introduction to Xprofiler on SP

Xprofiler is a GUI based performance profiling tool distributed as part of the IBM Parallel Environment for AIX. It can be used to graphically identify which functions are the most CPU intensive in your code. It provides a graphical function call tree as well as a text profile pertaining to your code. Xprofiler can be used to profile sequential and parallel C, C++, Fortran 90, Fortran77 and HPF programs.

To use Xprofiler, you first compile and link your program to ensure that profiling is enabled, then run the program to produce gmon.out file(s) (one for each processor involved in the execution) and finally invoke the xprofiler utility to display the profiling information.

Xprofiler provides CPU (busy) data only. It cannot be used to provide information such as I/O or communication data.

Please see the instruction on how to use the Xprofiler.


6. Introduction to CVD on Origin

SGI cvd is the WorkShop Debugger, which is a source-level debugging tool that provides window interface (views) for displaying program data and execution status. The WorkShop Debugger lets you set various types of breakpoints, watchpoints, and other types of signals. You can also view variables, expressions, structures, arrays, call stacks, and machine-level values. You can use the debugger to debug Ada, C, C++, Fortran 77, and Fortran 90 programs.

Because the Debugger is part of the WorkShop toolkit, you can move between tools within a debugging session by using the Launch Tool submenu of the Admin menu on the Debugger Main View.

Please see the man page (i.e., man cvd) for the commands and how to use them.


7. References

SGI Insight book: http://ofs.msi.umn.edu:88/ebt-bin/nph-dweb/dynaweb