IBM BladeCenter Linux Cluster

Introduction

The Institute has purchased a Linux Cluster from IBM. It is a IBM BladeCenter H with 307 LS 21 nodes. Each node has two dual-core 2.6 GHz AMD Opteron processors sharing 8 GB of memory. This gives a total of 1168 cores on the system. This system is scheduled to be available for production in January 2007. There will be approximately 40 TB of disk on this system. The Institute plans to emphasize parallel jobs on this new Linux Cluster.

The definition of a processor is no longer as clear as it once was. Vendors are using the word core to refer to an independent processing element that is physically on the same chip with one or more other independent processing elements. The cores are still independent processing elements and it is up to the user of the system to write code to run on multiple cores. Each dual-core AMD Opteron processor has two cores. Therefore, within an IBM LS21 BladeCenter node, one can run what should be called a 4-core job. Unfortunately, to add to the confusion, this is still referred to as a 4-processor job.

Hardware and Configuration

There will be one type of compute node.

Each node:

Scratch Spaces

There is a local scratch directory mounted at /scratch on each of the compute nodes. These node scratch spaces are not available on the interactive nodes. Two global scratch spaces, available on the compute nodes and the interactive nodes, are mounted at /scratch1 and /scratch2.

Scratch directories are not backed up. All files in the scratch directories that have not been modified for 14 days will be deleted.

Network

The compute nodes are connected with InfiniBand and with Gigabit ethernet.

Scientific and Math Libraries

Queues and Throttling Policies

Operations