Run parallel processes on homogeneous hosts

Parallel jobs run on multiple hosts. If your cluster has heterogeneous hosts some processes from a parallel job may for example, run on Solaris. However, for performance reasons you may want all processes of a job to run on the same type of host instead of having some processes run on one type of host and others on another type of host.

You can use the same section in the resource requirement string to indicate to LSF that processes are to run on one type or model of host. You can also use a custom resource to define the criteria for homogeneous hosts.

Run all parallel processes on the same host type

bsub -n 4 -R"select[type==HP6 || type==SOL11] same[type]" myjob

Allocate 4 processors on the same host type—either HP, or Solaris 11, but not both.

Run all parallel processes on the same host type and model

bsub -n 6 -R"select[type==any] same[type:model]" myjob

Allocate 6 processors on any host type or model as long as all the processors are on the same host type and model.

Run all parallel processes on hosts in the same high-speed connection group

bsub -n 12 -R "select[type==any && (hgconnect==hg1 |
| hgconnect==hg2 || hgconnect==hg3)] same[hgconnect:type]" myjob

For performance reasons, you want to have LSF allocate 12 processors on hosts in high-speed connection group hg1, hg2, or hg3, but not across hosts in hg1, hg2 or hg3 at the same time. You also want hosts that are chosen to be of the same host type.

This example reflects a network in which network connections among hosts in the same group are high-speed, and network connections between host groups are low-speed.

In order to specify this, you create a custom resource hgconnect in lsf.shared.
Begin Resource
hgconnect       STRING        ()             ()           ()      (OS release)
End Resource
In the lsf.cluster.cluster_name file, identify groups of hosts that share high-speed connections.
Begin ResourceMap
hgconnect       (hg1@[hostA hostB] hg2@[hostD hostE] hg3@[hostF hostG hostX])
End ResourceMap
If you want to specify the same resource requirement at the queue level, define a custom resource in lsf.shared as in the previous example, map hosts to high-speed connection groups in lsf.cluster.cluster_name, and define the following queue in lsb.queues:
Begin Queue 
QUEUE_NAME = My_test 
NICE = 20 RES_REQ = "select[mem > 1000 && type==any && (hgconnect==hg1 || 
hgconnect==hg2 || hgconnect=hg3)]same[hgconnect:type]"
DESCRIPTION = either hg1 or hg2 or hg3
End Queue
This example allocates processors on hosts that:
  • Have more than 1000 MB in memory

  • Are of the same host type

  • Are in high-speed connection group hg1 or hg2 or hg3