[ Platform Documentation ] [ Title ] [ Contents ] [ Previous ] [ Next ] [ Index ]
Welcome
- About This Guide
- What's New in the Platform LSF Version 6.1
- What's New in the Platform LSF Version 6.0
- Upgrade and Compatibility Notes
- Learning About Platform Products
- Getting Technical Support
1 About Platform LSF
- Cluster Concepts
- Job Life Cycle
2 How the System Works
- Job Submission
- Job Scheduling and Dispatch
- Host Selection
- Job Execution Environment
- Fault Tolerance
Part I: Managing Your Cluster
3 Working with Your Cluster
- Viewing Cluster Information
- Default Directory Structures
- Cluster Administrators
- Controlling Daemons
- Controlling mbatchd
- Reconfiguring Your Cluster
4 Working with Hosts
- Host States
- Viewing Host Information
- Controlling Hosts
- Adding a Host
- Removing a Host
- Adding Hosts Dynamically
- Adding Host Types and Host Models to lsf.shared
- Registering Service Ports
- Host Naming
- Hosts with Multiple Addresses
- Host Groups
- Tuning CPU Factors
- Handling Host-level Job Exceptions
5 Working with Queues
- Queue States
- Viewing Queue Information
- Controlling Queues
- Adding and Removing Queues
- Managing Queues
- Handling Job Exceptions
6 Managing Jobs
- Job States
- Viewing Job Information
- Changing Job Order Within Queues
- Switching Jobs from One Queue to Another
- Forcing Job Execution
- Suspending and Resuming Jobs
- Killing Jobs
- Sending a Signal to a Job
- Using Job Groups
7 Managing Users and User Groups
- Viewing User and User Group Information
- About User Groups
- Existing User Groups as LSF User Groups
- LSF User Groups
Part II: Working with Resources
8 Understanding Resources
- About LSF Resources
- How Resources are Classified
- How LSF Uses Resources
- Load Indices
- Static Resources
- Automatic Detection of Hardware Reconfiguration
9 Adding Resources
- About Configured Resources
- Adding New Resources to Your Cluster
- Configuring lsf.shared Resource Section
- Configuring lsf.cluster.cluster_name ResourceMap Section
- Static Shared Resource Reservation
- External Load Indices and ELIM
- Modifying a Built-In Load Index
10 Managing Software Licenses with LSF
- Using Licensed Software with LSF
- Host Locked Licenses
- Counted Host Locked Licenses
- Network Floating Licenses
Part III: Scheduling Policies
11 Time Syntax and Configuration
- Specifying Time Values
- Specifying Time Windows
- Specifying Time Expressions
- Automatic Time-based Configuration
12 Deadline Constraint and Exclusive Scheduling
- Deadline Constraint Scheduling
- Exclusive Scheduling
13 Preemptive Scheduling
- About Preemptive Scheduling
- How Preemptive Scheduling Works
- Configuring Preemptive Scheduling
14 Specifying Resource Requirements
- About Resource Requirements
- Queue-Level Resource Requirements
- Job-Level Resource Requirements
- About Resource Requirement Strings
- Selection String
- Order String
- Usage String
- Span String
- Same String
15 Fairshare Scheduling
- About Fairshare Scheduling
- User Share Assignments
- Dynamic User Priority
- How Fairshare Affects Job Dispatch Order
- Host Partition User-Based Fairshare
- Queue-Level User-Based Fairshare
- Cross-Queue User-Based Fairshare
- Hierarchical User-Based Fairshare
- Queue-Based Fairshare
- Configuring Queue-Based Fairshare
- Viewing Queue-Based Fairshare Allocations
- Typical Slot Allocation Scenarios
- Using Historical and Committed Run Time
- Users Affected by Multiple Fairshare Policies
- Ways to Configure Fairshare
16 Goal-Oriented SLA-Driven Scheduling
- Using Goal-Oriented SLA Scheduling
- Configuring Service Classes for SLA Scheduling
- Viewing Information about SLAs and Service Classes
- Understanding Service Class Behavior
Part IV: Job Scheduling and Dispatch
17 Resource Allocation Limits
- About Resource Allocation Limits
- Configuring Resource Allocation Limits
- Viewing Information about Resource Allocation Limits
18 Reserving Resources
- About Resource Reservation
- Using Resource Reservation
- Memory Reservation for Pending Jobs
- Viewing Resource Reservation Information
19 Advance Reservation
- About Advance Reservation
- Configuring Advance Reservation
- Using Advance Reservation
20 Dispatch and Run Windows
- Dispatch and Run Windows
- Run Windows
- Dispatch Windows
21 Job Dependencies
- Job Dependency Scheduling
- Dependency Conditions
22 Job Priorities
- User-Assigned Job Priority
- Automatic Job Priority Escalation
23 Job Requeue and Job Rerun
- About Job Requeue
- Automatic Job Requeue
- Reverse Requeue
- Exclusive Job Requeue
- User-Specified Job Requeue
- Automatic Job Rerun
24 Job Checkpoint, Restart, and Migration
- Checkpointing Jobs
- Approaches to Checkpointing
- Creating Custom echkpnt and erestart for Application-level Checkpointing
- Checkpointing a Job
- The Checkpoint Directory
- Making Jobs Checkpointable
- Manually Checkpointing Jobs
- Enabling Periodic Checkpointing
- Automatically Checkpointing Jobs
- Restarting Checkpointed Jobs
- Migrating Jobs
25 Chunk Job Dispatch
- About Job Chunking
- Configuring a Chunk Job Dispatch
- Submitting and Controlling Chunk Jobs
26 Job Arrays
- Creating a Job Array
- Handling Input and Output Files
- Redirecting Standard Input and Output
- Passing Arguments on the Command Line
- Job Array Dependencies
- Monitoring Job Arrays
- Controlling Job Arrays
- Requeuing a Job Array
- Job Array Job Slot Limit
Part V: Controlling Job Execution
27 Runtime Resource Usage Limits
- About Resource Usage Limits
- Specifying Resource Usage Limits
- Supported Resource Usage Limits and Syntax
- CPU Time and Run Time Normalization
28 Load Thresholds
- Automatic Job Suspension
- Suspending Conditions
29 Pre-Execution and Post-Execution Commands
- About Pre-Execution and Post-Execution Commands
- Configuring Pre- and Post-Execution Commands
30 Job Starters
- About Job Starters
- Command-Level Job Starters
- Queue-Level Job Starters
- Controlling Execution Environment Using Job Starters
31 External Job Submission and Execution Controls
- Understanding External Executables
- Using esub
- Working with eexec
32 Configuring Job Controls
- Default Job Control Actions
- Configuring Job Control Actions
- Customizing Cross-Platform Signal Conversion
Part VI: Interactive Jobs
33 Interactive Jobs with bsub
- About Interactive Jobs
- Submitting Interactive Jobs
- Performance Tuning for Interactive Batch Jobs
- Interactive Batch Job Messaging
- Running X Applications with bsub
- Writing Job Scripts
- Registering utmp File Entries for Interactive Batch Jobs
34 Running Interactive and Remote Tasks
- Running Remote Tasks
- Interactive Tasks
- Load Sharing Interactive Sessions
- Load Sharing X Applications
Part VII: Running Parallel Jobs
35 Running Parallel Jobs
- How LSF Runs Parallel Jobs
- Preparing Your Environment to Submit Parallel Jobs to LSF
- Submitting Parallel Jobs
- Starting Parallel Tasks with LSF Utilities
- Job Slot Limits For Parallel Jobs
- Specifying a Minimum and Maximum Number of Processors
- Specifying a Mandatory First Execution Host
- Controlling Processor Allocation Across Hosts
- Running Parallel Processes on Homogeneous Hosts
- Using LSF Make to Run Parallel Jobs
- Limiting the Number of Processors Allocated
- Reserving Processors
- Reserving Memory for Pending Parallel Jobs
- Allowing Jobs to Use Reserved Job Slots
- Parallel Fairshare
- How Deadline Constraint Scheduling Works For Parallel Jobs
- Optimized Preemption of Parallel Jobs
Part VIII: Monitoring Your Cluster
36 Achieving Performance and Scalability
- Optimizing Performance in Large Sites
- Tuning UNIX for Large Clusters
- Tuning LSF for Large Clusters
37 Event Generation
- Event Generation
38 Tuning the Cluster
- Tuning LIM
- Adjusting LIM Parameters
- Load Thresholds
- Changing Default LIM Behavior to Improve Performance
- Tuning mbatchd on UNIX
39 Authentication
- About User Authentication
- About Host Authentication
- About Daemon Authentication
- LSF in Multiple Authentication Environments
- User Account Mapping
40 Job Email, and Job File Spooling
- Mail Notification When a Job Starts
- File Spooling for Job Input, Output, and Command Files
41 Non-Shared File Systems
- About Directories and Files
- Using LSF with Non-Shared File Systems
- Remote File Access
- File Transfer Mechanism (lsrcp)
42 Error and Event Logging
- System Directories and Log Files
- Managing Error Logs
- System Event Log
- Duplicate Logging of Event Logs
- LSF Job Termination Reason Logging
43 Troubleshooting and Error Messages
- Shared File Access
- Common LSF Problems
- Error Messages
- Setting Daemon Message Log to Debug Level
- Setting Daemon Timing Levels
Part IX: LSF Utilities
44 Using lstcsh
- About lstcsh
- Task Lists
- Local and Remote Modes
- Automatic Remote Execution
- Differences from Other Shells
- Limitations
- Starting lstcsh
- Using lstcsh as Your Login Shell
- Host Redirection
- Task Control
- Built-in Commands
- Writing Shell Scripts in lstcsh
Index
[ Top ]
[ Platform Documentation ] [ Title ] [ Contents ] [ Previous ] [ Next ] [ Index ]
Date Modified: June 06, 2005
Platform Computing: www.platform.com
Platform Support: support@platform.com
Platform Information Development: doc@platform.com
Copyright © 1994-2005 Platform Computing Corporation. All rights reserved.