Downtime Schedules

When downtime is required, users are informed by email.

The system "Message-of-the-Day" also posts downtime schedules and other information.

It is possible that, during downtime, a number of batch jobs may have to be removed. A separate email will be sent to each user with a job in this state.

Academic Use Policy

Job Scheduling

We have two different job scheduling systems in use, the SGI Systems use Platform Computing (IBM) LSF scheduler and the Mercury cluster uses PBS Torque/Maui from Adaptive Computing. All work is handled through these scheduling systems to effectively manage resources.

LSF summary

LSF is a workload management system. LSF uses a group of configurable queues that run each job based on a number of resource requirements of the job and availability of system resources.

A limit of 30 cpu minutes has been placed on all processes start interactively during a session. When an interactive task reaches this limit it will be killed by the system. Jobs run via LSF are not affected by this limit and are controlled by the LSF queue definition.

Useful LSF files on Zeus in the directory: /usr/local/lsf/ there are several adobe acrobat (pdf) files that can be downloaded and printed for your use:

  • lsf_qrefcard_6.0.pdf | A quick reference card for normal LSF commands
  • lsf_using_6.0.pdf | The user guide for LSF
  • running_jobs.pdf | A tutorial on running jobs with LSF

PBS Torque/Maui

PBS Torque is the open source version of PBS Pro and is a standard in a large number of HPC environments. The Maui resource manager is the open source version of Adaptive’s MOAB product. Together these programs manage scheduling and resource allocation across the entire cluster.