EPIC - Sun Grid Engine Integration with the Globus Toolkit
Introduction
This page describes how to configure a Globus Toolkit server so that
it can submit jobs for execution on a local Sun Grid Engine
installation. It includes links to the software packages, developed
here at the London e-Science Centre, necessary to link the two systems.
Features
- The packages have been tested with version 5.3 of SGE and SGEEE
(Enterprise Edition) and versions 2.2.3, 2.2.4, 3.0.1, 3.0.2 and 3.2.0
of the Globus Toolkit. Other versions may also work, but they have not
been tested.
- The packages support single, multiple, and mpi jobtypes.
- Jobs of type single with a count greater than one are submitted as job arrays
- Jobs of type multiple are submitted as one job and execute their command(s) count times
- The sysadmin can specify a default MPI Parallel Envirnonment (PE), which will be used for MPI/multiple jobs. Of course, only PEs integrated into SGE/SGEEE on your system are supported. By default, no PE is specified. This value can be overridden by the SGE_PE/GRD_PE value in an RSL job specification.
- PE and Queue Validation is also supported. The PEs and queues available on a system can be automatically discovered and added to the RVF validation file $GLOBUS_LOCATION/share/globus_gram_job_manager/sge.rvf. This feature is disabled by default.
- Email job monitoring is supported through the use of the RSL attributes email_address, emailonexecution, emailontermination, emailonabort and emailonsuspend.
Licensing
These packages are licensed under the terms of the Globus Toolkit Public License,
version 3, except for the file sge.in which is licensed under the more permissive terms of the
Lesser GNU General Public License, v2.1.
Files
SGE Reporter package (GT2 only):
SGE Service configuration packages (GT3 only):
Toolkit-specfic instructions
The installation instructions for the SGE integration code differ depending on the release of the Globus
Toolkit that you have deployed. Please select the version of the toolkit you have installed to obtain the appropriate instructions:
Changelog
- 24th Oct 2003 - Release of SGE JobManager 0.11; changes:
- Job monitoring scalability improvements
- Fix for "lost job output" issue
- Fix for "job enters error state" issue
- Workaround for "job array output" issue, see Globus bug #1287
- 8th Sept 2003 - Release of SGE JobManager 0.10; changes:
- Modified job manager to support GT3 environment.
- 4th June 2003 - first release of SGE reporter.
- 2nd June 2003 - first release of SGE job manager.
Acknowledgements
The SGE integration code was originally written by Marko Krznaric (marko.krznaric@imperial.ac.uk) with some assistance from Paulo Tiberio Bulhoes; GT3 support and scalability improvements were contributed by David McBride (david.mcbride@imperial.ac.uk).
We would like to thank Dr. Constantinos Evangelinos at MIT and Andy Schwierskott at Sun Microsystems for their assistance and suggestions.
The MMJFS and MJS GT3 configuration packages are based on the PBS configuration packages which ship with GT3.
References
For further information please contact lesc@imperial.ac.uk