UK e-Science Grid, by Dr Steven Newhouse
The Grid
The ultimate vision of the Grid is to provide a ubiquitous infrastructure that allows flexible,
secure, coordinated resource sharing among dynamic collections of individuals, institutions, and
resources. Ideally, it will come to provide data and computing capability on demand to its users in
the similar manner as the electrical power grid delivers electricity to our homes, or the world wide
web delivers information through a web browser. Such a Grid infrastructure will be built through Grid
middleware - the software that brings together the resources owned by individuals within real
organisations to build virtual organisations (VO's) - where the services offered by an organisation,
and who is allowed to use them, are clearly defined enabling a user to pick the services they need
to meet their own specific requirements.
The UK research community is currently in the middle of an ambitious three year programme that is
demonstrating the opportunities offered by e-Science through a series of multi-disciplinary pilot
projects that couple the needs of applied scientists with the skills of computer scientists to
deliver innovative solutions to real-world requirements in engineering, physics, medicine and the
environment. These applied activities are supported by a 'Core' programme, partially funded by the
Department of Trade and Industry, which provides a network of eight regional e-Science centres, such
as the London e-Science Centre based at Imperial College London, to support their local e-Science
communities while engaging in collaborative projects with industry.
UK e-Science Grid
To support the UK e-Science activities members of the Grid Engineering Task Force, drawn from the
regional e-Science centres, have over the last 18 months been deploying existing Grid middleware
products to build a UK e-Science Grid.
This has been built upon the current defacto standard Grid middleware - the
Globus Toolkit v2.0 - which comprises three core services:
- Job Submission: Enables a job to be started on a Grid resource
- Data Transfer: Allows the input or output data files of job to be moved around the Grid
- Monitoring and Discovery: Provides a constantly updated status report of the Grid.
These core services use the Grid Security Infrastructure (GSI), implemented within the Globus
Toolkit by an established Public Key Infrastructure (PKI), to provide a basic security framework
that establishes identity and enables controlled access to Grid resources. This toolkit may be
deployed to form a basic but functional Grid, or used as a foundation to develop higher-level
services, such as schedulers that effectively match jobs to Grid resources.
Deployment of any Grid infrastructures presents many challenges. The UK e-Science Grid has been
built from the currently available resources at each regional e-Science Centre. Consequently, each
centre's resources are owned and managed by different organisations leading to local management
policies and procedures that have to be integrated within those of the virtual organisation. The
heterogeneous nature of the resources - both in terms of the operating systems and architectures -
has made the maintenance of interoperable versions of the middleware extremely challenging.
To address some of the early challenges that have emerged from the management of the UK e-Science
Grid, work has focussed on two new software packages:
- The Grid Integration Test Scripts being led by Simon Cox at the Southampton e-Science Centre
- Virtual Organisation Management Portal being led by Steven Newhouse at the London e-Science Centre.
The Grid Integration Test Scripts are designed to mimic typical user actions within the Grid,
e.g. discover a resource, run a job on the resource, transfer a data file to the resource, etc.
These scripts are now being run by each regional centre against their own, and other resources
within the UK Grid. Through this process we are able to test (typically daily) that the resources
are accessible and usable. This process has dramatically improved the reliability of the UK Grid
fabric.
The Virtual Organisation Management portal provides a mechanism for enrolling new users into a
virtual organisation and monitoring how the resources are used within it. New users authenticate
themselves to the portal using an X.509 certificate issued by the UK e-Science CA or other approved
organisation, and once authorised by the VO manager, accounts are created on local resources under
the control of the local resource manager. The local resource manager has to install software to
download the latest copy of the authorisation file for the local resources and to provide a
mechanism to upload local usage information into the VOM portal. Activity taking place throughout
the VO may then be monitored centrally.
Grid Activity at Imperial College London
The London e-Science Centre based at Imperial is actively developing and deploying Grid
infrastructures within the College to meet the diverse requirements of the research community.
Further details of the projects are on the Centre's web pages. As part of the Centre's activities as
a 'Sun Centre of Excellence in e-Science' we have deployed the open source Sun Grid Engine
Enterprise Edition across our resources encompassing different operating systems and architectures
within a single execution and management environment. The deployment of Condor
across the Department of Computing's desktop machines has supplemented the Centre's dedicated
computing resources. This has enabled scientists in the GENIE project to explore the sensitivity and
stability of the climate to deep ocean currents.
Future
Recent activity within the Grid community under the auspices of the Global Grid Forum
has led to the development of the Open Grid Services Architecture and the Open Grid Services
Infrastructure. The former is defining the types of services needed to support Grid activities, while
the latter leverages industrial standards from the web services community to define the standard
behaviour of such 'Grid Services'. A reference implementation of this infrastructure and the core set
of services is beginning to emerge from the Globus team in the form of Globus Toolkit 3. It is
expected that an experimental version of the UK e-Science Grid built using this infrastructure will
emerge in early 2004 before entering production use later in the year.
ICENI
The Imperial College e-Science Networked Infrastructure (ICENI) is a next generation Grid
middleware being developed at the London e-Science Centre, to support its involvement in a
variety of e-Science projects. It is built using Java and Jini (a Java based distributed
computing infrastructure) and can interoperate with the Open Grid Services Architecture.
A key goal within ICENI is to effectively exploit the distributed resources (computing,
storage, software, etc.) that comprises the Grid. To this end its primary focus is on
information relating to the jobs users want to run, the capability of the resources to meet
the job's needs, and the policy of the resource owners as to how they want their resources to
be used. Through an augmented component programming model it is possible to describe the
behaviour of a particular software component and how it relates to other components that it
is composed with to form an overall picture of the application's workflow. All this
information is exploited by the scheduler to effectively place the application on to the best
available resources.
The user is able to browse the available services and specify the application and workflow
structure through a graphical user interface. This is designed to hide the complexity of the
grid infrastructures from the user. All the interactions with the resources are made through
a well defined interface that authenticates users through their certificate and authorises
their use against the resource's access policy.
![[netbeans.jpg]](netbeans.jpg) Composing an ICENI application in the Netbeans environment
|