[Slides]

publicité
enabling the e-Scientist
!#"%$'&)(+*,"%$.- / (102354'06(%7)98 :
$ #".;+7)(< 7)8 =>7<?$:#"
'
= =9@A0CB5*D"E027<GF
Steven Newhouse
Technical Director
London e-Science Centre
Imperial College, London
Grid2001 - 12th November 2001
1
Contents
• The Grid and e-Science
• ICENI – Imperial College E-Science
Networked Infrastructure
• The need for higher-level meta-data
• Exploiting the meta-data
• Component Based Linear Solver
• Summary & Acknowledgements
2
1
From Computational Grids
to e-Science
• The Grid: complex networked resources
– Don’t know which resources will be available and when
– Potentially fragile resource connectivity and availability
– Link instruments / storage / computation / HPC
• E-Science: higher-level applied use of these resources
– Complex applications have to retain performance
• Analysis of very large distributed data sets
• Coupled execution for multi-physics or visualisation
– Ease of use for domain specialists
• For the Developer – higher level abstractions
• For the User – higher level interaction, e.g. portals, PSE’s
– Better utilisation of resources and improve throughput
– Support scientific collaboration and knowledge management
3
Building Computational
Communities
• Need to federate resources from real organisations
– Express ownership and retain control
– Grid resources are ‘not’ free – accountability / payment!
– Manage the resources as a single unit
• Computational Communities (a.k.a. Virtual Organisations)
–
–
–
–
Share and combine heterogeneous resources
Ensure transparency by hiding complexity from the user
Optimise resource utilisation for all jobs and users
Easy to contribute (and withdraw) resources
• Collect and retain information relating to:
– The resources
– The applications
– The users
4
2
ICENI
The Iceni, under Queen Boudicca,
united the tribes of South-East
England in a revolt against the
occupying Roman forces in AD60
H
• IC E-Science Networked Infrastructure
• Developed by Grid Middleware Group,
London E-Science Centre
• Collect and provide relevant Grid meta-data
• Use to define and develop higher-level services 5
Open Extensible Technologies
• XML used to define:
– Resource, application, user meta-data
– Protocol between different services in the system
• Java used to construct the framework, run-time
representation and interfaces to the assembled
components
• Jini used to provide a wide-area transport layer that
supports dynamic discovery and join
• Examining SOAP, WSDL, UDDI, .NET, SunONE etc.
6
3
Need Higher Level Knowledge
• Resources:
– Availability, capability, environment, access
• Application:
– Composition, behaviour, performance
Related talk: Anthony Mayer,
Thursday 2.00, A201/205
• User:
– Who, what, when, where
7
Managing Internal Resources
• The resources advertise their capabilities to a
private domain
• Extensible resources abstractions:
–
–
–
–
Computational resources
Storage resources
Resource
Software resources
…
Resource
• Annotate using XML
Resource
Private
Administrative
Domain
Resource
Manager
Manager
Jini Lookup
Service
8
4
The Domain Manager
Sole route between the private and the public
areas of the infrastructure
Private
I
Public
Public
Imposes local usage
requirements and access
and publicly advertises
its resources
Private
Administrative
Domain
Identity
Manager
Domain
Manager
I
Private
Authenticates requests
to use the local
resources (through the
Identity Manager)
Access route to &
from multiple
computational
communities
9
Trusting Organisations & Users
• Recognise three entities: individuals, groups
and organisations (or domains)
• Entities verified through the Identity Manager
using public key certificates (X.509)
• Locally managed access control list determine
which entities have access to local resources
• Non-local users can be mapped to individual,
single, or guest accounts (resource dependant
policy)
10
5
Interaction with the
Computational Community
User
Public
Computational
Community
Domain
Manager
User
User
Public
Computational
Community
Domain
Manager
The Computational Community allows Users
& Domain Managers to advertise
their capabilities and requirements.
11
The Resource Browser GUI
12
6
Resources in a Computational
Community
A resource might be:
- Available J shown with all its public
attributes and access policies. It’s possible to
get the values of the dynamic attribute(s) and
to connect to the resource
- Temporary unavailable J only the access
policies are shown
- Always unavailable J no information is
shown
13
ICENI Architecture
Identity
Manager
Computational
Resource
Private
Administrative
Domain
CR
Public Computational Community
CR
SR
Domain Manager
CR
SR
SR
Resource Browser
Software
Resource
Public Computational Community
SR
Policy Manager
CR
ResourceManager
Private
SR
Gateway between private
and public regions
Resource
Broker
Application
Mapper
Public
14
7
Higher Levels of Abstraction
• Developer
– Create numerical implementations
– Provide performance & implementation meta-data
– Extensible abstract objects or interfaces
• Scientist
– Add domain specific knowledge to implementations
– Define (new?) component interfaces
– Specify component meta-data
• End-User
– Customise component instances
– Application defined by interacting component instances
– Use components in local (or remote) repositories
15
Separating Meta-Information
from Implementation
Application
Construction
meta
meta
meta
Optimisation,
verification
etc.
Scheduling
meta
Code
meta
Code
meta
Code
Code
Code
Component Repository
Code
16
8
CXML – Component Meta-Data
Annotated
Java
wrapper
class around
a native
library
Developer
Java wrapper
class native
library
Component +
implementation
Components
User
Component
Network
Grid
Resources
CXML
Repository
Implementations
Optimiser
Execution Plan
17
User Information
• Security
– Need secure trusted identity for use on the Grid
– Need to define what it can be used for
• Application
– Develop through visual programming
– Drag & drop functionality from elsewhere
• Policy
– Restrict where the application runs
• Accounting
– Which project (account) and how much?
– Defining resource exchange rates
18
9
Exploiting The Meta-data
• The Computational Community contains
information:
– On resource capability, usage and access policy
– On the user’s job and their application’s capability
• Uses this information to ensure:
– The user’s job is run as specified (e.g deadline)
– All resources are fully utilised (if possible)
– The mapping of jobs to resources is ‘optimal’
19
Higher-level Grid Services
• Application Mapper
– Uses resource and application information to optimise
resource selection for an application
(e.g. are 16 PC processors better than 8 Alpha processors?)
• Resource Broker
– Uses computational economics to balance priorities
(e.g. 16 underused PC processors may cost less for longer
than 8 over subscribed Alpha processors)
– Factor queue length into resource selection
(e.g. Will a slow unloaded resource provide shorter job
turnaround time than a faster but heavily loaded resource?)
20
10
Example: Linear Solver
Linear
Equation
Source
DoF
Matrix
Vector
Unsymmetric
Matrix
Display
Vector
Vector
LU
C
C
Linear
Equation
Solver
BiCG
Java
C
Java
Java
ScaLAPACK
ScaLAPACK
21
Local Computational
Community
System
Processor
Number
Language
Solution
PC
(Linux)
AMD
900MHz
1
Java + C
LAPACK
LU + BCG
LU
Atlas
Alpha
667MHz
1,4,9
ScaLAPACK
LU + BCG
AP3000
UltraSparcII
300MHz
4,9,16
ScaLAPACK
LU + BCG
22
11
Dynamic Resource Cost
System
Cost Model 1 Cost Model 2
PC
(Linux)
Atlas
80
80
475
425
AP3000
100
100
23
Influence of Resource Selection
Policy on Execution Times
Cost Model 2
Java(1)/Atlas+BCG(4,9)
Java(1)/Atlas+BCG(1,4,9)
Minimum Time
Time (s)
Java(1)/AP3000+BCG(4,9,16)
Java(1)/Atlas+BCG(16)
C(1)/Linux+BCG(1)
C(1)/Linux+BCG(1)
Cost Model 1
C(1)/Atlas+BCG(16)
Number of unknowns
24
12
Cost savings relative to
minimum time
25
Layered Grid Architecture
KMLNLPORQRSTUVQXWY[Z]\_^`2WaPOXbdcfegWh i+QjY[kmlPYni+QX`2WdY[cmbdY[UoZ
s ¢ Œ¤ €ƒ‚¨©• ~ª
w «­¬«–€ƒ‚
KMLNLPhRQXSTUoQRWYš™œ››hXdžRŸz ¤
Ž
~
‚
«
~
tr† ‹
v6‘Ry ‹2ŒƒŽVŒ ¦ s { ‘ ‹6§A§ }• ¥  ŒŽ „Rs ” ¡£¢ Œ¤ €ƒ‚  € 
˜x—
p `2QXqmegbd`ri+QXSbdZ
struAv
}~€ƒ‚
y…„R†
sxuzw®w
v w
sxwzy sxw®y ¦ Š ‘ {
| v
”
˜
‘
‘
p `2QXqm‡ˆTaP`2QXS
‹2ŒwAŽŠ Œ ‘v6| ‘Rw y
‹
‰
}’!“
‰–•  €X } ” w Œ¢ ¬  • ‚ ” • ¯ ¯ w —~ ˜  °
˜
p²± žXqm³µ´d ›¶ ±2· ´ˆ 13
¼  ½¸¾¹Á ¿ººÀÁ »ÃºÄ ¹Á ¿º Á
Exploit Layered Architecture
ÍMÎNÎPÏRž · ˟ožR›Ðš™œ››ÏXdžRŸz ÅzÆÇlNÈÉÅ
p²± žXqmÊg´ ± i+ž · ´d p²± žXqm‡ˆËÌ ± ž ·
27
¼  ½¸¾¹Á ¿ººÀÁ »ÃºÄ ¹Á ¿º Á
Integration with the Grid
Identity
Manager
Computational
Resource
Private
Administrative
Domain
CR
SRB
Public Computational Community
CR
HTC
Portals
(JServlet) Applied
Portals
SR
Domain Manager
CR
SR
SR
Resource Browser
Software
Resource
Public Computational Community
Globus
Java
CoG
SR
Policy Manager
CR
ResourceManager
pM± žXq p²± žXq
‡ˆËÌ ± ž · ʜ´ ±rÑ ž · ´d Private
SR
Gateway between private
and public regions
ÅzÆÇlNÈÉÅ
Resource
Broker
Application
Mapper
Public
ͲÎPÎNÏjž · ˟VžX›Ð
™œ›Ò›ÏXdžXŸo 28
14
¼  ½¸¾¹Á ¿ººÀÁ »ÃºÄ ¹Á ¿º Á
Summary
• To enable the e-Scientist to fully exploit the resources
in the Grid we need information relating to the:
– Resources
– Applications
– Users
• Represent the meta-data in an open extensible manner
using XML derived syntax.
• Collect and hold the meta-data in a Java / Jini Grid
middleware which has inherent fault-tolerance.
• Allow higher level services to use this meta-data to
transparently guide resource allocation within the
computational community for defined goals.
29
¼  ½¸¾¹Á ¿ººÀÁ »ÃºÄ ¹Á ¿º Á
Acknowledgements
• London e-Science Centre, Grid Middleware Group:
– John Darlington, Steven Newhouse, Tony Field
– Anthony Mayer, Nathalie Furmento, Stephen McGough,
– James Stanton, Yong Xie
• Further information:
– http://www-icpc.doc.ic.ac.uk/components/
– http://www.lesc.ic.ac.uk/
• Contact: [email protected]
• Funding: EPSRC GR/N13371
• Related talk:Anthony Mayer, Thursday 2.00, A201/205
30
15
Téléchargement