Version 5 (modified by 18 years ago) ( diff ) | ,
---|
Operations
Hardware
System Overview
The ORBIT Testbed consists of 416 nodes, 26 Servers, and 45 ethernet switches. Nodes, servers, and switches are grouped into ORBIT resources which are referred to as "grid", and "sb1" through "sb8". The grid consists of the 400 nodes, a server that acts as a console, and 30 switches that are seperated into control, data, and CM networks. The eight sandboxes consist of 2 nodes, a console server, and a switch which aggregates all three networks.
Each resource is connected to the ORBIT back-end via the control, data, and CM networks. Each network of each resource is a seperate subnet following RFC 1981 and all route back to a Cisco PIX 515E Firewall apliance. Each subnet is connected to individual DMZ interfaces on the firewall and, therefore, has a set of security rules governing all traffic to and from each network. The firewall is configured to allow traffic from the external login machines to the ORBIT resources. Traffic generated on one resource will be blocked at the firewall if it's destination is in another resource. The purpose of this is the logical seperation of control planes for each resource; one user's experiment cannot interfere with that of another.
The Control network is comprised of 10 discrete switches on the grid, and shared switches on the sandboxes. Its purpose is to allow remote access to the nodes via ssh as well as provide a back channel for nodehandler communication and measurments collection.
Each resource shares the same ORBIT back-end which consists of 17 servers connected via a series of gigabit ethernet switches. The back-end servers run a variety of services ranging from industry standard services, such as DNS and DHCP, to ORBIT specific services.
Software
Access control to each resource is done via OpenLDAP. Each user is represented by an entry in the LDAP database with a set of attributes corresponding to the user's experiment group name, resource reservations, and email address. ORBIT services can use the information in this database to notify the user of scheduling conflicts, grant access to a resource for a requested time slot, and allow other users in his/her experiment group access to the same resources.
When a user requests a timeslot on a resource, the user accesses the ORBIT schedule webpage and selects slots. Each slot, by default, remains in the pending state until an administrator approves the request. To alleviate the human aspect of approving slots, an auto approver approves pending slots 3 minutes before the start time. Upon auto or manual approval, the schedule page generates and sends an email to the user's email address specified in the LDAP database informing the user of the state change. During the start of the slot, the auto approval service modify's the user's entry in LDAP to allow access to the approved resource. Once access is granted in LDAP, the console of the resource will detect the new entry in the user's LDAP profile and allow them access to that resource.