Changes between Version 60 and Version 61 of Documentation/CGettingStarted


Ignore:
Timestamp:
Jan 29, 2013, 3:54:58 AM (11 years ago)
Author:
seskar
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Documentation/CGettingStarted

    v60 v61  
    11[wiki:WikiStart Orbit] > GettingStarted
     2
     3[[TOC(*)]]
    24
    35= How to get started =
    46
    5 First, you need an account. Please check the UsagePolicy if you are eligible. In order to get an account please [http://www.orbit-lab.org/userManagement/register register here].
     7First, you will need an ORBIT account. Please check the [wiki:Documents/About/UsagePolicy usage policy] if you are eligible. Typically, in order to get an account,one would have to [http://www.orbit-lab.org/userManagement/register register for an account] and get it approved by the PI in charge of the project/institution they wish to be part off. If particular institution is not available, the appropriate PI can [http://www.orbit-lab.org/userManagement/orgReg] register for an institutional account.
    68
    7 A typical experiment requires the following three steps:
     9A typical experiment requires the following five steps:
    810
    9  * [#Reservations Reservation]
    10  * [#LoadImage Loading an Image]
    11  * [wiki:Tutorial/RunningExpirment Running the experiment]
    12  * [wiki:Tutorial/AnalyzeResults Analyzing the results]
    13  * [#NextStep Next steps]
     11 * Creating reservation: before you can access the testebed,you need to [http://www.orbit-lab.org/scheduler/ make a reservation] and get it approved by the  [wiki:Documentation/Scheduler] reservation service. First time users are '''highly''' encouraged to reserve time on a sandbox instead of the main grid, and start with the built-in [wiki:/Tutorials/HelloWorld Hello World] experiment (including ability to access all 400 nodes in the grid)
     12 * [wiki:Documentation/Short/Login Login into reserved domain:] after you receive the confirmation email, you can access the reserved domain  by ssh-ong into the corresponding console.
     13 * [#LoadImage Loading an Image on the ndoes]
     14 * [#Run Running the experiment]
     15 * [#Analyze Analyzing the results]
    1416
    15 == Reservations == #Reservations
     17== Reservations == #Reserve
    1618
    17 As this is a wireless testbed, it is difficult to run multiple experiments without interference. Therefore, we currently only support one experiment at a time on the individual grid. In Orbit speak, a grid is a set of nodes together with the controlling console which can be used to run experiments. In the present setup, the testbed consists of a single large grid (main grid) with 400 nodes and an array of sandboxes i.e. "grids" with only 2 nodes and a console, which are development and test environments intended to reduce the time experimenters need on the main grid. Ideally, experimenters develop their software (application programs, routing protocols, measurement instrumentation, etc.) on off-site machines and then use the sandboxes for integration with the orbit environment and orbit software infrastructure. Once the experiment runs successfully in the sandbox environment, it can be moved to the main grid.
     19First time users are '''highly''' encouraged to reserve time on a sandbox instead of the main grid, and start with the built-in [wiki:/Tutorials/HelloWorld Hello World] experiment (including ability to access all 400 nodes in the grid).
    1820
    19 ||[[Image(Schedule-howto3.jpg, width=500)]]||
    20 ||Figure 1: Scheduler web page||
    21 
    22 Reservations for Orbit resources (the main grid or any of the sandboxes) can be made on the [https://www.orbit-lab.org/schedule/ ORBIT Schedule page]. The scheduler main screen is illustrated in Figure 1.
    23 
    24 To reserve a resource, first navigate to the table for the day you wish to make the reservation on (please note that you can advance the calendar by using navigation links at the bottom of the page; also you cannot reserve a resource for a date/time that has passed or the one for which you don't have permission). Once you have located the table for the requested day, click on the time slot you want to use as a start time of your resource reservation. This will open the form showing the detail of the reservation. Currently slot duration is limited to 2 hours per request.
    25 
    26 [[Image(Schedule-howto4.jpg, border:solid 2px black)]]
    27 [[Image(Schedule-howto5.jpg, title="Newly crated reservation", align=right, width=250, border:solid 2px black)]]
    28 
    29 After saving the reservation, the pop-up windows is closed and main scheduler table is updated with the newly created reservation slot in yellow color indicating that it is in the "pending approval" state. Also, the email notification  on the reservation is sent to your registered email address. 
    30 
    31 Reservations slots are approved by the scheduling service based on the [wiki:Documentation/Scheduling#ReservationApprovalPolicies two stage approval policy]. Once it has been approved, the color for that slot will be changed to dark blue and approval email notification message will be sent to the requester and requester will be able to access the console of the resource whose reservation was just approved.
    32 
    33 === Conflicts ===
    34 
    35 It is possible to ask for a particular slot even if other user already made a reservation for it. The procedure is the same as for requesting an empty slot except that the resulting color changes to red once there are multiple simultaneous (conflicting) reservations as shown in Figure 2.
    36 
    37 ||[[Image(Schedule-HowTo6.jpg)]]||
    38 ||Figure 2: Scheduling Conflicts||
    39 
    40 === Reservation Approval Policies ===
    41 
    42 Reservation approval process is based on a two stage algorithm. In the first (pre-approval) stage, scheduling requests received before noon will be pre-approved for the following day. For example, if it is Tuesday morning before noon, and you ask for 4 to 6 in the evening Wednesday, you will know for sure whether you have this time by 2 in the afternoon on Tuesday. Users are limited to two hours a day of pre-approved time on the main grid.
    43 
    44 For the reservations that are made less than twelve hours in advance or for ones that are more than 2 hours a day, the slots will be automatically approved at the beginning of the slot (second or just-in-time approval stage).
    45 
    46 === Conflict Resolution ===
    47 
    48 Conflicts will be resolved based on how much time you've already used over the last two weeks. Those who have used less time on the main grid in the last two weeks will be more likely to have their requests approved for the conflicting slots.
    49 
    50 Due to complexity of the conflict resolution algorithm, please refrain from conflicting on slots that are less that 2 hours in the future since just-in-time approval process will not try to resolve conflicts.
     21[[Include(Documentation/Short/Login)]]
    5122
    5223== Loading an Image == #LoadImage
    5324
    54 During your approved time slot, you will be able to ssh into the console of the respective grid. A console is a dedicated machine that allows access to all resources on that grid.
     25When you have successfully logged in, you can start an experiment using the [wiki:Software/cOMF#ExperimentController Orbit Management Framework (OMF)].
    5526
    56 '''During your approved time slot''', you can then log into the '''console''' corresponding to the following table using SSH:
    57 || Name || Nodes || Console FQDN || Special Resources ||
    58 ||Main grid || 400 || console.grid.orbit-lab.org || USRP2, USRP1, Blue too, Zigbee, etc... ||
    59 ||Sandbox 1 || 2 || console.sb1.orbit-lab.org || None ||
    60 ||Sandbox 2 || 2 || console.sb2.orbit-lab.org || None ||
    61 ||Sandbox 3 || 2 || console.sb3.orbit-lab.org || USRP2 ||
    62 ||Sandbox 4 || 9 || console.sb4.orbit-lab.org || RF isolated nodes + mixer ||
    63 ||Sandbox 5 || 2 || console.sb5.orbit-lab.org || USRP1 ||
    64 ||Sandbox 6 || 2 || console.sb6.orbit-lab.org || WinC2R ||
    65 ||Sandbox 7 || 2 || console.sb7.orbit-lab.org || None ||
    66 ||Sandbox 8 || 2 || console.sb8.orbit-lab.org || None ||
    67 ||Sandbox 9 || 11 || console.sb9.orbit-lab.org || Netfpga + Openflow ||
    68 ||Outdoor || Variable || console.outdoor.orbit-lab.org || Variable ||
    69 
    70 
    71 For example,  to access the sandbox1,
    72 {{{
    73 yourhost>ssh username@console.sb1.orbit-lab.org
    74 }}}
    75 
    76 When you have successfully logged in, you can start an experiment using the [wiki:Software/cOMF#ExperimentController Orbit Management Framework (OMF)]. First time users are '''highly''' encouraged to reserve time on a sandbox instead of the main grid, and start with the built-in [wiki:/Tutorials/HelloWorld Hello World] experiment.
    77  1. Before we begin using the nodes, it's a good idea to check their status first. This is done with the omf stat command. This will typically produce a result like:
    78     {{{
    79  user@console.sb7:~$ omf stat
    80 
    81  INFO NodeHandler: OMF Experiment Controller 5.4 (git c005675)
    82  INFO NodeHandler: Slice ID: default_slice (default)
    83  INFO NodeHandler: Experiment ID: default_slice-2013-01-16t15.28.15-05.00
    84  INFO NodeHandler: Message authentication is disabled
    85  INFO Experiment: load system:exp:stdlib
    86  INFO property.resetDelay: resetDelay = 230 (Fixnum)
    87  INFO property.resetTries: resetTries = 1 (Fixnum)
    88  INFO Experiment: load system:exp:eventlib
    89  INFO Experiment: load system:exp:stat
    90  INFO Topology: Loading topology ''.
    91  INFO property.nodes: nodes = "system:topo:all" (String)
    92  INFO property.summary: summary = false (FalseClass)
    93  INFO Topology: Loading topology 'system:topo:all'.
    94  Talking to the CMC service, please wait
    95  -----------------------------------------------
    96  Domain: sb7.orbit-lab.org
    97  Node: node1-1.sb7.orbit-lab.org         State: POWEROFF
    98  Node: node1-2.sb7.orbit-lab.org         State: POWEROFF
    99  -----------------------------------------------
    100  INFO EXPERIMENT_DONE: Event triggered. Starting the associated tasks.
    101  INFO NodeHandler:
    102  INFO NodeHandler: Shutting down experiment, please wait...
    103  INFO NodeHandler:
    104  INFO run: Experiment default_slice-2013-01-16t15.28.15-05.00 finished after 0:6
    105     }}}
    106     Individual nodes are identified by their fully qualified domain name (FQDN). This establishes their "coordinates" and the "domain" to which they belong. Nodes in
    107     different domains can NOT see each other.
    108 
    109  2. Node can be in 1 of 3 states:
    110 
    111     || POWEROFF       || Node is Available for use but turned off ||
    112     || POWERON        || Node is Available and is on ||
    113     || NOT REGISTERED || Node is not Available for use ||
    114 
    115  3. It is recommended that the node be in the POWEROFF state prior to any experiment process. If the node is in the POWERON state you can use the omf tell command
    116     to get the node into the off state.
    117     {{{
    118     username@console.domain:~$ omf tell -a offh -t TOPOLOGY
    119     }}}
    120     The ''TOPOLOGY'' can take on many forms, the simplest being a comma separated list of FQDN's. There are special predefined topologies like: all, system:topo:circle, ...
    121     For more details see [wiki:/Software/cOMF OMF documentation]
    122     If the node is in the NOT REGISTERED state, you may need to wait for it to recover the POWEROFF state (it some times requires a few moments for the services to sync up). If
    123     the node never comes out of the NODE NOT AVAILABLE state please contact an administrator.
    124 
    125  4. Prior to the experiment, users need to install an image on the hard disks of the nodes. If you have not created a custom image use the default starting image:
    126     '''baseline.ndz'''. This image is built on top of '''Ubuntu 12.04''', and is pre-configured with the proper modules and start up scripts to take advantage of the rest of
    127     the Orbit services / hardware.  Loading an image is done with the [wiki:/Software/cOMF#load omf load command].
    128     {{{
    129     username@console.domain:~$ omf load -t TOPOLOGY -i IMAGENAME
    130     }}}
    131     Where ''TOPOLOGY'' is the set of nodes you wish to image , and !IMAGENAME is the name of the image you with to load. The most common sandbox starting image command
    132     would look like
    133     {{{
    134     username@console.domain:~$ omf load -t all -i baseline.ndz
    135     }}}
    136     which will load all the nodes of sandbox 1 (totaling 1) with the [wiki:Documentation/SupportedImages baseline] image. An example run on sandbox 7 looks like:
    137     {{{
    138 user@console.sb7:~$ omf load -t all -i baseline.ndz
    139 
    140  INFO NodeHandler: OMF Experiment Controller 5.4 (git c005675)
    141  INFO NodeHandler: Slice ID: pxe_slice
    142  INFO NodeHandler: Experiment ID: pxe_slice-2013-01-16t14.56.02-05.00
    143  INFO NodeHandler: Message authentication is disabled
    144  INFO Experiment: load system:exp:stdlib
    145  INFO property.resetDelay: resetDelay = 230 (Fixnum)
    146  INFO property.resetTries: resetTries = 1 (Fixnum)
    147  INFO Experiment: load system:exp:eventlib
    148  INFO Experiment: load system:exp:imageNode
    149  INFO property.nodes: nodes = "system:topo:all" (String)
    150  INFO property.image: image = "baseline.ndz" (String)
    151  INFO property.domain: domain = "sb7.orbit-lab.org" (String)
    152  INFO property.outpath: outpath = "/tmp" (String)
    153  INFO property.outprefix: outprefix = "pxe_slice-2013-01-16t14.56.02-05.00" (String)
    154  INFO property.timeout: timeout = 800 (Fixnum)                                                                                         
    155  INFO property.resize: resize = nil (NilClass)
    156  INFO Topology: Loading topology 'system:topo:all'.
    157  INFO Experiment: Resetting resources
    158  INFO stdlib: Waiting for nodes (Up/Down/Total): 0/2/2 - (still down: node1-2.sb7.orbit-lab.org,node1-1.sb7.orbit-lab.org) [0 sec.]
    159  INFO stdlib: Waiting for nodes (Up/Down/Total): 0/2/2 - (still down: node1-2.sb7.orbit-lab.org,node1-1.sb7.orbit-lab.org) [10 sec.]
    160  INFO stdlib: Waiting for nodes (Up/Down/Total): 0/2/2 - (still down: node1-2.sb7.orbit-lab.org,node1-1.sb7.orbit-lab.org) [20 sec.]
    161  INFO stdlib: Waiting for nodes (Up/Down/Total): 0/2/2 - (still down: node1-2.sb7.orbit-lab.org,node1-1.sb7.orbit-lab.org) [30 sec.]
    162  INFO ALL_UP: Event triggered. Starting the associated tasks.
    163  INFO exp: Progress(0/0/2): 0/0/0 min(node1-2.sb7.orbit-lab.org)/avg/max (30) - Timeout: 760 sec.
    164  INFO exp: Progress(0/0/2): 10/10/10 min(node1-2.sb7.orbit-lab.org)/avg/max (30) - Timeout: 750 sec.
    165  INFO exp: Progress(0/0/2): 10/15/20 min(node1-1.sb7.orbit-lab.org)/avg/max (30) - Timeout: 740 sec.
    166  INFO exp: Progress(0/0/2): 20/25/30 min(node1-2.sb7.orbit-lab.org)/avg/max (30) - Timeout: 730 sec.
    167  INFO exp: Progress(0/0/2): 30/35/40 min(node1-1.sb7.orbit-lab.org)/avg/max (30) - Timeout: 720 sec.
    168  INFO exp: Progress(0/0/2): 40/40/40 min(node1-2.sb7.orbit-lab.org)/avg/max (30) - Timeout: 710 sec.
    169  INFO exp: Progress(0/0/2): 40/45/50 min(node1-1.sb7.orbit-lab.org)/avg/max (30) - Timeout: 700 sec.
    170  INFO exp: Progress(0/0/2): 50/55/60 min(node1-1.sb7.orbit-lab.org)/avg/max (30) - Timeout: 690 sec.
    171  INFO exp: Progress(0/0/2): 60/65/70 min(node1-1.sb7.orbit-lab.org)/avg/max (30) - Timeout: 680 sec.
    172  INFO exp: Progress(0/0/2): 60/65/70 min(node1-1.sb7.orbit-lab.org)/avg/max (30) - Timeout: 670 sec.
    173  INFO exp: Progress(0/0/2): 70/75/80 min(node1-2.sb7.orbit-lab.org)/avg/max (30) - Timeout: 660 sec.
    174  INFO exp: Progress(0/0/2): 90/90/90 min(node1-2.sb7.orbit-lab.org)/avg/max (30) - Timeout: 650 sec.
    175  INFO exp: Progress(1/0/2): 90/95/100 min(node1-1.sb7.orbit-lab.org)/avg/max (30) - Timeout: 640 sec.
    176  INFO exp: Progress(2/0/2): 100/100/100 min()/avg/max (30) - Timeout: 630 sec.
    177  INFO exp:  -----------------------------
    178  INFO exp:  Imaging Process Done
    179  INFO exp:  2 nodes successfully imaged - Topology saved in '/tmp/pxe_slice-2013-01-16t14.56.02-05.00-topo-success.rb'
    180  INFO exp:  -----------------------------
    181  INFO EXPERIMENT_DONE: Event triggered. Starting the associated tasks.
    182  INFO NodeHandler:
    183  INFO NodeHandler: Shutting down experiment, please wait...
    184  INFO NodeHandler:
    185  INFO NodeHandler: Shutdown flag is set - Turning Off the resources
    186  INFO run: Experiment pxe_slice-2013-01-16t14.56.02-05.00 finished after 3:13
    187     }}}
     27 
    18828
    18929 5. The imageing process will turn the nodes back off after completing imageing. At this point the nodes disks are imaged with the ''basline'' image