Changes in nodehandler to address message losses and other issues

Nodehandler tests

Imaging 400 nodes

  1. After starting nodehandler (both imaging and experimentation), start communication layer process (ind1)
  2. 4 communication groups created for imaging all nodes. Each group is responsible for prespecified nodes. (Could be moved to a config file)
  3. Communication layer has to be started manually, but it will be terminated automatically by nodehandler at the end of the experiment
  • Main steps
    1. 80 is the magic number for the group size.
    2. Switch on nodes in groups of 80.
    3. Retry upto three times..
    4. Give up for those nodes that do not boot into pxe
    5. Then switch on the next group of 80… and so on..
    6. Until whenAll, then start frisbee process
    7. Switch off nodes in the order of completion..

Frisbee time is fairly constant, main problem is with initial booting into pxe image

Total time taken: 49:23 (out of 400, 10 were excluded)

Tutorial with 100 nodes and OTG

The attached script was used to test the tutorial with 100 nodes

Last modified 16 years ago Last modified on Sep 27, 2006, 4:41:06 PM

Attachments (1)

Download all attachments as: .zip

Note: See TracWiki for help on using the wiki.