Version 4 (modified by 18 years ago) ( diff ) | ,
---|
Changes in nodehandler to address message losses and other issues
Nodehandler tests
Imaging 400 nodes
- After starting nodehandler (both imaging and experimentation), start communication layer process (ind1)
- 4 communication groups created for imaging all nodes. Each group is responsible for prespecified nodes. (Could be moved to a config file)
- Communication layer has to be started manually, but it will be terminated automatically by nodehandler at the end of the experiment
- Main steps
- 80 is the magic number for the group size.
- Switch on nodes in groups of 80.
- Retry upto three times..
- Give up for those nodes that do not boot into pxe
- Then switch on the next group of 80… and so on..
- Until whenAll, then start frisbee process
- Switch off nodes in the order of completion..
Frisbee time is fairly constant, main problem is with initial booting into pxe image
Tutorial with 200 nodes and OTG
Attachments (1)
- nh-test-100nodes.rb (6.6 KB ) - added by 18 years ago.
Download all attachments as: .zip
Note:
See TracWiki
for help on using the wiki.