ORBIT-USER: Error: Nodes do not come up
Vijay Subramanian
subramanian.vijay at gmail.com
Tue Jul 31 05:30:03 EDT 2007
Hi,
I imaged the nodes using baseline.ndz using
imageNodes4 [1..20,1..20] baseline.ndz .
I then ran my script using:
nodehandler4 -k scriptname
However, none of the nodes seem to be coming up even though I was able
to ping/ssh them. The nodes were reset twice and then the system gave
up on them.
Any ideas what the problem might be?
Exp id: INFO run: Experiment grid_2007_07_31_05_02_47
I then ran the script again with a different set of nodes but it did
not help. Same error. (Exp id for this was Experiment ID:
grid_2007_07_31_05_19_28)
All and any help is appreciated.
Thanks,
Vijay
Last few lines of output follow:
INFO stdlib: Waiting for nodes (Up/Down/Total): 0/11/11 - (still down:
n_16_2,n_16_3,n_17_1)
INFO stdlib: Waiting for nodes (Up/Down/Total): 0/11/11 - (still
down: n_16_2,n_16_3,n_17_1)
WARN stdlib: Giving up on node n_16_2
WARN stdlib: Giving up on node n_16_3
WARN stdlib: Giving up on node n_17_1
WARN stdlib: Giving up on node n_1_8
WARN stdlib: Giving up on node n_15_5
WARN stdlib: Giving up on node n_20_4
WARN stdlib: Giving up on node n_8_1
WARN stdlib: Giving up on node n_20_3
WARN stdlib: Giving up on node n_12_4
WARN stdlib: Giving up on node n_18_6
WARN stdlib: Giving up on node n_11_7
INFO stdlib: Waiting for nodes (Up/Down/Total): 0/11/11 - (still
down: n_16_2,n_16_3,n_17_1)
INFO whenAll: *: 'apps/app/status[@value='INSTALLED.OK']' fires
starting soon
INFO OML: Started: {"port"=>"7600", "iface"=>"eth2", "addr"=>"224.0.0.6"}
INFO Experiment: DONE!
INFO ExecApp: Application 'commServer' finished
INFO run: Experiment grid_2007_07_31_05_02_47 finished after 10:28
--
Networks Lab, RPI
http://poisson.ecse.rpi.edu/~vijay
More information about the orbit-user
mailing list