ORBIT-USER: Error: Nodes do not come up

Thierry Rakotoarivelo Thierry.Rakotoarivelo at nicta.com.au
Tue Jul 31 05:56:00 EDT 2007


Hi Vijay,

The image "baseline.ndz" points to an image that does NOT contains 
nodeAgent4. Therefore there was no one to reply to the nodeHandler4 in 
your attempt to run the experiment script.

In order to use nodeHandler4, you need a nodeAgent4 running on each of 
your nodes.

Please re0image your nodes with either "baseline-2.2.ndz" or 
"baseline-2.3.ndz".

Regards,
Thierry.

Vijay Subramanian wrote:
> Hi,
> I imaged the nodes using baseline.ndz using
> imageNodes4 [1..20,1..20] baseline.ndz .
> 
> I then ran my script  using:
> nodehandler4 -k scriptname
> 
> However, none of the nodes seem to be coming up even though I was able
> to ping/ssh them. The nodes were reset twice and then the system gave
> up on them.
> Any ideas what the problem might be?
> 
> Exp id:  INFO run: Experiment grid_2007_07_31_05_02_47
> 
> I then ran the script again with a different set of nodes but it did
> not help. Same error. (Exp id for this was   Experiment ID:
> grid_2007_07_31_05_19_28)
> All and any help is appreciated.
> Thanks,
> Vijay
> 
> Last few lines of output follow:
> 
> INFO stdlib: Waiting for nodes (Up/Down/Total): 0/11/11 - (still down:
> n_16_2,n_16_3,n_17_1)
>  INFO stdlib: Waiting for nodes (Up/Down/Total): 0/11/11 - (still
> down: n_16_2,n_16_3,n_17_1)
>  WARN stdlib: Giving up on node n_16_2
>  WARN stdlib: Giving up on node n_16_3
>  WARN stdlib: Giving up on node n_17_1
>  WARN stdlib: Giving up on node n_1_8
>  WARN stdlib: Giving up on node n_15_5
>  WARN stdlib: Giving up on node n_20_4
>  WARN stdlib: Giving up on node n_8_1
>  WARN stdlib: Giving up on node n_20_3
>  WARN stdlib: Giving up on node n_12_4
>  WARN stdlib: Giving up on node n_18_6
>  WARN stdlib: Giving up on node n_11_7
>  INFO stdlib: Waiting for nodes (Up/Down/Total): 0/11/11 - (still
> down: n_16_2,n_16_3,n_17_1)
>  INFO whenAll: *: 'apps/app/status[@value='INSTALLED.OK']' fires
> starting soon
>  INFO OML: Started: {"port"=>"7600", "iface"=>"eth2", "addr"=>"224.0.0.6"}
>  INFO Experiment: DONE!
>  INFO ExecApp: Application 'commServer' finished
>  INFO run: Experiment grid_2007_07_31_05_02_47 finished after 10:28
> 
> 
> 

-- 
-----
Thierry Rakotoarivelo
Networks and Pervasive Computing Group (NPC) - NICTA
Locked Bag 9013, Alexandria, NSW 1435, Australia
Tel. +61 2 8374 5245 / Fax. +61 2 8374 5531
Web. www.nicta.com.au



More information about the orbit-user mailing list