ORBIT-USER: error when imaging nodes
Thierry Rakotoarivelo
Thierry.Rakotoarivelo at nicta.com.au
Sun Jul 29 19:59:56 EDT 2007
Dear all,
I re-restarted the CMC gridservice, and imaging seems to work now.
From the result of a 'ps' command, it seems like there was multiple CMC
gridservices that were still running on the machine.
There's only one now, and I quickly tested the imageNode4, which seems
to work fine.
Regards,
Thierry.
Vijay Subramanian wrote:
> Hi,
> Thanks for the prompt attention but I am still getting the same error
> for ImageNodes4.
> I tried imageNodes which got a bit further but still failed as shown below.
>
> Any ideas on how to resolve this? While we are on this, can someone
> clarify the difference between imageNodes and imageNodes4?
>
> Thanks for any pointers.
> Vijay
>
> vijay at console.grid:~$ imageNodes [1..10,1..10] baseline.ndz
> Imaging nodes: [1..10,1..10] with image baseline.ndz
> Using config /etc/nodehandler/grid.cfg
> /etc/nodehandler/grid.cfg:20: warning: Insecure world writable dir
> /tmp, mode 040777
> Using logfile /etc/nodehandler/nodehandler_log.xml
> INFO init: NodeHandler Version 3.6.4-1 (849)
> INFO init: Experiment ID: grid_2007_07_29_18_37_50
> INFO Experiment: load system:exp:stdlib
> INFO prop.resetDelay: resetDelay = 180:Fixnum
> INFO Experiment: load system:exp:imageNode
> INFO prop.nodes: nodes = [[1..10, 1..10]]:Array
> INFO prop.image: image = "baseline.ndz":String
> /tmp/eee.466/lib/util/communication.rb:127: warning: Insecure world
> writable dir /tmp, mode 040777
> INFO stdlib: 100 out of 100 node(s) still down n_10_6,n_9_5,n_7_9
> INFO n_6_9: Checked in as /ip/10.10.6.9 booting off baseline:1.0.9
> WARN n_6_9: Expected image 'pxe:1.1.4', but node reported 'baseline:1.0.9'.
> FATAL run: ServiceException: ServiceException
> No testbed known for domain '50.10.10'
> INFO n_6_9: Resseting node
> ERROR comm: While processing command '/ip/10.10.6.9 0 WHOAMI 3.4.3
> baseline:1.0.9' Error: 'ServiceException'
> /tmp/eee.466/app/nodeHandler.rb:389:in `service_call':
> ServiceException (ServiceException)
> from /tmp/eee.466/lib/handler/cmc.rb:38:in `nodeOff'
> from /tmp/eee.466/lib/handler/cmc.rb:51:in `nodeAllOff'
> from /tmp/eee.466/app/nodeHandler.rb:831:in `shutdown'
> from /tmp/eee.466/app/nodeHandler.rb:1011
> done.
>
>
> On 29/07/07, Joseph F. Miklojcik III <jfm3 at winlab.rutgers.edu> wrote:
>> On Sun, 2007-07-29 at 18:12 -0400, Vijay Subramanian wrote:
>>> I am getting an error similar to that seen by Andrea earlier. It looks
>>> like the CM service needs to be restarted. Can someone look into this
>>> and do the needful ?
>> I have restarted the CMC. Hope it helps!
>>
>> (jfm3)
>>
>>
>
>
--
-----
Thierry Rakotoarivelo
Networks and Pervasive Computing Group (NPC) - NICTA
Locked Bag 9013, Alexandria, NSW 1435, Australia
Tel. +61 2 8374 5245 / Fax. +61 2 8374 5531
Web. www.nicta.com.au
More information about the orbit-user
mailing list