ORBIT-USER: About imageNodes for 100+ nodes

Nanyan Jiang jnyan at winlab.rutgers.edu
Wed Sep 27 08:55:25 EDT 2006


Hi Sachin,

    Thanks. I tried to use the short cut you mentioned. It 
worked yesterday. 
However, this morning when I wanted to imageNodes it seemed not working 
for single node image or multiple node images. Here are experiment IDs. 
What I used were

    imageNodes [10,11] file.ndz
    imageNodes [10..16,10..16] file.ndz

    Experiment ID: grid_2006_09_27_08_42_43
    Experiment ID: grid_2006_09_27_08_42_34

When I used
    imageNodes [9,16] file.ndz
    imageNodes [11..13,11..13] file.ndz
They are ok!

The experiments were successful again. Are there any particular nodes, 
which should not be included in the image node group (I guess node 
[10,11] should be excluded)? Or are there any other issues to 
be aware when imaging the nodes? Thanks.

Best,

Nanyan

On Tue, 26 Sep 2006, Sachin Ganu wrote:

> Hi Nanyan,
>
> For 1) there is a short cut tht can be used (Note that it may be
> cumbersome if your nodeset is not contiguous) e.g. tp image nodes 1,1
> to 1,10, you can use the shortcut
>
> imageNodes 1,1..10 baseline.ndz (this means row 1 and columns 1 to 10)
>
> For 2) We are looking into that..
>
> On 9/26/06, Nanyan Jiang <jnyan at winlab.rutgers.edu> wrote:
>> Hi all,
>> 
>>     I want to have my applications running on over 100 nodes of ORBIT.
>> However, when I imageNodes on ORBIT, using
>>     imageNodes all file.ndz
>> 
>>     It will only image 60 nodes at a time. And if one of the nodes cannot
>> be checked properly (in my case, 1 out of 60 node(s) still down n_4_1),
>> the image process seems not starting (unless I wait not long enough).
>> (Experiment ID:  grid_2006_09_26_16_20_44). I have two questions here:
>> 
>>     (1) Is there a convinient way to image large number of nodes on ORBIT
>> using the command imageNodes?
>> I am thinking using
>>         imageNodes xyz file.ndz
>>       where xyz is a text file containing all nodes I want to have the 
>> same
>> images for each node (this is not supported by imageNodes). It is really
>> hard to input over 100 nodes' name in
>> the command
>> line, when "imageNodes all file.ndz" only image the same 60 nodes at a
>> time. I may miss other options using imageNodes -- please let me know.
>> Thanks.
>> 
>>     (2) Once there is misbehaved node during images, is there a time-out
>> mechanism for that node, such that the image process can continue without
>> that node (the missed node will be notified at then end of the process)?
>> Thanks.
>> 
>>     Best,
>> 
>>     Nanyan
>> 
>> 
>



More information about the orbit-user mailing list