ORBIT-USER: most of the grid does not work!
chris at orderonenetworks.com
chris at orderonenetworks.com
Fri Feb 16 14:44:53 EST 2007
I was a bit quick to post that code :) This version here seems better :)
Don't forget the 'chnod a+x gne'
gne
----------------------
#!/usr/bin/ruby
# Author: Chris Davies (chris at orderonenetworks.com)
# determine the output type
if (ARGV[0] == nil || (ARGV[0] != "nh" && ARGV[0] != "oh")) then
puts " Good Node Extractor v1.0"
puts " usage: gne nh/oh"
puts " "
puts " This utility extracts the nodes that imaged successfully"
puts " from the last experiment that was run. The parameter"
puts " specifies the output type"
puts " "
puts " nh - node handler format for use with imageNodes4"
puts " oh - orbit handler format"
puts " "
exit
end
# get the most recent log file
command = "ls -c -1 /tmp/comm*.log"
firstfile = IO.popen(command,"r").readlines[0]
if (firstfile == nil) then
puts "Error: log file not found."
exit
end
# grep the file for nodes that imaged
command = "grep 'INFO.*Wrote' " + firstfile
if (ARGV[0] == "nh") then
puts "defTopology('my:topo') { |t|"
end
IO.popen(command,"r").each { |line|
if (line != nil) then
line = line.split('n_')[1]
line = line.split('>')[0]
first = line.split('_')[0];
second = line.split('_')[1];
if (ARGV[0] == "nh")
puts " t.addNode(" + first + "," + second + ");"
else
puts "["+first+","+second+"]"
end
end
}
if (ARGV[0] == "nh")
puts "}"
end
> Ivan and Andrea,
>
> I'd prefer to have access to all the nodes - since the quantity counts :)
>
> I've come up with a little ruby script that may help. It basically parses
> the output from the last experiment and generates a list of nodes that
> imaged sucessfully. The list may be formatted in either 'nodehandler'
> format (for use with imageNodes4) or 'orbithandler' format.
>
> This basically allows you to image all the nodes, then stop the process
> when it images enough of them. Run the script to make the list of nodes
> that passed and then use that for the rest of your experiment.
>
> The code is new, so any bugs, please let me know.
>
> Thanks,
> Chris
>
>
> sample output:
> ------------------------------
>> gne nh
>
> defTopology('my:topo') { |t|
> t.addNode(1,10);
> t.addNode(8,4);
> t.addNode(9,1);
> }
>
>
> script (save as gne) (make sure to type 'chmod a+x gne' so it can run):
> -----------------------------------------
> #!/usr/bin/ruby
> # Author: Chris Davies (chris at orderonenetworks.com)
>
> # determine the output type
> if (ARGV[0] == nil || (ARGV[0] != "nh" && ARGV[0] != "oh")) then
> puts " Good Node Extractor v1.0"
> puts " usage: gne nh/oh"
> puts " "
> puts " This utility extracts the nodes that imaged successfully from
> the"
> puts " last experiment that was run. The parameter specifies the
> output type"
> puts " nh - node handler format for use with imageNodes4"
> puts " oh - orbit handler format"
> puts " "
> exit
> end
>
> # get the most recent log file
> command = "ls -c -1 -r /tmp/*.log"
>
> firstfile = IO.popen(command,"r").readlines[0]
>
> if (firstfile == nil) then
> puts "Error: log file not found."
> exit
> end
>
> # grep the file for nodes that imaged
> command = "grep Wrote " + firstfile
>
> if (ARGV[0] == "nh") then
> puts "defTopology('my:topo') { |t|"
> end
>
> IO.popen(command,"r").each { |line|
> #puts line
> line = line.split('msg: <n_')[1]
> line = line.split('>')[0]
>
> first = line.split('_')[0];
> second = line.split('_')[1];
> if (ARGV[0] == "nh")
> puts " t.addNode(" + first + "," + second + ");"
> else
> puts "["+first+","+second+"]"
> end
>
> }
>
> if (ARGV[0] == "nh")
> puts "}"
> end
>
>
>
>
>
>
>> Hi Andrea,
>>
>> If everybody agrees, we can try that. The problem is that people who
>> want to have as many nodes as possible even if they are not 100%
>> guaranteed to come up lose big time (once we declare nodes as
>> "administratively down" you can't access them at all). What do others
>> think about it?
>>
>> Ivan.
>>
>> -----Original Message-----
>> From: owner-orbit-user at winlab.rutgers.edu
>> [mailto:owner-orbit-user at winlab.rutgers.edu] On Behalf Of Andrea G Forte
>> Sent: Thursday, February 15, 2007 12:17 PM
>> To: orbit-user at winlab.rutgers.edu
>> Subject: Re: ORBIT-USER: most of the grid does not work!
>>
>> Ivan,
>>
>> thank you. This might be very helpful but perhaps I need to understand
>> it better. What is the difference between a node being off and being
>> unavailable? Currently the grid shows only 4 nodes as unavailable but
>> from my experiments there is a very large number of nodes that does not
>> turn on and others that turn on but do not complete the imaging process.
>>
>> In my opinion it would be very helpful to mark all of these nodes as
>> unavailable and just turn them off. In this way we would be able to
>> image the good nodes with minimum effort and without having to start the
>> imaging process again and again because of nodes getting stuck.
>> In other words, nodes that get stuck or do not turn on cause only
>> problems and should be disconnected.
>>
>> -Andrea
>>
>>
>> Ivan Seskar wrote:
>>>
>>>
>>>
>>>> From: owner-orbit-user at winlab.rutgers.edu
>>>>
>>> [mailto:owner-orbit-user at winlab.rutgers.edu] On Behalf Of Mesut Ali
>>> Ergin
>>>
>>>> Sent: Wednesday, February 14, 2007 10:38 PM
>>>> To: orbit-user at winlab.rutgers.edu
>>>> Subject: Re: ORBIT-USER: most of the grid does not work!
>>>>
>>>
>>> ...
>>>
>>> Just to add to this discussion: we are having problems with node power
>>> supplies (the ones with the red dots on the status page actually have
>>> dead power supplies). Unfortunately, the first symptoms of failing PSs
>>> are CM lockups and nodes stuck in on or off state; it looks like we
>> will
>>> have to replace all of them which is not a trivial thing to do. We are
>>> trying to find ways of using interim software solution that will
>>> (hopefully) prolong the life of power supplies as well as enable us to
>>> do incremental replacement (rather than force us to shut down the grid
>>> and replace all power supplies at once).
>>>
>>> Ivan.
>>>
>>> PS: Even better page for big grid status is
>>> http://www.orbit-lab.org/wiki/Status/Grid - it has a webcam feed as
>> well
>>> :-). (status pages do not auto-refresh so you will have to do it
>>> manually - after all they are not really finished yet as you will
>>> discover if you try to select individual nodes).
>>>
>>
>>
>>
>
>
>
More information about the orbit-user
mailing list