Version 1 (modified by 18 years ago) ( diff ) | ,
---|
http://www.linuxbios.org/index.php/Welcome_to_LinuxBIOS
Justification
LinuxBIOS looks like a good bet for ORBIT, if we can port it to the ORBIT node hardware. In theory, we have enough documentation of the ORBIT node hardware to do this. LinuxBIOS will make our node imaging process far more streamlined in the following ways:
- It is relatively difficult to service 400 simultaneous DHCP requests with our network infrastructure. There are COTS solutions, but these are overfeatured and therefore unreasonably expensive. Observing that every node gets the same answer from the DHCP server for every request it sends (based upon its position in the grid), it would be possible to eliminate the DHCP step entirely and going straight to image download if we could pre-program nodes with their basic network identity by running our own BIOS.
- It is also difficult to tftp down a PXE image to 400 nodes simultaneously. We want to use a multicast tftp server (orthogonal with Frisbee), but there is no mtftp client in our present BIOS.
- We may be able to provide other useful features in BIOS. For example, we could inventory the devices on nodes without booting even as much as a PXE image.
- We almost certainly have not yet encountered the full extent of problems with grid/cluster computing presented by an installation such as ORBIT. LinuxBIOS buys us a great deal of flexibility. Also, because LinuxBIOS is used primarly on similar installations, it may already contain solutions for the problems we have not encountered yet.
Potential Problems
To upgrade the firmware on every ORBIT node will take a significant amount of time. It will also mean calibrating the nodes . However, the process of updating firmware and calibrating the radios can be done by a documented procedure and (relatively) unskilled labor. We estimate the ORBIT community can tolerate a day or two in which the grid is not available.
LinuxBIOS may be worse than what we have now. There is a chance we won't discover how much worse until the whole grid is reprogrammed.