= List of Node Failures = || ''' Node ''' || ''' Failure Mode ''' || ''' Solution / Notes ''' || || [1,5] ||Pxe Halt - Locks up during execution of PXE code ||Multiple resets (more than 1) [[BR]] may be required [[BR]] Might require node Change || || [1,5] ||Dead Node ID box top LED (the blinking one) || Power cycle Fixed it [[BR]] Rabbit Issue? || || [3,8] ||First Power on Halt || Locks during the first attempt [[BR]] Post after reset || || [17,4] ||First Power on Halt || Locks during the first attempt [[BR]] no serial console output || || [1,14] ||First Power on Halt || Locks during the first attempt [[BR]] Reset Fixes it [[BR]] has new disk [[BR]] || || [20,19] ||Disk Failure || Kernel Throws errors during imageing [[BR]] Disk Changed || || [12,9] ||Disk Controller Failure || Disk controller was having issues, disks were being incorrectly recognised|| || [3,18] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [5,11] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [14,11] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [13,5] ||Lock Up || Rabbit and Node were halted [[BR]] Power cycled|| || [4,11] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [5,9] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [9,11] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [3,19] ||Bad Node || Mother board Failure, refused to boot [[BR]] Replaced || || [14,8] ||Disk Failure || Kernel Throws Disk Errors [[BR]] Disk Changed|| || [17,9] ||Disk Failure || Disk write halts, imaging times out[[BR]] Disk replaced || || [18,3] ||Over heat || CM measures internal temp at 106F, fails to boot reliably || || [20,2] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [8,13] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [9,10] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [5,2] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [17,13] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [12,1] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [6,14] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [17,19] || Memory Failure || Memory Pins did not make proper contact, Bent case and reinserted memory || || [7,2] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [5,15] ||Lock Up || Rabbit and Node were halted, node ID box LED was solid [[BR]] Power cycled|| || [7,2] ||Lock Up || Rabbit and Node were halted, node ID box led was off [[BR]] Power cycled|| || [16,1] ||Lock Up || Rabbit and Node were halted [[BR]] Power cycled|| || [1,9] ||Intermitten failure || Power cycled|| || [1,5] ||Disk Failure || Failing disk caused disk controller to fail[[BR]] Cm had issues also, both replaced|| || [9,4] ||Disk Failure || Failing disk caused disk controller to fail[[BR]] Cm had issues also, both replaced|| || [15,6] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [18,16] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [3,11] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [16,19] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [5,17] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [20,4] || Node Failure|| Node was replaced || || [15,4] || Node Failure|| Node was replaced, bad left antenna connector. Replacement was used || || [5,14] || Overheat|| Fan was not plugged in || || [17,4] || Disk Failure || Smartctl reports impending disk death|| || [9,9] || Memory Failure || Memory Pins did not make proper contact, Bent case and reinserted memory || || [11,4] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [12,7] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [13,2] ||Disk Failure || Successfully booted from disk, but kernel was throwing disk errors || || [16,6] ||Disk Failure || SMART overall-health self-assessment test result: FAILED! || || [13,5] ||Disk Failure || kernel throwing disk errors || || [17,3] ||Disk Failure || kernel throwing disk errors || || [14,12] ||Pxe Halt - Locks up during execution of PXE code || '''Not Fixed''' || || [11,15] ||Network Failure || Pxe give media check failure [[BR]]] Node replaced || || [19,6] ||Pxe Halt ||Powers down during pxe || || [15,7] ||Pxe Halt ||Halts at random stages in the pxe image download process, before control in handed over to kernel || || [16,8] ||CM crash ||Power Cycled || || [20,20] ||CM crash ||CM light stays solid, Power Cycled || || [7,2] || CM crash ||Node ID light stays off, Power Cycled || || [2,20] ||CM crash ||CM light stays solid, Power Cycled || || [14,12] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [10,7] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [11,18] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [1,15] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [8,3] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [2,11] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [11,16] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [7,8] ||Disk Failure || Bios Does not detect disk [[BR]] Disk replaced || || [18,7] ||Disk Failure || Bios Does not detect disk [[BR]] Disk replaced || || [2,17] ||Disk Failure || Bios Does not detect disk [[BR]] Disk replaced || || [5,19] ||Disk Failure || Bios Does not detect disk [[BR]] Disk replaced || || [7,2] ||Disk Failure || kernel throwing disk errors [[BR]] Disk replaced|| || [12,4] ||Disk Failure || kernel throwing disk errors [[BR]] Disk replaced || || [1,8] ||Disk Failure || kernel throwing disk errors [[BR]] Disk replaced || || [18,18] ||Disk Failure || kernel throwing disk errors [[BR]] Disk replaced || || [14,20] ||Disk Failure || kernel throwing disk errors [[BR]] Disk replaced || || [9,16] ||Disk Failure || kernel throwing disk errors [[BR]] Disk replaced || || [4,6] ||Disk Failure || kernel throwing disk errors [[BR]] Disk replaced || || [6,8] ||Disk Failure || kernel throwing disk errors [[BR]] Disk replaced || || [3,13] ||Disk Failure || kernel throwing disk errors [[BR]] Disk replaced || || [5,4] ||Disk Failure || kernel throwing disk errors [[BR]] Disk replaced || || [10,5] ||Disk Failure || kernel throwing disk errors [[BR]] Disk replaced || || [10,8] ||Disk Failure || kernel throwing disk errors [[BR]] Disk replaced || || [8,8] ||Network Failure || Kernel throws network hardware complain [wiki:Internal/NodeFailureModes/Node8.8 during dhcp]|| || [12,4] || Disk Failure || Bios Does not detect disk [[BR]] Disk replaced || ||[8,10]|| Disk Failure || Bios Does not detect disk [[BR]] Disk replaced || ||[15,17] || Disk Failure || Bios Does not detect disk [[BR]] Disk replaced || ||[10,2] || Disk Failure || Bios Does not detect disk [[BR]] Disk replaced || ||[1,6] || Disk Failure || kernel throwing disk errors [[BR]] Disk replaced [[BR]] hda: dma_timer_expiry: dma status == 0x21|| ||[18,12] ||Disk Failure || kernel throwing disk errors [[BR]] Disk replaced [[BR]] hda: dma_timer_expiry: dma status == 0x21|| || [1,10] ||Disk Failure || kernel throwing disk errors [[BR]] Disk replaced || || [13,8] ||Disk Failure || kernel throwing disk errors [[BR]] Disk replaced || || [12,12] ||Disk Failure || kernel throwing disk errors [[BR]] Disk replaced || || [8,8] ||Disk Failure || kernel throwing disk errors [[BR]] Disk replaced || || [2,3] ||Disk Failure || kernel throwing disk errors [[BR]] Disk replaced || || [2,14] ||Disk Failure || kernel throwing disk errors [[BR]] Disk replaced || || [13,17] ||Disk Failure || kernel throwing disk errors [[BR]] Disk replaced || || [16,17] ||Disk Failure || kernel throwing disk errors [[BR]] Disk replaced || || [1,2] || Node Failure || Can't isolate Problem: Seems to over heat and [wiki:Internal/NodeFailureModes/Node1.2 kernel panic] || || [6,20] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [20,20] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [18,19] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [16,17] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [11,17] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [1,3] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [2,3] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [6,15] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [13,9] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [11,10] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [15,2] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [13,7] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [11,1] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [15,10] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [7,10] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [11,9] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [3,14] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [8,12] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [15,5] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [2,13] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [19,2] ||Disk Failure || Disk Write errors [[BR]] Disk replaced || || [14,7] ||Disk Failure || Disk Write errors [[BR]] Disk replaced ||