November 10th, 2008.
Yesterday at approximately 13:25 EST the server 72.9.151.80 experienced a "dead lock" state where the operating system responded to commands but no traffic was being passed through the network interface card. When no response was received by the monitoring servers, several alerts were sent out and the server was rebooted at 13:30 EST using a remote reboot port.
After a few minutes of no response being received it was apparent that the server was going to require human intervention to bring it back online. The data center was contacted at 13:33 EST to request their assistance and manually rebooted the server at 13:46 EST.
When the server still failed to respond, the data center technician hooked up a KVM (Keyboard Video Mouse) and determined that nothing was displaying on the screen, possibly due to a hardware failure with either the power supply, CPU and/or motherboard.
The data center technician continued to troubleshoot the hardware before determining that it would be best to do a chassis swap of the existing hard drives. This process involves removing the hard drives from the failed server and putting them in brand new server.
At approximately 15:30 EST authorization was given to the data center to move forward with the chassis swap. The data center then proceeded to build and deploy the new server, which is now a Core2Quad Q9550 (4 x 2.83 Ghz) CPU that is drastically faster than the previous CPU.
The server was deployed at 16:12 EST but failed to recognize the hard drives due to an incorrect BIOS settings. When the BIOS setting was corrected the hard drives were finally working however further problems persisted with the operating system preventing the server from coming online.
At 16:55 EST all issues with the operating system were finally resolved and the server became fully operational again. The server has been rock solid since the chassis swap and no further problems are expected in regards to the hardware. There will be software maintenance later this week to upgrade Apache and a notice will be sent out accordingly.
Total downtime was 3 hours and 30 minutes, well below the 99.9% uptime guarantee. This was a rather unfortunate event, the first major downtime in well over a year and we sincerely apologize for any inconvenience this may have caused.
Clients with upcoming invoices have already had their due dates pushed back another month to honor our uptime guarantee. If you are on a PayPal subscription, please email sales[at]synhosting.com when your invoice is processed to receive a full refund for this month. All overdue notices for clients affected by this outage have been voided.
Please do not hesitate to email sales[at]synhosting.com if there are any concerns in regards to the SLA credits being issued or not being issued for the uptime guarantee.