moderated Downtime this morning #outage


Hi All,

Well that was fun. Here's what I know right now. At 8:28am pacific time, one of the back end machines appeared to freeze up in a weird way. This machine takes all changes to the main database and inserts them into the search cluster (new messages, new activity logs, etc). For some reason that I do not know yet and really do not understand how, this caused a chain of events to happen that started eating up all connections to the main database. This effectively took the site down at 8:34am, which is when I got paged the first time. It took me some time to figure out that the machine was frozen in a weird way and to reboot it. The site came back at 8:52am.

The site is functioning normally and all email sent to groups during this time should have been queued and resent after the site was back.


Join to automatically receive all group messages.