locked Re: Downtime this morning


 

Here's a followup to this morning's downtime: It was caused by a message with a large attachment going to a group with a lot of members. More specifically, it was caused by some really inefficient code for attachment encoding. This overwhelmed the web server and the machine rebooted. Rebooting that way caused a lot of hanging connections to the database which did not clear on their own. Which meant the database had no available connections when the web server came back up. This repeated a few times.

I've re-written the offending code, which is now orders of magnitude faster. I'm also looking at some other optimizations. I still don't know why the database connections didn't clear themselves.

Anyways, things are back to normal.

Thanks,
Mark

Join main@beta.groups.io to automatically receive all group messages.