Topics

moderated Site updates #changelog

 

This is a different #changelog from normal. Most everything outlined below represents internal changes to the site. This likely won't make sense to most people, but I'm writing this all down so that I can remember it in the future. The one new thing that will affect people are the new FBL subscriptions, outlined below.

I switched main database machines late last week because of a (minor) corruption issue. The new main database machine was running on the same type of hardware as the old machine, and I had tested it for several weeks beforehand. However, starting at the beginning of this week (we're busiest on Mondays), it was clear that for some reason the new database machine was not up to handling the load of the site. The site was crashing regularly, and I was getting paged in the middle of the night. It took me a few days to rule out other issues and come to the conclusion that it was a kernel issue. The new database machine was running an older Linux kernel, which normally shouldn't affect performance, but I believe in this case was. As part of my debugging process, I optimized many database access patterns, which will be great going forward. I have changed the Linux kernel on all database machines, and, combined with the other optimizations I did, I believe that the performance issues have been addressed. Hopefully.

At the same time, email to Comcast, Cox and GoDaddy was being delayed. I was finally able to establish contact with both Comcast and Cox and got the throttling lifted. As part of that, Comcast asked that I subscribe to their FBL (Feed Back Loop) service. An FBL is how we get notified when someone marks an email of ours as spam. So we are now receiving spam reports from several more email providers, including Comcast, Cox, Fastmail, Mail.ru, and others. The other recommendation that Comcast had was that we reuse SMTP connections when sending email to them, something we were not doing. So I spent the time to implement that. As of now, email to Comcast and Cox is flowing again. Email to GoDaddy/Secureserver is still being throttled and I have not had any luck contacting them as of yet.

Notes: karld refers to our email sending server, named after Karl 'The Mailman' Malone. forcesummaries is the program that generates summary emails. grafana is the package we use to generate internal monitoring graphs.

4/24/20:

  • CHANGE: We now automatically delete drafts that are older than two weeks.
  • INTERNAL: We now reuse SMTP connections when sending email to major providers, instead of opening a new connection for each email.
  • INTERNAL: On advice from Comcast, subscribed to the Comcast FBL, along with several other FBLs.
  • SYSADMIN: Upgraded the kernel in the main database machine to a more recent version.

4/23/20:

  • INTERNAL: Added grafana annotations to all cronjobs.
  • SYSADMIN: Reduced comcast.net karld concurrency from 10 to 5.
  • INTERNAL: Improved our email monitoring dashboards to better track when providers are not accepting email from us.

4/22/20:

  • INTERNAL: Optimized forcesummaries some more (and again fixed its use of replicas).
  • INTERNAL: Use db replicas for wiki queries.
  • INTERNAL: Use db replicas for some auth queries.
  • SYSADMIN: Upgraded the kernel in db replica machines to a more recent version.
  • SYSADMIN: Changed when we run the stats process on db05.
  • SYSADMIN: Added pg_stats_activity counts for db04/db05.
  • SYSADMIN: Reduced comcast.net karld concurrency from 20 to 10.
  • SYSADMIN: Reduced cox.net karld concurrency from 10 to 5.

4/21/20:

  • INTERNAL: Optimized the replacement of email addresses in messages with links to user profiles.
  • INTERNAL: Fixed some database indices.

4/20/20:

  • SYSADMIN: Adjusted main database parameters to deal with the current load.
  • SYSADMIN: Moved pgbouncer off main database machine to its own dedicated instance.
  • INTERNAL: Changed the forcesummaries process to use database replicas.
  • INTERNAL: Added additional stats graphing to the internal monitoring dashboard.

Take care everyone.

Mark

 

Mark,

* SYSADMIN: Reduced comcast.net karld concurrency from 10 to 5.
* SYSADMIN: Reduced comcast.net karld concurrency from 20 to 10.
* SYSADMIN: Reduced cox.net karld concurrency from 10 to 5.
Some GMF members have reported communications from orange.fr stipulating a necessary concurrency limit of 3:
https://groups.io/g/GroupManagersForum/message/30544

Shal

 

On Fri, Apr 24, 2020 at 10:28 PM Shal Farley <shals2nd@...> wrote:

Some GMF members have reported communications from orange.fr stipulating
a necessary concurrency limit of 3:
https://groups.io/g/GroupManagersForum/message/30544

I had forgotten to mention Wanadoo.fr/Orange.fr. The postmaster page for them states that they have a concurrency limit of 2, which is what we're set at now. Email has been flowing to them for several days.

Thanks,
MarkĀ