Smile Internet, like most mail providers, rely on a group of organizations that monitor spammers and maintain a list of bad addresses that spammers send spam through. One of these organizations was dsbl.org which recently discontinued service. Today, (March 11) they took their servers offline. This caused a problem for the Smile mail servers, where requests to the blacklist piled up, and the servers had trouble responding to new incoming mail.
The RBL requests eventually timed out, allowing some mail to come through, but the backup of incoming mail quickly piled up requests again. We were unlucky in troubleshooting the problem, because our test emails came in in a reasonable amount of time, which sent us searching in the wrong direction. After combing logs, restarting services, and some specific error reports from customers, we isolated the issue and removed all of the blacklists. Because of the specific way this issue occurred, our alerting system yep
We then checked the blacklist services and found that dsbl.org had stopped service, and we re-enabled the other services.
Currently the servers are working through a very large queue of mail that is coming in. Delayed mail should continue into the evening. We are monitoring the problem, and enhancing our alerting to provide us with better information on a queue issue, so if another blacklist goes down, we can quickly remedy the problem, and keep service at 100%.