I was really annoyed at Proofpoint; but really it was (partially) my fault

sudo postconf -e 'maximal_queue_lifetime = 1d'

I get it: some random guy sets up a new mail server, and you don’t know him from Adam, so you block his email.

They blocked me with a 400 series error. My new Postfix server had the default configuration of waiting five days before giving up. That wasn’t great, because I thought I’d sent an email, but the recipient never got it.

This failure was particularly irritating because my financial advisor needed me to sign an actual piece of paper to change the business relationship. She emailed me a document. It came across in her email that she was concerned about timing, this was rather urgent. I printed the document, signed it, and then emailed her back (on January 4) that I signed the document, but my schedule was such that I wouldn’t be able to bring the document to her office until Friday the 6th. Now I find out that Proofpoint never let my server send her that email; from her point of view I just completely shined her on.

That explains the weird look the receptionist gave me when I showed up on Friday with the signed document. They thought I had ghosted her. I didn’t: Proofpoint had decided to be a bully by stringing my mail server along with a 400 series error, until my mail server gave up and I finally got the 500 series error: blocked.

500 series errors are “it didn’t work, and it’s never going to work, what you are trying failed, we’re done.” 400 series errors on the other hand are “whoops – something went wrong, but it may be on our end rather than yours.”

I think it is a little disingenuous for Proofpoint to reply with a 400 series error “whoops – something went wrong, but it may be on our end rather than yours” when it was never going to get better. They blocked me because my mail server is new, and doesn’t have any reputation with them. That’s not going to get better on it’s own.

The big difference between 400 and 500 series errors is that 500 series errors are known problems. Because they are known problems, the results should be sent back immediately. 400 series errors have traditionally been allowed five days before the sending mail server reports an error. The idea behind 400 series errors was that perhaps you need time to fix a problem: so the sending mail server will try for five days before giving up. Perhaps you are moving to a new data center and you need a whole weekend to migrate. If you start Friday night and are done my Monday morning, the people sending you email will still get their messages to your users. It took several days, but eventually the email got there.

On the other hand, if I am sending an email to not-an-actual-user@some-real-domain.tld then the receiving mail server can immediately give me a 500 series error: not-an-actual-user does not exist. It didn’t work, it’s never going to work, what you are trying failed, we’re done. The sender (me) learns immediately that I mis-typed the email address or something.

I’ve recently gotten some DSN Fails trying to send to @icloud.com users. Same reason: they don’t know my mail server from Adam; but at least Apple was kind enough to let me know immediately that the mail didn’t go. I can at least text the person what I was going to send them in email.

Off-topic but mildly interesting: I had a user at work mis-type an email address by tacking on an extra period at the end. Like they finished a sentence with a period, but this was an email address. DNS has a top level, but to climb the tree means adding a trailing period and doing another DNS query. In other words, my computer could try to look up “mybox” and if that doesn’t resolve, tack a period on the end and look again. That would resolve to “mybox.mydomain”. If that doesn’t resolve, tack a period on the end and look again. That would resolve to “mybox.mydomain.tld” So when the resolution process finally gets to the very top Top Level Domain (TLD), searching stops. Those TLDs are a defined list; which is how DNS knows that there’s no more searching to be done. But whoops if the user asked for mybox.mydomain.tld. <—notice the trailing period. Then DNS is going to keep returning a 400 series error: it looks like there should be an upper level domain, but nothing responded, so maybe that box is just down right now? Try again in a few minutes. Five days later, my user finally got the DSN Fail. He was ticked off: why couldn’t the computer tell him he’d mistyped the email address? Well, because it kept looking for a mail server higher up in the DNS name hierarchy than .com

The Helm migration is complete

As I mentioned before, The Helm email appliance company is calling it quits, which I understand. If the business isn’t going to make it, it is better to pull the plug than just keep letting things linger. Best of luck to them on their next adventure.

So, what did I do?

  • (there was a detour while Amazon pissed on their customers wanting to run Mail-In-A-Box) (me)
  • I provisioned the smallest Ubuntu 22.04 LTS machine that Linode has.
    • Mildly annoyed that it doesn’t really support LVM (Logical Volume Manager); they have a backup service that runs an agent inside their machines, and that agent doesn’t do LVM. Still, I know that I’m going to need to grow disks, so I had to learn how to re-partition the Linode so I could do LVM. LVM done.
  • I made a mail server on the Linode machine at a domain name I have that I don’t really use. I followed the excellent guide from Christoph Haas at workaround.org: ISPmail guide for Debian 11 “Bullseye”
  • I got RoundCube webmail working for the domain name; complete with SPF and DKIM.
  • I got Thunderbird to send and receive from the domain name.
  • Then I added Nextcloud to the same box. I wanted CalDav for contacts and calendar, when I eventually hook my iPhone to it.
    • The Nextcloud documentation really needs a lot of work here. If I were retired, I would like to help them with their documentation.
    • Finally, I have the files.example.tld function of The Helm replaced, although at a different domain name.
    • Rspamd uses Redis, but so does Nextcloud. But one uses the network stack, and the other, Unix sockets. Get them both set same.
  • Then I added Duplicati backup. This wasn’t great, as it added a ton of overhead in the form of Mono, just for a graphical user interface.
  • I realize that I’m going to want to host my WordPress here too. I don’t want to have to wrangle four Let’s Encrypt SSL certificates, one for each domain. What about a single wildcard SSL certificate?
    • Yes, that can be done, but: my domain names registrar doesn’t support it. Linode does, though. I install the Linode DNS agent on my machine, and spin up Linode DNS servers to do the DNS work. I have to configure my domain names registrar to tell the rest of the world that Linode is where my name servers are.
    • Somewhere in there I installed the Unbounded DNS resolver. Looks like I need this on my home machine, too, for Home Assistant.io1
  • I got to the point where I could request the domain name transfer. Turns out the people at The Helm were going through Ghandi.net. Ghandi.net tooks as long as they legally could, before actually doing the DNS transfer.
    • Ghandi –> registrar, then registrar to point to Linode. Linode DNS needs to be reconfigured for SPF and DKIM. I had gotten some DNS records wrong, too.
  • Thunderbird to connect to the mail.domain.tld, and though the name hasn’t changed, everything underneath has. Thunderbird is not happy; I lose all my old mail.
    • Well, I didn’t, but it is in a new folder now, so that I’ve got an old version of my mailbox and a new version of my mailbox, and they are separate. Not ideal. Perhaps I could have done an IMAP to IMAP transfer, if I hadn’t already moved the domain name.
  • Hey, looky there: one of the volumes filled up (but everything else was unaffected). Time to grow a disk using LVM.
  • iPhone to connect to CalDAV; phew that was not well documented and had tons of conflicting information.
  • Not really happy with Duplicati, so I remove it and Mono, and install Restic backup instead.
  • Okay, so the last thing left to do is to migrate this blog from Amazon to this new Linode machine. The transfer using NS Cloner goes well, as it usually does. But domain names need to be updated via Let’s Encrypt certbot.
    • Crud. I’m on holiday out of town with family, and have only a Windows laptop with me. Per best practice security protocols, I can only ssh in from home. Logging in via root@ is blocked, and I don’t think I can even do a ssh-copy-id without getting in first and lowering the root login barrier. The certbot to add gerisch.org to the domains list is going to have to wait.
  • Here I am, at home, and I’m done. Dovecot, Postfix, RoundCube, Nextcloud, and WordPress all on one box.
  • While I was on holiday, I took the .mp3 files on the Nextcloud, and made Nextcloud Music Player playlists for the different types of files. Then on the 16 hour drive home, my iPhone logged in to the Nextcloud web interface and played playlists.
    • It’s a bit of nirvana to me, to have a large list of songs (randomized of course) playing absolutely advertising-free because I paid for the songs in the first place.
  1. I ended up not connecting Home Assistant to their cloud ↩︎

The Helm migration

I really liked my The Helm email appliance. But because the company running the service behind it is going to exit this business, I need to migrate stuff. Oh so much stuff…..

Of course, really, it becomes so-much-stuff because once I’m in a little, I want to pile on more. If Reddit hadn’t become so much trash, I’d have probably been living in /r/SelfHosted these past few weeks. Well, that and except that I’m cloud hosting for myself instead of keeping a box here at home.

Anyway, The Helm provided me with a SMTP server on it’s own domain name, and, NextCloud Files. (It did not include any other parts of NextCloud, though) (I think. Maybe contacts, too?). The company provided DNS services, too. And because no ISP is going to let me run an SMTP server here inside my home, it provided VPN services to AWS where boxes on the public Internet could send port 25 mail from.

I needed to move, and move quick. I’ve seen before how “oh I’ve got plenty of time” turned into “oh crap! It’s due tomorrow‽” enough times to remember the pain.

So now I have learned and am running:

  • A Dovecot and Postfix and rspamd server, with Redis
  • RoundCube attached to same
  • ISPMail attached to same (which is a web administration console for accounts in Dovecot and Postfix)
  • A caching DNS server on same
  • A Linode DNS server, so that Certbot can authorize a wildcard Let’s Encrypt SSL certificate.
  • NextCloud (full suite)
  • Duplicati for backup
  • and I haven’t ever added WordPress yet

I’m least happy with NextCloud. There is a lot of stuff that doesn’t work, and the documentation is poor, and a lot of the forum answers are “just read the documentation, newbie.”

I’m also not really happy with Duplicati. I loved it in version 1, because it was “just” a Python script. It ran on Windows, and I could very easily back up to Amazon S3. In fact, it was my introduction to learning AWS. Version 2 comes with it’s own web server so that it can be cross-platform and have a GUI; but that means adding Mono to my previously somewhat lean Linux server. By the way, accessing a web site on a “localhost” only web server? Here’s a reminder of how.

I started seeing a memory leak, and now I have to reboot the server once in a while. As Tenets of IT number 6 points out, rebooting is a band-aid. Really, I should remove the code that creates the memory leak. I think I’ll move to Restic and Backblaze.

Though I realy want to add WordPress and migrate this blog there, next.

Certbot and wildcard domains and –expand, oh my!

Nope, you cannot use –expand if you are using a wildcard helper (in my case --dns-linode)

The command that worked was

certbot certonly --dns-linode --dns-linode-credentials ~/somefolder/somefile.ini -d davidgerisch.com -d gerisch.me -d *.davidgerisch.com -d *.gerisch.me --cert-name davidgerisch.com

certbot –expand was no good because of –dns-linode. My only choice was certbot certonly.

But leaving off the original certificate name created a new certificate in a new location with -0001 tacked on to the name. No way do I want to have to wrangle the original certificate with it’s expiration date and this new certificate and it’s other expiration date. Besides, my web server is already configured for the original certificate. Reconfiguring the web server was less than ideal.

So the secret was to use the –cert-name option to specifically update the existing certificate.

2022-12-27 Update: if you go to add another domain (which happened to be this one) and you get the error “Certbot failed to authenticate some domains (authenticator: dns-linode). The Certificate Authority reported these problems:
 Domain: newdomain.tld
 Type:   unauthorized
 Detail: No TXT record found at _acme-challenge.newdomain.tld

 Domain: firstdomain.tld
 Type:   unauthorized
 Detail: No TXT record found at _acme-challenge.firstdomain.tld

Hint: The Certificate Authority failed to verify the DNS TXT records created by –dns-linode. Ensure the above domains are hosted by this DNS provider, or try increasing –dns-linode-propagation-seconds (currently 120 seconds).”

The problem may actually be a leftover file at /etc/letsencrypt/renewal

I had two files in there: firstdomain.tld.conf and firstdomain.tld-0001.conf

Certbot was trying to use the -0001.conf file instead of the real file. The real file pointed to the actual certificates being served up. The -0001.conf file was pointing to certificates with -0001 in their name, which were never served up to any of my web sites.

The Helm email appliance – you were a good product

I really liked my Helm email appliance. It has done well by me.

Unfortunately, the business behind it doesn’t see it’s future getting better, so they are going to call it quits. I have until December 31, 2022 to build a replacement email server. This is turning out to be a larger project that I’d like.

I do appreciate that The Helm company gave me plenty of warning (I got the email more than two weeks ago). I hope the people at the company find something else they can do which brings more success to them. You have my many thanks for your years of solid service.