Hosting your own Email with DANE

- 10 min read - Text Only

So I run my own email server, a combination of postfix, dovecot, opendkim. I followed a guide on linuxbabe.com and it runs on a cheap $1/mo server. But there are consequences to running your own email server: reputation of your ip address, domain, ASN can result in your emails being dropped before it even shows up in the spam box. I have not had any spam problems yet and I don't run a web based mail server so that hasn't been a concern for me.

Mail rejected due to antispam policy

It seems that after a few months, my server is doing better but not best. At first my mail was not trusted at all. Now it seems that half the time my mail gets through.

Email Aliases

Another thing, people can send emails to common addresses like postmaster, abuse at your domain but it won’t go anywhere. The system may automatically email root but if root is never set up with an inbox.. It’ll be floating in the mail transfer agent (MTA) for who knows how long. There’s even a special one called tls-reports! Google sends gzipped json here whenever there’s activity.

TLS Reports email

To resolve this without creating yet more accounts there’s a special file at /etc/aliases.

/etc/aliases file

But editing it is not enough, this is the human friendly file, there is a machine friendly file nearby at /etc/aliases.db. To rebuild that database, run newaliases. This step cannot be skipped and a hint will be left behind in syslog.

Syslog showing root doesn't have an email address

Once newaliases has ran, after a while records of bounced emails will start flowing in from the MTA now that root@example.org has a valid destination.

Root receiving emails that root couldn't receive an email

shrug
Well now that seems over..

DANE

Except here’s what really lead me to tinker with my email server today!

DANE survey notice

Viktor and maybe others are automating discovery and validity notification of DANE deployment, more info at About the DNSSEC/DANE measurement survey.

excited
DANE, or DNS-based Authentication of Named Entities is a protocol of sorts using TLSA DNS records to specify the public key signature for a service at a name and port.
Originally I set up DANE with TLSA records in DNS for internet points. My script seemed to work but I’d need to see how it functioned after a few Lets Encrypt renewal cycles. That time has come but I totally forgot to schedule it.
bashful

So I look at the link Viktor sent me, and I’m actually surprised that my account name was CC’d, that suggests a human was behind sending me this email! Whoa, did not expect that.

DANE report on TLSA record failure

Wait, what’s going on here, two bad ones?

Cloudflare DNS showing many records for TLSA with old and current keys

crossmark
Then I find out that I have three keys set up in cloudflare for mail and https. Looks like my script isn’t working as intended! Time to correct that..

I identify three problems. 1) I have certbot running in two cron jobs, one by itself and another with my DNS Update script. 2) My update script was deleting the new DNS record instead of the old ones. 3) My update script fails if it tries to apply a record that already exists.

I comment out the cronjob in /etc/cron.d/certbot and then fix my script where I had a == instead of a !=. Silly me. Finally I add an error check where if the message is NOT “Record already exists.” then it actually fails.

Included below is the script I use! Secrets redacted of course.

#!/bin/bash

HOST="mail.cendyne.dev"
ZONE_ID=REDACTED
TOKEN=REDACTED
PEM="/etc/letsencrypt/live/$HOST/cert.pem"
PORTS="25 443"
PROTOCOL="tcp"
NAMES=""

for port in $PORTS; do
NAMES="$NAMES _$port._$PROTOCOL.$HOST"
done

function hex() {
echo "$1" | xxd -p -c 1000000
}

function gettlsa() {
curl -X GET "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/dns_records?type=TLSA&match=all" \
     -H "Authorization: Bearer $TOKEN" \
     -H "Content-Type: application/json"
}

function digestpem() {
openssl x509 -in "$1" -noout -pubkey  | openssl pkey -pubin -outform DER | openssl dgst -sha256 -binary | xxd -p -c 100000
}

DIGEST="$(digestpem "$PEM")"
EXPECTED="3 1 1 $DIGEST"
ZONES="$(gettlsa)"
echo $EXPECTED

function extractentries() {
echo "$1" | jq -r ".result[] | {content: .content, id: .id, name: .name} | @base64"
}

TODELETE=""

for row in $(extractentries "$ZONES"); do

ROW="$(echo "$row" | base64 --decode)"
CONTENT="$(echo "$ROW" | jq -r ".content")"
ID="$(echo "$ROW" | jq -r ".id")"
NAME="$(echo "$ROW" | jq -r ".name")"
if echo "$NAMES" | grep -w "$NAME" > /dev/null; then
echo "id: $ID	name: $NAME	content: $CONTENT"
if [[ "$CONTENT" != "$EXPECTED" ]]; then
TODELETE="$TODELETE $ID"
fi
fi
done
echo "EXPECTED: $EXPECTED"
echo "TO Delete $TODELETE"

certbot renew --quiet

NEWDIGEST="$(digestpem "$PEM")"
NEWCONTENT="3 1 1 $NEWDIGEST"

if [[ "$NEWCONTENT" == "$CONTENT" ]]; then
echo "Renewall is a no-op, no changes to be done"
exit
fi

function puttlsa() {
curl -X POST "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/dns_records" \
     -H "Authorization: Bearer $TOKEN" \
     -H "Content-Type: application/json" \
     --data "{\"type\":\"TLSA\",\"name\":\"$1\",\"content\":\"$2\",\"ttl\":1,\"data\":{\"usage\":3,\"selector\":1,\"matching_type\":1,\"certificate\":\"$3\"}}"
}

for name in $NAMES; do
echo "create dns record for $name with $NEWCONTENT"
RESULT="$(puttlsa "$name" "$NEWCONTENT" "$NEWDIGEST")"
echo $RESULT
SUCCESS="$(echo "$RESULT" | jq -r '.success')"
if [[ "$SUCCESS" != "true" ]]; then
ERROR="$(echo "$RESULT" | jq -r '.errors[0].message')"
if [[ "$ERROR" != "Record already exists." ]]; then
echo "Failure in applying '$name' with '$NEWCONTENT' due to $ERROR"
exit
fi #end error != ...
fi #end success != true

done

echo "Sleeping for 5 minutes"
sleep 300

# Reload applications that rely on certificate
echo "Reloading"
systemctl reload postfix dovecot nginx

echo "Sleeping for 5 minutes"
sleep 300

function deletedns() {
curl -X DELETE "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/dns_records/$1" \
     -H "Authorization: Bearer $TOKEN" \
     -H "Content-Type: application/json"
echo ""
}

echo "Deleting old records"
for id in $TODELETE; do
echo "Deleting dns record $id with old content $CONTENT"
deletedns $id
done

echo "Done!"
Github Gist of the above

With that fixed, I ran the script and the old TLSA records were removed and the correct one matching my current certificate was retained.

DANE authentication check

social-credit+20
One last check and it looks good!

Okay but what’s the deal with DANE? DNS-based Authentication of Named Entities. It’s a way to delegate what to trust when a client connects to a service. The client can check DNS (which can be secured with DNSSEC) and find the available hashes of the public keys that may be presented upon connecting.

A TLS Server supplies a certificate with a public key to a client. While a certificate authority signs the public key to attest they trust it, TLSA records specify the digest of the public key in DER form. A signature is implicitly supplied through DNSSEC from the name server and the chain upwards to the Top Level Domain (such as com or org or dev).
notes

Why not just trust that key or certificate upon connecting in the first place? It could be signed by Lets Encrypt! Sure, it could. But it could also be obtained through an ACME challenge on a compromised machine and then deployed on an adversary controlled endpoint. That client could trust Lets Encrypt because its root certificate is in the trust relationship with their machine.. But so was DigiNotar. What DANE does is distribute the trust relationship across the named entity (such as my domain) and a certificate authority. I have control of my domain’s DNS records, the certificate authority does not, also I do not control the certificate authority but I must comply with their needs to fulfill a challenge. ACME challenges usually require DNS control or the ability to fulfill a challenge offered by whatever DNS is pointing at (such as HTTP challenges). In both cases, DNS is a central place of trust that clients rely upon.

gendo
Whether or not the client actually verifies the DNSSEC chain is another matter entirely.

DNSSEC Analysis, shows no problems

Now why might you choose not to deploy DANE? Simple: synchronizing things is hard. If you update your certificate before you update DNS (which takes time) then clients may reject the connection while the server uses a certificate that is not identified in the TLSA records. My solution is to retrieve the new cert but not refresh it into the app. Then add the TLSA record. Wait a while. Refresh the app. Wait a while. And finally remove the old TLSA record. You will find this in the script I shared above.

Second, and it is a bit of a chicken and the egg problem, not everyone is verifying it yet because not everyone supports it yet. Well I’m not a marketing department that wants 10 SPF records on the company domain, I can tweak things and fix them without causing downtime for a vital marketing resource: email.

laptop
Personally, I’m doing this for “fun”, for the experience, and I find security to be a constantly engaging field. DANE is a tangible security gadget that I can implement and learn from.

Follow Up

Later this evening Viktor responded. I really do enjoy personal constructive outreach over technical subjects like this.

Viktor's Response

Also, checking the survey results once more...

No more issues for DNSSEC and SMTP DANE TLS!

This issue has been cleared and the automation should be good going forward!
hooray