Hosting your own Email with DANE
- 10 min read - Text OnlySo I run my own email server, a combination of postfix, dovecot, opendkim. I followed a guide on linuxbabe.com and it runs on a cheap $1/mo server. But there are consequences to running your own email server: reputation of your ip address, domain, ASN can result in your emails being dropped before it even shows up in the spam box. I have not had any spam problems yet and I don't run a web based mail server so that hasn't been a concern for me.
It seems that after a few months, my server is doing better but not best. At first my mail was not trusted at all. Now it seems that half the time my mail gets through.
Email Aliases
Another thing, people can send emails to common addresses like postmaster
, abuse
at your domain but it won’t go anywhere. The system may automatically email root but if root is never set up with an inbox.. It’ll be floating in the mail transfer agent (MTA) for who knows how long. There’s even a special one called tls-reports
! Google sends gzipped json here whenever there’s activity.
To resolve this without creating yet more accounts there’s a special file at /etc/aliases
.
But editing it is not enough, this is the human friendly file, there is a machine friendly file nearby at /etc/aliases.db
. To rebuild that database, run newaliases
. This step cannot be skipped and a hint will be left behind in syslog.
Once newaliases
has ran, after a while records of bounced emails will start flowing in from the MTA now that root@example.org
has a valid destination.
DANE
Except here’s what really lead me to tinker with my email server today!
Viktor and maybe others are automating discovery and validity notification of DANE deployment, more info at About the DNSSEC/DANE measurement survey.
So I look at the link Viktor sent me, and I’m actually surprised that my account name was CC’d, that suggests a human was behind sending me this email! Whoa, did not expect that.
Wait, what’s going on here, two bad ones?
I identify three problems. 1) I have certbot running in two cron jobs, one by itself and another with my DNS Update script. 2) My update script was deleting the new DNS record instead of the old ones. 3) My update script fails if it tries to apply a record that already exists.
I comment out the cronjob in /etc/cron.d/certbot and then fix my script where I had a ==
instead of a !=
. Silly me. Finally I add an error check where if the message is NOT “Record already exists.” then it actually fails.
Included below is the script I use! Secrets redacted of course.
#!/bin/bash
HOST="mail.cendyne.dev"
ZONE_ID=REDACTED
TOKEN=REDACTED
PEM="/etc/letsencrypt/live/$HOST/cert.pem"
PORTS="25 443"
PROTOCOL="tcp"
NAMES=""
for port in $PORTS; do
NAMES="$NAMES _$port._$PROTOCOL.$HOST"
done
function hex() {
echo "$1" | xxd -p -c 1000000
}
function gettlsa() {
curl -X GET "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/dns_records?type=TLSA&match=all" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json"
}
function digestpem() {
openssl x509 -in "$1" -noout -pubkey | openssl pkey -pubin -outform DER | openssl dgst -sha256 -binary | xxd -p -c 100000
}
DIGEST="$(digestpem "$PEM")"
EXPECTED="3 1 1 $DIGEST"
ZONES="$(gettlsa)"
echo $EXPECTED
function extractentries() {
echo "$1" | jq -r ".result[] | {content: .content, id: .id, name: .name} | @base64"
}
TODELETE=""
for row in $(extractentries "$ZONES"); do
ROW="$(echo "$row" | base64 --decode)"
CONTENT="$(echo "$ROW" | jq -r ".content")"
ID="$(echo "$ROW" | jq -r ".id")"
NAME="$(echo "$ROW" | jq -r ".name")"
if echo "$NAMES" | grep -w "$NAME" > /dev/null; then
echo "id: $ID name: $NAME content: $CONTENT"
if [[ "$CONTENT" != "$EXPECTED" ]]; then
TODELETE="$TODELETE $ID"
fi
fi
done
echo "EXPECTED: $EXPECTED"
echo "TO Delete $TODELETE"
certbot renew --quiet
NEWDIGEST="$(digestpem "$PEM")"
NEWCONTENT="3 1 1 $NEWDIGEST"
if [[ "$NEWCONTENT" == "$CONTENT" ]]; then
echo "Renewall is a no-op, no changes to be done"
exit
fi
function puttlsa() {
curl -X POST "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/dns_records" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
--data "{\"type\":\"TLSA\",\"name\":\"$1\",\"content\":\"$2\",\"ttl\":1,\"data\":{\"usage\":3,\"selector\":1,\"matching_type\":1,\"certificate\":\"$3\"}}"
}
for name in $NAMES; do
echo "create dns record for $name with $NEWCONTENT"
RESULT="$(puttlsa "$name" "$NEWCONTENT" "$NEWDIGEST")"
echo $RESULT
SUCCESS="$(echo "$RESULT" | jq -r '.success')"
if [[ "$SUCCESS" != "true" ]]; then
ERROR="$(echo "$RESULT" | jq -r '.errors[0].message')"
if [[ "$ERROR" != "Record already exists." ]]; then
echo "Failure in applying '$name' with '$NEWCONTENT' due to $ERROR"
exit
fi #end error != ...
fi #end success != true
done
echo "Sleeping for 5 minutes"
sleep 300
# Reload applications that rely on certificate
echo "Reloading"
systemctl reload postfix dovecot nginx
echo "Sleeping for 5 minutes"
sleep 300
function deletedns() {
curl -X DELETE "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/dns_records/$1" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json"
echo ""
}
echo "Deleting old records"
for id in $TODELETE; do
echo "Deleting dns record $id with old content $CONTENT"
deletedns $id
done
echo "Done!"
Github Gist of the aboveWith that fixed, I ran the script and the old TLSA records were removed and the correct one matching my current certificate was retained.
Okay but what’s the deal with DANE? DNS-based Authentication of Named Entities. It’s a way to delegate what to trust when a client connects to a service. The client can check DNS (which can be secured with DNSSEC) and find the available hashes of the public keys that may be presented upon connecting.
Why not just trust that key or certificate upon connecting in the first place? It could be signed by Lets Encrypt! Sure, it could. But it could also be obtained through an ACME challenge on a compromised machine and then deployed on an adversary controlled endpoint. That client could trust Lets Encrypt because its root certificate is in the trust relationship with their machine.. But so was DigiNotar. What DANE does is distribute the trust relationship across the named entity (such as my domain) and a certificate authority. I have control of my domain’s DNS records, the certificate authority does not, also I do not control the certificate authority but I must comply with their needs to fulfill a challenge. ACME challenges usually require DNS control or the ability to fulfill a challenge offered by whatever DNS is pointing at (such as HTTP challenges). In both cases, DNS is a central place of trust that clients rely upon.
Now why might you choose not to deploy DANE? Simple: synchronizing things is hard. If you update your certificate before you update DNS (which takes time) then clients may reject the connection while the server uses a certificate that is not identified in the TLSA records. My solution is to retrieve the new cert but not refresh it into the app. Then add the TLSA record. Wait a while. Refresh the app. Wait a while. And finally remove the old TLSA record. You will find this in the script I shared above.
Second, and it is a bit of a chicken and the egg problem, not everyone is verifying it yet because not everyone supports it yet. Well I’m not a marketing department that wants 10 SPF records on the company domain, I can tweak things and fix them without causing downtime for a vital marketing resource: email.
Follow Up
Later this evening Viktor responded. I really do enjoy personal constructive outreach over technical subjects like this.
Also, checking the survey results once more...