Skip to main content

S3 static webhosting, DKIM signature size errors & why DNS prefers UDP

This weekend I spent some time migrating a few low-traffic websites from Nginx to AWS S3's static web hosting service.

In theory, this is a straightforward process: move content from the old webroot to an S3 bucket that shares the name of the domain, enable static web hosting for the bucket & set a security policy that enables anonymous web users to see that content.

In practice, there's a bit more involved:

1. S3 bucket resource paths can change, which will result in DNS failures unless you use a Route 53 hosted zone. You don't need to buy a domain from Amazon to do this, but you do need to use their nameservers. This isn't free, and there is an extra fee for DNSSEC.

2. Want an SSL/TLS certificate? Of course you do. This means generating a certificate within Amazon Certificate Manager. In most circumstances (without "legacy" client support for example), there is no charge for the certificate. But to serve traffic using that certificate requires provisioning a web distribution to Amazon's CDN, CloudFront. 

3. If you are a power user (ha) you might even have some honest-to-god web applications, like with a CGI. S3 can't handle this on its own, but there are tools that can address this while hosting with S3. Some folks like to host, say, their Wordpress site within S3 by running the installation process in a "normal" webserver, than migrating the files to S3 (the database can stay where it is or move to RDS). But that is a hack, and not a great one. S3 still will not run CGIs, and its hard to see the value of Wordpress without being able to use any of its functionality or run a template that cannot run PHP. For my use case, I was dealing with small business card-type websites - there is no authentication, but there might be some simple script functionality, like a contact form. Fortunately, AWS' serverless architecture tools enable me to fairly quickly work around this using a combination of AWS Lambda, Amazon API Gateway & Amazon SES. There is a great blog post detailing this process here, The workflow looks like this:

So, all of this is well and good and didn't provide me with too many obstacles (except for the one domain I missed the DNSSEC DS record deletion for).

The only error I encountered during this process was a bit of a surprise: I was unable to directly transfer  my pre-existing DKIM signatures to Route 53. When I did so, I was greeted with the error "CharacterStringTooLong (Value is too long) encountered with {Value}". An example of this error is pictured here:

The problem is that there is an RFC-mandated 255 character maximum for TXT record values, and a 2048 bit DKIM signature requires more characters than that. This is not a limitation that is unique to Route 53 or Amazon. The 255 character convention was established in RFC1035 ... in 1987. This isn't an Amazon thing. So why do so many people first encounter this error in Route 53?

In most circumstances, users do not modify DNS zone files directly. Instead, they use some sort of application or interface created by the domain registrar to modify those zones. Those interfaces tend to reconfigure user input into an RFC-compliant zone on the backend. There are positives to this (DNS confuses the hell out of a lot of people) and negatives (its a crutch that can lead to fundamental misunderstandings re: zone format). IMO, its fine that different registrars cater to different customer skill levels, but I would prefer if registrars that seamlessly modify user zone input were more forth-coming about what is occurring. For example, it wouldn't be to hard to have some sort of javascript break up TXT record user input into smaller blocks after a user has completed entering the record. This would be trivial to implement, not require users to know TXT length rules but would also help to educate users.

Anyway, that's all very interesting. How do we get a working DKIM signature in Route 53, though?

One option is to use DKIM with a 1024 bit key. Is this as good as a 2048 bit key? No. But it is much better than nothing: a bit like wearing a cloth mask vs an N95 mask.

The optimal workaround, however, is to take advantage of a provision for extending TXT values established in RFC4408 (Amazon has a blog post describing how to do this):

3.1.3. Multiple Strings in a Single DNS record As defined in [RFC1035] sections 3.3.14 and 3.3, a single text DNS record (either TXT or SPF RR types) can be composed of more than one string. If a published record contains multiple strings, then the record MUST be treated as if those strings are concatenated together without adding spaces. For example: IN TXT "v=spf1 .... first" "second string..." MUST be treated as equivalent to IN TXT "v=spf1 .... firstsecond string..." SPF or TXT records containing multiple strings are useful in constructing records that would exceed the 255-byte maximum length of a string within a single TXT or SPF RR record.

Notice how the RFC doesn't mention DKIM. SPF was invented seven years prior to DKIM, in the heady days of 1997 (although it would be many more years before they both became an industry standard).

But why is the size of DNS records such a problem? Are those smarty pants at the IETF just inventing arbitrary rules to, like, keep us down, man? It turns out that the size issue doesn't follow directly from DNS at all - the size limit is the result of the amount of information that can be included within a UDP packet.  Again, quoting RFC4408:

The published SPF record for a given domain name SHOULD remain small enough that the results of a query for it will fit within 512 octets. This will keep even older DNS implementations from falling over to TCP. Since the answer size is dependent on many things outside the scope of this document, it is only possible to give this guideline: If the combined length of the DNS name and the text of all the records of a given type (TXT or SPF) is under 450 characters, then DNS answers should fit in UDP packets. Note that when computing the sizes for queries of the TXT format, one must take into account any other TXT records published at the domain name. Records that are too long to fit in a single UDP packet MAY be silently ignored by SPF clients.

This merely regresses the question: instead of establishing an arbitrary size for DNS records, we now have an arbitrary selection of a communication protocol, UDP, with arbitrary size requirements. Why does DNS use UDP, when we could use TCP and stuff the collected works of Shakespeare into a TXT record? Computer Networking - A Top-Down Approach by Kurose & Ross provides us with several reasons (h/t to alhelal for the quote):

1. No connection establishment - that triple-handshake that makes TCP so reliable is slow. DNS doesn't need a reliable, ongoing connection. All that is needed is the answer to a single question: what IP belongs to this hostname. Kurose & Ross state this is the main reason why DNS uses UDP. It isn't particularly mysterious, but it makes sense: DNS precedes a panoply of other transactions. If DNS is slow, all of those other transactions will be slow, as well. 

2. No connection state - saving state information introduces substantial overhead. A UDP-based DNS resolver can handle many more connections than a TCP-based DNS resolver. The advantages of maintaining state - identifying packet transmission order, congestion control - do not make a lot of sense to address within the DNS session itself.

3. Smaller packet header overhead - TCP packet headers contain 20 bytes vs 8 bytes per UDP packet header. Those bytes add up fast.