Biggest problems seen in DNSSEC operation in .gov


(Editor’s note:  Scott Rose, a National Institute for Standards and Technology computer scientist, is a DNS and DNSSEC expert and NIST project lead for DNSSEC deployment.  He is a co-author of the DNSSEC RFCs and NIST Special Publication SP-800-81, which defines DNSSEC requirements in the federal system.  Here, he shares observations from reviewing signed delegations throughout the U.S. government as it continues deploying DNSSEC; the views presented here are his own and do not necessarily represent the views or policies of NIST.)

The .gov top-level domain (TLD) was the first to mandate signing for a portion of its delegated children (all Federal delegations).  Now, 18 months since the TLD was signed, deployment at the lower levels has been a mixed bag.  While the majority of the signed delegations (over 600 seen) have been operating without problems, a minority of zones appears to have some sort of problem.  These zones are vary from week to week and usually number around 12 to 20, with the number decreasing over time. The problems we have seen could be grouped into three categories:

  1. KSK rollover issues have been the most commonly seen error so far, accounting for roughly three out of four problematic zones.  The subzone has rolled its KSK, but has either failed to upload its new KSK, or did so incorrectly.  In this case, there are a couple of options:  Either the subzone operator can push up the correct KSK or resign the zone with the old KSK.  Both will mean that the zone will be invalid for a period of time.  On the parent side, one solution is to pull the subzone’s DS RR (that now points to a missing KSK).  This means the zone is now insecure (there is no secure delegation), but the zone is no longer invalid. 
  2. Timing issues are the next-most common error. Either the signatures in the sub zone are all expired, or (in a few cases) published before they are valid.  If the zone signatures are expired (i.e. the RRSIG expiration time is in the past), the easy solution is to simply resign the zone with the current keys and republish the zones.  If it is the rare case that the signatures are published too soon (i.e. the inception time is in the future), the issue may be that the system used to sign the zone has an incorrect clock.  The solution there is to ensure that the signing system’s clock is correct before the zone is signed. 
  3. Signed zones served on DNSSEC-unaware systems are (thankfully) becoming increasingly rare.  In this situation, the admin has correctly signed the zone, but has not configured the server to be DNSSEC-aware, or the server cannot serve a DNSSEC signed zone correctly.  In all of these cases, the zone returns traditional DNS responses, but if there is a DS RR in .gov, the lack of RRSIGs in a response results in a validation failure.  In all of these cases, the solution is to make sure all of the authoritative servers for the zone (primary and secondary servers) are DNSSEC capable and have been configured to send DNSSEC responses when queried. 

I believe most of these problems could have been prevented with better planning and education of operational staff before deployment.  Failing that, these issues are easily caught at the registry level via monitoring of signed delegations.  The registry could warn of issues before they happen, and it worst case, pull the DS RR’s from the delegation to prevent validation errors until the delegation is fixed. In the case of split registry/registrar models, the registrar could perform this function for delegations they handle.  It may not prevent all the problems since administrators still need to fix their own problems, but it prevents the zone from “going dark” for DNSSEC enabled clients.  Not a perfect situation, but workable until the problem is fixed.

Comments are closed.