Contrary to most DNS servers (such as BIND and NSD) which “compile” (i.e. check) the data they will be serving before successfully loading a zone, PowerDNS has to make do with what it finds in one of its sundry back-end databases. And let me tell you that it sometimes has to cope with very weird data. Here are some examples I’ve been finding recently:
- Zones with missing SOA records
- SOA records with impossible data in them (e.g. an mname with a colon in it), some of the SOA timers with a 0 in them, etc.
- Zones without NS records
- CNAME records with other data, which is forbidden
- Domain names with non-ASCII characters in them
- Domain names with white space in the names
- Impossible IP addresses for
A
(e.g.10.1.2
) andAAAA
(e.g.10.1.2.1
) records - Illegal RR types in the
records
table - Differing case of domain names in the records table associated with a single zone
- Unqualified domain names in records table
- Qualified mname and rname fields in SOA records
These are just some of the highlights.
PowerDNS is lenient when you query it and it finds bad data; it tries to fix some of the stuff on the fly which, on the one hand is good because the program helps you around your bad data, but on the other hand is really bad when you want that data served by a different brand of DNS server.
Case in point is the following scenario, where PowerDNS is configured as a hidden master server to an NSD, Yadifa, Knot, or BIND server:
The slave servers on the right of the diagram will obtain their data using zone transfers (AXFR) from PowerDNS, and PowerDNS will gladly give them what it has. However, as BIND, say, checks the validity of incoming data, it may refuse to load the zone.
I was recently at a client who had a lot of incorrect data in their database.
Let me repeat that: a lot. Several tens of thousands of zones that BIND would
have either refused to load or which would have caused it to go a bit bonkers.
(Look at the list above, and you may find some reasons why that would happen.)
I’ll show you a simple example. Consider the following row in the records
table:
Suppose you query PowerDNS for an A resource record, what would you expect to see? The answer is
which is not what I would have expected. Be that as it may, it’s rubbish; nobody noticed when it was inserted years ago, and it hasn’t hurt anybody (because nobody queried that particular record or didn’t care about the result), but it’s rubbish none the less. This single entry isn’t particularly painful, but imagine an SOA refresh timer set to a few seconds, on thousands of zones … I’ll let your imagination wonder. :-)
We urgently needed to clean up the data in the back-end database, which we did. We concocted all sorts of programs which ran through and repaired stuff. End of story. Everybody is happy.
Almost everybody; I am not.
I’m convinced this is a) going to happen to other people, and b) can be handled. But how?
Let me re-iterate: this is absolutely no fault of PowerDNS; it’s the fault of the “provisioning” systems which allow incorrect data into whatever PowerDNS uses as a back-end database. Some people have provisioning systems that perform very careful checks, but others don’t: they might use a command-line interface (mysql) to add data, or maybe they use a flashy Web interface that doesn’t do enough checking before issuing the final SQL INSERT INTO table ...
or UPDATE table SET ...
.
Be that as it may, bad data (from a DNS perspective) can typically easily make it into a bunch of database tables.
A longish example
Let me show you a few records for a (not so fictitious) zone called a1.aa
, taken directly
from a database which is being served by PowerDNS. I’ve recreated the records to protect the
innocent.
How many pitfalls do you see there? Two? Five? More? There are quite a few, even if some are hard to spot.
As a first step, let us ask PowerDNS some questions. Remember please, I know things are going to break. Actually it’s rather cool how PowerDNS handles this stuff to avoid breakage…
Hmm. We’ve discussed the first issue already, but look at the second record in the reply: it’s good, in spite of the white space in the content column!
How about an ANY query?
Oops. The PowerDNS console (or log) shows the following correct and fair answer:
Let’s see what the (relatively new) pdnssec
utility says about the underlying data:
All in all, what “looked” somewhat OK to us from the database, is going to cause a number of problems. Thankfully, PowerDNS has been tightening up on the somewhat lax rules it used to apply (or rather: not apply) to its back-end data. This is necessary in the transition to DNSSEC.
pdnssec’s check-zone doesn’t find all errors – it wasn’t designed to do that. For example, CNAME and other records weren’t caught if their owner names differed in case. That has meanwhile been fixed.
Be that as it may, is there a solution for this?
Is there a cure?
Warning (insert guy shoveling pic)
I strongly believe the only solution is to enforce rules for the data in the database tables used by PowerDNS to be correct the moment it enters said database. I posted an idea to the PowerDNS mailing-list and feedback has been “mixed”. I won’t say we’ve been having fist fights, but only because the good #powerdns
people on IRC and I were geographically distant from one-another. ;-)
There are basically two schools of thought:
- Those that say: “make sure your data is good before it enters the database”
- Those that say: “make sure your data can be stored in the database only if it’s good”. That’s me. ;-)
To cut a very long story short, I’ve been prototyping a couple of MySQL User Defined Functions which, together with a few triggers, should ensure that illegal data cannot be inserted into the database tables.
It’s early days, but the prototype looks pretty good, if I may say so myself.
Broadly speaking, there are two UDFs: one is called checkname, and the other is called checkrr. The former applies regular expressions (and spare me please; I know most of the jokes) to owner names, and the latter applies similar rules to the rdata (i.e. the content column). These rules are taken from a very lightweight and fast TinyCDB database on the fly, which is compiled from input such as this:
Each record type (NS
, AAAA
, etc.) can have any number of regular expressions
applied to its name or content; if none of the rules match, the check fails.
For example, for the database record
the rulesets for NS:name
are applied to the name column, and the rulesets
NS:content
are applied to the content column. You can easily tell from the
NS:content
regular expression, that the content check will fail for this
value of content because it’s an IP address and not a domain name.
The special content rulesets @IP
and @SOA
perform inet_pton()
and SOA
checks respectively. The latter splits the SOA record into tokens and applies
the SOA:mname
, SOA:rname
, … rules to the individual portions of the
SOA record. As a special case, we can define, say, mininum and maxiumum values
for the numeric portions of the SOA record. The example above specifies that
the minimum for the refresh timer should be 600 seconds, and its maximum must
be less than or equal to 7201 seconds. (Yes, I’ve also implemented an SOA:xxxx:equals
rule.)
Let me show you what I have already. I’ll apply the UDFs to the records
table
from above.
The first character from the response of the checkname UDF contains a Y
or N
depending on whether the name is correct or not. (The rest of the string
currently has debugging information in it, which is why I’m omitting that here;
I’ve replaced that with manually added comments.)
Let’s now check the content
column for the same records; this time I’ll leave
the debug info in the column to save me typing it out:
Pay particular attention to row 36, where the check has failed because the refresh timer isn’t within specified bounds. Row 44 is also bad: there’s an IPv4 address in an AAAA
record. And so on, and so on.
Adding these UDF to a MySQL trigger then results in the following, when I try to insert such a record:
Interfaces
One of the very valid arguments on IRC was that all this would be useless if existing Web interfaces to PowerDNS (of which there are far too many) wouldn’t profit. Now, I know that many people use them, others have rolled their own interfaces or provision differently.
So the question is: how does, say, PowerAdmin, one of the more popular interfaces, react when the underlying database fails on an INSERT? The answer: it fails nicely. :-)
I’m going to leave this here for a bit, and let it all sink in. I’m not yet convinced this is a good idea, mainly due to the heavy use of, yeah, regular expressions. But who knows: with a bit of work, maybe this could turn into something useful. On the other hand, the use of MySQL UDFs deprives PostgreSQL users from the benefits of however much work we put into this.
If you feel this is a Good Thing, tell me. If, on the other hand, you feel this is a Stupid Idea, then by all means, tell me. I’m awaiting your feedback. :-)
Update: I’ve put what I have in the way of code up for grabs. Fix it, make it good, and send me lots of pull requests. :)