DNS 名字表示法和消息压缩技术

maimang09

已于 2022-03-17 10:25:05 修改

阅读量628

点赞数

分类专栏：网络文章标签：网络

于 2022-03-17 10:15:15 首次发布

原文链接：http://www.tcpipguide.com/free/t_DNSNameNotationandMessageCompressionTechnique.htm

版权

网络专栏收录该内容

269 篇文章 33 订阅

订阅专栏

本文详细介绍了DNS（域名系统）中用于表示域名、子域名和对象名称的特殊记法，以及DNS消息压缩技术。DNS使用一种特殊的名称表示法，每个标签前有一个字节表示标签字符数，然后是标签字符，最后以零长度的根标签表示结束。邮件地址的表示方式类似，但将'@'视为另一个点。为了减少消息的大小，DNS采用消息压缩技术，通过指针避免重复全名，从而节省空间。这种压缩技术能够有效地处理包含多个共享元素的域名。

摘要由CSDN通过智能技术生成

The TCP/IP Guide - DNS Name Notation and Message Compression Technique

DNS Name Notation and Message Compression Technique

Obviously, the entire Domain Name System protocol is oriented around dealing with names for domains, subdomains and objects. We've seen in the preceding topics that there are many fields in DNS messages and resource records that carry the names of objects, name servers and so forth. DNS uses a special notation for encoding names in resource records and fields, a variation of this notation for e-mail addresses, and a special compression method that reduces the size of messages for efficiency.

Standard DNS Name Notation

In the section describing the DNS name space we saw how DNS names are constructed. Each node in the name hierarchy has a label associated with it. The fully-qualified domain name (FQDN) for a particular device consists of the sequence of labels that starts from the root of the tree and progresses down to that device. The labels at each level in the hierarchy are listed in sequence, starting with the highest level, from right to left, separated by dots. This results in the domain names we are used to working with, such as “www.xyzindustries.com”.

It would be possible to encode these names into resource records or other DNS message fields directly: put the letter “w” into each of the first three bytes of the name, then put a “.” into the fourth byte, an “x” into the fifth and so on. The disadvantage of this is that as a computer was reading the name, it wouldn't be able to tell when each name was finished. We would need to include a length field for each name.

Instead, DNS uses a special notation for DNS names. Each label is encoded one after the next in the name field. Before each label, a single byte is used that holds a binary number indicating the number of characters in the label. Then, the label's characters are encoded, one per byte. The end of the name is indicated by a null label, representing the root; this of course has a length of zero, so each name ends with just a “0” character, indicating this zero-length root label.

Note that the “dots” between the labels aren't necessary, since the length numbers delineate the labels. The computer reading the name also knows how many bytes are in each label as it reads the name, so it can easily allocate space for the label as it reads it from the name.

For example, “www.xyzindustries.com” would be encoded as:

“[3] w w w [13] x y z i n d u s t r i e s [3] c o m [0]”

I have shown the label lengths in square brackets to distinguish them. Remember that these label lengths are binary encoded numbers, so a single byte can hold a value from 0 to 255; that “[13]” is one byte and not two, as you can see in Figure 252. Labels are actually limited to a maximum of 63 characters, and we'll see shortly why this is significant.

Figure 252: DNS Standard Name Notation

In DNS every named object or other name is represented by a sequence of label lengths and then labels, with each label length taking one byte and each label taking one byte per character. This example shows the encoding of the name “www.xyzindustries.com”.

DNS Electronic Mail Address Notation

Electronic mail addresses are used in certain DNS resource records, such as the RName field in the Start Of Authority resource record. E-mail addresses of course take the form “<name>@<domain-name>”. DNS encodes these in exactly the same way as regular DNS domains, simply treating the “@” like another dot. So, “johnny@somewhere.org” would be treated as “johnny.somewhere.org” and encoded as:

“[6] j o h n n y [9] s o m e w h e r e [3] o r g [0]”.

Note that there is no specific indication that this is an e-mail address. The name is interpreted as an e-mail address instead of a device name based on context.

DNS Message Compression

A single DNS message may contain many domain names. Now, consider that when a particular name server sends a response containing multiple domain names, they are all usually in the same zone, or are related to the zone. Most of these names will have common elements to their names.

Consider our previous mail example of a client asking for an MX record for “xyzindustries.com”. The response to this client will contain, among other things, these two records:

MX Record: An MX record that has “xyzindustries.com” as the Name of the record, and “mail.xyzindustries.com” in the RData field.
A Record: Assuming the name server knows the IP address of “mail.xyzindustries.com”, the Additional section will contain an A record that has “mail.xyzindustries.com” as the Name and its address in the RData field.

This is just one small example of name duplication; it can be much more extreme with other types of DNS messages, with certain string patterns being repeated many times. Normally this would require that each name be spelled out fully using the encoding method described above. But this would be wasteful, since a large portion of these names is common.

Using Message Compression to Avoid Duplication of a Full Name

To cut down on duplication, a special technique called message compression is used. Instead of a DNS name encoded as above using the combination of labels and label-lengths, a two-byte subfield is used to represent a pointer to another location in the message where the name can be found. The first two bits of this subfield are set to one (the value “11” in binary), and the remaining 14 bits contain an offset that species where in the message the name can be found, counting the first byte of the message (the first byte of the ID field) as 0.

Let's go back to our example. Suppose that in the DNS message above, the RData field of the MX record, containing “mail.xyzindustries.com”, begins at byte 47. In this first instance, we would find the name encoded in full as:

“[4] m a i l [13] x y z i n d u s t r i e s [3] c o m [0]”.

However, the second instance, where “mail.xyzindustries.com” shows up in the Name field of the A record, we would instead put two “1” bits, followed by the number 47 encoded in binary. So, this would be the 16-bit binary pattern “11000000 00101111”, or two numeric byte values “192 （0xC0）” and “47”. This second instance now takes 2 bytes instead of duplicating the 24 bytes needed for the first instance of the name.

How does a device reading a name field differentiate a pointer from a “real” name? This is the reason that “11” is used at the start of the field. Doing this guarantees that the first byte of the pointer will always have a value of 192 or larger. Since labels are restricted to a length of 63 or less, when the host reads the first byte of a name, if it sees a value of 63 or less in a byte, it knows this is a “real” name; a value of 192 or more means it is a pointer.

Using Message Compression to Avoid Duplication of Part of a Name

The example above shows how pointers can be used to eliminate duplication of a whole name: the name “mail.xyzindustries.com” was used in two places and a pointer was used instead of the second. Pointers are even more powerful than this, however. They can also be used to point to only part of a real name, or can be combined with additional labels to provide a compressed representation of a name related to another name in a resource record. This provides even greater space savings.

In the example above, this means that even the first instance of “mail.xyzindustries.com” can be compressed. Recall that the MX record will have “xyzindustries.com” in the Name field and “mail.xyzindustries.com” in the RData field. If the Name field of that record starts at byte 19, then we can encode the RData field as:

“[4] m a i l [pointer-to-byte-19]”.

The device reading the record will get “mail” for the first label and then read “xyzindustries.com” from the Name field to get the complete name, “mail.xyzindustries.com”.

Similarly, suppose we had a record in this same message that contained a reference to the parent domain for “xyzindustries.com”, which is of course “com”. This could simply be encoded as: