位域

最新推荐文章于 2024-10-11 18:35:35 发布

taotaost

最新推荐文章于 2024-10-11 18:35:35 发布

阅读量829

点赞数

分类专栏： C/C++

C/C++ 专栏收录该内容

19 篇文章 1 订阅

订阅专栏

位域
　　有些信息在存储时，并不需要占用一段完整的存储单元（以定义的类型为单位），而只需占用几个或一个二进制位。例如在存放一个开关量时，只有0和1 两种状态，用一位二进位即可。为了节省存储空间，并使处理简便，C语言又提供了一种数据结构，称为“位域”或“位段”。所谓“位域”是把一个存储单元中的二进位划分为几个不同的区域，并说明每个区域的位数。每个域有一个域名，允许在程序中按域名进行操作。

一、位域的定义和位域变量的说明

位域定义与结构定义相仿，其形式为：

　　struct 位域结构名
　　{ 位域列表 };
　　其中，位域列表的形式为：类型说明符位域名：位域长度
　　例如：
　　struct bs
　　{
　　    int a:8;
　　    int b:2;
　　    int c:6;
　　};
　　位域变量的说明与结构变量说明的方式相同。可采用先定义后说明，同时定义说明或者直接说明这三种方式。例如：
　　struct bs
　　{
　　    int a:8;
　　    int b:2;
　　    int c:6;
　　}data;

　　说明：这里的存储单元为int型变量所占的空间大小，设为4（下同）。data为bs变量，共占4个字节。其中位域a占8位，位域b占2位，位域c占6位，另外两个字节没有用到。

对于位域的定义尚有以下几点说明：

　　1. 一个位域必须存储在同一个存储单元中，不能跨两个存储单元。如一个存储单元所剩空间不够存放另一位域时，会从下一个存储单元起存放该位域。

            struct
            {
                unsigned int a:4;
		unsigned int  :27;
                unsigned int b:4;
                unsigned int c:1;
            }d;

上面a后面空了27位不用，也即前32位只留下1位给b，因此b从下一个32位开始放，故这里d占了8个字节。位域的长度不能大于一个存储单元的长度，否则怎么放都会跨两个存储单元，在这里不能超过32位。

2.位域宽度为0可用来对齐。例如：

　　struct bs
　　{
　　    unsigned int a:4
　　    unsigned int    :0
　　    unsigned int b:4
　　    unsigned int c:1
　　}

　　在这个位域定义中，a占第一个存储单元的前4位，后28位不使用，b从第二个存储单元开始，占用4位，c占用1位。注意：要使用那些用于对齐或者没有位域名的位的值时，例如这里前32位中的后28位，必须事先对其进行赋值或进行相应操作以确保其值的确定性，否则里面的值可能是随机的（我不确定是否一定用0填充）。同样，对于位域a，b，c的使用也应遵循这个原则。

3. 大端和小端字节序

    union {
        struct  
        {
            unsigned char a1:2;
            unsigned char a2:3;
            unsigned char a3:3;
        }x;
        unsigned char b;
    }d;
    int main(int argc, char* argv[])
    {
        d.b = 100;
        return 0;
    }

        100的二进制是：0110 0100，那么，对应的应该是
               <<<<<<－－内存增大
               a3   a2 a1
               011 001 00
        内存增大之所以这么写是因为，011是在高位！还有一个情况多见就是一个二进制的数字转化为点分十进制数值，如何进行，这里涉及到大端还是小端的问题，上面没有涉及，主要是因为上面是一个字节，没有这个问题，多个字节就有大端和小端的问题了，或许我们应该记住这一点就是, 在我们的计算机上面,大端和小端都是以字节为准的,当然严格来说更应该以位为准不是吗?具体可以参考维基百科上面的一篇文章,他给出了一个以位为准的大小端序的图: http://en.wikipedia.org/wiki/Endianess

下面研究以字节为单位的大小端序，继续看代码吧，如下：

    int main(int argc, char* argv[])
    {
        int a = 0x12345678;
        char *p = (char *)&a;
        char str[20];
        sprintf(str,"%d.%d.%d.%d", p[0], p[1], p[2], p[3]);
        printf(str);
        return 0;
    }

这个程序假设是小端字节序，那么结果是什么？我们看看应该怎么放置呢？每个字节8位，0x12345678分成4个字节，就是从高位字节到低位字节：12，34，56，78，那么这里该怎么放？如下：
－－－－>>>>>>内存增大
78 56 34 12

因为这个是小端，那么小内存对应低位字节，就是上面的结构。

接下来的问题又有点迷糊了，就是p怎么指向，是不是指向0x12345678的开头－－12处？不是！12是我们所谓的开头，但不是内存的开始处，我们看看内存的分布，我们如果了解p[0]到p[1]的操作是&p[0]+1,就知道了，p[1]地址比p[0]地址大，也就是说p的地址也是随内存递增的！

sprintf(str,"%d.%d.%d.%d", p[0], p[1], p[2], p[3]);

str就是这个结果了：120.86.52.18

那么反过来呢？

    int main(int argc, char* argv[])
    {
        int a = 0x87654321;
        char *p = (char *)&a;
        char str[20];
        sprintf(str,"%d.%d.%d.%d", p[0], p[1], p[2], p[3]);
        printf(str);
        return 0;
    }

依旧是小端序。结果是：
33.67.101.-121
为什么是负的？因为系统默认的char是有符号的，本来是0x87也就是135，大于127因此就减去256得到-121。那么要正的该怎么的弄？如下就是了：

    int main(int argc, char* argv[])
    {
        int a = 0x87654321;
        unsigned char *p = (unsigned char *)&a;
        char str[20];
        sprintf(str,"%d.%d.%d.%d", p[0], p[1], p[2], p[3]);
        printf(str);
        return 0;
    }

用无符号的！结果：
33.67.101.135

4. 位域的正负——位域也有正负，指有符号属性的，就是最高位表示的。

程序1：    
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    int main(int argc, char** argv)
    {
        union
        {
            struct
            {
                unsigned char a:1;
                unsigned char b:2;
                unsigned char c:3;
            }d;
            unsigned char e;
        } f;
        f.e = 1;
        printf("%d\n",f.d.a);
        return 0;
    }

程序2：
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    int main(int argc, char** argv)
    {
        union
        {
            struct
            {
                char a:1;
                char b:2;
                char c:3;
            }d;
            char e;
        } f;
        f.e = 1;
        printf("%d\n",f.d.a);
        return 0;
    }

都是小端序，结果如下：前者输出是1，没有问题，第二个输出是－1。

对于只有1位的a来说，如果是有符号的，那它的数值范围是-1到0（没有1是因为它仅有的一位就是符号位，如果这一位是0，真值则是0，如果这一位是1，真值就是-1。就像char类型，默认情况下是有符号的，它的取值范围是-128到127，总共256个数，由于0000 0000和1000 0000表示正0和负0，都是0，因此规定1000 0000是-128的补码表示。）

5. 整形提升

最后的打印是用的%d，那么就是对应的int的打印，这里的位域肯定要提升，这里有一点，不管是提升到有符号还是无符号，都是自己的符号位来补充，而不改变值的大小(这里说的不改变值大小是用相同的符号属性来读取)，负数前面都补充1，正数都是用0来补充，而且也只有这样才能保证值不变，比如，char提升到int就是前面补充24个char的最高位，比如：

     char c = 0xf0; 
     int p = c;
     printf("%d %d\n",c,p);

输出：-16 -16
p实际上就是0xfffffff0，是负数因此就是取反加1得到。c是一个负数，那么转化到x的时候就是最高位都用1来代替，得到的数不会改变值大小的。再看：

 char c = 0xf0;
 unsigned int x = c;
 printf("%u\n",x);

得到的结果是4294967280，也就是0xfffffff0,记住，无符号用%u来打印。

6. 地址不可取

最后说的一点就是位域是一个存储单元里面的一段，是没有地址的！

二、位域的使用

位域的使用和结构成员的使用相同，其一般形式为：位域变量名·位域名，位域允许用各种格式输出。

#include <stdio.h>
#include <stdlib.h>
#include <string.h> 

void main(){     
       struct bs{
           unsigned a:1;
           unsigned b:3; 
           unsigned c:4;
       }bit,*pbit;

       bit.a=1;
       bit.b=7;
       bit.c=15;
       printf("%d,%d,%d\n",bit.a,bit.b,bit.c);
       pbit=&bit;
       pbit->a=0;
       pbit->b=3;
       pbit->c=12;
       printf("%d,%d,%d\n",pbit->a,pbit->b,pbit->c);
}
输出：
1，7，15
0，3，12

上例程序中定义了位域结构bs，三个位域为a,b,c。说明了bs类型的变量bit和指向bs类型的指针变量pbit。这表示位域也是可以使用指针的。

附录

最后附上《The C Book》这本书的一段说法：
While we're on the subject of structures, we might as well look at bitfields. They can only be declared inside a structure or a union, and allow you to specify some very small objects of a given number of bits in length. Their usefulness is limited and they aren't seen in many programs, but we'll deal with them anyway. This example should help to make things clear:

    struct {
          /* field 4 bits wide */
          unsigned field1 :4;
          /*
           * unnamed 3 bit field
           * unnamed fields allow for padding
           */
          unsigned        :3;
          /*
           * one-bit field
           * can only be 0 or -1 in two's complement!
           */
          signed field2   :1;
          /* align next field on a storage unit */
          unsigned        :0;
          unsigned field3 :6;
    }full_of_fields;

Each field is accessed and manipulated as if it were an ordinary member of a structure. The keywords signed and unsigned mean what you would expect, except thatit is interesting to note that a 1-bit signed field on a two's complement machine can only take the values 0 or -1. The declarations are permitted to include the const and volatile qualifiers.The main use of bitfields is either to allow tight packing of data or to be able to specify the fields within some externally produced data files.C gives no guarantee of the ordering of fields within machine words, so if you do use them for the latter reason, you program will not only be non-portable, it will be compiler- dependent too.The Standard says that fields are packed into ‘storage units’, which are typically machine words. The packing order, and whether or not a bitfield may cross a storage unit boundary, are implementation defined. To force alignment to a storage unit boundary, a zero width field is used before the one that you want to have aligned. Be careful using them. It can require a surprising amount of run-time code to manipulate these things andyou can end up using more space than they save. Bit fields do not have addresses—you can't have pointers to them or arrays of them.