字符串的操作

最新推荐文章于 2022-07-29 09:29:18 发布

新一下的兰天

最新推荐文章于 2022-07-29 09:29:18 发布

阅读量955

点赞数 2

分类专栏： C++ C语言

本文链接：https://blog.csdn.net/maoliran/article/details/51863302

版权

C++ 同时被 2 个专栏收录

79 篇文章 0 订阅

订阅专栏

C语言

17 篇文章 1 订阅

订阅专栏

一、字符串的初始化
1、定长字符数组

char buf1[128] = {'a', 'b', 'c', 'd'};
printf("sizeof(buf1) = %d\n", sizeof(buf1));    //128
printf("strlen(buf1) = %d\n", strlen(buf1));    //4
printf("buf1[66] : %d\n", buf1[66]);    //0
printf("buf1 : %s----\n", buf1);    //abcd----

定义一个128长度的字符数组，只初始化前4个字符，后面字符默认都是0。

char buf5[128] = "abcd";
    printf("sizeof(buf5) = %d\n", sizeof(buf5));    //128
    printf("strlen(buf5) = %d\n", strlen(buf5));    //4
    printf("buf5[66] : %d\n", buf5[66]);    //0
    printf("buf5 : %s----\n", buf5);    //abcd----
    printf("-------------------------\n");

2、不定长的字符数组

char buf2[] = { 'a', 'b', 'c', 'd' };
printf("sizeof(buf2) = %d\n", sizeof(buf2));    //4
printf("strlen(buf2) = %d\n", strlen(buf2));    //16（不确定长度，直到遇到0结束符为止）
printf("buf1 : %s----\n", buf2);     //abcd烫烫烫烫abcd----

不定长字符数组不会在最后一个字符的后面自动添加0，因为打印出来会出现乱码，直到内存空间遇到一个0为止。

char buf3[] = "abcd";
printf("sizeof(buf3) = %d\n", sizeof(buf3));    //5
printf("strlen(buf3) = %d\n", strlen(buf3));    //4
printf("buf1 : %s----\n", buf3);    //abcd

字符串常量会默认结尾自动带一个0结束符，因此buf3的内存空间包括字符长度加一个结束符长度。
其中内存四区图是这样的：
这里写图片描述

3、字符指针初始化

char *buf4 = "abcd";    //abcd
printf("sizeof(buf4) = %d\n", sizeof(buf4));    //4
printf("strlen(buf4) = %d\n", strlen(buf4));    //4
printf("buf1 : %s----\n", buf4);

其内存四区图：
这里写图片描述

二、通过字符数组和指针操作指针

void test2() {
    char buf[128] = "abcdefg";
    char *p = NULL;
    for (int i = 0; i < strlen(buf); i++)
        printf("%c ", buf[i]);
    printf("\n");
    p = &buf;
    for (int i = 0; i < strlen(buf); i++)
        printf("%c ", *(p + i));
    printf("\n");
    for (int i = 0; i < strlen(buf); i++)
        printf("%c ", *(buf + i));
    printf("\n");
}

输出结果：

这里写图片描述

通过数组下标的方式本质是指针操作的方式是一致的，只不过数组下标的方式更符合程序员的编程习惯：

buf[i] ==》 buf[0 + i] ==》  *(buf + i)

编译器做的工作就是这样的。

buf相当于是一个常量指针，实质是一个内存首地址，不能改变buf指向的内存首地址。也就是说不能进行buf++;这种操作：

这里写图片描述

因为出了函数，系统要释放buf的内存空间，是根据buf指向的首地址即buf的空间大小进行释放的，如果buf可以向上++，改变其指向，比如buf现在指向字符‘e’所在的内存首地址，那么abcd到时候就无法释放内存了，因此，干脆不让修改buf的指向，这也是程序设计的原因，为了保证内存的正确释放。

这也是内存首地址(buf)和普通指针（p）的区别：

普通指针可以进行下面的操作：

p = p + i;

内存首地址不可以。

三、字符串一级指针内存模型

void test3() {
    char buf1[20] = "aaaa";
    char buf2[] = "bbbb";
    char *p1 = "1111111";
    char *p2 = (char*)malloc(100);
    strcpy(p2, "33333");
}

内存模型图：

这里写图片描述

strcpy的过程就是把全局区的”33333”拷贝到堆区。

四、字符串的copy操作演变

1、

void copy1() {
    char a[] = "i am a student";
    char buf[64];
    int i = 0;
    for (i = 0; a[i] != '\0'; i++)
        buf[i] = a[i];
    buf[i] = '\0';
    printf("a : %s\n", a);
    printf("buf : %s\n", buf);

}

输出结果：

这里写图片描述

这里是最简单的遍历拷贝，只需要注意：由于循环中当a[i] = ‘\0’时跳出循环，故buf并没有把‘\0’烤进去，要手动把buf最后一个字符后面的字符设为0，否则打印buf时，会直到遇到’\0’才会停止打印，缺省

buf[i] = '\0';

这句的输出结果为：

这里写图片描述

便会出现乱码。

2、下面都是通过接口实现的，正式公司编写代码中需求的

void copy2(char *from, char *to) {
    for (; *from != '\0'; from++, to++) {
        *to = *from;
    }
    *to = '\0';
}

int main(){
    char *from = "abcdefg";
    char buf[64];
    copy2(from, buf);
    printf("buf : %s\n", buf);
    system("pause");
    return 0;
}

输出结果：

这里写图片描述

这里用到指针间接修改实参的应用，画一个内存四区图：

这里写图片描述

开始main函数和copy2函数中的from指针都指向全局区“abcdefg”的首地址，to指针指向main函数buf的首地址，循环过程中，copy2函数中的from和to的指向不断改变，直到from所指向的内存中存放的是0位置，这个时候跳出循环，并没有把0拷贝进to所指向的内存中，因此仍然要手动加入0表示字符串结束。

3、

void copy3(char *from, char *to) {
    for (; *from != '\0';) {
        *to++ = *from++;
    }
    *to = '\0';
}

int main(){
    char *from = "abcdefg";
    char buf[64];
    copy3(from, buf);
    printf("buf : %s\n", buf);
    system("pause");
    return 0;
}

输出结果同上，这里把赋值和++操作合体了：

*to++ = *from++;

++操作的优先级大于*操作符，因此先执行++操作，但由于是后++，故上面实际执行顺序是：

*to = *from;
from++;
to++;

4、继续演变

void copy4(char *from, char *to) {
    while ((*to = *from) != '\0') {
        from++;
        to++;
    }
}

int main(){
    char *from = "abcdefg";
    char buf[64];
    copy4(from, buf);
    printf("buf : %s\n", buf);
    system("pause");
    return 0;
}

输出结果还是同上，此时就不需要手动添加结束符了，while循环中会自动添加。

5、继续演变

void copy5(char *from, char *to) {
    while (*to++ = *from++) {
    }
}

int main(){
    char *from = "abcdefg";
    char buf[64];
    copy5(from, buf);
    printf("buf : %s\n", buf);
    system("pause");
    return 0;
}

输出结果仍然同上。

可以看到代码越来越简洁，这就是整个字符串copy的演变过程。

五、字符串的copy操作强化

1、不能往NULL内存空间中拷贝

void copy5(char *from, char *to) {
    while (*to++ = *from++) {
    }
}

int main(){
    char *from = "abcdefg";
    char buf[64];
    {
        char* to = NULL;
        copy5(from, to);
    }
    system("pause");
    return 0;
}

执行这一段代码，会报错：

这里写图片描述

NULL主要用来避免野指针问题，NULL内存空间是由系统来保护的内存空间，是不允许往里面拷贝任何东西的。所以一旦执行：

*to = *from;

这样的拷贝语句，程序就会down掉。

此时就要对代码进行优化：

int copy6(char *from, char *to) {
    if (to == NULL || from == NULL)
        return - 1;
    while (*to++ = *from++) {
    }

    return 0;
}

int main(){
    int ret = 0;
    char *from = "abcdefg";
    char buf[64];
    {
        char* to = NULL;
        ret = copy6(from, to);
        if (ret != 0)
            printf("func copy6 err: %d\n", ret);
    }
    system("pause");
    return 0;
}

输出结果：

这里写图片描述

要对传来的指针进行判断，判断是否指向NULL，并通过函数返回值告诉被调用函数是否执行成功。

2、函数调用中通过指针遍历时，借助中间指针变量

对于没有中间指针变量的函数：

int copy7_err(char *from, char *to) {
    if (to == NULL || from == NULL)
        return -1;
    while (*to++ = *from++) {
    }
    printf("from : %s\n", from);
    return 0;
}

int main(){
    int ret = 0;
    char *from = "abcd";
    char buf[64];
    printf("copy7 begin\n");
    copy7_err(from, buf);
    printf("copy7 end\n");

    system("pause");
    return 0;
}

输出结果：
这里写图片描述

可以看到打印from什么都没有，在函数copy7_err中不断改变from的指向，最终指向了0，必然打印的时候打印0，就会什么都没有，这就造成有些时候在函数后面需要打印from的情况，这时候就需要中间指针变量：

int copy7_good(char *from, char *to) {
    if (to == NULL || from == NULL)
        return -1;
    char *fromtemp = from;
    char *totemp = to;
    while (*totemp++ = *fromtemp++) {
    }
    printf("from : %s\n", from);
    return 0;
}

int main(){
    int ret = 0;
    char *from = "abcd";
    char buf[64];
    printf("copy7 begin\n");
    copy7_good(from, buf);
    printf("copy7 end\n");

    system("pause");
    return 0;
}

输出结果：

这里写图片描述

借助中间指针变量，到时候打印from就不会出现问题，from仍然还是指向内存首地址，没有改变指向。这是编程过程中经常出现的错误，切记，切记。

六、字符串项目开发模型

1、字符串查找（strstr）

int strstr_interface(char* mystr /*in*/, char* substr /*in*/, int *count /*out*/) {
    int ret = 0;
    if (mystr == NULL || substr == NULL || substr == NULL) {
        ret = -1;
        printf("func strstr_interface() mystr == NULL || substr == NULL || substr == NULL err : %d\n", ret);
        return ret;
    }
    char *mystrtemp = mystr;
    int counttemp = 0;
    while (mystrtemp = strstr(mystrtemp, substr)) {
        mystrtemp += strlen(substr);
        counttemp++;
        if (counttemp == '\0')
            break;
    }
    *count = counttemp;
    return ret;
}

int main(){
    int ret = 0;
    char* mystr = "11abcd2117732abcd093902shabcd239sdfjajqqabcd";
    char* substr = "abcd";
    int count = 0;
    ret = strstr_interface(mystr, substr, &count);
    if (ret != 0) {
        printf("func strstr_interface() err: %d\n", ret);
    }
    printf("count : %d\n", count);
    system("pause");
    return 0;
}