C++克服了C的字符串硬伤

 

#include <iostream>
#include <cstring>
using namespace std;

int main()
{
    string s1 = "IBM";
    string s2 = "123";
    string *p;
    char *pc;
   
    cout << "s1 = " << &s1 << endl;
    cout << "s1 = " << s1 << endl;
    cout << "s2 = " << &s2 << endl;
    cout << "s2 = " << s2 << endl;
    cout << endl;
   
    cout << "s2 += \"ABCDEFGHIJKMNOPQRSTUVWXYZ\"" << endl;
    s2 += "ABCDEFGHIJKMNOPQRSTUVWXYZ";
    cout << "s1 = " << &s1 << endl;
    cout << "s1 = " << s1 << endl;
    cout << "s2 = " << &s2 << endl;
    cout << "s2 = " << s2 << endl;
    cout << endl;
   
    cout << "ox22ff50 - ox22ff40 = " << 0x22ff50 - 0x22ff40 << endl;   
    p = &s2;
   
    cout << "&s2 = " << &s2 << endl;
    cout << "s2[4] = " << s2[4] << endl;
   
    cout << "sizeof(p) = " << sizeof(p) << endl;
    cout << "p+4 = " << p+4 << endl;
    cout << "*(p+4) = " << *(p+4) << endl;
    cout << endl;
  
    cout << "(p+1) = " << p+1 << endl;
    //cout << "*(p+1) = " << *(p+1) << endl;
    //这是能通过编译的错误。
   
    cout << endl;
   
    cout << "s2[3] = " << s2[3] << endl;
    cout << "&s2[3] = " << &s2[3] << endl;
    pc = &s2[0];
    cout << "*pc = " << *pc << endl;
    cout << "pc = " << pc << endl;
    pc += 1;
    cout << "*pc = " << *pc << endl;
    cout << "pc = " << pc << endl;
    pc += 1;
    cout << "*pc = " << *pc << endl;
    cout << "pc = " << pc << endl;
    pc += 1;
    cout << "*pc = " << *pc << endl;
    cout << "pc = " << pc << endl;
    pc += 1;
    cout << "*pc = " << *pc << endl;
    cout << "pc = " << pc << endl;

    cout << pc << endl;
    cout << endl;
    pc = &s2[3];     
   
    cout << "pc = s2[3]: " << endl;
    cout << "*pc = " << *pc << endl;
    cout << "pc = " << pc << endl;
    //可见,无论如何用指针都是取不到指针所保存的地址的。(强制类型转换或许可以cast_static<int *>)
   
    char p1[3] = {'A', 'B', '\0'};
    char *p2 = p1;
    cout << "*p2 = " << *p2 << endl;
    cout << "p2 = " << p2 << endl;
   
    system("pause");
    return 0;
}

 

输出:

s1 = 0x22ff50

s1 = IBM

s2 = 0x22ff40

s2 = 123

 

s2 += "ABCDEFGHIJKLMNOPQRSTUVWXYZ"

s1 = 0x22ff50

s1 = IBM

s2 = 0x22ff40

s2 = 123ABCDEFGHIJKLMNOPQRSTUVWXYZ

 

0x22ff50 - 0x22ff40 = 16

&s2 = 0x22ff40

s2[4] = B

sizeof(p) = 4

p+4 = 0x22ff50  ---> C的硬伤

*(p+4) = IBM

 

(p+1) = 0x22ff44  --->段访问异常。能够通过编译。但是访问产生异常。

 

s2[3] = A

&s2[3] = ABCDEFGHIJKLMNOPQRSTUVWXYZ   --->C中字符串是直接判断char *或者字符数组类型,用地址(指针)来做解析操作的。

*pc = 1

pc = 123ABCDEFGHIJKLMNOPQRSTUVWXYZ

*pc = 2

pc = 123ABCDEFGHIJKLMNOPQRSTUVWXYZ

*pc = 3

pc = 3ABCDEFGHIJKLMNOPQRSTUVWXYZ

*pc = A

pc = ABCDEFGHIJKLMNOPQRSTUVWXYZ

*pc = B

pc = BCDEFGHIJKLMNOPQRSTUVWXYZ

pc = ABCDEFGHIJKLMNOPQRSTUVWXYZ

 

pc = s2[3]:

*pc = A

pc = ABCDEFGHIJKLMNOPQRSTUVWXYZ

*p2 = A

p2 = AB

 

总结: C对字符串的处理是,判断char *、char[ ]类型,然后直接对地址做字符串的解析操作。

解析操作是,读取指针所指第一个内存位置,按char来翻译每个8bit的位数据,然后对照ascii码,取到对应的字符。

                        指针一直往后走,直到遇到'\0’为止。

 

C++的克服了C字符串的硬伤:

在C中s2串要连接s1串,是通过strcpy实现的。同时也要确保s2的字符数组空间足够大。

不够大时,按内存分配的方法,计算出多少位后会刷掉s1。如危险的字符串。

C++的做法是这样的: s2 = {‘1’, '2', '3', '\0'};

s2 += "ABCDEFGHIJKLMNOPQRSTUVWXYZ";

s2 = {'1', '2', '3', ("ABCDEFGHIJKLMNOPQRSTUVWXYZ"的首地址)};

其中最后一部分是按C对字符串的判断来引用的。这里存的是地址信息。也就是常量指针。这个常量指针无法提取。

因为它是没有命名的内存空间。---->可以通过数组越界来判断。

这就解析了为什么s1和s2的间距空间不足够,但是仍然不会越界刷s1的原因。用strcpy的C语言方式是会刷掉s1的。

 

(*)(0x22ff34) 与 *p的区别。 (*)(0x22ff34) 是一个地址。*p是对名字进行操作。(*)(0x22ff34)是对数字进行操作。

而(*)运算符号是对名字来操作的。名字下填了空间地址。所以(*)(0x22ff34)是错误的。

 

cout << "&p1 = " << &p1 << endl;
cout << "&p2 = " << &p2 << endl;

p1 = 0x22ff30

p2 = 0x22ff24

 间距: sizeof(p2) + 3*sizeof(char)   即p2占4字节,'A'、'B'、'\0'各占1个字节。

p1  = p2 + 间距。

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值