自己动手写一个 vector

zaizai1007

已于 2024-04-07 20:45:38 修改

阅读量782

点赞数 24

分类专栏： C++ 文章标签： C++

于 2024-04-07 10:45:51 首次发布

本文链接：https://blog.csdn.net/zaizai1007/article/details/137449832

版权

C++ 专栏收录该内容

40 篇文章 3 订阅

订阅专栏

原视频链接：https://youtu.be/ryRf4Jh_YC0

参考博客链接：自己动手写Vector【Cherno C++教程】 - zhangyi1357 - 博客园 (cnblogs.com)

vector数组相较于Array的最大特点就是动态扩容，我们不用指定初始容量，而在使用过程中不断以O(1)的时间复杂度向尾部插入元素和读取任意位置的元素

动态扩容策略

需要O（1）的时间复杂复任意读取元素，所以需要连续的存储空间，所以不使用链表

动态扩容：每次重新分配之后需要将数组完整的挪到新的内存地址中，这个过程非常耗时，解决办法就是每次分配数组的时候多分配一些空间。于是扩容的高消耗行就被分摊到每次的插入操作上，达到总体的O（1）时间复杂度

那么具体多分配多少空间呢，我们要保证一次扩容操作被分摊到O(n)次插入操作上才行，所以扩大的容量必须要是O(n)这个数量级的。

实际中不同的编译器的处理方式不尽相同，MSVC中以1.5倍扩容，GCC中以2倍扩容。本文采取2倍扩容的方式。

基础版本

基础版本API

动态扩容
PushBack方法
重载和 [ ] 方法
Size 方法

template<typename T>
void PrintVector(const Vector<T>& vector) {
    for (size_t i = 0; i < vector.Size(); ++i)
        std::cout << vector[i] << std::endl;
 
    std::cout << "---------------------------" << std::endl;
}
 
int main() {
    Vector<std::string> vector;
    vector.PushBack("Cherno");
    vector.PushBack("C++");
    vector.PushBack("Vector");
    PrintVector(vector);
 
    return 0;
}

基础版本实现

初始化策略是分配分配两个元素的空间

template <typename T>
class Vector {
public:
    Vector() { ReAlloc(2); }
 
    void PushBack(const T& value) {
        // check the space
        if (m_Size >= m_Capacity)
            ReAlloc(m_Size + m_Size);
 
        // push the value back and update the size
        m_Data[m_Size++] = value;
    }
 
    T& operator[](size_t index) { return m_Data[index]; }
    const T& operator[](size_t index) const { return m_Data[index]; }
 
    size_t Size() const { return m_Size; }
 
private:
    void ReAlloc(size_t newCapacity) {
        // allocate space for new block
        T* newBlock = new T[newCapacity];
 
        // ensure no overflow
        if (newCapacity < m_Size)
            m_Size = newCapacity;
 
        // move all the elements to the new block
        for (int i = 0; i < m_Size; ++i)
            newBlock[i] = m_Data[i];
 
        // delete the old space and update old members
        delete[] m_Data;
        m_Data = newBlock;
        m_Capacity = newCapacity;
    }
 
private:
    T* m_Data = nullptr;
 
    size_t m_Size = 0;
    size_t m_Capacity = 0;
};

move 版本

以上的基础版本可以实现基本的功能，但是其效率却太低，存在许多复制。我们可以自己写一个class测试一下。

move版本API

class Vector3 {
public:
    Vector3() {}
    Vector3(float scalar)
        : x(scalar), y(scalar), z(scalar) {}
    Vector3(float x, float y, float z)
        : x(x), y(y), z(z) {}
 
    Vector3(const Vector3& other)
        : x(other.x), y(other.y), z(other.z) {
        std::cout << "Copy" << std::endl;
    }
    Vector3(const Vector3&& other)
        : x(other.x), y(other.y), z(other.z) {
        std::cout << "Move" << std::endl;
    }
    ~Vector3() {
        std::cout << "Destroy" << std::endl;
    }
 
    Vector3& operator=(const Vector3& other) {
        std::cout << "Copy" << std::endl;
        x = other.x;
        y = other.y;
        z = other.z;
        return *this;
    }
    Vector3& operator=(Vector3&& other) {
        std::cout << "Move" << std::endl;
        x = other.x;
        y = other.y;
        z = other.z;
        return *this;
    }
    friend std::ostream& operator<<(std::ostream&, const Vector3&);
private:
    float x = 0.0f, y = 0.0f, z = 0.0f;
};
 
std::ostream& operator<<(std::ostream& os, const Vector3& vec) {
    os << vec.x << ", " << vec.y << ", " << vec.z;
    return os;
}
 
int main() {
    Vector<Vector3> vec;
    vec.PushBack(Vector3());
    vec.PushBack(Vector3(1.0f));
    vec.PushBack(Vector3(1.0f, 2.0f, 3.0f));
    PrintVector(vec);
 
    return 0;
}

对于基础版本的API其输出为

Copy
Destroy
Copy
Destroy
Copy
Copy
Destroy
Destroy
Copy
Destroy
0, 0, 0
1, 1, 1
1, 2, 3
---------------------------

中间连着两个copy 和两个 destroy 是扩容过程，除此之外都是 PushBack产生的

实际上不需要这么多复制，我们可以将原来的内容直接移动到新的位置在PushBack的时候，扩容的时候也一样

move版本实现

消除 Copy 很简单，只需要重载一个接收右值的 PushBack 并在其中进行 move 即可，另外扩容的时候也需要改成 move 的

// new PushBack Method
    void PushBack(T&& value) {
        // check the space
        if (m_Size >= m_Capacity)
            ReAlloc(m_Size + m_Size);
 
        // push the value back and update the size
        m_Data[m_Size++] = std::move(value);
    }
 
// in ReAlloc
        for (int i = 0; i < m_Size; ++i)
            newBlock[i] = std::move(m_Data[i]);

可以看到以下结果

Move
Destroy
Move
Destroy
Move
Move
Destroy
Destroy
Move
Destroy
0, 0, 0
1, 1, 1
1, 2, 3
---------------------------

EmplaceBack & Placement new

每次 PushBack 会在外面构造好一个变量之后移动到 vector 里面，那么可以直接把构造需要的参数给到 vector ，然后直接在给定地址空间上进行对象的构造

原地构造 API

可以看到这里给EmplaceBack的直接是构造Vector3所需的参数而不是Vector3。

int main() {
    Vector<Vector3> vec;
    vec.EmplaceBack();
    vec.EmplaceBack(1.0f);
    vec.EmplaceBack(1.0f, 2.0f, 3.0f);
    PrintVector(vec);
 
    return 0;
}

原地构造实现

首先是 EmplaceBack 的实现，实现依赖于模板参数展开

注意到实现中的 new 运算符，不同于一般的 new 运算符，这里给出了一个参数作为需要 new 的位置的地址，这样就可以直接在原地构造而不用移来移去

template<typename... Args>
    T& EmplaceBack(Args&&... args) {
        // check the space
        if (m_Size >= m_Capacity)
            ReAlloc(m_Size + m_Size);
 
        // Placement new
        new (&m_Data[m_Size]) T(std::forward<Args>(args)...);
        return m_Data[m_Size++];
    }

测试结果为

Move
Move
Destroy
Destroy
0, 0, 0
1, 1, 1
1, 2, 3
---------------------------

我们只在扩容的时候进行了两次Move，所有的对象都是在原地直接进行构造的。

PopBack 和析构函数

前面的过程中为了输出简单省略了析构函数，实际上析构函数不可或缺，否则会有内存泄漏。

我们增加了 PopBack 的功能，这两者组合起来会造成一个非常严重的问题

PopBack和析构函数 API

int main() {
    Vector<Vector3> vec;
    vec.EmplaceBack();
    vec.EmplaceBack(1.0f);
    vec.EmplaceBack(1.0f, 2.0f, 3.0f);
    PrintVector(vec);
    vec.PopBack();
    vec.PopBack();
    PrintVector(vec);
 
    return 0;
}

PopBack和析构函数实现

    void PopBack() {
        if (m_Size > 0) {
            --m_Size;
            m_Data[m_Size].~T();
        }
    }
 
    ~Vector() { delete[] m_Data; }

Move
Move
Destroy
Destroy
0, 0, 0
1, 1, 1
1, 2, 3
---------------------------
Destroy
Destroy
0, 0, 0
---------------------------
Destroy
Destroy
Destroy
Destroy

如果我们的 vector3 类中右指针指向某一片内存空间的化，那么PopBack 中会调用一次 vector3 的析构函数，然后析构函数中的 delete 还会对该地址空间调用一个析构函数，那么该内存空间会被 delete 两次

接下来我们解决这个问题：

::operator new / delete

析构API

class Vector3 {
public:
    Vector3() {
        m_MemoryBlock = new int[5];
    }
    Vector3(float scalar)
        : x(scalar), y(scalar), z(scalar) {
        m_MemoryBlock = new int[5];
    }
    Vector3(float x, float y, float z)
        : x(x), y(y), z(z) {
        m_MemoryBlock = new int[5];
    }
 
    Vector3(const Vector3& other) = delete;
 
    Vector3(Vector3&& other)
        : x(other.x), y(other.y), z(other.z) {
        std::cout << "Move" << std::endl;
        m_MemoryBlock = other.m_MemoryBlock;
        other.m_MemoryBlock = nullptr;
    }
    ~Vector3() {
        std::cout << "Destroy" << std::endl;
        delete[] m_MemoryBlock;
    }
 
    Vector3& operator=(const Vector3& other) {
        std::cout << "Copy" << std::endl;
        x = other.x;
        y = other.y;
        z = other.z;
        return *this;
    }
    Vector3& operator=(Vector3&& other) {
        std::cout << "Move" << std::endl;
        x = other.x;
        y = other.y;
        z = other.z;
        return *this;
    }
    friend std::ostream& operator<<(std::ostream&, const Vector3&);
private:
    float x = 0.0f, y = 0.0f, z = 0.0f;
    int* m_MemoryBlock = nullptr;
 
};
 
std::ostream& operator<<(std::ostream& os, const Vector3& vec) {
    os << vec.x << ", " << vec.y << ", " << vec.z;
    return os;
}
 
int main() {
    {
        Vector<Vector3> vec;
        vec.EmplaceBack();
        vec.EmplaceBack(1.0f);
        vec.EmplaceBack(1.0f, 2.0f, 3.0f);
        PrintVector(vec);
        vec.PopBack();
        vec.PopBack();
        PrintVector(vec);
    }
    std::cout << "hello" << std::endl;
    return 0;
}

并没有输出hello，应该是程序异常退出了

正确内存管理实现

我们使用的办法就是将new和delete的两阶段分开，其中分配和回收的过程则调用::operator new和::operator delete。

具体实现如下：

    ~Vector() {
        Clear();
        ::operator delete(m_Data, m_Capacity * sizeof(T));
    }
 
    void Clear() {
        for (int i = 0; i < m_Size; ++i)
            m_Data[i].~T();
 
        m_Size = 0;
    }
 
		void ReAlloc(size_t newCapacity) {
        // allocate space for new block
        T* newBlock = (T*)::operator new(newCapacity * sizeof(T));
 
        // ensure no overflow
        if (newCapacity < m_Size)
            m_Size = newCapacity;
 
        // move all the elements to the new block
        for (int i = 0; i < m_Size; ++i)
            new(&newBlock[i]) T(std::move(m_Data[i]));
 
        // delete the old space and update old members
        Clear();
        ::operator delete(m_Data, m_Capacity * sizeof(T));
        m_Data = newBlock;
        m_Capacity = newCapacity;
    }

可以看到主要就是将析构函数的调用挪到了 Clear 函数里，只析构有元素的位置，然后删除和分配空间用 ::operator new / delete 注意： ::operator new / delete 的该重载函数直到 C++14 才得到支持

其输出结果：

Move
Move
Destroy
Destroy
1, 2, 3
---------------------------
Destroy
---------------------------
hello

关于 operator new 和 new operator 可以看我之前的博客：第八条《More Effective C++》学习-CSDN博客

C++ 中 new 操作符内幕：new operator、operator new、placement new - slgkaifa - 博客园 (cnblogs.com)

假设你用placement new在内存中建立对象，你应该避免在该内存中用delete操作符。

由于delete操作符调用operator delete来释放内存，可是包括对象的内存最初不是被operator new分配的。placement new仅仅是返回转递给它的指针。谁知道这个指针来自何方？而你应该显式调用对象的析构函数来解除构造函数的影响：

// 在共享内存中分配和释放内存的函数 void * mallocShared(size_t size);

void freeShared(void *memory);
void *sharedMemory = mallocShared(sizeof(Widget));
Widget *pw = // 如上所看到的,
constructWidgetInBuffer(sharedMemory, 10); // 使用

// placement new 

...
delete pw; // 结果不确定! 共享内存来自
// mallocShared, 而不是operator new

pw->~Widget(); // 正确。 析构 pw指向的Widget，

// 可是没有释放
//包括Widget的内存

freeShared(pw); // 正确。 释放pw指向的共享内存

// 可是没有调用析构函数