CS144 lab1

分析

lab1要求实现将不可靠的字节流 (可能会覆盖、交叉重叠、不按顺序) 转换为可靠字节流的StreamReassembler类。不可靠字节流以不定长字符串形式到达(长度可能为0,但其eof信号是有效的),可靠字节流要求写入lab0实现的字节流中(test程序会调用lab0中的read函数读取写入的可靠字节流)。
传入的参数capacity指的是StreamReassembler所能处理的最大字节数,超过其范围的字节都应当舍弃。
分析其不可靠的情况:

  1. 覆盖

字符串a首序号为10,长度为5;字符串b首序号为5,长度为25。字符串b覆盖字符串a。

  1. 交叉重叠

字符串a首序号为10,长度为10;字符串b首序号为5,长度为10。字符串a和字符串b交叉重叠。

  1. 不按顺序到达

字符串a首序号为10,字符串b首序号为5,字符串a可能会先于字符串b到达。

首先想到的思路是使用unordered_map存储到达的字串及其首序号,利用unordered_map的快速查找来降低部分开销并且处理首序号相同的覆盖情况,之后通过系列逻辑来处理剩余覆盖及交叉叠和不按顺序到达的情况。
其次想到的思路是化整为散,使用unordered_map存储每个字符及其首序号,利用unordered_map的自身的不可重复特性去重,进一步考虑可以用map替代unordered_map,map较于unordered_map的优点是会维护元素之间的顺序,这样在将字符写入的时候查找消耗还会进一步降低,但map在每次新字符到达时都需要维护元素之间的顺序,会产生一定的消耗,和unordered_map之间孰优孰劣不太明显,故最终考虑仍然使用unordered_map。
第一种方法处理覆盖、交叉重叠和不按顺序到达的逻辑更为复杂,但其性能较好;第二种方法的优点是代码逻辑简单,但其时间复杂度是随着字节流的长度增加而线性增加的,空间复杂度在最坏的情况下也是如此,当字节流长度较大时不可用(该实验所用测试字节流最长只有十几万字节,此方法仍可用)。

常规思路代码

运行结果:

stream_reassembler.cc:

#include "stream_reassembler.hh"


#include <unordered_set>


// Dummy implementation of a stream reassembler.

// For Lab 1, please replace with a real implementation that passes the
// automated checks run by `make check_lab1`.

// You will need to add private members to the class declaration in `stream_reassembler.hh`

template <typename... Targs>
void DUMMY_CODE(Targs &&... /* unused */) {}

using namespace std;

StreamReassembler::StreamReassembler(const size_t capacity) : _output(capacity), _capacity(capacity) {buffer_.erase(1);}

//! \details This function accepts a substring (aka a segment) of bytes,
//! possibly out-of-order, from the logical stream, and assembles any newly
//! contiguous substrings and writes them into the output stream in order.
void StreamReassembler::write_less_capacity(const string &s) {
    _output.write(s);  
    num_ += s.size();
    next_ += s.size();
}

void StreamReassembler::write_over_capacity(const string &s) {
    _output.write(string(s.cbegin(), s.cbegin() + (_capacity - num_)));
    next_ += _capacity - num_;
    num_ = _capacity;
}



void StreamReassembler::string_to_write(const string &s) { //将s写入输出字节流
    size_t len = s.size();
    num_ = _output.bytes_written() - _output.bytes_read();
    if (len <= _capacity - num_) {  //判断剩余的capacity是否够完整写入s
        write_less_capacity(s);  //足够完整写入s
    } else {
        write_over_capacity(s);  //不够完整写入s
    }
}

auto StreamReassembler::FindMap(unordered_map<size_t, std::string> &umap, size_t index) { //获取buffer_中将next_包含的元素,这个元素可以进行reassemble操作了。
    for(auto it = umap.cbegin(); it != umap.cend(); ++it) {
        if(it -> first <= index && it -> first + it ->second.size() - 1 >= index)
            return it;
    }
    return umap.cend();
}


void StreamReassembler::BufToWrite() { //依次从buffer_中获取可以reassemble的字符串,然后一次性写入输出字节流
    string s;
    size_t tempNext = next_; //用临时变量替换next_,防止while循环中对next加len的操作和后面string_to_write中对next加s.size()的操作重合
    auto it = FindMap(buffer_, tempNext);
    while(it != buffer_.cend()) { //不为end说明it指向的元素必然可以写入输出字节流,或部分或全部
        size_t len = it -> second.size() - (tempNext - it -> first);
        string tempS = string(it -> second.cend() - len, it -> second.cend());
        s += tempS;
        tempNext += len;
        buffer_.erase(it);
        it = FindMap(buffer_, tempNext);
    }
    if(s.size() > 0) {
        string_to_write(s);
    }
    if (eof_flag_ && (next_ == eof_index_)) { //不能把它放在s.size() > 0的if语句中,输入的字节流可能为空
        _output.end_input();
    }
}

void StreamReassembler::push_substring(const string &data, const size_t index, const bool eof) {
    if (eof) {
        eof_index_ = index + data.size();
        eof_flag_ = true;
    }
    if(buffer_.find(index) == buffer_.end()) {  //buffer_没有保存过index开始的字符串
        buffer_[index] = data;
    } else { //buffer_保存过index开始的字符串,那就比较哪个更长,保留更长的,舍弃更短的
        if(buffer_[index].size() < data.size())
            buffer_[index] = data;
    }
    BufToWrite();
}

size_t StreamReassembler::unassembled_bytes() const {
    unordered_set<size_t> uset;
    for(auto &p : buffer_) {
        size_t index = p.first;
        for(size_t i = 0; i < p.second.size(); ++i) {
            if(index + i >= next_) uset.insert(index + i); //加if判断防止有已经装配过的字符(可能装配完后又再次到达,导致BufToWrite的循环中的删除语句未将其删除)
        }
    }
    return uset.size();
}

bool StreamReassembler::empty() const {
    return unassembled_bytes() == 0; 
}

stream_reassembler.hh:

#ifndef SPONGE_LIBSPONGE_STREAM_REASSEMBLER_HH
#define SPONGE_LIBSPONGE_STREAM_REASSEMBLER_HH

#include "byte_stream.hh"

#include <cstdint>
#include <string>
#include <unordered_map>


//! \brief A class that assembles(聚集) a series of excerpts(片段) from a byte stream (possibly out of order,
//! possibly overlapping) into an in-order byte stream.
class StreamReassembler {
  private:
    // Your code here -- add private members as necessary.

    ByteStream _output;                                               //!< The reassembled in-order byte stream
    size_t _capacity;                                                 //!< The maximum number of bytes
    std::unordered_map<size_t, std::string> buffer_  = {{1, "a"}};  //暂时保存未重新装配的字符串
    size_t next_ = 0;        //已重新装配的长度的后一位,即下一个应该重新装配的string的index
    size_t num_ = 0;         //已经重新装配的数量
    bool eof_flag_ = false;  //最后的string是否到来(不一定是已经装配了,可能会暂存)
    size_t eof_index_ = -1;

  public:
    //! \brief(简介) Construct a `StreamReassembler` that will store up to `capacity` bytes.
    //! \note This capacity limits both the bytes that have been reassembled,
    //! and those that have not yet been reassembled.
    StreamReassembler(const size_t capacity);

    //! \brief Receive a substring and write any newly contiguous(相邻的) bytes into the stream.
    //!
    //! The StreamReassembler will stay within the memory limits of the `capacity`.
    //! Bytes that would exceed(超过) the capacity are silently discarded(抛弃).
    //!
    //! \param(parameter 参数) data the substring
    //! \param index indicates the index (place in sequence) of the first byte in `data`
    //! \param eof the last byte of `data` will be the last byte in the entire stream
    void push_substring(const std::string &data, const uint64_t index, const bool eof);

    void write_less_capacity(const std::string &s);
    void write_over_capacity(const std::string &s);
    void string_to_write(const std::string &s);
    auto FindMap(std::unordered_map<size_t, std::string> &umap, size_t index);
    void BufToWrite();

    //! \name Access the reassembled byte stream
    //!@{
    const ByteStream &stream_out() const { return _output; }
    ByteStream &stream_out() {return _output;}
    //!@}

    //! The number of bytes in the substrings stored but not yet reassembled
    //!
    //! \note If the byte at a particular(特定的) index has been pushed more than once, it
    //! should only be counted once for the purpose of this function.
    size_t unassembled_bytes() const;

    //! \brief Is the internal(内部的) state(状况) empty (other than(除了) the output stream)?
    //! \returns `true` if no substrings are waiting to be assembled
    bool empty() const;
};

#endif  // SPONGE_LIBSPONGE_STREAM_REASSEMBLER_HH

化整为散思路代码

运行结果:

stream_reassembler.cc:

#include "stream_reassembler.hh"


// Dummy implementation of a stream reassembler.

// For Lab 1, please replace with a real implementation that passes the
// automated checks run by `make check_lab1`.

// You will need to add private members to the class declaration in `stream_reassembler.hh`

template <typename... Targs>
void DUMMY_CODE(Targs &&... /* unused */) {}

using namespace std;

StreamReassembler::StreamReassembler(const size_t capacity) : _output(capacity), _capacity(capacity) {buffer_.erase(1);}

//! \details This function accepts a substring (aka a segment) of bytes,
//! possibly out-of-order, from the logical stream, and assembles any newly
//! contiguous substrings and writes them into the output stream in order.
void StreamReassembler::write_less_capacity(const string &s) {
    _output.write(s);  
    num_ += s.size();
    next_ += s.size();
}

void StreamReassembler::write_over_capacity(const string &s) {
    _output.write(string(s.cbegin(), s.cbegin() + (_capacity - num_)));
    next_ += _capacity - num_;
    num_ = _capacity;
}



void StreamReassembler::string_to_write(const string &s) {
    size_t len = s.size();
    num_ = _output.bytes_written() - _output.bytes_read();
    if (len <= _capacity - num_) {  
        write_less_capacity(s);
    } else {
        write_over_capacity(s);
    }
}

void StreamReassembler::SToC(const string &s, int index) { //将字符串中的字符存入buffer_, unordered_map的buffer_自带去重能力
    for(size_t i = 0; i < s.size(); i++) {
        buffer_[index + i] = s[i];
    }
}

void StreamReassembler::BufToWrite() { //依次获取buffer_中的字符并组装成字符串,然后一次性写入输出字节流
    string s;
    while(buffer_.find(next_) != buffer_.cend()) {
        s += buffer_[next_];
        buffer_.erase(next_);
        ++next_;
    }
    if(s.size() > 0) {
        next_ -= s.size();
        string_to_write(s);
    }
    if (eof_flag_ && (next_ == eof_index_)) { //不能把它放在s.size() > 0的if语句中,输入的字节流可能为空
        _output.end_input();
    }
}

void StreamReassembler::push_substring(const string &data, const size_t index, const bool eof) {
    if (eof) { //eof所在的子串到达后并不一定会直接写入输出字节流,可能还需要等待其他子串填补eof子串前面的空缺
        eof_index_ = index + data.size();
        eof_flag_ = true;
    }
    SToC(data, index);
    BufToWrite();
}

size_t StreamReassembler::unassembled_bytes() const {
    return buffer_.size();
}

bool StreamReassembler::empty() const {
    return unassembled_bytes() == 0; 
}

stream_reassembler.hh:

#ifndef SPONGE_LIBSPONGE_STREAM_REASSEMBLER_HH
#define SPONGE_LIBSPONGE_STREAM_REASSEMBLER_HH

#include "byte_stream.hh"

#include <cstdint>
#include <string>
#include <unordered_map>


//! \brief A class that assembles(聚集) a series of excerpts(片段) from a byte stream (possibly out of order,
//! possibly overlapping) into an in-order byte stream.
class StreamReassembler {
  private:
    // Your code here -- add private members as necessary.

    ByteStream _output;                                               //!< The reassembled in-order byte stream
    size_t _capacity;                                                 //!< The maximum number of bytes
    std::unordered_map<size_t, char> buffer_  = {{1, 'a'}};  //暂时保存未重新装配的字符串
    size_t next_ = 0;        //已重新装配的长度的后一位,即下一个应该重新装配的string的index
    size_t num_ = 0;         //已经重新装配的数量
    bool eof_flag_ = false;  //最后的string是否到来(不一定是已经装配了,可能会暂存)
    size_t eof_index_ = -1;

  public:
    //! \brief(简介) Construct a `StreamReassembler` that will store up to `capacity` bytes.
    //! \note This capacity limits both the bytes that have been reassembled,
    //! and those that have not yet been reassembled.
    StreamReassembler(const size_t capacity);

    //! \brief Receive a substring and write any newly contiguous(相邻的) bytes into the stream.
    //!
    //! The StreamReassembler will stay within the memory limits of the `capacity`.
    //! Bytes that would exceed(超过) the capacity are silently discarded(抛弃).
    //!
    //! \param(parameter 参数) data the substring
    //! \param index indicates the index (place in sequence) of the first byte in `data`
    //! \param eof the last byte of `data` will be the last byte in the entire stream
    void push_substring(const std::string &data, const uint64_t index, const bool eof);

    void write_less_capacity(const std::string &s);
    void write_over_capacity(const std::string &s);
    void string_to_write(const std::string &s);
    void SToC(const std::string &s, int index);
    void BufToWrite();

    //! \name Access the reassembled byte stream
    //!@{
    const ByteStream &stream_out() const { return _output; }
    ByteStream &stream_out() { BufToWrite(); return _output;}
    //!@}

    //! The number of bytes in the substrings stored but not yet reassembled
    //!
    //! \note If the byte at a particular(特定的) index has been pushed more than once, it
    //! should only be counted once for the purpose of this function.
    size_t unassembled_bytes() const;

    //! \brief Is the internal(内部的) state(状况) empty (other than(除了) the output stream)?
    //! \returns `true` if no substrings are waiting to be assembled
    bool empty() const;
};

#endif  // SPONGE_LIBSPONGE_STREAM_REASSEMBLER_HH

  • 1
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值