352. Data Stream as Disjoint Intervals

最新推荐文章于 2022-07-14 20:57:05 发布

无差别刷题

最新推荐文章于 2022-07-14 20:57:05 发布

阅读量151

点赞数

文章标签： interval sort array leetcode

本文链接：https://blog.csdn.net/wilzxu/article/details/88467120

版权

352. Data Stream as Disjoint Intervals

方法1:
- Complexity
- 易错点：
方法2:
- Complexity
- 易错点：

Given a data stream input of non-negative integers a1, a2, …, an, …, summarize the numbers seen so far as a list of disjoint intervals.

For example, suppose the integers from the data stream are 1, 3, 7, 2, 6, …, then the summary will be:

[1, 1]
[1, 1], [3, 3]
[1, 1], [3, 3], [7, 7]
[1, 3], [7, 7]
[1, 3], [6, 7]
Follow up:
What if there are lots of merges and the number of disjoint intervals are small compared to the data stream’s size?

方法1:

参考：https://blog.csdn.net/qq508618087/article/details/51553166

思路：

联想到128. Longest Consecutive Sequence中online的方法，用hash保持每个点所在interval的长度，但是和这道题不太一样，没有办法用hash端点的办法keep track of all interval。这道题的第一种做法是每次进入一个新数字都要遍历一次已经存在的区间，不太高效。题目说考虑区间比较少而merge很多的情况，也就是说会产生很多最后一步的插入，由于是vector，会造成位移。

Complexity

Time complexity: O(kN)
Space complexity: O(N), 每次都用到了额外的tmp数组

易错点：

Inverval (val, val)
融合条件是interval.end + 1 >= v.start && interval.start <= v.end+ 1
应该有更efficient的方法in place修改result。否则需要创建一个新的tmp，转移结果，最后再赋值给result，否则遍历result时push_back to result会overflow。

/**
 * Definition for an interval.
 * struct Interval {
 *     int start;
 *     int end;
 *     Interval() : start(0), end(0) {}
 *     Interval(int s, int e) : start(s), end(e) {}
 * };
 */
class SummaryRanges {
public:
    /** Initialize your data structure here. */
    SummaryRanges() {
        
    }
    
    void addNum(int val) {
        int cur = 0;
        Interval v = Interval(val, val);
        vector<Interval> tmp;
        
        for (auto interval: result){
            
            if (interval.end + 1 < v.start){
                tmp.push_back(interval);
                cur++;                
            }
            else if (v.end + 1 < interval.start){
                tmp.push_back(interval);
                
            }
            else {
                v.start = min(interval.start, v.start);
                v.end = max(interval.end, v.end);
            }
        }
        tmp.insert(tmp.begin() + cur, v);
        result = tmp;
    }
    
    vector<Interval> getIntervals() {
        return result;
    }
private: 
    vector<Interval> result;
};

/**
 * Your SummaryRanges object will be instantiated and called as such:
 * SummaryRanges* obj = new SummaryRanges();
 * obj->addNum(val);
 * vector<Interval> param_2 = obj->getIntervals();
 */

方法2:

主要利用了iterator，用vector::lower_bound 找到最小的大于Interval(val, val)的区间（这里的cmp自定义为起点序），将它的iterator定义为it。从这个区间开始，之后的所有区间如果满足（ it -> start <= val + 1 ），那么一定和Interval(val, val)发生merge。除此以外，lower_bound 之前的interval（或一个单独的整数），虽然start < val，但是如果满足(it -> end <= val - 1) 仍然会发生merge，需要按需持续it–。那么有了iterator之后，每次发生融合，只需要用vector::erase(it)就可以原位删除区间，最后将融合的新区间vector::insert(it, Interval(start, end))，也就是再加入 it 退出的位置（下一个不融合的区间）之前就可以了。原位修改，利用iterator，提速很多。

Complexity

Time complexity: O(kN)
Space complexity: O(1)

易错点：

这里融合的条件是
it -> start <= val + 1 && val <= it -> end + 1
和merge intervals的条件不一样。因为相邻正整数在这里也需要merge。
不要漏掉前面另一个可能的interval：（更正）不止一个，要用while loop找出所有
vector::insert(it, Interval()), 插入在该iterator之前

class SummaryRanges {
public:
    void addNum(int val) {
        auto cmp = [](Interval a, Interval b) { return a.start < b.start; };
        auto it = lower_bound(vec.begin(), vec.end(), Interval(val, val), cmp);
        int start = val, end = val;
        while (it != vec.begin() && (it-1)->end+1 >= val) it--;
        while(it != vec.end() && val+1 >= it->start && val-1 <= it->end)
        {
            start = min(start, it->start);
            end = max(end, it->end);
            it = vec.erase(it);
        }
        vec.insert(it,Interval(start, end));
    }
    
    vector<Interval> getIntervals() {
        return vec;
    }
private:
    vector<Interval> vec;
};