Binary search and so forth

Binary search is simple in concept but quite error-prone in implementation. Better keep one for later use. My version may look like,

static int BinarySearch(TListRef list, int start, int count, const T &item, const IComparer<T> &comparer)
{
    int low = start;
    int high = start + count;
    int mid;

    while (low < high)
    {
        mid = (low + high)/2;
        const T & v = list[mid];
        int comp = comparer.Compare(item, v);
        if (comp < 0)
        {
            high = mid;
        }
        else if(comp > 0)
        {
            low = mid + 1;
        }
        else
        {
            return mid;    // found, returning the position
        }
    }
    return -(low + 1);    // not found, returning minus the position to insert minus one
}


Another two subroutines that can be useful and related to binary search are the two that find the boundaries of a chunk of items that are identical in the sorted list from the point that's returned by the binary search.

The first one is the FindLeftMostMatch() which returns the index to the first item in the chunk which by definition should always exist. 'start' is the starting point of the subsequence in question from the list. If the whole list is to be processed, then it should be 0. Both it and the one after use a incremental step strategy which can be proven with time complexity O(log(n)) where n is the distance between the original position and the boundary. (The proof of which might need a bit of mathematics)

static int FindLeftmostMatch(TListRef list, int start, int index, const T &item, 
    const IComparer<T> &comparer)
{
    int comp = comparer.Compare(list[start], item);
    if (comp == 0)
    {
        return start;
    }

    int step = 1;
    int lastIndex = index;
    for (index -= step; index >= start; step += step, index -= step)
    {
        comp = comparer.Compare(list[index], item);
        if (comp < 0) break;

        lastIndex = index;
    }

    if (index < start)
    {
        index = start;
    }

    // list[index] < list[lastIndex] = item
    // the result must be (index, lastIndex]
    // the following process is similar to binary search 
    
    int high = lastIndex;
    int low = index;
    int mid;

    while (low < high - 1)
    {
        mid = (low + high)/2;
        const T & v = list[mid];
        int comp = comparer.Compare(v, item);
        if (comp < 0)
        {
            low = mid;
        }
        else // comp == 0
        {
            high = mid;
        }
    }
    return high;
}

The other one is named FindFirstSuccessor() which returns the index to the first item after the chunk; if the chunk sits at the end of the sequence, then it returns the length of the chunk. Likewise, if the whole list is considered, 'end' should be the length of the list.

static int FindFirstSuccessor(TListRef list, int end, int index, const T &item, 
    const IComparer<T> &comparer)
{
    int comp = comparer.Compare(item, list[end-1]);
    if (comp == 0)
    {
        return start;
    }

    int step = 1;
    int lastIndex = index;
    for (index += step; index < end; step += step, index += step)
    {
        int comp = comparer.Compare(item, list[index]);
        if (comp < 0) break;

        lastIndex = index;
    }

    if (index >= end)
    {
        index = end;
    }

    // item = list[lastIndex] > list[lastIndex]
    // the result must be [lastIndex, lastIndex)
    // the following process is similar to binary search 

    int low = lastIndex;
    int high = index;
    int mid;
    
    while (low < high - 1)
    {
        mid = (low + high)/2;
        const T & v = list[mid];
        int comp = comparer.Compare(item, v);
        if (comp < 0)
        {
            high = mid;
        }
        else // comp == 0
        {
            low = mid;
        }
    }
    return low;
}

The binary search subroutine has been reasonably tested. The other two haven't undergone proper test thereby may subject to revision.

Now it's been found that the previous implementation did have fatal issues that overran the stack. Again these current ones not guaranteed to be impeccable, and they look less attractive but do converge a little bit faster, however the complexity of the previous ones if implemented properly remain O(log(n)).

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值