求最大全1子矩阵的一些探讨

最近烦恼了一阵关于最大全1子矩阵的问题。流行的解法似乎是把0设为大负数然后就可以用求最大和子矩阵的算法了。

……不过总感觉有很大优化改进的余地,尚在思考中……

 

anyway,回到最大全1子矩阵。自己想了另一个算法:

比如对于矩阵:

0 1 1 0 0

0 1 0 1 1

1 0 1 1 1

0 1 1 1 1

1 1 1 1 1

首先处理每一行,对连续的1序列逐个累加,逢0reset:

比如1 0 1 1 1就变换为 1 0 1 2 3

0 1 1 0 0 就变换为 0 1 2 0 0

这样每个位置的数即表示从此位置往左共有多少连续的1

然后处理变换过后的每一列,对每个数,若〉0,则往上往下分别遍历至第一个小于其值的数为止,计算行数,行数x数值即是一个可能的全1子矩阵的面积。找出最大的即可。

比如对于:

0

1

3

2

0

4

 

对于第二行1,向上向下遍历,>=1的行共3行,所以此子矩阵宽度1,高度3,面积为3

对于第三行3,向上向下遍历,>=3的行共1行,所以此子矩阵宽度3,高度1,面积为3

对于第四行2,向上向下遍历,>=2的行共2行,所以此子矩阵宽度2,高度2,面积为4

等等。

 

复杂度分析:

从时间复杂度上来说此算法和最大和算法(i=1...矩阵最大行数;j=i...矩阵最大行数,对于包含行i至j子矩阵分算每列和得到一维数组然后动态规划求最大连续和,遍历所有可能 i , j 组合)并无区别,设矩阵行数H,列数W,其复杂度应该都为O(H^2 * W)。但是以上的方法比较而言避免了很多不必要的比较和计算,所以应该快一些。

缺点是占用更多空间。

 

python编码验证,以上方法的运算时间是最大和方法的30~50分之一。当矩阵尺寸为300时,最大和方法运算时间200~300秒+,以上方法运算时间5秒+。

代码如下:

首先是生成测试矩阵的代码:

import random
import time
import string

matrix_size = 8
one_percentage = 0.85

random.seed()

example_matrix = []

for i in range(matrix_size):
    row = []
    for j in range(matrix_size):
        v = 1
        if random.random() > one_percentage:
            v = 0
        row.append(v)
    example_matrix.append(row[:])


 

然后是记录比较/运算次数的代码:

 

comp_count = 0
add_count = 0
mul_count = 0
swap_count = 0

is_counting = True
#is_counting = False

start_time = None

def reset():
    if is_counting:
        global comp_count
        global add_count
        global mul_count
        global swap_count
        comp_count = 0
        add_count = 0
        mul_count = 0
        swap_count = 0
    global start_time
    start_time = time.clock()
    
def report():
    global start_time
    end_time = time.clock()
    print("---Process time: %g sec(s)---" % \
    (end_time - start_time,))
    if is_counting:
        global comp_count
        global add_count
        global mul_count
        global swap_count
        print 'reporting:'
        print 'compare:', comp_count
        print 'add:', add_count
        print 'mul:', mul_count
        print 'swap:', swap_count
        print 'total calculation:', add_count + mul_count + swap_count


def comp(a, b):
    if is_counting:
        global comp_count
        comp_count += 1
    if a > b:
        return 1
    if a < b:
        return -1
    return 0

def add(a, b):
    if is_counting:
        global add_count
        add_count += 1
    return a+b

def mul(a, b):
    if is_counting:
        global mul_count
        mul_count += 1
    return a*b


然后是最大和子矩阵的代码:
# Get continuous sub sequence with max sum
def getMax(l):
    start_i = 0
    sum = l[0]
    # max_start: inclusive
    max_start = 0
    # max_end: exclusive
    max_end = 0
    max_sum = l[0]
    
    for i in range(1, len(l)):
        if comp(sum, 0) < 0:
            sum = l[i]
            start_i = i
        else:
            sum = add(sum, l[i])
        if comp(sum, max_sum) > 0:
            max_start = start_i
            # max_end: exclusive
            max_end = i + 1
            max_sum = sum
    return max_start, max_end, max_sum


#reset()
#
#t = [31, -41, 59, 26, -53, 58, 97, -93, -23, 84]
#print getMax(t)
#
#report()

# start_row and end_row: start_row inclusive, end_row exclusive
def getMatrixColumnSum(matrix, start_row, end_row):
    #h = len(matrix)
    w = len(matrix[0])
    sum = matrix[start_row][:]
    for i in range(start_row + 1, end_row):
        for j in range(0, w):
            sum[j] = add(sum[j], matrix[i][j])
    return sum

# well actually can optimize it a bit
def addTwoRows(row1, row2):
    w = len(row1)
    for i in range(w):
        row1[i] = add(row1[i], row2[i])


def getMaxSubMatrixSum(matrix):
    h = len(matrix)
    # w = len(matrix[0])
    max_sum = matrix[0][0]
    # for now just testing so not going to record actual sub matrix area,
    # only output max sum.
    for i in range(h):
        column_sum = matrix[i][:]
        for j in range(i, h):
            if comp(j, i) > 0:
                addTwoRows(column_sum, matrix[j])
            column_max_sum = getMax(column_sum)[2]
            if comp(column_max_sum, max_sum) > 0:
                max_sum = column_max_sum
    
    return max_sum

#for i in range(12):
#    for j in range(11):
#        if example_matrix[i][j] <= 0:
#            example_matrix[i][j] = - 132
#
#reset()
#print getMaxSubMatrixSum(example_matrix)
#report()


 

然后是较快的算法:

def process_row(row):
    processed_row = []
    accu = 0
    for i in range(len(row)):
        if comp(row[i], 0) == 0:
            processed_row.append(0)
            accu = 0
        else:
            accu = add(accu, row[i])
            processed_row.append(accu)
            
    return processed_row

def getMaxSubMatrixForSingleColumnFromProcessedMatrix(p_m, column_num):
    h = len(p_m)
    row_sum = p_m[0][column_num]
    max_sum = row_sum
    max_row_num = [0]*h
    for i in range(h):
        if comp(p_m[i][column_num], 0) == 0:
            # pass
            continue
        else:
            if comp(max_row_num[i], 0) > 0:
                # pass: previously examined same value 
                continue
            max_row_num[i] = 1
            same_value_rows = []
            
            # can merge these 2 loops into 1
            # but for now leave it be.
            for j in range(i-1, -1, -1):
                if comp(p_m[j][column_num], p_m[i][column_num]) < 0:
                    break;
                else:
                    max_row_num[i] = add(max_row_num[i], 1)
                    if comp(p_m[j][column_num], p_m[i][column_num]) == 0:
                        same_value_rows.append(j)
            for j in range(i+1, h):
                if comp(p_m[j][column_num], p_m[i][column_num]) < 0:
                    break;
                else:
                    max_row_num[i] = add(max_row_num[i], 1)
                    if comp(p_m[j][column_num], p_m[i][column_num]) == 0:
                        same_value_rows.append(j)
            for j in same_value_rows:
                max_row_num[j] = max_row_num[i]
            sum = mul(max_row_num[i], p_m[i][column_num])
            if comp(sum, max_sum) > 0:
                max_sum = sum
                    
    return max_sum

            
def getMaxSubMatrix(matrix):
    p_m = []
    for row in matrix:
        p_m.append(process_row(row))
    max_sum = getMaxSubMatrixForSingleColumnFromProcessedMatrix(p_m, 0)
    for i in range(1, len(matrix[0])):
        sum = getMaxSubMatrixForSingleColumnFromProcessedMatrix(p_m, i)
        if comp(sum, max_sum) > 0:
            max_sum = sum
            
    return max_sum



#print 'new method:'
#reset()
#print getMaxSubMatrix(example_matrix)
#report()
#        
#
#
#
#
#for i in range(len(example_matrix)):
#    for j in range(len(example_matrix[0])):
#        if example_matrix[i][j] <= 0:
#            example_matrix[i][j] = - 132
#
#print
#print 'old method:'
#reset()
#print getMaxSubMatrixSum(example_matrix)
#report()


 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值