求最大全1子矩阵的一些探讨

最新推荐文章于 2018-08-30 11:11:52 发布

festony

最新推荐文章于 2018-08-30 11:11:52 发布

阅读量2.4k

点赞数

分类专栏： python 算法

本文链接：https://blog.csdn.net/festony/article/details/8558235

版权

python 同时被 2 个专栏收录

16 篇文章 0 订阅

订阅专栏

算法

1 篇文章 0 订阅

订阅专栏

最近烦恼了一阵关于最大全1子矩阵的问题。流行的解法似乎是把0设为大负数然后就可以用求最大和子矩阵的算法了。

……不过总感觉有很大优化改进的余地，尚在思考中……

anyway，回到最大全1子矩阵。自己想了另一个算法：

比如对于矩阵：

0 1 1 0 0

0 1 0 1 1

1 0 1 1 1

0 1 1 1 1

1 1 1 1 1

首先处理每一行，对连续的1序列逐个累加，逢0reset：

比如1 0 1 1 1就变换为 1 0 1 2 3

0 1 1 0 0 就变换为 0 1 2 0 0

这样每个位置的数即表示从此位置往左共有多少连续的1

然后处理变换过后的每一列，对每个数，若〉0，则往上往下分别遍历至第一个小于其值的数为止，计算行数，行数x数值即是一个可能的全1子矩阵的面积。找出最大的即可。

比如对于：

对于第二行1，向上向下遍历，>=1的行共3行，所以此子矩阵宽度1，高度3，面积为3

对于第三行3，向上向下遍历，>=3的行共1行，所以此子矩阵宽度3，高度1，面积为3

对于第四行2，向上向下遍历，>=2的行共2行，所以此子矩阵宽度2，高度2，面积为4

等等。

复杂度分析：

从时间复杂度上来说此算法和最大和算法（i=1...矩阵最大行数；j=i...矩阵最大行数，对于包含行i至j子矩阵分算每列和得到一维数组然后动态规划求最大连续和，遍历所有可能 i , j 组合）并无区别，设矩阵行数H，列数W，其复杂度应该都为O(H^2 * W)。但是以上的方法比较而言避免了很多不必要的比较和计算，所以应该快一些。

缺点是占用更多空间。

python编码验证，以上方法的运算时间是最大和方法的30~50分之一。当矩阵尺寸为300时，最大和方法运算时间200~300秒+，以上方法运算时间5秒+。

代码如下：

首先是生成测试矩阵的代码：

import random
import time
import string

matrix_size = 8
one_percentage = 0.85

random.seed()

example_matrix = []

for i in range(matrix_size):
    row = []
    for j in range(matrix_size):
        v = 1
        if random.random() > one_percentage:
            v = 0
        row.append(v)
    example_matrix.append(row[:])

然后是记录比较/运算次数的代码：

comp_count = 0
add_count = 0
mul_count = 0
swap_count = 0

is_counting = True
#is_counting = False

start_time = None

def reset():
    if is_counting:
        global comp_count
        global add_count
        global mul_count
        global swap_count
        comp_count = 0
        add_count = 0
        mul_count = 0
        swap_count = 0
    global start_time
    start_time = time.clock()
    
def report():
    global start_time
    end_time = time.clock()
    print("---Process time: %g sec(s)---" % \
    (end_time - start_time,))
    if is_counting:
        global comp_count
        global add_count
        global mul_count
        global swap_count
        print 'reporting:'
        print 'compare:', comp_count
        print 'add:', add_count
        print 'mul:', mul_count
        print 'swap:', swap_count
        print 'total calculation:', add_count + mul_count + swap_count


def comp(a, b):
    if is_counting:
        global comp_count
        comp_count += 1
    if a > b:
        return 1
    if a < b:
        return -1
    return 0

def add(a, b):
    if is_counting:
        global add_count
        add_count += 1
    return a+b

def mul(a, b):
    if is_counting:
        global mul_count
        mul_count += 1
    return a*b

然后是最大和子矩阵的代码：

# Get continuous sub sequence with max sum
def getMax(l):
    start_i = 0
    sum = l[0]
    # max_start: inclusive
    max_start = 0
    # max_end: exclusive
    max_end = 0
    max_sum = l[0]
    
    for i in range(1, len(l)):
        if comp(sum, 0) < 0:
            sum = l[i]
            start_i = i
        else:
            sum = add(sum, l[i])
        if comp(sum, max_sum) > 0:
            max_start = start_i
            # max_end: exclusive
            max_end = i + 1
            max_sum = sum
    return max_start, max_end, max_sum


#reset()
#
#t = [31, -41, 59, 26, -53, 58, 97, -93, -23, 84]
#print getMax(t)
#
#report()

# start_row and end_row: start_row inclusive, end_row exclusive
def getMatrixColumnSum(matrix, start_row, end_row):
    #h = len(matrix)
    w = len(matrix[0])
    sum = matrix[start_row][:]
    for i in range(start_row + 1, end_row):
        for j in range(0, w):
            sum[j] = add(sum[j], matrix[i][j])
    return sum

# well actually can optimize it a bit
def addTwoRows(row1, row2):
    w = len(row1)
    for i in range(w):
        row1[i] = add(row1[i], row2[i])


def getMaxSubMatrixSum(matrix):
    h = len(matrix)
    # w = len(matrix[0])
    max_sum = matrix[0][0]
    # for now just testing so not going to record actual sub matrix area,
    # only output max sum.
    for i in range(h):
        column_sum = matrix[i][:]
        for j in range(i, h):
            if comp(j, i) > 0:
                addTwoRows(column_sum, matrix[j])
            column_max_sum = getMax(column_sum)[2]
            if comp(column_max_sum, max_sum) > 0:
                max_sum = column_max_sum
    
    return max_sum

#for i in range(12):
#    for j in range(11):
#        if example_matrix[i][j] <= 0:
#            example_matrix[i][j] = - 132
#
#reset()
#print getMaxSubMatrixSum(example_matrix)
#report()

然后是较快的算法：

def process_row(row):
    processed_row = []
    accu = 0
    for i in range(len(row)):
        if comp(row[i], 0) == 0:
            processed_row.append(0)
            accu = 0
        else:
            accu = add(accu, row[i])
            processed_row.append(accu)
            
    return processed_row

def getMaxSubMatrixForSingleColumnFromProcessedMatrix(p_m, column_num):
    h = len(p_m)
    row_sum = p_m[0][column_num]
    max_sum = row_sum
    max_row_num = [0]*h
    for i in range(h):
        if comp(p_m[i][column_num], 0) == 0:
            # pass
            continue
        else:
            if comp(max_row_num[i], 0) > 0:
                # pass: previously examined same value 
                continue
            max_row_num[i] = 1
            same_value_rows = []
            
            # can merge these 2 loops into 1
            # but for now leave it be.
            for j in range(i-1, -1, -1):
                if comp(p_m[j][column_num], p_m[i][column_num]) < 0:
                    break;
                else:
                    max_row_num[i] = add(max_row_num[i], 1)
                    if comp(p_m[j][column_num], p_m[i][column_num]) == 0:
                        same_value_rows.append(j)
            for j in range(i+1, h):
                if comp(p_m[j][column_num], p_m[i][column_num]) < 0:
                    break;
                else:
                    max_row_num[i] = add(max_row_num[i], 1)
                    if comp(p_m[j][column_num], p_m[i][column_num]) == 0:
                        same_value_rows.append(j)
            for j in same_value_rows:
                max_row_num[j] = max_row_num[i]
            sum = mul(max_row_num[i], p_m[i][column_num])
            if comp(sum, max_sum) > 0:
                max_sum = sum
                    
    return max_sum

            
def getMaxSubMatrix(matrix):
    p_m = []
    for row in matrix:
        p_m.append(process_row(row))
    max_sum = getMaxSubMatrixForSingleColumnFromProcessedMatrix(p_m, 0)
    for i in range(1, len(matrix[0])):
        sum = getMaxSubMatrixForSingleColumnFromProcessedMatrix(p_m, i)
        if comp(sum, max_sum) > 0:
            max_sum = sum
            
    return max_sum



#print 'new method:'
#reset()
#print getMaxSubMatrix(example_matrix)
#report()
#        
#
#
#
#
#for i in range(len(example_matrix)):
#    for j in range(len(example_matrix[0])):
#        if example_matrix[i][j] <= 0:
#            example_matrix[i][j] = - 132
#
#print
#print 'old method:'
#reset()
#print getMaxSubMatrixSum(example_matrix)
#report()

festony

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
求最大全1子矩阵的一些探讨

最近烦恼了一阵关于最大全1子矩阵的问题。流行的解法似乎是把0设为大负数然后就可以用求最大和子矩阵的算法了。……不过总感觉有很大优化改进的余地，尚在思考中…… anyway，回到最大全1子矩阵。自己想了另一个算法：比如对于矩阵：0 1 1 0 00 1 0 1 11 0 1 1 10 1 1 1 11 1 1 1 1首先处理每一行，对连续的1序列逐个累加，逢0
复制链接

扫一扫

专栏目录