动态规划之最长递增子序列问题（nlogn输出LIS）——Longest increasing subsequence（python3实现）

最新推荐文章于 2024-05-09 18:09:52 发布

liuxiang15

最新推荐文章于 2024-05-09 18:09:52 发布

阅读量2.1k

点赞数 1

分类专栏：作业文章标签：动态规划 nlogn输出最长递增子序列 Longest increasing subsequence LIS 算法导论15.4-6

作业专栏收录该内容

5 篇文章 0 订阅

订阅专栏

花了三天时间陆陆续续地才把代码和注释写好，有点心累。

因为是一步步来，由简单到复杂，所以写的函数比较多。

这是算法导论15.4-6上的一道题：

问题描述：

看到这个问题后有这样一种简单的解法.

方法1：O(nlogn)排序+O(n2)LCS+去重(O(n))

因为我之前写过一个求最长公共子序列（LCS）的DP算法，假设原数组为A，那么：

1.数组A使用nlogn的排序算法得到有序数组B

2.对数组A和B求LCS

复杂度会是O( n^2 ),但是有一个问题需要考虑，就是如果数组A中出现重复元素，

比如A=[1,3,8,6,4,6,7],排序后得到B=[1,3,4,6,6,7,8]

那么A和B的一个LCS是：[1,3,6,6,7]

这样的话我们只需要遍历一遍A和B的LCS去重即可，这一步的时间复杂度是O(n).

方法2：O(n2)获取LIS长度

当然是用动态规划来解决了，因为长度为i的候选子序列其实可以看成是长度为i-1的候选子序列加上一个大于其尾元素的数。

参考博客：https://segmentfault.com/a/1190000012748540

python3实现O( n^2 )获取LIS的长度：

# coding=utf-8
"""
Created on 2019.2.14-2018.2.16
@author: 刘祥
参考：https://www.cnblogs.com/zhangbaochong/p/5793965.html
     https://segmentfault.com/a/1190000012754802
"""

import numpy as np 
from numpy import *

class LIS(object):
    '''
    动态规划实现n^2和nlogn算法实现获得【最长公共子序列】(LIS)
    list:作为计算LIS的数组，可由用户输入，也可由程序生成
    '''

    def __init__(self, list=[]):
        self.list = list

    #函数说明：用复杂度为n2的DP算法计算LIS长度。用了2层嵌套循环
                # 外层循环用来逐个扫描输入，假设当前扫描到的元素是X
                # 内层循环用来找出在X的左边（也就是已经扫描过的），且值比X小的元素E，使X能拼接到以E结尾的LIS的后面
                # 参考博客：https://www.cnblogs.com/zhangbaochong/p/5793965.html
    #参数：self.list
    #返回值：LIS的长度
    def lis_length_dp_n2(self):
        list_len = len(self.list)
        if list_len < 1:
            return
        pos = [-1]*list_len         # pos[i]代表以self.list[i]结尾的最长上升子序列的上一个序列内数的下标
        dp_len = [1]*list_len       # dp_len[i]代表以self.list[i]结尾的最长上升子序列的长度!!!
        max_length = 1
        for i in range(1, list_len):
            for j in range(0, i):
                if self.list[j] < self.list[i] and dp_len[i] < dp_len[j]+1:
                    dp_len[i] = dp_len[j]+1
                    pos[i] = j
            if max_length < dp_len[i]:
                max_length = dp_len[i]
        max_len = max(dp_len)
        last_pos = dp_len.index(max_len)    #返回最大值对应的下标
        lis = []
        for i in range(max_len):
            lis.append(self.list[last_pos])
            last_pos = pos[last_pos]
        lis.reverse()                       #反转
        print("in lis_length_dp_n2 a LIS is: "+str(lis))
        return max_length

方法3：O(nlogn)获取LIS长度

参考博客：https://segmentfault.com/a/1190000012754802

添加类成员函数

#函数说明：返回数组dp_len中第一个>=target的值的位置下标
    #参数：数组dp_len, target
    #返回值：数组dp_len中第一个>=target的值的位置下标,如果dp_len中都比target小，则返回len(dp_len);如果数组为空，返回0
    def lower_bound(self, dp_len, target):
        low = 0
        high = len(dp_len)-1
        pos = len(dp_len)
        if pos == 0:
            return pos
        if target > dp_len[high]: 
            return pos
        while low<high:
            mid = (low+high)//2
            if dp_len[mid] < target:
                low = mid+1
            else:
                high = mid
            pos = high
        return pos

    #函数说明：用复杂度为nlogn的DP算法计算LIS长度。
            #参考博客：https://segmentfault.com/a/1190000012754802
    #参数：self.list
    #返回值: LIS的长度
    #注意： 最后数组dp_len里的元素并不一定是所求的序列
    def lis_length_dp_nlogn(self):
        list_len = len(self.list)
        dp_len = []         #dp_len[i]代表以self.list[i]结尾的最长上升子序列的长度!!!dp_len不是一个单调递增序列！！！
        print(self.list)
        for i in range(list_len):
            pos = self.lower_bound(dp_len, self.list[i])
            if pos == len(dp_len):
                dp_len.append(self.list[i])
            else:
                dp_len[pos] = self.list[i]
            # print(dp_len)
        print(dp_len)
        print("in lis_length_dp_nlogn length of a LIS is: ",len(dp_len))
        return len(dp_len)

方法4：O(nlogn)获取LIS

这部分代码我是参考将舍友@易滔的代码（感谢易滔！！！），基本上是将他的代码从C++转为python3。

说实话这部分代码我先开始一脸懵逼，主要是对pre_cursor和last_min两个数组代表的含义没搞懂。

pre_cursor[i]存储以self.list[i]为结尾的LIS的前驱在self.list中的下标值
last_min[L]存储的是长度为L的LIS的最小末位值在self.list中的下标值

简单分析一下get_lis_dp_nlogn函数

除了函数体的前5行是变量的初始化，第6行开始是数组遍历：

首先通过logn的binary_search_new函数找到以self.list[i]为LIS末尾元素时self.list[i]之前的元素数量max

然后就是关键的pre_cursor[i] = last_min[max]，这里的last_min[max]正是长度为max的LIS的最小末尾元素下标

接着i_len = max + 1，因为加入了self.list[i]元素，所以max+1

然后更新不同长度的LIS最小末位值的下标值

last_min[i_len] == -1代表self.list[i]比之前的元素都小或者加入self.list[i]后LIS长度比之前的更长

接着更新LIS和LIS结束位置

在回溯获取LIS的过程中，只需要根据pre_cursor数组的含义很容易获取LIS

#函数说明：用复杂度为nlogn的DP算法获得LIS。
    #参数：self.list
    #返回值: LIS   
    def get_lis_dp_nlogn(self):
        list_len = len(self.list)
        pre_cursor = [-1]*(list_len+1)       #pre_cursor[i]存储以self.list[i]为结尾的LIS的前驱在self.list中的下标值
        last_min = [-1]*(list_len+1)         #last_min[L]存储的是长度为L的LIS的最小末位值在self.list中的下标值
        max_lis_len = 0                      #最大LIS的长度
        max_lis_end_pos = -1
        for i in range(list_len):
            # max = self.binary_search(last_min, 1, i, self.list[i])  #???start为什么是1
            max = self.binary_search_new(last_min, 1, i, self.list[i])
            pre_cursor[i] = last_min[max]
            i_len = max + 1
            #更新不同长度的LIS最小末位值的下标值
            if last_min[i_len] == -1 or self.list[i] < self.list[last_min[i_len]]:
                last_min[i_len] = i
            #更新LIS和LIS结束位置
            if i_len > max_lis_len:
                max_lis_len = i_len
                max_lis_end_pos = i
        
        #回溯获取LIS
        lis = [-1]*max_lis_len
        curr_len = max_lis_len-1
        curr_pos = max_lis_end_pos
        while curr_len >= 0:
            lis[curr_len] = self.list[curr_pos]
            curr_pos = pre_cursor[curr_pos]
            curr_len -= 1
        print("nlogn算法得到一个LIS是", str(lis))
        return lis
    
    #函数说明：用复杂度为logn的二分查找递归算法查找以target为LIS末尾元素时target之前的元素数量
    #参数：self.list
    # last_min:last_min[L]存储的是长度为L的LIS的最小末位值在self.list中的下标值
    # start,end:下标[start,end]为查找范围
    # target:用于比较
    #返回值: 以target为LIS末尾元素时target之前的元素数量
    def binary_search(self, last_min, start, end, target):
        if start > end:
            # print("出现start > end")
            return start-1  #为什么会出现这种情况
        mid = (start+end) // 2
        if last_min[mid] == -1 or target <= self.list[last_min[mid]]:
            if start == mid:
                return start-1
            else:
                # return self.binary_search(last_min, start, mid-1, target)
                return self.binary_search(last_min, start, mid, target)
        else:
            if mid == end:
                return end
            else:
                return self.binary_search(last_min, mid+1, end, target) 
    
    #函数说明：用复杂度为logn的二分查找非递归算法查找以target为LIS末尾元素时target之前的元素数量
    #参数：self.list
    # last_min:last_min[L]存储的是长度为L的LIS的最小末位值在self.list中的下标值
    # start,end:下标[start,end]为查找范围
    # target:用于比较
    #返回值: 以target为LIS末尾元素时target之前的元素数量
    def binary_search_new(self, last_min, start, end, target):
        #返回以target为LIS末尾元素时target之前的元素数量
        if end == 0:
            return 0
        low = start
        high = end
        while low <= high:
            mid = (low+high)//2
            if last_min[mid] == -1 or target <= self.list[last_min[mid]]:   #注意这里是小于等于
                if low == mid:     #也就是此时high=low或者high=low+1
                    return low-1
                else:
                    # high = mid-1  #这行代码貌似和下面一条实现同样的功能，
                    high = mid
            else:
                if mid == high:  #也就是此时start = mid = high
                    return high
                else:
                    low = mid+1

便于测试添加函数

#函数说明：命令行提示以及接收用户输入
    #首先接收测试方法输入：1代表自动输入数组元素进行测试，输入2代表输入数组规模size，由程序自动生成数组元素
    #如果是1的话接收数组元素输入并存储在self.list中
    #参数：self.list
    #返回值: 无
    def deal_input(self):
        test_method = int(input("请决定您选择测试的方法：\n输入1代表自动输入数组元素进行测试，输入2代表输入数组规模size，由程序自动生成数组元素\n"))
        if test_method == 1:
            input_str = input("请输入数组元素，以空格分开，以换行结束：")
            for num in input_str.split():
                self.list.append(int(num))
            # print(self.list)
        elif test_method == 2:
            size = int(input("请输入数组规模size："))
            self.list = np.random.random_integers(0, 10000, size=size).tolist()
            print("随机生成的大小为%d的数组为%s"%(size, str(self.list)))

类外测试代码：

test = LIS()
test.deal_input()
test.get_lis_dp_nlogn()

运行效果：

自动测试

C:\Users\liuxiang15\Desktop\homework3>python longest_increasing_subsequence.py
请决定您选择测试的方法：
输入1代表自动输入数组元素进行测试，输入2代表输入数组规模size，由程序自动生成数组元素
2
请输入数组规模size：20
随机生成的大小为20的数组为[2356, 485, 7490, 7245, 2280, 1820, 5160, 7817, 6303, 5832, 553, 2266, 2430, 9268, 8846, 4507, 4588, 6657, 8618, 8371]
nlogn算法得到一个LIS是 [485, 553, 2266, 2430, 4507, 4588, 6657, 8618]

手动输入

C:\Users\liuxiang15\Desktop\homework3>python longest_increasing_subsequence.py
请决定您选择测试的方法：
输入1代表自动输入数组元素进行测试，输入2代表输入数组规模size，由程序自动生成数组元素
1
请输入数组元素，以空格分开，以换行结束：5 9 4 1 3 7 6 7
nlogn算法得到一个LIS是 [1, 3, 6, 7]

实在抱歉，本人能力有限，对算法理解还不够深入！

欢迎大家批评指正，在下方评论！！！

liuxiang15

关注

1
点赞
踩
14

收藏

觉得还不错? 一键收藏
3
评论
动态规划之最长递增子序列问题（nlogn输出LIS）——Longest increasing subsequence（python3实现）

花了三天时间陆陆续续地才把代码和注释写好，有点心累。因为是一步步来，由简单到复杂，所以写的函数比较多。这是算法导论15.4-6上的一道题：问题描述：看到这个问题后有这样一种简单的解法.方法1：O(nlogn)排序+O(n2)LCS+去重(O(n))因为我之前写过一个求最长公共子序列（LCS）的DP算法，假设原数组为A，那么：1.数组A使用nlogn的排序算法得到有序...
复制链接

扫一扫