数据结构与算法重难点100道

这篇博客汇总了100道数据结构与算法的典型题目,包括蓄水池抽样、AUC曲线计算、青蛙过河、股票交易、二叉树遍历等。通过这些题目,可以深入理解各种算法的实现和应用,提高编程解决问题的能力。
摘要由CSDN通过智能技术生成

文章目录

骚题目

1420. 生成数组

蓄水池抽样

博客园 - 蓄水池采样算法

O ( N ) O(N) O(N)的时间复杂度从 N N N个数中无放回等可能抽样 K K K个数

用于不知道数据规模的情况,保证每个样本被抽中的概率是等可能的

  • 算法步骤

假设数据序列的规模为 n n n,需要采样的数量的为 k k k

首先构建一个可容纳 k k k个元素的数组,将序列的前 k k k个元素放入数组中。

然后从第 k + 1 k+1 k+1个元素开始,以 k / n k/n k/n n n n表示动态增长的数据规模)的概率来决定该元素是否被替换到数组中(数组中的元素被替换的概率是相同的)。 当遍历完所有元素之后,数组中剩下的元素即为所需采取的样本。
在这里插入图片描述

在这里插入图片描述

class Solution:

    def __init__(self, head: ListNode):
        """
        @param head The linked list's head.
        Note that the head is guaranteed to be not null, so it contains at least one node.
        """
        self.res=[]
        self.K=1
        self.head=head



    def getRandom(self) -> int:
        """
        Returns a random node's value.
        """
        k=self.K
        p=self.head
        while k and p:
            self.res.append(p.val)
            k-=1
            p=p.next
        p=self.head
        i=0
        while p:
            ix=random.randint(0, i)
            if ix < self.K:
                self.res[ix]=p.val
            i+=1
            p=p.next
        return self.res[0]

字节跳动推荐算法岗面经,三次技术面通过

这题也是蓄水池抽样
在这里插入图片描述

shuffle算法本质上也是蓄水池抽样,就是动作换成swap

全Shuffle和抽m个Shuffle:

from random import randint


def shuffle(nums):
    n = len(nums)
    for i in range(n):
        ri = randint(0, i)
        nums[i], nums[ri] = nums[ri], nums[i]
    return nums


def shuffle_m(nums, m):
    n = len(nums)
    for i in range(n):
        ri = randint(0, i)
        if ri < m:
            nums[i], nums[ri] = nums[ri], nums[i]
    return nums[:m]


print(shuffle(list(range(10))))
print(shuffle_m(list(range(10)),3))

Alias Sampling

Alias Method:时间复杂度O(1)的离散采样方法

python3 Alias Sample

class Solution:

    def __init__(self, w: List[int]):
        N = len(w)
        sum_ = sum(w)
        prob = [p / sum_ for p in w]
        alias = [0] * N
        alias_prob = [p * N for p in prob]
        small_q = []
        large_q = []
        for i, p in enumerate(alias_prob):
            if p < 1:
                small_q.append(i)
            else:
                large_q.append(i)
        while small_q and large_q:
            small = small_q.pop(0)
            large = large_q.pop(0)
            alias[small] = large
            alias_prob[large] -= (1 - alias_prob[small])
            if alias_prob[large] < 1:
                small_q.append(large)
            else:
                large_q.append(large)
        self.alias = alias
        self.N = N
        self.alias_prob = alias_prob

    def pickIndex(self) -> int:
        ix = random.randint(0, self.N - 1)
        p = random.random()
        if p < self.alias_prob[ix]:
            return ix
        else:
            return self.alias[ix]

补充其他与随机数相关的算法:

  • 线性同余法

linear congruential generator (LCG)

R a n d S e e d = ( A ∗ R a n d S e e d + B ) % M RandSeed = (A * RandSeed + B) \% M RandSeed=(ARandSeed+B)%M

LCG的周期最大为 M,但大部分情况都会少于M。要令LCG达到最大周期,应符合以下条件:

在这里插入图片描述

  • 中心极限生成高斯分布

反函数法

一般,一种概率分布,如果其分布函数为 y = F ( x ) y=F(x) y=F(x),那么,y的范围是0~1,求其反函数 G G G,然后产生0到1之间的随机数作为输入,那么输出的就是符合该分布的随机数了:

y = G ( x ) y= G(x) y=G(x)

中心极限定理

分大量的批次,每个批次生成12个 [ 0 , 1 ] [0,1] [0,1]间的随机数,然后求和。求和后的随机变量方差为 1 / 12 × 12 = 1 1/12\times 12=1 1/12×12=1,均值为 1 / 2 × 12 = 6 1/2 \times 12=6 1/2×12=6。然后-6。只要产生的批次足够多,就能生成总体服从 N ( 0 , 1 ) \mathcal{N}(0,1) N(0,1)的分布。

import numpy as np
import pylab as plt

n = 12
N = 5000
x = np.zeros([N])
for j in range(N):
    a = np.random.rand(n)
    u = sum(a)
    x[j] = u - n * 0.5
plt.hist(x)
plt.show()

Box Muller

import numpy as np
import pylab as plt

N = 1000
x1 = np.random.rand(1, N)
x2 = np.random.rand(1, N)
y1 = np.sqrt(-2 * np.log(x1)) * np.cos(2 * np.pi * x2)
y2 = np.sqrt(-2 * np.log(x1)) * np.sin(2 * np.pi * x2)
y = np.hstack([y1, y2])
plt.hist(y)
plt.show()	
  • 接受拒绝采样生成高斯分布

逆采样(Inverse Sampling)和拒绝采样(Reject Sampling)原理详解

  • 接受拒绝采样求 π \pi π
from math import sqrt
from random import random

def sample_pi(max_cnt=100000):
    acc_cnt=0
    for i in  range(max_cnt):
        x=random()
        y=random()
        if sqrt(x**2+y**2)<1:
            acc_cnt+=1
    print("pi =",acc_cnt/max_cnt*4)

sample_pi()
  • 高斯分布生成均匀分布

正态分布可以生成均匀分布吗?

a r c t a n ( z 0 z 1 ) + 0.5 arctan(\frac{z0}{z1})+0.5 arctan(z1z0)+0.5

生成二维标准正态分布的方法就是取两个独立的标准正态分布变量X和Y放在一起(X, Y)就行了

然后二维标准正态分布在直角坐标系里有各向同性,也就是(X, Y)这个点所指的方向和X轴(或者任何一个给定方向)的夹角是均匀分布的

很好理解, t a n ( θ ) = y x tan(\theta )=\frac{y}{x} tan(θ)=xy

反着用Box-Muller

在这里插入图片描述

AUC曲线计算


按照prob对【标签-样本】pairs进行排序,如果模型具有良好的排序能力,结果应该是 [ 0 , 0 , 1 , 1 , 1 ] [0,0,1,1,1] [0,0,1,1,1],正样本数 M = 3 M=3 M=3,负样本数 N = 2 N=2 N=2

只考虑正样本获取对应的排序值rankList,为 [ 3 , 4 , 5 ] [3,4,5] [3,4,5],求和为12 。右边公式算出来是6,分母6,结果1,符合定义。

def calAUC(prob,labels):
  f = list(zip(prob,labels))
  rank = [values2 for values1,values2 in sorted(f,key=lambda x:x[0])]
  rankList = [i+1 for i in range(len(rank)) if rank[i]==1]
  posNum = 0
  negNum = 0
  for i in range(len(labels)):
    if(labels[i]==1):
      posNum+=1
    else:
      negNum+=1
  auc = 0
  auc = (sum(rankList)- (posNum*(posNum+1))/2)/(posNum*negNum)
  print(auc)
  return auc

1363. 形成三的最大倍数

1363. 形成三的最大倍数

class Solution:
    def largestMultipleOfThree(self, digits: List[int]) -> str:
        n=len(digits)
        count=[0]*10
        modulo=[0]*3
        s=0
        for digit in digits:
            s+=digit
            count[digit]+=1
            modulo[digit%3]+=1
        if s%3==1:
            if modulo[1]>=1:
                rcnt=1
                rmod=1
            else:
                rcnt=2
                rmod=2
        elif s%3==2:
            if modulo[2]>=1:
                rcnt=1
                rmod=2
            else:
                rcnt=2
                rmod=1
        else:
            rcnt=0
            rmod=0
        ans=""
        for i in range(10):
            for j in range(count[i]):
                if rcnt>0 and i%3==rmod:
                    rcnt-=1
                else:
                    ans+=str(i)
                
        if len(ans)>0 and ans[-1]=='0':
            return '0'
        return ans[::-1]

403. 青蛙过河

403. 青蛙过河

时间复杂度: O ( N 2 ) O(N^2) O(N2)

class Solution:
    def canCross(self, stones: List[int]) -> bool:
        mapper={
   }
        for stone in stones:
            mapper[stone]=set()
        mapper[0].add(0)
        for stone in stones:
            for k in mapper[stone]:
                for step in range(k-1,k+2):
                    if step>0 and stone+step in mapper:
                        mapper[stone+step].add(step)
        return len(mapper[stones[-1]])>0

不同路径(考虑障碍物的DP)

62. 不同路径

class Solution:
    def uniquePaths(self, m: int, n: int) -> int:
        dp=[[0]*(n+1) for _ in range(m+1)]
        dp[1][1]=1
        for i in range(1, m+1):
            for j in range(1, n+1):
                if i==1 and j==1:
                    continue
                dp[i][j]=dp[i-1][j]+dp[i][j-1]
        return dp[m][n]

63. 不同路径 II

class Solution:
    def uniquePathsWithObstacles(self, obstacleGrid: List[List[int]]) -> int:
        m=len(obstacleGrid)
        n=len(obstacleGrid[0])
        dp=[[0]*(n+1) for _ in range(m+1)]
        # dp[1][1]=1
        for i in range(1, m+1):
            for j in range(1, n+1):
                if obstacleGrid[i-1][j-1]==1:
                    continue
                if i==1 and j==1:
                    dp[1][1]=1
                    continue
                dp[i][j]=dp[i-1][j]+dp[i][j-1]

        return dp[m][n]

股票难题

188. 买卖股票的最佳时机 IV

class Solution:
    def maxProfit(self, k: int, prices: List[int]) -> int:
        N=len(prices)
        K=k
        dp=[[[0]*2 for _ in range(K+1)] for _ in range(N+1)]
        for k in range(K+1):
            dp[0][k][1]=-inf
        for i in range(N+1):
            dp[i][0][1]=-inf
        for i in range(1, N+1):
            # for k in range(K,0,-1):
            for k in range(1, K+1):
                # 卖
                dp[i][k][0]=max(dp[i-1][k][0], dp[i-1][k][1]+prices[i-1])
                # 买
                dp[i][k][1]=max(dp[i-1][k][1], dp[i-1][k-1][0]-prices[i-1])
        return dp[N][K][0]

309. 最佳买卖股票时机含冷冻期

class Solution:
    def maxProfit(self, prices: List[int]) -> int:
        N=len(prices)
        dp=[[0]*2 for _ in range(N+1)]
        dp[0][1]=-inf
        for i in range(1, N+1):
            # 卖
            dp[i][0]=max(dp[i-1][0],dp[i-1][1]+prices[i-1])
            # 买
            dp[i][1]=max(dp[i-1][1],dp[i-2][0]-prices[i-1])
        return dp[N][0]

714. 买卖股票的最佳时机含手续费

class Solution:
    def maxProfit(self, prices: List[int], fee: int) -> int:
        if not prices:
            return 0
        N = len(prices)
        dp_i_0 = 0
        dp_i_1 = -inf
        for i in range(N):
            dp_i_0 = max(dp_i_0, dp_i_1 + prices[i] - fee)
            dp_i_1 = max(dp_i_1, dp_i_0 - prices[i])
        return dp_i_0

315. 计算右侧小于当前元素的个数

315. 计算右侧小于当前元素的个数

其实和逆序对差不多,就是改为对index排序(要记录下标)

和逆序对一样,在归并的统计区做数据统计,其他基本不变

class Solution:
    def countSmaller(self, nums: List[int]) -> List[int]:

        counts = [0] * len(nums)
        index = [i for i in range(len(nums))]

        def merge_sort(l, r):
            if l >= r:
                return
            mid = (r + l) // 2
            merge_sort(l, mid)
            merge_sort(mid + 1, r)
            # 统计
            b = mid + 1
            for a in range(l, mid + 1):
                while b <= r and nums[index[b]] < nums[index[a]]:
                    b += 1
                counts[index[a]] += b - mid - 1
            # 归并
            a = l
            b = mid + 1
            merged = []
            while a <= mid and b <= r:
                if nums[index[a]] <= nums[index[b]]:
                    merged.append(index[a])
                    a += 1
                else:
                    merged.append(index[b])
                    b += 1
            while a <= mid:
                merged.append(index[a])
                a += 1
            while b <= r:
                merged.append(index[b])
                b += 1
            index[l:r + 1] = merged

        merge_sort(0, len(nums) - 1)
        # print(index)
        return counts

255. 验证前序遍历序列二叉搜索树

255. 验证前序遍历序列二叉搜索树

入栈的时候递减,出栈的时候递增

Python3 图解,栈

class Solution:
    def verifyPreorder(self, preorder: List[int]) -> bool:
        root=-inf
        stack=[]
        # 局部递减,全局递增
        for val in preorder:
            if val<root:
                return False
            while stack and stack[-1]<val:
                root=stack.pop()
            stack.append(val)
        return True

剑指 Offer 33. 二叉搜索树的后序遍历序列

面试题33. 二叉搜索树的后序遍历序列(递归分治 / 单调栈,清晰图解)

class Solution:
    def verifyPostorder(self, postorder: List[int]) -> bool:
        root=inf
        stack=[]
        # 局部递增,全局递减
        for val in reversed(postorder):
            if val>root:
                return False
            while stack and stack[-1]>val:
                root=stack.pop()
            stack.append(val)
        return True

376. 摆动序列

376. 摆动序列

class Solution:
    def wiggleMaxLength(self, nums: List[int]) -> int:
        n=len(nums)
        if n<=1:
            return n
        pre_delta=nums[1]-nums[0]
        ans=1 if pre_delta==0 else 2
        for i in range(2,n):
            delta=nums[i]-nums[i-1]
            if (delta<0 and pre_delta>=0) or (delta>0 and pre_delta<=0):
                ans+=1
                pre_delta=delta
        return ans

582. 杀掉进程

582. 杀掉进程

  • 并查集

考虑一种极端情况,数的结构类似链表,那么时间复杂度直接飙到 N 2 N^2 N2,就TLE

hashmap并查集的话,如果某个结点不存在,如x not in parent,可以直接让parent[x]=x,然后返回x,这样连初始化都免了

class Solution:
    def killProcess(self, pid: List[int], ppid: List[int], kill: int) -> List[int]:
        result = [kill]
        parent = dict(zip(pid, ppid))
        parent[0] = 0
        for i in parent.keys():
            x = self.find(i, parent, kill)
            if x == kill:
                result.append(i)
        return result

    def find(self, x, parent, kill):
        # 在kill处早停
        if parent[x] == kill:
            return kill
        # 路径压缩
        p = parent[x]
        if parent[x] != x:
            parent[x] = self.find(parent[x], parent, kill)
        return parent[x]
  • 模拟树

的确是种好方法

反正都差不多,这样就 O ( N ) O(N) O(N)

# 模拟树(哈希表)
class Node:
    def __init__(self,val,children):
        self.val = val
        self.children = children 

class Solution:
    def killProcess(self, pid: List[int], ppid: List[int], kill: int) -> List[int]:
        def getAllChildren(p, l):
            """递归搜索子进程"""
            for n in p.children:
                l.append(n.val)
                getAllChildren(n,l)
                
        mapping = {
   }
        for _, val in enumerate(pid):
            mapping[val] = Node(val,[])
        for i in range(len(ppid)): # 父子关系连接
            if ppid[i] >0:
                cur = mapping[ppid[i]]
                cur.children.append(mapping[pid[i]])

        l  = []
        l.append(kill)
        getAllChildren(mapping[kill],l)
        return l 

718. 最长重复子数组

在这里插入图片描述

718. 最长重复子数组

  • 动态规划

O ( N M ) O(NM) O(NM)

class Solution:
    def findLength(self, A: List[int], B: List[int]) -> int:
        n, m = len(A), len(B)
        dp = [[0] * (m + 1) for _ in range(n + 1)]
        ans = 0
        for i in range(1,n+1):
            for j in range(1,m+1):
                dp[i][j] = dp[i - 1][j - 1] + 1 if A[i-1] == B[j-1] else 0
                ans = max(ans, dp[i][j])
        return ans

  • 滑动窗口
class Solution:
    def findLength(self, A: List[int], B: List[int]) -> int:
        def maxLength(addA: int, addB: int, length: int) -> int:
            ret = k = 0
            for i in range(length):
                if A[addA + i] == B[addB + i]:
                    k += 1
                    ret = max(ret, k)
                else:
                    k = 0
            return ret

        n, m = len(A), len(B)
        ret = 0
        for i in range(n):
            length = min(m, n - i)
            ret = max(ret, maxLength(i, 0, length))
        for i in range(m):
            length = min(n, m - i)
            ret = max(ret, maxLength(0, i, length))
        return ret
class Solution {
   
public:
    int findLength(vector<int> &A, vector<int> &B) {
   
        int m = A.size(), n = B.size(), res = 0;
        // 枚举对应关系
        for (int diff = -(m - 1); diff <= n - 1; ++diff) {
   
            // 遍历公共部分
            for (int i = max(0, -diff), l = 0; i < min(m, n - diff); ++i) {
   
                l = (A[i] == B[i + diff]) ? (l + 1) : 0;
                res = max(res, l);
            }
        }
        return res;
    }
};

剑指 Offer 41. 数据流中的中位数

class MedianFinder {
   
public:
    priority_queue<int> lo;
    priority_queue<int, vector<int>, greater<int> > hi;

    void addNum(int num) {
   
        lo.push(num);
        hi.push(lo.top());
        lo.pop();
        if (lo.size() < hi.size()) {
   
            lo.push(hi.top());
            hi.pop();
        }
    }

    double findMedian() {
   
        if (lo.size() == hi.size()) {
   
            return (lo.top() + hi.top()) / 2.0;
        } else {
   
            return lo.top();
        }
    }
};

516. 最长回文子序列

516. 最长回文子序列

class Solution:
    def longestPalindromeSubseq(self, s: str) -> int:
        n = len(s)
        dp = [[0] * n for _ in range(n)]
        # 初始化1
        for i in range(n):
            dp[i][i] = 1
        # 初始化2 
        for i in range(n - 1):
            dp[i][i + 1] = 2 if s[i] == s[i + 1] else 1
        # 遍历
        # 间隔
        for k in range(2,n):
            # 起点
            for i in range(n - k):
                # 终点
                j = i + k
                if s[i] == s[j]:
                    dp[i][j] = dp[i + 1][j - 1] + 2
                else:
                    dp[i][j] = max(dp[i + 1][j], dp[i][j - 1])
        return dp[0][n - 1]

97. 交错字符串

class Solution:
    def isInterleave(self, s1: str, s2: str, s3: str) -> bool:
        l1=len(s1)
        l2=len(s2)
        l3=len(s3)
        if l1+l2!=l3:
            return False
        dp=[[False]*(l2+1) for _ in range(l1+1)]
        dp[0][0]=True
        for i in range(1,l1+1):
            dp[i][0]=(dp[i-1][0] and s1[i-1]==s3[i-1])
        for i in range(1,l2+1):
            dp[0][i]=(dp[0][i-1] and s2[i-1]==s3[i-1])
        for i in range(1,l1+1):
            for j in range(1,l2+1):
                if s3[i+j-1]==s1[i-1] and dp[i-1][j]:
                    dp[i][j]=True
                elif s3[i+j-1]==s2[j-1] and dp[i][j-1]:
                    dp[i][j]=True
        return dp[l1][l2]

416. 分割等和子集

todo: 条件可以改成尽量分成两个子集,求选中了哪些物品

其实这题就是转化过的01背包

class Solution:
    def canPartition(self, nums: List[int]) -> bool:
        L = len(nums)
        sum_ = sum(nums)
        if sum_ % 2:
            return False
        target = sum_ // 2
        # L 行 target + 1 列
        dp = [[0] * (target + 1) for _ in range(L + 1)]
        # 容量为0的时候,绝逼恰好能装满
        dp[0][0] = True
        # 处理数据,让nums从1索引
        nums.insert(0,0)
        # 开始DP
        for i in range(1, L+1):
            for j in range(target + 1):
                if j >= nums[i]:
                    dp[i][j] = dp[i - 1][j] or dp[i - 1][j - nums[i]]
                else:
                    dp[i][j] = dp[i - 1][j]
            # 只装i个物品就装满了
            if dp[i][target]:
                return True
        return False

312. 戳气球

312. 戳气球

经典动态规划:戳气球问题

有空再研究

class Solution:
    def maxCoins(self, nums: List[int]) -> int:
        n = len(nums)
        points = [1] * (n + 2)
        for i in range(n):
            points[i + 1] = nums[i]
        dp = [[0] * (n + 2) for _ in range(n + 2)]
        for itv in range(2, n+2):  # [2,n+1]
            for i in range(0, n - itv + 2):
                j = i + itv
                for k in range(i + 1, j):
                    dp[i][j] = max(
                        dp[i][j],
                        dp[i][k] + dp[k][j] + points[i] * points[j] * points[k],
                    )
        return dp[0][n + 1]

583. 两个字符串的删除操作

583. 两个字符串的删除操作

def longestCommonSubsequence(text1: str, text2: str) -> int:
    l1=len(text1)
    l2=len(text2)
    dp=[[0]*(l2+1) for _ in range(l1+1)]
    for i in range(1, l1+1):
        for j in range(1, l2+
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值