python字符串处理15题-刷题

最新推荐文章于 2025-03-13 18:40:23 发布

小炫y

最新推荐文章于 2025-03-13 18:40:23 发布

阅读量1.7k

点赞数 2

文章标签： python 开发语言算法

本文链接：https://blog.csdn.net/weixin_44740756/article/details/132475743

版权

文章介绍了多种与字符串处理相关的算法，包括寻找最长无重复子串、验证回文字符串、翻转单词顺序、字符串转换整数、最长回文子串、最长公共前缀、字符大小写转换、单词长度、元音反转、括号验证、唯一字符索引、变位词查找、分割回文、最小覆盖子串、字符串排列和频率最高的单词。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

题目一：最长无重复子串长度
给定一个字符串，找出其中不含有重复字符的最长子串的长度。
示例：
输入: "abcabcbb"
输出: 3
解释: 最长无重复子串是 "abc"，长度为 3。
解答：

def length_of_longest_substring(s):
    max_length = 0
    start = 0
    char_index = {}

    for i, char in enumerate(s):
        if char in char_index and char_index[char] >= start:
            start = char_index[char] + 1
        char_index[char] = i
        max_length = max(max_length, i - start + 1)

    return max_length

题目二：验证回文字符串
给定一个字符串，验证它是否是回文串。只考虑字母和数字，忽略大小写。
示例：
输入: "A man, a plan, a canal: Panama"
输出: True
解答：

def is_palindrome(s):
    s = ''.join(filter(str.isalnum, s)).lower()
    return s == s[::-1]

题目三：翻转字符串中的单词顺序
给定一个字符串，逐个翻转字符串中的每个单词。
示例：
输入: "the sky is blue"
输出: "blue is sky the"
解答：

def reverse_words(s):
    words = s.split()
    return ' '.join(words[::-1])

题目四：字符串转换整数 (atoi)
实现一个函数来将字符串转换成整数。
示例：
输入: "42"
输出: 42
解答：

def my_atoi(s):
    s = s.strip()
    if not s:
        return 0
    
    sign = 1
    if s[0] in ['+', '-']:
        if s[0] == '-':
            sign = -1
        s = s[1:]
    
    num = 0
    for char in s:
        if not char.isdigit():
            break
        num = num * 10 + int(char)
    
    num = min(max(num * sign, -2**31), 2**31 - 1)
    return num

题目五：最长回文子串
给定一个字符串，找到其中最长的回文子串。
示例：
输入: "babad"
输出: "bab"
解答：

def longest_palindrome(s):
    def expand_around_center(left, right):
        while left >= 0 and right < len(s) and s[left] == s[right]:
            left -= 1
            right += 1
        return s[left + 1:right]

    longest = ""
    for i in range(len(s)):
        # 奇数长度回文串
        palindrome_odd = expand_around_center(i, i)
        # 偶数长度回文串
        palindrome_even = expand_around_center(i, i + 1)
        
        if len(palindrome_odd) > len(longest):
            longest = palindrome_odd
        if len(palindrome_even) > len(longest):
            longest = palindrome_even
    
    return longest

题目一：最长公共前缀
编写一个函数来查找字符串数组中的最长公共前缀。示例：
输入: ["flower","flow","flight"]
输出: "fl"
解答：

def longest_common_prefix(strs):
    if not strs:
        return ""
    
    min_length = min(len(s) for s in strs)
    common_prefix = ""
    
    for i in range(min_length):
        char = strs[0][i]
        if all(s[i] == char for s in strs):
            common_prefix += char
        else:
            break
    
    return common_prefix

题目二：字符串转换
给定一个字符串，将该字符串中的大写字母转换为小写字母，小写字母转换为大写字母。示例：
输入: "Hello World"
输出: "hELLO wORLD"
解答：

def swap_case(s):
    return s.swapcase()

题目三：最后一个单词的长度
给定一个字符串，包含多个单词，其中单词由空格分隔。找出最后一个单词的长度。示例：
输入: "Hello World"
输出: 5
解答：

def length_of_last_word(s):
    words = s.split()
    if words:
        return len(words[-1])
    return 0

题目四：反转字符串中的元音字母
编写一个函数，以字符串作为输入，反转字符串中的元音字母。示例：
输入: "hello"
输出: "holle"
解答：

def reverse_vowels(s):
    vowels = 'aeiouAEIOU'
    s = list(s)
    left, right = 0, len(s) - 1
    
    while left < right:
        while left < right and s[left] not in vowels:
            left += 1
        while left < right and s[right] not in vowels:
            right -= 1
        
        s[left], s[right] = s[right], s[left]
        left += 1
        right -= 1
    
    return ''.join(s)

题目五：验证有效的括号序列
给定一个只包含三种字符的字符串：'(', ')'，'{'，'}'，'['，']'，判断字符串是否有效。示例：
输入: "()[]{}"
输出: True
解答：

def is_valid(s):
    stack = []
    mapping = {')': '(', '}': '{', ']': '['}
    
    for char in s:
        if char in mapping:
            top_element = stack.pop() if stack else '#'
            if mapping[char] != top_element:
                return False
        else:
            stack.append(char)
    
    return not stack

题目六：字符串中的第一个唯一字符
给定一个字符串，找到它的第一个不重复的字符，并返回它的索引。如果不存在，则返回 -1。示例：
输入: "leetcode"
输出: 0
解答：

def first_uniq_char(s):
    char_count = {}
    
    for char in s:
        char_count[char] = char_count.get(char, 0) + 1
    
    for i, char in enumerate(s):
        if char_count[char] == 1:
            return i
    
    return -1

题目七：字符串中的所有变位词
给定两个字符串 s 和 p，找到 s 中所有 p 的变位词的起始索引。示例：
输入: s = "cbaebabacd", p = "abc"
输出: [0, 6]
解答：

def find_anagrams(s, p):
    p_count = [0] * 26
    s_count = [0] * 26
    result = []
    
    for char in p:
        p_count[ord(char) - ord('a')] += 1
    
    for i in range(len(s)):
        s_count[ord(s[i]) - ord('a')] += 1
        
        if i >= len(p):
            s_count[ord(s[i - len(p)]) - ord('a')] -= 1
        
        if s_count == p_count:
            result.append(i - len(p) + 1)
    
    return result

题目八：分割回文串
给定一个字符串，将字符串分割成一些子串，使每个子串都是回文串。返回所有可能的分割方案。示例：
输入: "aab"
输出: [["aa","b"], ["a","a","b"]]
解答：

def partition(s):
    def is_palindrome(sub):
        return sub == sub[::-1]
    
    def backtrack(start, path):
        if start == len(s):
            result.append(path[:])
            return
        
        for end in range(start + 1, len(s) + 1):
            if is_palindrome(s[start:end]):
                path.append(s[start:end])
                backtrack(end, path)
                path.pop()
    
    result = []
    backtrack(0, [])
    return result

题目九：最小覆盖子串
给定一个字符串 S 和一个字符串 T，请在 S 中找出包含 T 所有字母的最小子串。示例：
输入: S = "ADOBECODEBANC", T = "ABC"
输出: "BANC"
解答：

def min_window(s, t):
    from collections import Counter
    
    if not s or not t:
        return ""
    
    t_count = Counter(t)
    required_chars = len(t_count)
    
    left, right = 0, 0
    formed_chars = 0
    window_count = {}
    ans = float("inf"), None, None
    
    while right < len(s):
        char = s[right]
        window_count[char] = window_count.get(char, 0) + 1
        
        if char in t_count and window_count[char] == t_count[char]:
            formed_chars += 1
        
        while left <= right and formed_chars == required_chars:
            char = s[left]
            
            if right - left + 1 < ans[0]:
                ans = right - left + 1, left, right
            
            window_count[char] -= 1
            if char in t_count and window_count[char] < t_count[char]:
                formed_chars -= 1
            
            left += 1
        
        right += 1
    
    return "" if ans[0] == float("inf") else s[ans[1]:ans[2] + 1]

题目十：字符串的排列
给定两个字符串 s1 和 s2，写一个函数来判断 s2 是否包含 s1 的排列。示例：
输入: s1 = "ab" s2 = "eidbaooo"
输出: True
解答：

def check_inclusion(s1, s2):
    from collections import Counter
    
    s1_count = Counter(s1)
    window_count = Counter()
    
    left, right = 0, 0
    required_chars = len(s1)
    
    while right < len(s2):
        char = s2[right]
        window_count[char] = window_count.get(char, 0) + 1
        
        if char in s1_count and window_count[char] <= s1_count[char]:
            required_chars -= 1
        
        while left <= right and window_count[s2[left]] > s1_count[s2[left]]:
            window_count[s2[left]] -= 1
            left += 1
        
        if required_chars == 0:
            return True
        
        right += 1
    
    return False

题目：频率最高的单词
给定一段英文文章的文本，找出其中出现频率最高的前k个单词，并返回这些单词按频率降序排列的列表。请忽略单词的大小写。
示例：
输入: text = "This is a sample text. This text contains some sample words."
k = 3
输出: ["sample", "this", "text"]
解答：


def top_k_frequent_words(text, k):
    from collections import Counter  # 导入Counter类，用于统计元素频率
    import re  # 导入正则表达式模块
    
    # 去除标点符号，转换为小写，然后按空格分割文本
    words = re.findall(r'\w+', text.lower())  # 使用正则表达式找出所有单词，转换为小写
    
    # 统计单词频率
    word_count = Counter(words)  # 使用Counter统计单词出现的频率
    
    # 按频率降序排列单词
    sorted_words = sorted(word_count, key=lambda word: (-word_count[word], word))
    # 使用sorted函数对单词按照频率降序排列，如果频率相同则按字典序升序排列
    
    # 返回前k个单词
    return sorted_words[:k]  # 返回按照要求排序后的前k个单词
示例：

给定文本："This is a sample text. This text contains some sample words." 和 k = 3。

文本处理：

去除标点符号后得到："this is a sample text this text contains some sample words"

转换为小写："this is a sample text this text contains some sample words"

按空格分割：["this", "is", "a", "sample", "text", "this", "text", "contains", "some", "sample", "words"]

单词统计：
统计后得到：{"this": 2, "is": 1, "a": 1, "sample": 2, "text": 2, "contains": 1, "some": 1, "words": 1}

按频率降序排列：
排序后得到：["sample", "text", "this", "is", "a", "contains", "some", "words"]

返回前k个单词：
返回前3个单词：["sample", "text", "this"]

综合起来，题目要求将一段英文文本进行处理，找出其中频率最高的前k个单词，并按频率降序排列返回。
解答思路：

文本处理： 首先，我们需要对给定的英文文本进行处理。我们要去除标点符号，将所有单词转换为小写，并按空格分割文本，以便后续的单词统计。

单词统计： 使用Python的collections.Counter模块，我们可以方便地统计文本中每个单词的出现次数。将单词作为键，出现次数作为值，创建一个单词计数器。

按频率降序排列： 接下来，我们将计数器中的单词按照频率降序排列。这里需要注意，如果频率相同，我们按字典序排列单词，以保证结果的正确性。

返回前k个单词： 最后，从排序后的单词列表中，取出前k个单词作为答案。

题目一：最小覆盖子串的变体
给定一个字符串s和一组字符t，找出s中包含t中所有字符的最小子串。要求时间复杂度为O(n)。

输入: s = "ADOBECODEBANC", t = "ABC"
输出: "BANC"

def min_window_substring(s, t):
    from collections import Counter
    
    t_count = Counter(t)  # 统计目标字符 t 中每个字符的频率
    required_chars = len(t_count)  # 需要匹配的不同字符的数量
    
    left, right = 0, 0  # 滑动窗口的左右指针
    formed_chars = 0  # 已经匹配到的字符数量
    window_count = {}  # 滑动窗口中字符的频率计数
    ans = float("inf"), None, None  # 存储结果的变量，初始化为一个极大值
    min_window = ""  # 最小覆盖子串
    
    while right < len(s):
        char = s[right]  # 右指针指向的字符
        window_count[char] = window_count.get(char, 0) + 1  # 统计字符频率
        
        if char in t_count and window_count[char] == t_count[char]:
            formed_chars += 1  # 已匹配字符的数量满足要求
        
        while left <= right and formed_chars == required_chars:
            char = s[left]  # 左指针指向的字符
            
            if right - left + 1 < ans[0]:
                ans = right - left + 1, left, right  # 更新最小覆盖子串的长度和位置
                min_window = s[left:right+1]  # 更新最小覆盖子串
            
            window_count[char] -= 1  # 缩小窗口，左指针右移
            if char in t_count and window_count[char] < t_count[char]:
                formed_chars -= 1  # 一个字符的频率不满足要求，更新已匹配字符数量
            
            left += 1  # 左指针右移
        
        right += 1  # 右指针右移
    
    return min_window  # 返回最小覆盖子串

使用了滑动窗口的思想来解决最小覆盖子串问题。

题目二：最长连续子序列
给定一个字符串s，找出s中最长的连续子序列，使得子序列中的每个字符的前后相邻字符在原始字符串中也是相邻的。
输入: "abacdefg"
输出: "abcdefg"

def longest_continuous_subsequence(s):
    if not s:
        return ""
    
    max_length = 1  # 最长连续子序列的长度
    start = 0  # 连续子序列的起始索引
    end = 0  # 连续子序列的结束索引
    longest_subsequence = s[0]  # 最长连续子序列
    
    for i in range(1, len(s)):
        if ord(s[i]) == ord(s[i - 1]) + 1:
            end = i  # 连续序列继续
        else:
            start = i  # 连续序列中断，更新起始索引
        if end - start + 1 > max_length:
            max_length = end - start + 1  # 更新最长长度
            longest_subsequence = s[start:end + 1]  # 更新最长连续子序列
    
    return longest_subsequence  # 返回最长连续子序列

通过遍历字符串来找到最长连续子序列。每次遇到断开的情况，更新起始索引，并根据最长长度更新最长连续子序列。