人工智能之编程进阶 Python高级：第四章数学类模块

原创已于 2025-11-06 19:38:56 修改 · 556 阅读

6 ·

CC 4.0 BY-SA版权

文章标签：

#人工智能 #python #开发语言

于 2025-11-06 19:32:35 首次发布

编程进阶之python高级专栏收录该内容

4 篇文章

订阅专栏

程序员成长：技术、职场与思维模式实战指南 10w+人浏览 1.4k人参与

人工智能之编程进阶 Python高级

第四章数学类模块

文章目录

人工智能之编程进阶 Python高级
前言
一、`re` 模块：正则表达式（Regular Expressions）
常用函数
常用元字符

二、`operator` 模块：高效操作符函数
常用函数

三、`math` 模块：数学函数
常用函数

四、`random` 模块：伪随机数生成
常用函数

五、`hashlib` 模块：哈希算法
常用哈希
安全实践：密码哈希

六、`statistics` 模块：统计计算
常用函数

七、模块对比与使用场景总结
八、最佳实践建议
总结
资料关注

前言

本章节主要讲述和数学相关的模块。本文涵盖了Python 标准库中 re、operator、math、random、hashlib、statistics 六个常用模块的详细说明与使用示例，包括核心功能、典型用法和最佳实践。

一、`re` 模块：正则表达式（Regular Expressions）

用于字符串模式匹配与文本处理。

常用函数

import re

text = "Contact: alice@example.com, bob123@gmail.com"

# 1. 搜索（返回第一个匹配）
match = re.search(r'\b\w+@\w+\.\w+\b', text)
if match:
    print("找到邮箱:", match.group())  # alice@example.com

# 2. 查找所有匹配
emails = re.findall(r'\b\w+@\w+\.\w+\b', text)
print("所有邮箱:", emails)  # ['alice@example.com', 'bob123@gmail.com']

# 3. 替换
cleaned = re.sub(r'\d+', '#', "ID: 12345, Code: 67890")
print(cleaned)  # ID: #, Code: #

# 4. 分割
parts = re.split(r'[,\s]+', "apple, banana  orange")
print(parts)  # ['apple', 'banana', 'orange']

# 5. 编译正则（提高性能）
pattern = re.compile(r'\d{3}-\d{4}')
phones = pattern.findall("Call 123-4567 or 987-6543")

常用元字符

符号	含义
`\d`	数字 `[0-9]`
`\w`	字母/数字/下划线 `[a-zA-Z0-9_]`
`\s`	空白字符（空格、制表符等）
`.`	任意字符（除换行）
`^` `$`	行首 / 行尾
`*` `+` `?`	0+/1+/0 或 1 次
`{n,m}`	重复 n 到 m 次
`[]`	字符集合
`()`	分组

✅ 建议：复杂正则使用 re.VERBOSE 模式增加可读性。

二、`operator` 模块：高效操作符函数

将 Python 操作符（如 +, ==, []）封装为函数，常用于高阶函数（如 sorted, map）。

常用函数

import operator

# 算术
print(operator.add(2, 3))      # 5
print(operator.mul(4, 5))      # 20

# 比较
print(operator.eq(10, 10))     # True (==)
print(operator.lt(3, 5))       # True (<)

# 属性/索引访问
students = [
    {'name': 'Alice', 'score': 90},
    {'name': 'Bob', 'score': 85}
]

# 按 score 排序（替代 lambda）
sorted_students = sorted(students, key=operator.itemgetter('score'))
print(sorted_students)

# 多级排序
sorted_by_name_score = sorted(students, 
                              key=operator.itemgetter('name', 'score'))

# 获取对象属性
class Person:
    def __init__(self, name):
        self.name = name

p = Person("Charlie")
print(operator.attrgetter('name')(p))  # Charlie

✅ 优势：比 lambda 更快、更清晰。

三、`math` 模块：数学函数

提供浮点数数学运算（不支持复数，复数用 cmath）。

常用函数

import math

# 基础常数
print(math.pi)     # 3.141592653589793
print(math.e)      # 2.718281828459045

# 幂与对数
print(math.sqrt(16))       # 4.0
print(math.pow(2, 3))      # 8.0
print(math.log(10))        # 自然对数 ≈2.302
print(math.log10(100))     # 2.0

# 三角函数（弧度）
print(math.sin(math.pi/2)) # 1.0
print(math.degrees(math.pi)) # 180.0

# 取整
print(math.ceil(3.2))      # 4
print(math.floor(3.8))     # 3
print(math.trunc(-3.7))    # -3

# 最大公约数（Python 3.5+）
print(math.gcd(48, 18))    # 6

⚠️ 注意：所有输入输出均为 float（除 gcd 等少数函数）。

四、`random` 模块：伪随机数生成

用于生成随机数、随机选择、打乱序列等。

常用函数

import random

# 设置种子（可复现结果）
random.seed(42)

# 随机浮点数 [0.0, 1.0)
print(random.random())

# 指定范围浮点数
print(random.uniform(1.5, 10.5))  # 1.5 ~ 10.5

# 随机整数（含两端）
print(random.randint(1, 10))      # 1~10

# 从序列随机选一个
colors = ['red', 'green', 'blue']
print(random.choice(colors))

# 随机抽样（不放回）
print(random.sample(range(1, 100), 5))  # 5个不重复数字

# 打乱列表（原地修改）
deck = list(range(1, 53))
random.shuffle(deck)
print(deck[:5])  # 前5张牌

# 加权随机选择（Python 3.6+）
choices = random.choices(['A', 'B', 'C'], weights=[1, 2, 3], k=10)

🔒 安全提示：
random 是伪随机，不适用于加密！加密请用 secrets 模块。

五、`hashlib` 模块：哈希算法

用于生成**消息摘要（如 MD5、SHA 系列）**，常用于校验、密码存储（需加盐）。

常用哈希

import hashlib

data = b"Hello, world!"

# MD5（不推荐用于安全场景）
md5_hash = hashlib.md5(data).hexdigest()
print("MD5:", md5_hash)

# SHA-256（推荐）
sha256_hash = hashlib.sha256(data).hexdigest()
print("SHA256:", sha256_hash)

# 更新式哈希（处理大文件）
sha1 = hashlib.sha1()
sha1.update(b"Part 1")
sha1.update(b"Part 2")
print("SHA1:", sha1.hexdigest())

# 支持的算法
print(hashlib.algorithms_available)
# {'md5', 'sha1', 'sha224', 'sha256', 'sha384', 'sha512', ...}

安全实践：密码哈希

import hashlib
import os

def hash_password(password: str, salt: bytes = None) -> tuple[bytes, bytes]:
    if salt is None:
        salt = os.urandom(32)  # 32字节盐值
    pwd_hash = hashlib.pbkdf2_hmac('sha256', password.encode('utf-8'), salt, 100000)
    return pwd_hash, salt

# 存储 pwd_hash 和 salt
hashed, salt = hash_password("my_secret")
# 验证时用相同 salt 重新哈希比较

✅ 推荐：使用 hashlib.pbkdf2_hmac 或 bcrypt/scrypt 进行密码哈希。

六、`statistics` 模块：统计计算

提供基础统计函数，适用于数值数据分析。

常用函数

import statistics

data = [1, 2, 2, 3, 4, 4, 4, 5]

# 中心趋势
print("均值:", statistics.mean(data))        # 3.125
print("中位数:", statistics.median(data))    # 3.5
print("众数:", statistics.mode(data))        # 4

# 离散程度
print("方差:", statistics.variance(data))    # 样本方差
print("标准差:", statistics.stdev(data))     # 样本标准差

# 其他
print("几何平均数:", statistics.geometric_mean([1, 2, 4]))  # ≈2.0
print("分位数:", statistics.quantiles(data, n=4))  # 四分位数

⚠️ 注意：

variance 和 stdev 默认计算样本（分母 n-1）
总体方差用 pvariance，总体标准差用 pstdev

七、模块对比与使用场景总结

模块	主要用途	典型场景
`re`	文本模式匹配	日志解析、数据清洗、验证输入格式
`operator`	操作符函数化	高阶函数中的键函数（如 `sorted`）
`math`	数学计算	科学计算、几何、工程公式
`random`	随机数生成	游戏、模拟、抽样、测试数据生成
`hashlib`	哈希摘要	文件校验、密码存储、数字签名
`statistics`	统计分析	数据探索、简单报表、教育用途

八、最佳实践建议

正则表达式：
- 复杂模式使用 re.VERBOSE
- 避免过度使用（有时 str.split() 更高效）
随机数：
- 加密场景用 secrets 模块
- 测试时设置 random.seed() 保证可复现
哈希：
- 永远不要用 MD5/SHA1 存密码
- 密码必须加盐（salt）
统计：
- 大数据集用 numpy/pandas 替代 statistics
- 注意样本 vs 总体的区别
数学：
- 整数精确计算用 fractions 或 decimal
- 复数用 cmath

总结

经过本文掌握这六个模块，就能高效处理文本、数学、随机、安全、统计等常见数学类任务，加油！

资料关注

公众号：咚咚王

《Python编程：从入门到实践》
《利用Python进行数据分析》
《算法导论中文第三版》
《概率论与数理统计（第四版） (盛骤) 》
《程序员的数学》
《线性代数应该这样学第3版》
《微积分和数学分析引论》
《（西瓜书）周志华-机器学习》
《TensorFlow机器学习实战指南》
《Sklearn与TensorFlow机器学习实用指南》
《模式识别（第四版）》
《深度学习 deep learning》伊恩·古德费洛著花书
《Python深度学习第二版(中文版)【纯文本】 (登封大数据 (Francois Choliet)) (Z-Library)》
《深入浅出神经网络与深度学习+(迈克尔·尼尔森（Michael+Nielsen）》
《自然语言处理综论第2版》
《Natural-Language-Processing-with-PyTorch》
《计算机视觉-算法与应用(中文版)》
《Learning OpenCV 4》
《AIGC：智能创作时代》杜雨+&+张孜铭
《AIGC原理与实践：零基础学大语言模型、扩散模型和多模态模型》
《从零构建大语言模型（中文版）》
《实战AI大模型》
《AI 3.0》