requirements.txt文件中通常要求Python包的版本不低于某个版本号,似乎只要包足够新,一切问题都能解决, 但是实际经常遇到报错需要某个包降级到更低的版本才能使用,而往往其他包又对它有较高版本的依赖,导致很难找到一组版本合适的包。
最近突然想到,直接安装项目文件(特别是requirements.txt)最近更新日期及以前的最新Python包,就仿佛模拟作者当时使用的最新包,应该各种包的依赖关系不会出错了。
于是写了一个脚本,输入requirements.txt文件,输出和原文件格式相同,只是每一行有改变。比如原始行的gradio>=4.26.0
改为:gradio==4.32.0 # at 2024-09-02,orig req>=4.26.0
。原来的空行和只含有注释的行会保留。
import requests
from datetime import datetime
import re
def get_latest_version_before_date(package_name, target_date):
"""
获取指定日期之前的最新版本号和该版本的发布日期
"""
url = f"https://pypi.org/pypi/{package_name}/json"
response = requests.get(url)
if response.status_code != 200:
print(f"Error fetching data for {package_name}")
return None, None
data = response.json()
releases = data['releases']
latest_version = None
latest_date = datetime.min
latest_release_date = None
# 遍历所有版本和发布日期
for version, release_info in releases.items():
for info in release_info:
release_date = datetime.strptime(info['upload_time'], '%Y-%m-%dT%H:%M:%S')
if release_date < target_date and release_date > latest_date:
latest_version = version
latest_date = release_date
latest_release_date = release_date.strftime('%Y-%m-%d')
return latest_version, latest_release_date
def read_requirements(file_path):
"""
读取requirements.txt文件并提取包名、版本约束和原始行
"""
with open(file_path, 'r') as file:
lines = file.readlines()
packages = []
pattern = r'([a-zA-Z0-9_\-]+)([<>=~!]+[\d.]+)?' # 匹配包名和版本限制
for line in lines:
line = line.strip()
if line == '' or line.startswith('#'):
# 保留空行和注释行
packages.append((line, None, None))
else:
match = re.match(pattern, line)
if match:
package_name = match.group(1)
version_constraint = match.group(2) if match.group(2) else ""
packages.append((line, package_name, version_constraint))
else:
packages.append((line, None, None)) # 保持原样处理非标准格式的行
return packages
def generate_requirements(file_path, target_date_str):
"""
根据给定日期生成符合要求的requirements.txt文件内容
"""
packages = read_requirements(file_path)
target_date = datetime.strptime(target_date_str, '%Y-%m-%d')
# 将目标日期设为当天的结束时间
target_date = target_date.replace(hour=23, minute=59, second=59)
updated_requirements = []
for original_line, package, constraint in packages:
if package:
version, release_date = get_latest_version_before_date(package, target_date)
if version and release_date:
updated_line = f"{package}=={version} # released on {release_date}, orig req {constraint}"
else:
updated_line = f"{original_line} # No suitable version found before {target_date.strftime('%Y-%m-%d')}"
else:
updated_line = original_line
print(updated_line) # 每获取一个包的信息后立即输出
updated_requirements.append(updated_line)
return updated_requirements
# 示例用法
requirements_file = r"C:\Users\abc\Downloads\requirements.txt" # 指定你的 requirements.txt 文件路径
date = '2024-05-29' # 指定日期
new_versions = generate_requirements(requirements_file, date)
# 将新的版本信息写入新的requirements文件
output_file = r"C:\Users\abc\Downloads\new_requirements.txt"
with open(output_file, 'w') as f:
f.write("\n".join(new_versions)+"\n")
比如ChatGLM3的requirements.txt文件(Commits on May 30, 2024):
# basic requirements
transformers==4.40.0
cpm_kernels>=1.0.11
torch>=2.3.0
vllm>=0.4.2
gradio>=4.26.0
sentencepiece>=0.2.0
sentence_transformers>=2.7.0
accelerate>=0.29.2
streamlit>=1.33.0
fastapi>=0.110.0
loguru~=0.7.2
mdtex2html>=1.3.0
latex2mathml>=3.77.0
jupyter_client>=8.6.1
# for openai demo
openai>=1.30.1
pydantic>=2.7.1
sse-starlette>=2.1.0
uvicorn>=0.29.0
timm>=0.9.16
tiktoken>=0.6.0
# for langchain demo
langchain>=0.2.1
langchain_community>=0.2.0
langchainhub>=0.1.15
arxiv>=2.1.0
运行脚本获取截止2024.5.30的各Python包版本:
# basic requirements
transformers==4.41.2 # released on 2024-05-30, orig req ==4.40.0
cpm_kernels==1.0.11 # released on 2022-03-07, orig req >=1.0.11
torch==2.3.0 # released on 2024-04-24, orig req >=2.3.0
vllm==0.4.2 # released on 2024-05-05, orig req >=0.4.2
gradio==4.32.1 # released on 2024-05-30, orig req >=4.26.0
sentencepiece==0.2.0 # released on 2024-02-19, orig req >=0.2.0
sentence_transformers==3.0.0 # released on 2024-05-28, orig req >=2.7.0
accelerate==0.30.1 # released on 2024-05-10, orig req >=0.29.2
streamlit==1.35.0 # released on 2024-05-23, orig req >=1.33.0
fastapi==0.111.0 # released on 2024-05-03, orig req >=0.110.0
loguru==0.7.2 # released on 2023-09-11, orig req ~=0.7.2
mdtex2html==1.3.0 # released on 2024-01-26, orig req >=1.3.0
latex2mathml==3.77.0 # released on 2023-12-06, orig req >=3.77.0
jupyter_client==8.6.2 # released on 2024-05-23, orig req >=8.6.1
# for openai demo
openai==1.30.5 # released on 2024-05-30, orig req >=1.30.1
pydantic==2.7.2 # released on 2024-05-28, orig req >=2.7.1
sse-starlette==2.1.0 # released on 2024-04-05, orig req >=2.1.0
uvicorn==0.30.0 # released on 2024-05-28, orig req >=0.29.0
timm==1.0.3 # released on 2024-05-15, orig req >=0.9.16
tiktoken==0.7.0 # released on 2024-05-13, orig req >=0.6.0
# for langchain demo
langchain==0.2.1 # released on 2024-05-23, orig req >=0.2.1
langchain_community==0.2.1 # released on 2024-05-23, orig req >=0.2.0
langchainhub==0.1.17 # released on 2024-05-29, orig req >=0.1.15
arxiv==2.1.0 # released on 2023-12-18, orig req >=2.1.0
当然,像pytorch、transformers这样的大包,安装时仍然需要注意,甚至pytorch要根据cuda情况在官网复制安装命令,而transformers在requirements.txt的变动中居然出现了降级和指定版本号(transformers>=4.41.0变为transformers==4.40.0),这是尤其要注意的。