存储焦虑是原因
一般用nexus都是为了团队间共享私库或者个人自用,部署在某一个服务器上托管我们的心血,但是如果是一个高频率的开发团队,同时又使用docker来打包构件,那么势必会在服务器上占用大量的存储空间用来保存,我并不想看到这个情况,让我担心占用过多的存储会挤占其他服务的一个稳定性。于是我决定通过某种策略定时清理。
了解nexus的一个清理方式
基本和harbor一致, 都是先标记为删除,此时用户已经看不到并且无法拉取某个构件,然后服务器运行一个块存储压缩的服务真正的物理释放占用的空间
了解nexus自带的清理服务
nexus自带的maven的快照清理策略非常好用,结果docker的清理却是一坨大的

首先是可以通过Asser Name Matcher 来设置策略可以影响的镜像,然后就是设置组件年龄大于多少天和多少天内没有下载的镜像可以删除, 这种简单的设置完全做不到 删除短时间内大量推送的冗余镜像 和 保留一个长时间没有下载和推送但是以后可能会被拉取 的稳定版本镜像
脚本的优点
不关心上一次下载时间和存活年龄,每种镜像都按时间排序,按需保留N个版本
将被标记删除的都会被输出
支持通过环境变量或者脚本参数进行自动化的清理
使用说明见最下面
#!/usr/bin/env python3
"""用于清理 Nexus 仓库中的 Docker 镜像标签。
脚本会遍历配置的 Nexus 仓库,统计镜像标签数量,输出友好的中文提示,并在确认后删除超出保留数量的旧标签。
"""
from __future__ import annotations
import argparse
import os
import sys
from collections import defaultdict
from dataclasses import dataclass
from datetime import datetime, timezone
from typing import Dict, Iterable, List, Optional
import requests
from requests import Session
from requests.auth import HTTPBasicAuth
from requests.exceptions import HTTPError
# ---------------------------------------------------------------------------
# 可配置的默认值(可直接修改或使用环境变量 / 命令行参数覆盖)
# ---------------------------------------------------------------------------
NEXUS_BASE_URL = "https://your-nexus.com" # 你的 Nexus 地址
NEXUS_USERNAME = "admin" # 用户名
NEXUS_PASSWORD = "your_password" # 密码
TARGET_REPOSITORIES = [
"docker-hosted", # 要清理的仓库名称数组
"docker-releases",
]
RETAIN_COUNT = 4 # 保留数量(不含 latest)
REQUEST_TIMEOUT = 30 # HTTP 请求超时时间(秒)
ENV_PREFIX = "NEXUS_CLEANUP_"
@dataclass
class ComponentRecord:
component_id: str
repository: str
name: str
version: str
created: Optional[datetime]
def created_text(self) -> str:
if not self.created:
return "<未知>"
return self.created.astimezone().strftime("%Y-%m-%d %H:%M:%S %Z")
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(
description="清理 Nexus 仓库中的 Docker 镜像标签。"
)
parser.add_argument(
"--base-url",
default=os.environ.get(f"{ENV_PREFIX}BASE_URL", NEXUS_BASE_URL),
help="Nexus 服务器基础地址(默认:%(default)s)",
)
parser.add_argument(
"--username",
default=os.environ.get(f"{ENV_PREFIX}USERNAME", NEXUS_USERNAME),
help="Nexus 登录账号(默认:%(default)s)",
)
parser.add_argument(
"--password",
default=os.environ.get(f"{ENV_PREFIX}PASSWORD", NEXUS_PASSWORD),
help="Nexus 登录密码(默认:配置或环境变量)",
)
parser.add_argument(
"--repos",
nargs="*",
default=os.environ.get(f"{ENV_PREFIX}REPOSITORIES"),
help="要处理的仓库名称(空格分隔),会覆盖默认配置。",
)
parser.add_argument(
"--retain",
type=int,
default=int(os.environ.get(f"{ENV_PREFIX}RETAIN", RETAIN_COUNT)),
help="每个镜像除 latest 外保留的标签数量(默认:%(default)s)",
)
parser.add_argument(
"--yes",
"-y",
action="store_true",
help="跳过交互确认,直接执行删除。",
)
parser.add_argument(
"--dry-run",
action="store_true",
help="仅预览操作,不执行删除。",
)
parser.add_argument(
"--insecure",
action="store_true",
help="跳过 TLS 证书校验(不安全)。",
)
return parser.parse_args()
def build_session(username: str, password: str, verify: bool) -> Session:
session = requests.Session()
session.auth = HTTPBasicAuth(username, password)
session.verify = verify
session.headers.update({"Accept": "application/json"})
return session
def fetch_components(
session: Session, base_url: str, repository: str
) -> Iterable[ComponentRecord]:
endpoint = f"{base_url.rstrip('/')}/service/rest/v1/components"
params: Dict[str, str] = {"repository": repository}
while True:
response = session.get(endpoint, params=params, timeout=REQUEST_TIMEOUT)
try:
response.raise_for_status()
except HTTPError as exc:
raise RuntimeError(
f"获取仓库 “{repository}” 组件列表失败:{exc}"
) from exc
payload = response.json()
for item in payload.get("items", []):
component_id = item.get("id")
name = item.get("name", "<未知>")
version = item.get("version", "<未知>")
created = extract_created(item.get("assets", []))
yield ComponentRecord(
component_id=component_id,
repository=repository,
name=name,
version=version,
created=created,
)
continuation = payload.get("continuationToken")
if not continuation:
break
params["continuationToken"] = continuation
def extract_created(assets: List[dict]) -> Optional[datetime]:
timestamps: List[datetime] = []
for asset in assets:
raw = asset.get("blobCreated")
if not raw:
continue
parsed = parse_iso_datetime(raw)
if parsed:
timestamps.append(parsed)
if not timestamps:
return None
return max(timestamps)
def parse_iso_datetime(timestamp: str) -> Optional[datetime]:
try:
if timestamp.endswith("Z"):
timestamp = timestamp[:-1] + "+00:00"
return datetime.fromisoformat(timestamp).astimezone(timezone.utc)
except ValueError:
return None
def group_components(components: Iterable[ComponentRecord]) -> Dict[str, List[ComponentRecord]]:
grouped: Dict[str, List[ComponentRecord]] = defaultdict(list)
for component in components:
grouped[component.name].append(component)
return grouped
def classify_components(
components: List[ComponentRecord], retain: int
) -> tuple[List[ComponentRecord], List[ComponentRecord]]:
latest_components = [c for c in components if c.version == "latest"]
other_components = [c for c in components if c.version != "latest"]
other_components.sort(
key=lambda c: c.created or datetime.min.replace(tzinfo=timezone.utc), reverse=True
)
keep = latest_components + other_components[:retain]
delete = other_components[retain:]
keep.sort(
key=lambda c: c.created or datetime.min.replace(tzinfo=timezone.utc), reverse=True
)
return keep, delete
def prompt_confirmation(total: int) -> bool:
answer = input(f"确认删除 {total} 个组件?输入 y/N: ").strip().lower()
return answer == "y"
def delete_component(session: Session, base_url: str, component: ComponentRecord) -> bool:
endpoint = f"{base_url.rstrip('/')}/service/rest/v1/components/{component.component_id}"
response = session.delete(endpoint, timeout=REQUEST_TIMEOUT)
if response.status_code == 204:
return True
try:
response.raise_for_status()
except HTTPError as exc:
raise RuntimeError(
f"删除组件 {component.component_id} ({component.name}:{component.version}) 失败:{exc}"
) from exc
return False
def ensure_repositories(repos_argument: Optional[List[str]]) -> List[str]:
if repos_argument:
repos = [item.strip() for item in repos_argument if item.strip()]
else:
repos = TARGET_REPOSITORIES
if not repos:
raise SystemExit("未指定仓库名称,请配置 TARGET_REPOSITORIES 或使用 --repos 参数。")
return repos
def main() -> None:
args = parse_args()
repositories = ensure_repositories(args.repos)
base_url = args.base_url
username = args.username
password = args.password
retain = max(args.retain, 0)
session = build_session(username, password, verify=not args.insecure)
if args.insecure:
print("[!] 已关闭 TLS 证书校验,请谨慎操作。")
pending_deletions: List[ComponentRecord] = []
for repository in repositories:
print("\n" + "=" * 72)
print(f"仓库:{repository}")
print("=" * 72)
try:
components = list(fetch_components(session, base_url, repository))
except RuntimeError as exc:
print(f"[错误] {exc}")
continue
if not components:
print("未找到任何组件。")
continue
grouped = group_components(components)
for image_name, records in sorted(grouped.items()):
print(f"\n镜像:{image_name}(标签总数:{len(records)})")
keep, to_delete = classify_components(records, retain=retain)
for item in keep:
marker = "最新" if item.version == "latest" else ""
print(f" 保留 : {item.version:<20} {item.created_text()} {marker}")
if not to_delete:
print(" 跳过 : 标签数量未超过保留阈值,无需删除。")
continue
print(" 删除 :")
for item in to_delete:
print(f" {item.version:<20} {item.created_text()}")
pending_deletions.append(item)
if not pending_deletions:
print("\n没有需要删除的组件,操作结束。")
return
print("\n汇总:")
print(f" 待删除组件数量:{len(pending_deletions)}")
print(f" 涉及仓库 :{', '.join(repositories)}")
if args.dry_run:
print("已开启预览模式,不执行删除。")
return
if not args.yes and not prompt_confirmation(len(pending_deletions)):
print("操作已被用户取消。")
return
print("\n开始删除组件……")
failures = 0
for item in pending_deletions:
try:
success = delete_component(session, base_url, item)
except RuntimeError as exc:
print(f" [错误] {exc}")
failures += 1
continue
if success:
print(f" 已删除 {item.repository} :: {item.name}:{item.version}")
if failures:
print(f"\n执行完毕,但有 {failures} 个删除失败。")
sys.exit(1)
print("\n所有组件均已成功删除。")
if __name__ == "__main__":
main()
使用说明
1.直接修改python文件头部的配置信息,然后直接python xxx.py 运行本脚本,通过交互式的方式查看和确认是否标记
2.1 通过环境变量自动化
如果不修改头部的环境变量前缀,那么可以通过下面的环境变量来注入需要运行脚本的nexus仓库
$env:NEXUS_CLEANUP_BASE_URL = "https://your-nexus.com"
$env:NEXUS_CLEANUP_USERNAME = "admin"
$env:NEXUS_CLEANUP_PASSWORD = "your_password"
$env:NEXUS_CLEANUP_REPOSITORIES = "docker-hosted docker-releases"
$env:NEXUS_CLEANUP_RETAIN = "4"
2.2 通过参数自动化
参数 说明 默认值
–base-url Nexus 服务器地址
–username 登录用户名
–password 登录密码
–repos 仓库名称(空格分隔) 注意用空格分隔
–retain 保留数量(不含 latest) 默认4
–yes / -y 跳过确认 默认否
–dry-run 仅预览 默认否
–insecure 跳过 SSL 验证 默认否,如果你是http的无ssl证书的需要开启这个
比如
python nexus_docker_cleanup.py `
--base-url https://nexus.example.com `
--username admin `
--password secret `
--repos docker-hosted docker-releases `
--retain 3 `
--yes
其中 nexus_docker_cleanup.py是你拷贝我的脚本放到你电脑/服务器上的名字
注意本脚本输出的所有删除和本文的删除指的是 nexus的标记部分的逻辑,真正的释放磁盘空间需要运行nexus的blob压缩 定时任务来进行空间的释放
都看到这里,如果解决你的烦恼了就麻烦帅哥/美女点赞收藏呗
2025年12月2日:V2版本,除了全局默认的保留个数外,允许针对某个镜像单独设置保留个数,比如某个临时性的补救镜像,只需要保留1个,不能占用额外的空间;比如某个生产环境镜像,需要留20个历史版本来保证k8s回滚
#!/usr/bin/env python3
"""用于清理 Nexus 仓库中的 Docker 镜像标签 (支持针对特定镜像自定义保留数量)。
脚本会遍历配置的 Nexus 仓库,统计镜像标签数量,输出友好的中文提示,并在确认后删除超出保留数量的旧标签。
"""
from __future__ import annotations
import argparse
import os
import sys
from collections import defaultdict
from dataclasses import dataclass
from datetime import datetime, timezone
from typing import Dict, Iterable, List, Optional
import requests
from requests import Session
from requests.auth import HTTPBasicAuth
from requests.exceptions import HTTPError
# ---------------------------------------------------------------------------
# 可配置的默认值
# ---------------------------------------------------------------------------
NEXUS_BASE_URL = "https://nexus.example.com"
NEXUS_USERNAME = "admin"
NEXUS_PASSWORD = "adminpasswd"
TARGET_REPOSITORIES = [
"docker",
"public-docker",
]
RETAIN_COUNT = 4 # 全局默认保留数量
REQUEST_TIMEOUT = 30
ENV_PREFIX = "NEXUS_CLEANUP_"
@dataclass
class ComponentRecord:
component_id: str
repository: str
name: str
version: str
created: Optional[datetime]
def created_text(self) -> str:
if not self.created:
return "<未知>"
return self.created.astimezone().strftime("%Y-%m-%d %H:%M:%S %Z")
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(
description="清理 Nexus 仓库中的 Docker 镜像标签。"
)
parser.add_argument(
"--base-url",
default=os.environ.get(f"{ENV_PREFIX}BASE_URL", NEXUS_BASE_URL),
help="Nexus 服务器基础地址",
)
parser.add_argument(
"--username",
default=os.environ.get(f"{ENV_PREFIX}USERNAME", NEXUS_USERNAME),
help="Nexus 登录账号",
)
parser.add_argument(
"--password",
default=os.environ.get(f"{ENV_PREFIX}PASSWORD", NEXUS_PASSWORD),
help="Nexus 登录密码",
)
parser.add_argument(
"--repos",
nargs="*",
default=os.environ.get(f"{ENV_PREFIX}REPOSITORIES"),
help="要处理的仓库名称(空格分隔)",
)
parser.add_argument(
"--retain",
type=int,
default=int(os.environ.get(f"{ENV_PREFIX}RETAIN", RETAIN_COUNT)),
help="全局默认保留数量(除 latest 外)",
)
# --- 新增参数 ---
parser.add_argument(
"--custom-retain",
action="append",
help="针对特定镜像设置保留数量,格式:镜像名=数量。可多次使用。(例如: --custom-retain mrs-playwright=1)",
)
# ----------------
parser.add_argument(
"--yes", "-y",
action="store_true",
help="跳过交互确认,直接执行删除。",
)
parser.add_argument(
"--dry-run",
action="store_true",
help="仅预览操作,不执行删除。",
)
parser.add_argument(
"--insecure",
action="store_true",
help="跳过 TLS 证书校验。",
)
return parser.parse_args()
def parse_custom_rules(rules: Optional[List[str]]) -> Dict[str, int]:
"""解析自定义保留规则列表,返回 {镜像名: 数量} 字典"""
custom_map = {}
if not rules:
return custom_map
for rule in rules:
if "=" not in rule:
print(f"[警告] 忽略无效的规则格式: {rule} (应为 name=count)")
continue
name, count_str = rule.split("=", 1)
name = name.strip()
try:
count = int(count_str.strip())
custom_map[name] = count
except ValueError:
print(f"[警告] 忽略无效的数量值: {rule}")
return custom_map
def build_session(username: str, password: str, verify: bool) -> Session:
session = requests.Session()
session.auth = HTTPBasicAuth(username, password)
session.verify = verify
session.headers.update({"Accept": "application/json"})
return session
def fetch_components(
session: Session, base_url: str, repository: str
) -> Iterable[ComponentRecord]:
endpoint = f"{base_url.rstrip('/')}/service/rest/v1/components"
params: Dict[str, str] = {"repository": repository}
while True:
response = session.get(endpoint, params=params, timeout=REQUEST_TIMEOUT)
try:
response.raise_for_status()
except HTTPError as exc:
raise RuntimeError(
f"获取仓库 “{repository}” 组件列表失败:{exc}"
) from exc
payload = response.json()
for item in payload.get("items", []):
component_id = item.get("id")
name = item.get("name", "<未知>")
version = item.get("version", "<未知>")
created = extract_created(item.get("assets", []))
yield ComponentRecord(
component_id=component_id,
repository=repository,
name=name,
version=version,
created=created,
)
continuation = payload.get("continuationToken")
if not continuation:
break
params["continuationToken"] = continuation
def extract_created(assets: List[dict]) -> Optional[datetime]:
timestamps: List[datetime] = []
for asset in assets:
raw = asset.get("blobCreated")
if not raw:
continue
parsed = parse_iso_datetime(raw)
if parsed:
timestamps.append(parsed)
if not timestamps:
return None
return max(timestamps)
def parse_iso_datetime(timestamp: str) -> Optional[datetime]:
try:
if timestamp.endswith("Z"):
timestamp = timestamp[:-1] + "+00:00"
return datetime.fromisoformat(timestamp).astimezone(timezone.utc)
except ValueError:
return None
def group_components(components: Iterable[ComponentRecord]) -> Dict[str, List[ComponentRecord]]:
grouped: Dict[str, List[ComponentRecord]] = defaultdict(list)
for component in components:
grouped[component.name].append(component)
return grouped
def classify_components(
components: List[ComponentRecord], retain: int
) -> tuple[List[ComponentRecord], List[ComponentRecord]]:
latest_components = [c for c in components if c.version == "latest"]
other_components = [c for c in components if c.version != "latest"]
other_components.sort(
key=lambda c: c.created or datetime.min.replace(tzinfo=timezone.utc), reverse=True
)
keep = latest_components + other_components[:retain]
delete = other_components[retain:]
keep.sort(
key=lambda c: c.created or datetime.min.replace(tzinfo=timezone.utc), reverse=True
)
return keep, delete
def prompt_confirmation(total: int) -> bool:
answer = input(f"确认删除 {total} 个组件?输入 y/N: ").strip().lower()
return answer == "y"
def delete_component(session: Session, base_url: str, component: ComponentRecord) -> bool:
endpoint = f"{base_url.rstrip('/')}/service/rest/v1/components/{component.component_id}"
response = session.delete(endpoint, timeout=REQUEST_TIMEOUT)
if response.status_code == 204:
return True
try:
response.raise_for_status()
except HTTPError as exc:
raise RuntimeError(
f"删除组件 {component.component_id} ({component.name}:{component.version}) 失败:{exc}"
) from exc
return False
def ensure_repositories(repos_argument: Optional[List[str]]) -> List[str]:
if repos_argument:
repos = [item.strip() for item in repos_argument if item.strip()]
else:
repos = TARGET_REPOSITORIES
if not repos:
raise SystemExit("未指定仓库名称,请配置 TARGET_REPOSITORIES 或使用 --repos 参数。")
return repos
def main() -> None:
args = parse_args()
repositories = ensure_repositories(args.repos)
base_url = args.base_url
username = args.username
password = args.password
global_retain = max(args.retain, 0)
# 解析自定义规则
custom_retain_map = parse_custom_rules(args.custom_retain)
session = build_session(username, password, verify=not args.insecure)
if args.insecure:
print("[!] 已关闭 TLS 证书校验,请谨慎操作。")
pending_deletions: List[ComponentRecord] = []
for repository in repositories:
print("\n" + "=" * 72)
print(f"仓库:{repository}")
print("=" * 72)
try:
components = list(fetch_components(session, base_url, repository))
except RuntimeError as exc:
print(f"[错误] {exc}")
continue
if not components:
print("未找到任何组件。")
continue
grouped = group_components(components)
for image_name, records in sorted(grouped.items()):
# 确定当前镜像的保留策略
# 优先查找 custom_retain_map,找不到则使用 global_retain
if image_name in custom_retain_map:
current_retain = custom_retain_map[image_name]
policy_source = "自定义策略"
else:
current_retain = global_retain
policy_source = "全局默认"
print(f"\n镜像:{image_name}(标签总数:{len(records)} | 保留:{current_retain} [{policy_source}])")
keep, to_delete = classify_components(records, retain=current_retain)
for item in keep:
marker = "最新" if item.version == "latest" else ""
print(f" 保留 : {item.version:<20} {item.created_text()} {marker}")
if not to_delete:
print(" 跳过 : 标签数量未超过保留阈值,无需删除。")
continue
print(" 删除 :")
for item in to_delete:
print(f" {item.version:<20} {item.created_text()}")
pending_deletions.append(item)
if not pending_deletions:
print("\n没有需要删除的组件,操作结束。")
return
print("\n汇总:")
print(f" 待删除组件数量:{len(pending_deletions)}")
print(f" 涉及仓库 :{', '.join(repositories)}")
if args.dry_run:
print("已开启预览模式,不执行删除。")
return
if not args.yes and not prompt_confirmation(len(pending_deletions)):
print("操作已被用户取消。")
return
print("\n开始删除组件……")
failures = 0
for item in pending_deletions:
try:
success = delete_component(session, base_url, item)
except RuntimeError as exc:
print(f" [错误] {exc}")
failures += 1
continue
if success:
print(f" 已删除 {item.repository} :: {item.name}:{item.version}")
if failures:
print(f"\n执行完毕,但有 {failures} 个删除失败。")
sys.exit(1)
print("\n所有组件均已成功删除。")
if __name__ == "__main__":
main()
使用方法如上,仅演示下面一种方法,其余同理
python3 /opt/scripts/nexus_docker_cleanup.py \
--base-url https://nexus.example.win \
--username example\
--password exampleaaa \
--repos docker public-docker \
--custom-retain mrs-playwright=1 \
--custom-retain komgagen=1 \
--custom-retain docker/mlntfy-agent=1 \
--custom-retain docker/mlntfy-core=1 \
--yes
这样就可以设置 mrs-playwright、komgagen等等只保留1个,采用kv键值对的方式,k位镜像名,v是保留的个数
示例输出如下:
2025/12/02 21:19:48 镜像:my-mihomo(标签总数:1 | 保留:4 [全局默认])
2025/12/02 21:19:48 保留 : latest 2025-11-13 17:12:59 CST 最新
2025/12/02 21:19:48 跳过 : 标签数量未超过保留阈值,无需删除。
2025/12/02 21:19:48
2025/12/02 21:19:48 ========================================================================
2025/12/02 21:19:48 仓库:public-docker
2025/12/02 21:19:48 ========================================================================
2025/12/02 21:19:48 未找到任何组件。
2025/12/02 21:19:48
2025/12/02 21:19:48 汇总:
2025/12/02 21:19:48 待删除组件数量:6
2025/12/02 21:19:48 涉及仓库 :docker, public-docker
2025/12/02 21:19:48
2025/12/02 21:19:48 开始删除组件……
2025/12/02 21:19:48 已删除 docker :: docker/mlntfy-agent:52
2025/12/02 21:19:48 已删除 docker :: docker/mlntfy-agent:51
2025/12/02 21:19:48 已删除 docker :: docker/mlntfy-agent:49
2025/12/02 21:19:48 已删除 docker :: docker/mlntfy-core:52
2025/12/02 21:19:48 已删除 docker :: docker/mlntfy-core:51
2025/12/02 21:19:48 已删除 docker :: docker/mlntfy-core:49
2025/12/02 21:19:48
2025/12/02 21:19:48 所有组件均已成功删除。
2025/12/02 21:19:48 执行脚本 定时清理ml256的nexus的docker仓库的冗余镜像 成功
2252

被折叠的 条评论
为什么被折叠?



